anna smith anna smith on

Anna Smith from Rent the Runway talks about how they've evolved their data pipeline over time to deal with infrastructure constraints, disparate data sources, and changing data sources/quality all while still serving reports and data back to the website with minimal downtime. Anna also covers how they leveraged Luigi to ensure robust reporting without forcing non-technical analysts to learn Python.


This video was recorded at the NYC Data Engineering meetup at Spotify in NYC.

Joe Crobak Joe Crobak on

In this talk, Joe Crobak, formerly from Foursquare, will give a brief overview of how a workflow engine fits into a standard Hadoop-based analytics stack. He will also give an architectural overview of Azkaban, Luigi, and Oozie, elaborating on some features, tools, and practices that can help build a Hadoop workflow system from scratch or improve upon an existing one. This talk was recorded at the NYC Data Engineering meetup at Ebay.