Silviu Calinoiu Silviu Calinoiu on

This talk shows how to build an ETL pipeline using Google Cloud Dataflow/Apache Beam that ingests textual data into a BigQuery table. Google engineer Silviu Calinoiu gives a live coding demo and discusses concepts as he codes. You don't need any previous background with big data frameworks, although people familiar with Spark or Flink will see some similar concepts. Because of the way the framework operates the same code can be used to scale from GB files to TB files easily.

Continue
Joe Crobak Joe Crobak on

Big data processing with Apache Hadoop, Spark, Storm and friends is all the rage right now. But getting started with one of these systems requires an enormous amount of infrastructure, and there are an overwhelming number of decisions to be made. Oftentimes you don't even know what kinds of questions you can or should be answering with your data.

Continue