Diving into Spark

Apache Spark is a next generation engine for large scale data processing built with Scala. Dean Chen, software engineer at ebay, discusses how Spark takes advantage of Scala's function idioms to produce an expressive and intuitive API for big data analysis. Dean covers the design of Spark RDDs and the abstraction enables the Spark execution engine to be extended to support a wide variety of use cases: Spark SQL, Spark Streaming, MLib and GraphX.


This was recorded at the Scala Bay meetup at PayPal.