An Introduction to Mining Big Data with Apache Spark

Mining Big Data can be an incredibly frustrating experience due to its inherent complexity and a lack of tools. Reynold Xin and Aaron Davidson are Committers and PMC Members for Apache Spark and use the framework to mine big data at Databricks. In this presentation and interactive demo, you'll learn about data mining workflows, the architecture and benefits of Spark, as well as practical use cases for the framework.


This talk was given at the SF Data Mining Meetup group hosted by Trulia in San Francisco.

If you'd like to be notified when we post new tech talks, developer presentations and open source projects, you can subscribe to our newsletter or YouTube channel.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.