An Introduction to Mining Big Data with Apache Spark

Mining Big Data can be an incredibly frustrating experience due to its inherent complexity and a lack of tools. Reynold Xin and Aaron Davidson are Committers and PMC Members for Apache Spark and use the framework to mine big data at Databricks. In this presentation and interactive demo, you'll learn about data mining workflows, the architecture and benefits of Spark, as well as practical use cases for the framework.

This talk was given at the SF Data Mining Meetup group hosted by Trulia in San Francisco.

If you'd like to be notified when we post new tech talks, developer presentations and open source projects, you can subscribe to our newsletter or YouTube channel.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.

Reynold Xin is a Co-Founder at Databricks

Databricks was founded out of the UC Berkeley AMPLab by the creators of Apache Spark. We’ve been working for the past six years on cutting-edge systems to extract value from Big Data. We believe that Big Data is a huge opportunity that is still largely untapped, and we’re working to revolutionize what you can do with it.