Adam Gibson Adam Gibson on

Deep learning is all the rage in advanced analytics. How does it work and how can it scale? Adam Gibson, Data Scientist and Co-founder of Skymind, explains why representational learning is an advance over traditional machine learning techniques. He also gives a demo of a working deep-belief net with a tour through DL4J's API, showing how a DBN extracts features and classifies data.


This talk was given at the SF Data Mining meetup at Trulia.

Reynold Xin Reynold Xin on

Mining Big Data can be an incredibly frustrating experience due to its inherent complexity and a lack of tools. Reynold Xin and Aaron Davidson are Committers and PMC Members for Apache Spark and use the framework to mine big data at Databricks. In this presentation and interactive demo, you'll learn about data mining workflows, the architecture and benefits of Spark, as well as practical use cases for the framework.

Rosaria Silipo Rosaria Silipo on

Open source tools usually delegate their support service to community forums. How reliable is this strategy? In this talk, Rosaria Silipo answers that question and this one, "who says that Open Source Software does not have support?"  She measures the efficiency of the community forum from 2007 to 2012 of KNIME, an open source data analytics platform. Commonly used techniques in social media analysis, such as web crawling, web analytics, text mining, and network analytics, are used to investigate the forum characteristics. Each part is described in detail during this presentation. This talk was recorded at the SF Data Mining meetup at inPowered.

John Jensen John Jensen on

John Jensen and Mike Sherman will be speaking about their problem domain over at Rich Relevance . At Rich Relevance, they provide content personalization as a service, mostly to retailers. Unlike Pandora, they don't use intrinsic similarity metrics with in-depth knowledge about the domain they are recommending. This talk was recorded at the SF Data Mining meetup at Pandora HQ.

Todd Holloway Todd Holloway on

Recommendation engines typically produce a list of recommendations in one of two ways - through collaborative or content-based filtering. Collaborative filtering approaches to build a model from a user's past behavior (items previously purchased or selected and/or numerical ratings given to those items) as well as similar decisions made by other users, then use that model to predict items (or ratings for items) that the user may have an interest in. Content-based filtering approaches utilize a series of discrete characteristics of an item in order to recommend additional items with similar properties.