Asim Jalis Asim Jalis on

Big Data applications need to ingest streaming data and analyze it. HBase is great at ingesting streaming data but not so good at analytics. On the other hand, HDFS is great at analytics but not at ingesting streaming data. Frequently applications ingest data into HBase and then move it to HDFS for analytics.

Continue
John Myles White John Myles White on

In this talk, "Streaming Data Analysis and Online Learning," John Myles White of Facebook surveys some basic methods for analyzing data in a streaming manner. He focuses on using stochastic gradient descent (SGD) to fit models to data sets that arrive in small chunks, discussing some basic implementation issues and demonstrating the effectiveness of SGD for problems like linear and logistic regression as well as matrix factorization. He also describes how these methods allow ML systems to adapt to user data in real-time. This talk was recorded at the New York Open Statistical Programming meetup at Knewton.

Continue