Nick Elprin, founder of Domino Data Lab, talks about how to deploy predictive models into production, specifically in the context of a corporate enterprise use case. Nick demonstrates an easy way to “operationalize” your predictive models by exposing them as low-latency web services that can be consumed by production applications. In the context of a real-world use case this translates into more subtle requirements for hosting predictive models, including zero-downtime upgrades and retraining/redeploying against new data. Nick also focuses on the best practices for writing code that will make your predictive models easier to deploy.Continue
In this presentation, Dr. Michael Kane (Associate Research Scientist, Yale University) introduces a scalable, distributed software design that facilitates communications patterns beyond those supported by existing MapReduce frameworks making it appropriate for a more general class of computing challenges.Continue
Meetup organizers and business owners have the same question: "Where should I put my event, store, or factory to maximize attendance?" You could pay consultants thousands of dollars or figure it out with a free weekend and R!Continue
There are exciting applications of network science and graphical modeling in recent brain imaging studies. Watch as Ivor Cribben examines the challenges of estimating group-level dynamic connectivity structure across subjects and outlines novel data-driven statistical methods to estimate connectivity. Techniques discussed include:Continue
In this talk on Machine Learning Distributed GBM, Earl Hathaway, resident Data Scientist at 0xdata, talks about distributed GBM, one of the most popular machine learning algorithms used in data mining competitions. He will discuss where distributed GBM is applicable, and review recent KDD & Kaggle uses of machine learning and distributed GBM. Also, Cliff Click, CTO of 0xdata, will talk about implementation and design choices of a Distributed GBM. This talk was recorded at the SF Data Mining meetup at Trulia.Continue
In this talk, Tal Galili, the founder of R-bloggers, will present his recent work "dendextend," a package intended for visualizing and comparing trees of hierarchical clusterings (a.k.a: dendrograms) with R. This talk was recorded at the New York Statistical Programming meetup at Knewton.Continue
In this talk, "Using Go for Statistical Programming," Aditya Mukerjee, student at Cornell Tech, discusses how to use Google's Go programming language for statistics. This talk was recorded at the New York Open Statistical Programming meetup at Knewton.Continue
This talk is by Adam Ilardi, a data scientist at eBay, and was recorded at the NY Scala meetup at eBay NYC. Adam talks about eBay's transition from Pig and raw Cascading to Scalding and explains other ways they use Scala.Continue
Today's talk was recorded at the recent Bay Area UseR meetup. Truly a killer talk here! Drew Linzer covers the dynamic Bayesian forecasting model he used at VOTAMATIC to correctly call the outcome of all 50 states in his final forecast, which was posted on Election Day at 7:29 a.m. PST.Continue
Today we've got the last of our November R user talks. It's a lightning talk by Aurobindo Tripathy on Skin Detection Using Random Forest Decision Trees.
We were lucky to attend the Bay Area R users group last week where we recorded Laurent Gautier's talk on the RPy2 bridge which allows one to use Python as the glue language to develop applications while using R for the statistics and data analysis engine. He also demonstrated how a web application could be developed around an existing R script.