PredictionIO: Build and Deploy ML Applications in a Fraction of the Time

In this talk, Simon Chan (co-founder of PredictionIO) introduces the latest developments and shows how to use PredictionIO to build and deploy predictive engines in real production environments. PredictionIO is an open source machine learning server built on Apache Spark and MLlib. It is designed for data scientists and developers to build predictive engines for real-world applications in a fraction of the time normally required.

Using PredictionIO’s DASE design pattern, Simon illustrates how developers can develop machine learning applications with the separation of concerns (SoC) in mind.
“D" stands for Data Source and the Data Preparator, which take care of the preparation of data for model training.
“A" stands for Algorithm, which is where the code of one or more algorithms are implemented. MLlib, the machine learning library of Apache Spark, is natively supported here.
“S” stands for Serving, which handles the application logic during the retrieval of predicted results.
Finally, “E” stands for Evaluation.

Simon also covers upcoming development work, including new Engine Templates for various business scenarios.


This video was recorded at the SF Data Mining meetup at in SF.