Information Theoretic Metrics for Multi-Class Predictor Evaluation
The most common metrics used to evaluate a classifier are accuracy, precision, recall and F1 score.
Globally Scalable Web Document Classification Using Word2Vec
Extracting information from unstructured web documents is a common problem for many applications and determining which category they belong to can be especially challenging at planetary scale.
New Workflows for Building Data Pipelines
As companies continue to become more data-driven, data pipelines have gotten much more complicated and we need new tools and workflows for managing them. In this talk, Joe Doliner, co-founder of Pachyderm, looks at some of the current data pipelining challenges and how he envisions them being solved in the future.