Yael Elmatad Yael Elmatad on

Many data scientists work within the realm of machine learning, and their problems are often addressable with techniques such as classifiers and recommendation engines. However, at Tapad, they have often had to look outside the standard machine learning toolkit to find inspiration from more traditional engineering algorithms. This has enabled them to solve a scaling problem with their Device Graph’s connected component, as well as maintaining time-consistency in cluster identification week over week.

Continue
Sandeep Jain Sandeep Jain on

The Gist: Ever wonder why you keep getting ads for Budweiser when you're clearly a Coors aficionado? In this Lyceum, Sandeep Jain, technical advisor at Axial, will dive into the algorithms and systems that decide how online ads are delivered on the internet. Never again will you wonder why you're being hawked bad beer. This Lyceum will mix behavioral economics, game theory, distributed systems, and graph theory into one fun and informative talk.


Click here to register for the event


Speaker Bio: Sandeep is currently a technical adviser to Axial. Before that, he cofounded Reschedge, a SaaS enterprise recruiting tool which was recently sold to Hirevue. He started his career at Google where he spent 5 years working on Google Maps and Doubleclick products. He finished his career there as the technical lead of the display advertising backend.

Continue
Max Sklar Max Sklar on

When it comes to recommendation systems and natural language processing, data that can be modeled as a multinomial or as a vector of counts is ubiquitous. For example if there are 2 possible user-generated ratings (like and dislike), then each item is represented as a vector of 2 counts.  In a higher dimensional case, each document may be expressed as a count of words, and the vector size is large enough to encompass all the important words in that corpus of documents.  The Dirichlet distribution is one of the basic probability distributions for describing this type of data. In this talk, Max Sklar, from Foursquare, takes a closer look at the Dirichlet distribution and it's properties, as well as some of the ways it can be computed efficiently.  This talk was recorded at the NYC Machine Learning meetup at Pivotal Labs.

Continue
Jeroen Janssens Jeroen Janssens on

In this talk, Jeroen Janssens, senior data scientist at YPlan, introduces both the outlier selection and one-class classification setting. He then presents a novel algorithm called Stochastic Outlier Selection (SOS). The SOS algorithm computes for each data point an outlier probability. These probabilities are more intuitive than the unbounded outlier scores computed by existing outlier-selection algorithms. Jeroen has evaluated SOS on a variety of real-world and synthetic datasets, and compared it to four state-of-the-art outlier-selection algorithms. The results show that SOS has a superior performance while being more robust to data perturbations and parameter settings. Click Here for the link to Jeroen's blogpost on the subject, it contains links to the d3 demo! This talk was recorded at the NYC Machine Learning meetup at Pivotal Labs.

Continue