Machine Learning Meetup - Targeted Online Advertising

Here's a new talk on targeted online advertising recorded at one of the NYC Machine Learning meetups. Two presenters from Media6 labs spoke about their respective papers from the recent Knowledge Discover and Data Mining conference (KDD). Claudia Perlich presented "Bid Optimizing and Inventory Scoring in Targeted Online Advertising" and Troy Raeder presented "Design Principles of Massive, Robust Prediction Systems." Full abstracts and audio below.


Description from Meetup.com:

At the Machine Learning Meetup hosted at Pivotal Labs, two presenters from Media6 labs each talked about their papers from Knowledge Discover and Data Mining (KDD). Claudia Perlich presented"Bid Optimizing and Inventory Scoring in Targeted Online Advertising" and Troy Raeder presented "Design Principles of Massive, Robust Prediction Systems". Abstracts and bios below.

Bid Optimizing and Inventory Scoring in Targeted Online Advertising

Billions of online display advertising spots are purchased on a daily basis through real time bidding exchanges (RTBs). Advertising companies bid for these spots on behalf of a company or brand in order to purchase these spots to display banner advertisements. These bidding decisions must be made in fractions of a second after the potential purchaser is informed of what location (Internet site) has a spot available and who would see the advertisement. The entire transaction must be completed in near real-time to avoid delays loading the page and maintain a good users experience. This paper presents a bid-optimization approach that is implemented in production at Media6Degrees for bidding on these advertising opportunities at an appropriate price. The approach combines several supervised learning algorithms, as well as second price auction theory, to determine the correct price to ensure that the right message is delivered to the right person, at the right time.

Design Principles of Massive, Robust Prediction Systems

Most data mining research is concerned with building high-quality classification models in isolation. In massive production systems, however, the ability to monitor and maintain performance over time while growing in size and scope is equally important. Many external factors may degrade classification performance including changes in data distribution, noise or bias in the source data, and the evolution of the system itself. A well-functioning system must gracefully handle all of these. This paper lays out a set of design principles for large-scale autonomous data mining systems and then demonstrates our application of these principles within the m6d automated ad targeting system. We demonstrate a comprehensive set of quality control processes that allow us monitor and maintain thousands of distinct classification models automatically, and to add new models, take on new data, and correct poorly-performing models without manual intervention or system disruption.

Troy Raeder Bio:

Troy Raeder has earned B.S., M.S., and Ph.D degrees in Computer Science from the University of Notre Dame and is currently a Data Scientist at M6D.  He has published academic articles in a number of venues including Pattern Recognition and the Journal of Machine Learning Research.  His current research interests include machine learning for online advertising, learning under shifting distributions, and the development large-scale machine learning algorithms and systems.

Claudia Perlich Bio:

Since 2010, Claudia Perlich holds the position of chief scientist at Media6Degrees, a startup that specializes at targeted online display advertising. Claudia received her Ph.D. in Information Systems from Stern School of Business, New York University in 2005 and holds additional graduate degrees in Computer Science. Claudia joined the Data Analytics Research group at the IBM T.J. Watson Research Center in 2004 and continued her research on data analytics and machine learning for complex real-world domains and applications. She is the author or 50+ scientific publications and holds multiple patents in the area of machine learning, has won various data mining competitions, best paper awards, and speaks regularly at conferences and other public events.

Bid Optimizing and Inventory Scoring in Targeted Online Advertising

(Claudia Perlich, Chief Scientist at Media6Degrees)

Billions of online display advertising spots are purchased on a daily basis through real time bidding exchanges (RTBs). Advertising companies bid for these spots on behalf of a company or brand in order to purchase these spots to display banner advertisements. These bidding decisions must be made in fractions of a second after the potential purchaser is informed of what location (Internet site) has a spot available and who would see the advertisement. The entire transaction must be completed in near real-time to avoid delays loading the page and maintain a good users experience. This paper presents a bid-optimization approach that is implemented in production at Media6Degrees for bidding on these advertising opportunities at an appropriate price. The approach combines several supervised learning algorithms, as well as second price auction theory, to determine the correct price to ensure that the right message is delivered to the right person, at the right time.

Design Principles of Massive, Robust Prediction Systems

(Troy Raeder. Data Scientist at Media6Degrees)

Most data mining research is concerned with building high-quality classification models in isolation. In massive production systems, however, the ability to monitor and maintain performance over time while growing in size and scope is equally important. Many external factors may degrade classification performance including changes in data distribution, noise or bias in the source data, and the evolution of the system itself. A well-functioning system must gracefully handle all of these. This paper lays out a set of design principles for large-scale autonomous data mining systems and then demonstrates our application of these principles within the m6d automated ad targeting system. We demonstrate a comprehensive set of quality control processes that allow us monitor and maintain thousands of distinct classification models automatically, and to add new models, take on new data, and correct poorly-performing models without manual intervention or system disruption.

Want to hear from more top engineers?

Our weekly email contains the best software development content and interviews with top CTOs. Enter your email address now to stay in the loop.