HiScore: A New Python Package for Making Scores

Scores are a way for domain experts to communicate the quality of a complex, multi-faceted object to a broader audience. Scores are ubiquitous; everything from NFL Quarterbacks to the security threat risk of software has a score. Scoring also has commercial potential: beyond obvious applications (e.g., credit scoring) in the past twelve months both Klout (social media reputation scoring) and Walkscore (neighborhood walkability assessment) have been acquired.

HiScore is a python package that provides a new way for domain experts to quickly create and improve scoring functions: by using reference sets, a set of representative objects that are assigned scores. HiScore is currently used by a major environmental non-profit as well as IES, a startup that assesses the safety and sustainability of fracking wells.

HiScore relies on being able to interpolate through the reference set in an understandable and justifiable way. In technical terms, HiScore needs a good solution to the multivariate monotone scattered data interpolation problem. Monotone scattered data interpolation turns out to be trivial in one dimension and devilishly hard in many others. We discuss several failed approaches and false starts before finally arriving at the quasi-Kriging algorithmic foundation of HiScore. We conclude with applications, including the intuitive creation of complex scores with dozens of attributes.

The theoretical basis of HiScore is joint work with Ken Judd (Stanford).


GitHub repo here.

This video was recorded at the SF Bayarea Machine Learning meetup at Thumbtack in SF.