Calvin French-Owen Calvin French-Owen on

Data is critical to building great apps. Engineers and analysts can understand how customers interact with their brand at any time of the day, from any place they go, from any device they're using - and use that information to build a product they love. But there are countless ways to track, manage, transform, and analyze that data. And when companies are also trying to understand experiences across devices and the effect of mobile marketing campaigns, data engineering can be even trickier. What’s the right way to use data to help customers better engage with your app?

In this all-star panel hear from mobile experts at Instacart, Branch Metrics, Pandora, Invoice2Go, Gametime and Segment on the best practices they use for tracking mobile data and powering their analytics.

Che Horder is the Director of Analytics at Instacart, and previously led a team data science and engineering team at Netflix as Director of Marketing Analytics.

Gautam Joshi is the Engineering Program Manager of Analytics at Pandora and formerly worked at CNET/CBSi and Rdio. He helped create sustainable solutions for deriving meaning from large datasets. He’s a huge fan of music and technology, a California native and a proud Aggie.

Mada Seghete is the co-founder of Branch Metrics, a powerful tool that helps mobile app developers use data to grow and optimize their apps.

Beth Jubera is Senior Software Engineer at Invoice2Go, and was previously a Systems Engineer at IBM.

John Hession is VP of Growth at Gametime, and was previously Director of Mobile Operations and Client Strategy at Conversant.

Nick Chamandy Nick Chamandy on

Simple “random-user” A/B experiment designs fall short in the face of complex dependence structures. These can come in the form of large-scale social graphs or, more recently, spatio-temporal network interactions in a two-sided transportation marketplace. Naive designs are susceptible to statistical interference, which can lead to biased estimates of the treatment effect under study.

Ben Packer Ben Packer on

With the world’s largest residential energy dataset at their fingertips, Opower is uniquely situated to use Machine Learning to tackle problems in demand-side management. Their communication platform, which reaches millions of energy customers, allows them to build those solutions into their products and make a measurable impact on energy efficiency, customer satisfaction and cost to utilities.

In this talk, Opower surveys several Machine Learning projects that they’ve been working on. These projects vary from predicting customer propensity to clustering load curves for behavioral segmentation, and leverage supervised and unsupervised techniques.

Ben Packer is the Principal Data Scientist at Opower. Ben earned a bachelor's degree in Cognitive Science and a master's degree in Computer Science at the University of Pennsylvania. He then spent half a year living in a cookie factory before coming out to the West Coast, where he did his Ph.D. in Machine Learning and Artificial Intelligence at Stanford.

Justine Kunz is a Data Scientist at Opower. She recently completed her master’s degree in Computer Science at the University of Michigan with a concentration in Big Data and Machine Learning. Now she works on turning ideas into products from the initial Machine Learning research to the production pipeline.

This talk is from the Data Science for Sustainability meetup in June 2016.

Amir Najmi Amir Najmi on

Scalable web technology has greatly reduced the marginal cost of serving users. Thus, an individual business today may support a very large user base. With so much data, one might imagine that it is easy to obtain statistical significance in live experiments. However, this is always not the case. Often, the very business models enabled by the web require answers for which our data is information poor.

Greg Dingle Greg Dingle on

Tech businesses know how they're doing by numbers on a screen. The weakest link in the process of analysis is usually the part in front of the keyboard. People are not designed to think about abstract quantities. Scientists in the field of decision science have described for decades now exactly how people go wrong. You can overcome your biases only by being aware of them. Greg Dingle will walk you through some common biases, examples, and corrective measures.

Chris Becker Chris Becker on

One of our key missions on the search team at Shutterstock is to constantly improve the reliability and speed of our search system.  To do this well, we need to be able to measure many aspects of our system’s health.  In this post we’ll go into some of the key metrics that we use at Shutterstock to measure the overall health of our search system.

The image above shows our search team’s main health dashboard.  Anytime we get an alert, a single glance at this dashboard can usually point us toward which part of the system is failing.

On a high level, the health metrics for our search system focus on its ability to respond to search requests, and its ability to index new content.  Each of these capabilities is handled by several different systems working together, and requires a handful of core metrics to monitor its end-to-end functionality.

One of our key metrics is the rate of traffic that the search service is currently receiving.  Since our search service serves traffic from multiple sites, we also have other dashboards that break down those metrics further for each site.  In addition to the total number of requests we see, we also measure the rate of memcache hits and misses, the error rate, and the number of searches returning zero results.

One of the most critical metrics we focus on is our search service latency.  This varies greatly depending on the type of query, number of results, and type of sort order being used, so this metric is also broken down into more detail in other dashboards.  For the most part we aim to maintain response times of 300ms or less for 95% of our queries.  Our search service runs a number of different processes before running a query on our Solr pool– language identification, spellcheck, translation, etc, so this latency represents the sum total of all those processes.

In addition to search service latency, we also track latency on our Solr cluster itself.  Our Solr pool will only see queries that did not have a hit in memcache, so the queries that run there may be a little slower on average.

When something in the search service fails or times out, we also track the rate of each type of error that the search service may return.  At any time there’s always a steady stream of garbage traffic from bots generating queries that may error out, so there’s a small but consistent stream of failed queries.  If a search service node is restarted we may also see a blip in HTTP 502 errors, although that’s a problem we’re trying to address by improving our load balancer’s responsiveness in taking nodes out of the pool before they’re about to go down.

A big part of the overall health of our system also includes making sure that we’re serving up new content in a timely manner.  Another graph on our dashboard tracks the volume and burndown of items in our message queues which serves as our pipeline for ingesting new images, videos, and other assets into our Solr index.  This ensures that content is making it into our indexing pipeline, where all the data needed to make it searchable is processed.  If the indexing system stops being able to process data, then that will usually cause the burndown rate of each queue to come to a halt.

There’s other ways in which our indexing pipeline may fail too, so we also have another metric that measures the amount of content that is making it through our indexing system, getting into Solr, and showing up in the actual output of Solr queries.  Each document that goes into Solr receives a timestamp when it was indexed.  One of our monitoring scripts then polls Solr at regular intervals to see how many documents were added or modified in a recent window of time.  This helps us serve our contributors well by making sure that their new content is being made available to customers in a timely manner.

Behind the scenes we also have a whole host of other dashboards that break out the health and performance of each system covered in this dashboard, as well as metrics for other services in our search ecosystem.  When we’re deploying new features or troubleshooting issues, having metrics like these helps us very quickly determine what the impact is and guides us to quickly resolving it.

This article first appeared on Shutterbits.