Big News! Split to be acquired by Harness. Read More

Experimenting With Statistical Rigor to Make Data-Driven Taco Decisions



Let’s say you just moved to a new city, and you are trying to identify delicious restaurants in the area. You happen to love tacos (because who doesn’t?!), and there are three taco restaurants near you. Without knowing which restaurant is better, you want to make an educated guess before choosing one. So, you get on your favorite food app and try to decide based on ratings and reviews. 

How can you steer the decision-making process? Here are three approaches you can take: 

Approach 1: The Leave-It-to-the-Food-Blogger Method

You follow a food blogger online who writes about restaurants in your area, and you really trust their opinion. Their review of restaurant A says, “I’m never coming back again.”  For restaurant B, they give it an average rating.  However, when writing about restaurant C, they exclaim,  “This is the best taco I’ve ever had in my life!”  According to the bloggers’ opinion, restaurant C is the winning choice. 

Approach 2: The Overall Ratings Method 

Another simple decision-making method is to look at the overall ratings, as most people do. Restaurant A has an overall rating of 4.8, restaurant B has an overall rating of 4.9, and restaurant C has an overall rating of 5. In conclusion, restaurant C, again, seems to be the obvious winner. 

Although these two approaches are common, they highlight two frequent problems that people face when they make decisions–even with the help of data. 

Problem 1: Dependence on Small Data Points

This is when you rely heavily on small or individual data points, which often leads to a decision based on incomplete or biased information like that of the blogger.

Problem 2: Lack of Systematic Analysis 

This is when you don’t perform rigorous and systematic analysis of all the data you have at hand. Instead, you perform a simple comparison of averages like with the very few overall ratings found online.

Don’t Be Led to the Wrong Taco

In approach 1, the food blogger’s review might be completely subjective and turn out to be an unpopular opinion that is contradictory to what most people believe. 

In approach 2, if restaurant C is a brand new restaurant with only three reviews, then it’s unlikely representative of everyone else’s opinion. The restaurant owner may have received their five star reviews from friends and family on opening night.  Plus, a 4.9 average review for restaurant A doesn’t mean it’s better than restaurant B with 4.8 stars.

These mistakes are harmless in this context, because if you pick the wrong restaurant, it’s not the end of the world. Worse comes to worse, you ingest some bad taco meat. However, in the world of product development, a wrong decision could lead to catastrophic consequences with millions of dollars at stake. That’s why a rigorous and systematic approach to experimentation is essential for product dev teams.

Approach 3: Choosing Taco Kitchens Like a Statistician

Here’s a third approach you should try. It’s how you’d make a data-driven dining decision if you were a statistician. 

Instead of relying on a single person’s opinion, the statistician looks at all of the data available and makes a decision based on thorough analysis. First, they’d rule out restaurant C, because they don’t have enough information about the restaurant with only three reviews. Sure, they could be adventurous and just go with it, but they are a rational thinkers and risk-averse. Instead, they’d perform a statistical comparison between restaurant A and restaurant B, determining that the difference between restaurant A and B is not statistically significant. In this case, the ratings alone couldn’t determine which option is better. 

Time to dig deeper. To respond like a statistician, you’d conduct an overall sentiment analysis based on each of the restaurant reviews, discovering that 80% of restaurant A’s ratings praise its food for the quality.  Conversely, restaurant B results indicate that only 50% of reviews mention food quality; the rest focus on ambience and service. In conclusion, you determine that 50% and 80% are “statistically significant” from each other. You didn’t technically do an experiment by manipulating variables here, but to the best of your ability, you land on restaurant A as the best option. So you go with it, after all food quality overrides ambience in your decision-making process. 

Did You Make the Right Data-Driven Decision? 

Fast forward to 3 years after your dining decision. You are now extremely familiar with the area. Based on your experience and all of your friends’ opinions, you know that restaurant A and restaurant B are equally loved. Restaurant C, unfortunately, has closed due to insufficient customer traffic. 

You think back to day one when you were deciding between restaurants–had you gone with any of the first two approaches, your decision would have been wrong.

Statistical Analysis Matters to Product Development

Without using any statistical jargon and detail, hopefully this example gives you some intuition about why experimentation and statistical analysis matter. Product development is and should be a very similar process where the decision to ship a product should rely on systematic, quantitative analysis of crucial metrics, comparing different options rather than trusting single data points or a simple comparison of an average metric. With tacos it might not matter so much if you made the wrong decision, but for software features, statistical rigor could save your career. At Split, our statistical engine is equipped with the utmost rigor to ensure your feature decisions are as safe as possible without requiring any statistical knowledge from the user. 

In a future blog, I will explain, using more statistical details, why having a large number of data points and using a systematic hypothesis testing framework would make the process more rigorous.

Check back soon.

Switch It On With Split

The Split Feature Data Platform™ gives you the confidence to move fast without breaking things. Set up feature flags and safely deploy to production, controlling who sees which features and when. Connect every flag to contextual data, so you can know if your features are making things better or worse and act without hesitation. Effortlessly conduct feature experiments like A/B tests without slowing down. Whether you’re looking to increase your releases, to decrease your MTTR, or to ignite your dev team without burning them out–Split is both a feature management platform and partnership to revolutionize the way the work gets done. Schedule a demo to learn more.

Get Split Certified

Split Arcade includes product explainer videos, clickable product tutorials, manipulatable code examples, and interactive challenges.

Want to Dive Deeper?

We have a lot to explore that can help you understand feature flags. Learn more about benefits, use cases, and real world applications that you can try.

Create Impact With Everything You Build

We’re excited to accompany you on your journey as you build faster, release safer, and launch impactful products.