When measuring latency, keeping track of individual measurements is a good, first-pass solution. It gives you the flexibility to perform aggregations at a later time and experiment with different ways of constructing histograms and explore metrics in specific time windows, among other tests.
However, as your scale increases, memory restrictions can make it impossible to record every individual measurement. At this point, you’ll need techniques that allow you to keep an increasing amount of metrics in a constrained amount of memory. We’ve found that creating a latency histogram is a great way to aggregate latency measurements.
What is the ideal bucket size for the histogram? That’s the key question you’ll need to answer if you decide to set up a latency histogram. You’ll likely consider two bucketization approaches, but I’ve found one to be vastly superior to the other:
1. Lay out the bucket sizes as an arithmetic sequence
This concept is best illustrated with an example:
[1ms, 3ms, 5ms, 7ms, 9ms, 11ms, …. 2000ms]
In the example above, bucket 0 contains the count of latencies that were <= 1ms. Bucket 1 contains the count of latencies > 1ms and <= 3ms. The difference in buckets in this arithmetic sequence is 2ms.
This approach is better than measuring each latency separately, but with fixed width buckets, it will take roughly 1,000 buckets to measure latencies up to 2000ms. Most of the buckets will be empty, which isn’t ideal.
2. Lay out the bucket sizes as a geometric sequence
Again, we’ll start with an example:
[1ms, 1.5ms, 2.25ms, 3.75ms, 5.07ms, 7.6ms, …. 2000ms]
Here each bucket boundary is 150% of the previous bucket’s boundary. The advantage of this style of bucketization is that we get very granular data at low latencies, which is the interesting part of the distribution, and the data becomes less granular for larger buckets. With 19 buckets, you can capture latencies up to 2000ms and each bucket will be used well.
I greatly prefer the geometric sequence over the arithmetic one, and I’m not alone. According to Ben Sigelman, the geometric sequence is the default bucketization used at Google to measure latency histograms. I strongly recommend using a geometric sequence to bucketize your latency histogram.
Stay up to date
Don’t miss out! Subscribe to our digest to get the latest about feature flags, continuous delivery, experimentation, and more.
Learn how to MongoDB as the storage mechanism for a Spring Boot application, and then add feature flags!
Every tech company, should be using a robust feature flag system. A well-built system will provide a host of value-adds and efficiencies for your dev team
Feature flagging is a technique development teams deploy to enable easy switches between codepaths in their systems, at runtime. In simpler terms, they’re control structures that toggle on and off the code inside them. Dev teams use feature flags for a wide variety of purposes, from canary releases to A/B…