New Trends in Application Monitoring Systems from Velocity 2016

July 24, 2016

So, why is anomaly detection not that common? First, it is really, really hard. Second, its value goes down drastically if it doesn’t control for false positives (incorrectly marked anomalies).

At this year’s O’Reilly Velocity 2016, there were many interesting sessions on anomaly detection in APM. Specifically, a fantastic talk was presented by two of my ex-colleagues from LinkedIn, Ritesh Maheshwari and Yang Yang. They spoke about ‘Anomaly Detection for Real User Monitoring Data’. While a video is not yet available, you can see the slides here.

First, a quick explanation for Real User Monitoring (RUM). It is the idea that performance optimizations, whether deep in the backend or in the UI layer, should result in a faster experience for the end user. Hence, it is important to measure performance from the perspective of a real user. Companies achieve this by inserting RUM Javascript libraries into their apps that measure page load times, client render time etc. against dimensions like CDN PoP, geo, and page type.

I’ve highlighted two important topics from their presentation:

Their anomaly detection algorithm was simple yet powerful in detecting sustained anomalies (an anomaly that lasts for a while). Engineers learn from experience that threshold based anomaly detection is broken: yesterday’s threshold is today’s normal. Ritesh and Yang used sign test to detect if say the page load times today were anomalous when compared to yesterday or the same time a week ago. Besides its simplicity, the approach leads to an adaptive sustained anomaly detection which addresses false positives better.
By connecting RUM with anomaly detection, they were able to quickly determine a high level root cause. For instance, if the anomaly was in connection time, they could be confident that the problem lay in their network, down to the region or PoP where the problem occurred. Similarly, if the anomaly was in first byte time or page download time, they could be confident that the problem lay on the server side (CDN Origin).

In summary, combining RUM with their anomaly detection approach is very promising and an interesting new approach to analysis for modern engineering teams.

Get Split Certified

Split Arcade includes product explainer videos, clickable product tutorials, manipulatable code examples, and interactive challenges.

Switch It On With Split

The Split Feature Data Platform™ gives you the confidence to move fast without breaking things. Set up feature flags and safely deploy to production, controlling who sees which features and when. Connect every flag to contextual data, so you can know if your features are making things better or worse and act without hesitation. Effortlessly conduct feature experiments like A/B tests without slowing down. Whether you’re looking to increase your releases, to decrease your MTTR, or to ignite your dev team without burning them out–Split is both a feature management platform and partnership to revolutionize the way the work gets done. Switch on a free account today, schedule a demo, or contact us for further questions.

Want to Dive Deeper?

We have a lot to explore that can help you understand feature flags. Learn more about benefits, use cases, and real world applications that you can try.

Blog

Industry Trends

To Stage or Not to Stage: Moving Fast to Production

View Blog

Webinar

Company

Flagship 2022: Journeys to Best Practice – Introducing Split Arcade

View Webinar

Webinar

Company

Flagship 2022: Feature Flag Use Cases You Haven’t Heard About Yet

View Webinar

Create Impact With Everything You Build

We’re excited to accompany you on your journey as you build faster, release safer, and launch impactful products.

Free Account Contact Us

Search site

Why Split

Products

Feature Delivery & Control

Feature Measurement & Learning

Enterprise Readiness

Related Links

Use Cases

By Need

By Industry

Resources

Developer Resources

Content Hub

Success

Related Links

Pricing

Company

Search site

New Trends in Application Monitoring Systems from Velocity 2016

Get Split Certified

Switch It On With Split

Want to Dive Deeper?

Split Experimentation for Azure App Configuration Now in Public Preview

Introducing Switch, Split’s New AI Developer Assistant

Experimenting With Statistical Rigor to Make Data-Driven Taco Decisions

Comparing Platform Engineering and DevOps

Top Engineering Strategies To Reduce Risk And Increase Speed

Embracing AI in Your Product: Insights From Split’s GenAI Chatbot Launch

Release New Features Faster

Want to Dive Deeper?

To Stage or Not to Stage: Moving Fast to Production

Flagship 2022: Journeys to Best Practice – Introducing Split Arcade

Flagship 2022: Feature Flag Use Cases You Haven’t Heard About Yet

Create Impact With Everything You Build