Investigating Performance Regressions with Trends
To us, dogfooding means using Sentry to improve Sentry. Here in this article, you’ll see how we used Performance to improve our search infrastructure.
Recently, we extended our performance monitoring solution support to PHP and Serverless. We’re bringing it to Ruby and Java + Springboot soon too. But as some of you may have noticed, there’s also a new view in Performance, Trends.
Trends shows you the most improved and regressed transactions in relation to releases. At a glance, you can see how new code changes impact the performance of your application and by how much.
As you can assume, we started watching Sentry’s performance trends on everything we shipped. William Mak, Software Engineer, on our Visibility team noticed that our new feature, Web Vitals, could have introduced a performance regression when Trends told him that the new endpoint for the Web Vitals histograms experienced a dramatic increase in duration.
Opening up the event details view from Trends, we saw the following:
The most drastic change here appears to happen in the query to Snuba, our search infrastructure.
Here we see the trend on the transaction for the Snuba query filtered on
It turns out that there was a general bug in Snuba that was causing the Abstract Syntax Tree to be processed multiple times, which was exacerbated by the new translators processing their own children. For most queries the tree is fairly flat, but the histogram calls have nested functions seven layers deep which exposed the exponential performance growth with each layer.
So to ensure each node would only be processed once, the team built a cache to store nodes that were already translated.
To a fault, code is loyal to its author. It’s why communicating with your code is critical. And it’s why trends in performance monitoring is so valuable: it not only helps you understand the ups and downs, but it can help point you in the right direction. Now that dog will hunt.