Welcome Jan Crisostomo

Join us in welcoming Jan Crisostomo to the Sentry team!

If you’re an engineer, you may recognize Jan’s work from such emails as, “Engineering opportunities with [Company]!”, “Just following up on my previous email”, and “PLZ DON’T BLOCK ME”. Jan has recruited and managed talent programs for tech companies like Google and Dropbox, most recently at Bessemer Venture Partners where she advised and recruited for a variety of early-stage startups.

Jan will help build the Sentry team as Head of Talent, which she finds ironic since she has no talents. When she’s not self-consciously crafting third-person biographies, Jan can be found sipping tea, traveling, and writing chapters for her inevitable cult classic The 4-Hour Nap.

The Troubles With iOS Symbolication

Who does not love iOS? It’s a great operating system. However I can tell you about a type of person that has a love/hate relationship with iOS: engineers who have to debug crashes on iOS devices. iOS makes debugging crashes trickier than most environments which in turn makes the job of tools like Sentry and Crashlytics that much harder. In this blog post we want to give you a bit of insight into how Sentry deals with iOS crashes and what is necessary for you to have an enjoyable crash reporting experience.

Crashing in the First Place

The first part that goes into a crash is to generate a report we can actually send to Sentry. For this to work you need something that can generate you a basic backtrace at the moment a crash happens. There are two popular libraries for iOS that can do that. One is KSCrash, the other is PLCrashReporter. Those two libraries hook into different parts of the OS to respond to errors and to extract a backtrace. This in itself is already a very complex undertaking and we’re glad that others have done this task for us.

There are many different situations that can cause crashes and each of them has different characteristics. I don’t want to get too much into detail here but it’s important to understand that not all crashes will result in the same quality of reporting. An extreme case for iOS are C++ exceptions which will not create proper backtraces on iOS because of how the exception system works.

When we manage to report a stacktrace and some important data on crashing we persist that temporarily on the device. Next time you start the application we send that information to the server. The stacktrace is the most interesting part and this also is the first complexity that spills over to the server side.

To give you a bit of an idea what this looks like, here is an example stacktrace after you extract it:

CrashLibiOS                     0x100077c4c
CrashProbeiOS                   0x200050220
UIKit                           0x31d104d30
UIKit                           0x31d104cb0
UIKit                           0x31d0ef128
UIKit                           0x31d10459c
UIKit                           0x31d68f628
UIKit                           0x31d68b6c0
UIKit                           0x31d68b1e0
UIKit                           0x31d68a49c
UIKit                           0x31d0ff30c
UIKit                           0x31d0cfda0
UIKit                           0x31d8b975c
UIKit                           0x31d8b3130
CoreFoundation                  0x3111ffb5c
CoreFoundation                  0x3111ff4a4
CoreFoundation                  0x3111fd0a4
CoreFoundation                  0x31112b2b8
GraphicsServices                0x314690198
UIKit                           0x31d13a7fc
UIKit                           0x31d135534
CrashProbeiOS                   0x20004f2a4
libdyld.dylib                   0x30f0f65b8

So the first step would be to find some names for those addresses. This process is often called “symbolizing” or “symbolicating”. We can already see where the addresses are located because the device sends us a list of loaded images (object files) and where they are loaded into memory. To find the names we need to look at symbol tables.

Stacktraces on iOS

So as you can see stacktraces are fairly incomplete. While we can easily find out what frameworks the addresses are contained in, it’s unlikely that you will be able to find the function names for them on the device. There are two cases you have to keep apart here. One case is where the symbols are in fact missing, the other one is where symbols are marked as redacted.

Missing symbols are typically what you have in release builds for your own applications. In release builds most of the symbols you encounter are not actually on the device so if we were to try to locate the function names on the device we will not succeed. Instead they are stored in what is commonly referred to as a “dsym file”. Technically a dsym file is a macho file just like an executable but it only contains the symbol table and debug information. So while they could be in the same file, they usually are not. When I said that “most” symbols are not on the device, this refers to the fact that some symbols need to be in the file. This is because most applications on iOS are written in Objective-C. This is relevant because Objective-C implements methods through a mechanism that is based on the idea of sending messages from object to object. These messages are referred to as “selectors” and they are essentially the name of the method.

PLCrashReporter and some other tools are often attempting to find such symbols even if the normal symbols are not on the device, however for the bulk of the symbols you need to do this on the server.

The second case of missing function names we need to concern ourselves with is a weirder one: redacted symbols.

Redacted Symbols

Redacted symbols are symbols that are indeed available on the device but tools like KSCrash or PLCrashReporter cannot access. When iOS loads system libraries it removes symbols so that when one attempts to read the symbol by parsing the framework one will only come across a symbol with the name <redacted> .This is most likely done to save some memory or for security reasons. Because all system frames will have the same constant string as a symbol there is a lot that does not have to be loaded into memory.

The downside is that we are not able to tell you which function in UIKit caused your crash. When you hook your phone up to Xcode you can see such symbols though. So how does that work? The answer is a bit bizarre and requires some understanding of what happens when iOS loads the system libraries.

When iOS redacts symbols it stores a copy of the original symbol on the file system in a cache file that is not accessible for non rooted devices. The file is named dyld_shared_cache_arm64 for arm64 etc. From the file name you can see that this is considered a cache file. This means the file is updated as redacted symbols are added to it. Apple built this system to primarily support the flow where you debug your own device. If you run your own app and you hook it up to the debugger all the frames that you are interested in will have their redacted symbols added to the cache file. When you connect the phone to Xcode, Xcode will go in and “prepare the device for development” and that will essentially download the cache file and run it through a process where dummy debug symbols are built for it. It will in fact create a folder structure below ~/Library/Developer/Xcode/iOS DeviceSupport for your version of iOS and put new macho files in there with symbols recovered from the cache file.

Now you can guess what the problem with this is: if you have never seen a symbol it won’t be in the cache file. This is particularly noticeable if you are working with “legacy” architectures. For instance if you hook up an arm64 device with Xcode it will be able to extract some armv7 symbols but it will most likely not find all. Your chances are most likely higher if you are running a lot of 32bit apps to populate the cache, but you might as well just hook it up with an older device instead. Whenever you add a device to Xcode it will merge together the symbols it extracts.

This shows one of the core issues that come up with symbolizing on iOS: you need to collect as many of these debug symbols as possible.

Symbolicating App and System

Sentry is using two separate systems for resolving functions. For customer debug symbols we are using our own LLVM based symbolication library for Python. We fetch debug symbols from our S3 backed asset storage and then symbolicate based on the symbols we have on our device. This scales quite well to the workload caused by apps. These are typically large symbol files but there are not that many per app.

On the other hand dealing with symbols from the system libraries is a different story. There are thousands of symbol files and because the cache might be incomplete we actually want to be quite fuzzy over them. As example for this fuzziness is that we might be dealing with incomplete debug symbols for system libraries from one SDK. In that case we want to try a few older versions as well in case we find matches there.

To achieve this goal we wrote a separate system we call the sentry symbol server and it is a simple HTTP service written in Rust that takes a batch request of addresses to symbolicate and then responds with the function names if it finds them. It uses a custom file format that can be memory mapped in. We then use a separate build process to create these mmap’ed files and put them to S3. In regular intervals the server checks back with S3 and fetches new memory maps if necessary.

The Final Result

After symbolication our boring crash report from before looks more like this:

CrashLibiOS        -[CRLCrashNULL crash] (CRLCrashNULL.m:37)
CrashProbeiOS      -[CRLDetailViewController doCrash] (CRLDetailViewController.m:53)
UIKit              -[UIApplication sendAction:to:from:forEvent:]
UIKit              -[UIControl sendAction:to:forEvent:]
UIKit              -[UIControl _sendActionsForEvents:withEvent:]
UIKit              -[UIControl touchesEnded:withEvent:]
UIKit              __UIGestureEnvironmentSortAndSendDelayedTouches
UIKit              __UIGestureEnvironmentUpdate
UIKit              -[UIGestureEnvironment _deliverEvent:toGestureRecognizers:usingBlock:]
UIKit              -[UIGestureEnvironment _updateGesturesForEvent:window:]
UIKit              -[UIWindow sendEvent:]
UIKit              -[UIApplication sendEvent:]
UIKit              ___dispatchPreprocessedEventFromEventQueue
UIKit              ___handleEventQueue
CoreFoundation     ___CFRunLoopDoSources0
CoreFoundation     ___CFRunLoopRun
CoreFoundation     _CFRunLoopRunSpecific
GraphicsServices   _GSEventRunModal
UIKit              -[UIApplication _run]
UIKit              _UIApplicationMain
CrashProbeiOS      main (main.m:16)
libdyld.dylib      _start

And this allows us then to render the crash report in a more presentable way. Because we know which symbols are from your app and which ones are from the system we can by default hide frames you likely don’t care about:

iOS Stacktrace Example

In an Ideal World

In an ideal world Apple would provide a web service that does what our symbol server does. You give it the UUID of the image you want to symbolicate and the address in it, and you get back a response of the symbol that is at that address. At present the process of collecting all the symbols from different SDK versions is slow, requires a lot of manual labour and is not even guaranteed to always succeed.

Future Plans

Sadly we are limited to providing system symbol resolving on our cloud hosted version. There are some concerns about the redistribution of system symbol files which is why we currently cannot offer this service for on-prem customers.

If you are interested for support for system symbol symbolication for on-prem installations leave your feedback in the forums. We are playing with the idea of making our symbol server a public API in case there is demand for it.

If this article was of interest of you let us know. We might do a followup where we explain our heuristics and the technical challenges on doing server side symbolication.

Open Source Sprint at Sentry

In March, we hosted a sprint for the SF Python community to help newcomers interested contributing to open source. For two days, over 50 people came together in the Sentry offices to work on widely used projects like Mypy, Zulip, Certbot, and of course, Sentry. In particular, five attendees merged their first commits to a major open source project.

If you’d like to see what was built, check out the PRs below! And if you’d like to start contributing to Sentry, check out our docs for setting up your development environment and come chat with us on our community forum.

Contributors to Sentry

Welcome Evan Ralston

Evan Ralston joins Sentry this week as an operations engineer.

Evan previously worked at a number of Bay area startups where he helped improve uptime of their web services. He will be helping Sentry’s operations team get ahead of projects, instead of dealing with systems problems reactively.

When not automating himself out of a job, Evan likes convincing people to watch Steven Universe.

Tips for Reducing JavaScript Error Noise

If you’re using Sentry to monitor and debug browser JavaScript issues, you might be suffering from a common affliction: noisy, low-value errors that make it harder for you and your team to identify high-priority issues.

This happens because browser JavaScript is perhaps the single most complex environment from which to capture errors – because it’s not just one environment! There are multiple major browsers, JavaScript engines, operating systems, and browser extension ecosystems, all of which come together to make capturing good errors difficult.

Sentry does a decent job out of the box cutting through all this noise, but for the best results, it needs your help. Below are a few additional steps you can take to configure Sentry to greatly reduce the amount of noisy errors you receive.

Whitelist your URLs

Sentry’s browser JavaScript SDK(Raven.js), by default, will pick up any uncaught error triggered from your web app. That includes code running on your page that isn’t necessarily authored or controlled by you. For example, errors triggered from browser extensions, malware, or 3rd-party applications like chat widgets, analytics, and ad code.

To ignore such problematic errors, you can configure Raven.js to whitelist errors originating solely from your own code:

Raven.config('your-dsn', {
    whitelistUrls: [
        'www.example.com/static/js', // your code
        'ajax.googleapis.com'        // code served from Google CDN

This example configuration ensures that only errors that originate from scripts served from www.example.com/static/js and ajax.googleapis.com are reported to the Sentry server. This small configuration change is the easiest, most impactful change you can make to reduce errors.

Use inbound data filters

Inbound data filters are a Sentry feature designed to discard known low-value errors from your projects. They are easily toggled inside your Sentry project settings, and any errors they discard as a result do not count towards your account quota.

There are 3 filters that are particularly valuable for JavaScript developers:

  • Legacy browsers – old browsers like IE9 produce low-fidelity error reports that aren’t always actionable
  • 3rd-party extensions – automatically drop errors from known browser extensions, malware, and ad scripts
  • Web crawlers – drop errors triggered from known web crawlers like Google Bot

Inbound filters are not as powerful as configuring Raven.js to whitelist error URLs, but they’re nice because they can be enabled with a single click from inside your project settings.

Note that inbound data filters are open source and contributed to by the community. If you are encountering an error that you feel should be discarded globally by these filters, consider opening an issue/pull request.

Use the latest version of Raven.js

Sentry’s browser JavaScript SDK is under active development, and changes are frequently made to both improve the quality of error reports and reduce the quantity of low-value errors.

For example, Raven.js 3.12.0 now suppresses back-to-back duplicate errors by default. This is a life-saver if you are suffering from errors that trigger from asynchronous loops (e.g. from setTimeout or XMLHttpRequest callbacks). In long lived applications, errors like these can result in thousands of events for a single user!

To get the best experience, keep your copy of Raven.js up to date. Sentry will tell you when there’s a new version available, but it’s also worth checking the changelog periodically to see what’s new.

Use source maps

Source maps don’t just make debugging your production stack traces easier, they make it easier for Sentry to group errors into individual issues. This means that events bucket into a smaller, more manageable set of issues, which means less noise in your issue stream and less 2AM emails about broken code.

Making source maps a part of your build and deploy process isn’t as easy as toggling a button, but our in-depth source map documentation has everything you need to get started. Besides helping to reduce noise, source maps may be the single most profound improvement you can make to your monitoring and debugging workflow.

Ignore troublesome errors

Some errors you’re just never going to fix. When that happens, it might be time to declare bankruptcy and ignore them entirely.

You can either ignore the error via the Sentry UI, or configure Raven.js to prevent them client-side using the ignoreErrors option. Doing so from Raven.js is ideal because errors discarded at the client-level do not reach Sentry’s servers and do not count against your event quota.

Here’s what it looks like:

Raven.config('your-dsn', {
    ignoreErrors: [
        'Can\'t execute code from freed script',
        /SecurityError\: DOM Exception 18$/

Be careful though: once you make this change, you’ll never see these errors again. And error strings that were previously just nuisances could become bigger problems down the line that you’ll never be informed of. Choose your ignoreErrors array wisely!

Note that browsers can produce different error messages for the same fundamental error. For a single ReferenceError, you may need to input multiple strings/regexes to cover all possible browsers.

Upload your source files (and source maps)

When Sentry encounters an error triggered from a JavaScript file, it attempts to download that source file from your app servers in order to correlate line and column information with actual source content. This source content is the basis of Sentry’s error grouping algorithm.

If your source files are only accessible over the web, there’s a lot of bad things that can happen. For example, you might have a build process that removes old JavaScript files from servers as you deploy new ones. If your users trigger errors from older cached scripts, when Sentry goes to download them, they will no longer be available. Not having access to that content can mess up the grouping algorithm, which means separate issues will be created for errors that would normally be bucketed under an existing issue.

To avoid these and other interruption scenarios (e.g. network availability), we strongly recommend you upload your production JavaScript files and source maps as release artifacts. This means that Sentry will always have direct access to these files, ensuring maximum grouping effectiveness.

Uploading source files is done using the Sentry API, and is pretty simple:

$ curl https://sentry.io/api/0/projects/:organization_slug/:project_slug/releases/:release/files/ \
  -X POST \
  -H 'Authorization: Bearer YOUR_TOKEN_HERE' \
  -F file=@app.js.map \
  -F name="http://example.com/app.js.map"

To learn more about artifacts and releases, please see our documentation.

We’re here to help

We just discussed 6 ways you can reduce noise from browser JavaScript projects – some easy, some more involved. As always, if you want additional assistance with reducing JavaScript noise, please reach out on the Sentry forum or contact our helpful support team.