Share on Twitter
Share on Facebook
Share on HackerNews

Terms of Service Update

We talked about changing our Terms of Service in support of some new R&D at Sentry. We put that on hold as we realized two critical missteps on our side:

  1. We didn’t do a good enough job explaining what we’re trying to accomplish and instead caused some justifiable reactions around the idea of Sentry using your IP within generative AI.

  2. We hadn’t considered the variable use cases, and the need for documented consent mechanisms within many of them.

So we went back to the drawing board in an effort to re-align these goals with our broader stance on customer trust. I’m going to talk about what we’re doing around this, give more details about the shape of these investments, and hopefully leave you with assurances that, at the end of the day, our stance on data security and privacy is not changing, and continues to be of the highest importance.

Scaling Noise

Sentry’s been around for 15 years, and while our fingerprinting technology is still inspiring a whole suite of companies, it no longer solves the problem. If you’re unfamiliar with how Sentry works under the hood, the core of our value offering was always a simple flow: take an error, deduplicate it, and make that error as actionable as possible. Fingerprinting is critical to the success of the product, as it’s the way we deduplicate errors. You might think it’s as simple as hashing stack traces together, but as soon as you can get into the nuances of runtimes you’ll find it goes much deeper than that. Unfortunately, the problem has only gotten more complicated over the years, particularly with the rise of JavaScript, and the increasing scale of consumer apps.

We see more and more patterns of behavior that are simply hard to maintain static heuristics against. Examples of these things include browser extensions, which often inject code directly into your application, but even if you take those away you have an ongoing set of changes to transpiler and the overarching JavaScript runtimes (yes, there’s more than one). All of these challenges have made it more and more difficult to maintain the level of data quality - and trust - that Sentry desires. You combine that with the increasing volume of errors, even out of smaller applications, and the problems only compound. While some of this is simply due to the growth of the internet, much of it comes from our increasing reliance on upstream services, which means many of our root cause mechanisms tend to break down due to cascading failures from third-party services and code.

How’s this related to chatbots you might ask? Well, it simply isn’t. This is one of our mistakes with the Terms change. Most of what we’re trying to improve is our ability to train internal models to better fingerprint errors, and to better prioritize data within the Sentry system. In these implementations, we’re not talking about generative AI. What we’re talking about is taking things like error messages and stack traces and using that to influence a set of algorithms that could be applied across many customers. That means while the inputs might be coming out of customer data, the outputs never pose a risk of exposing customer information. We have a number of ideas in this vein, and we don’t ultimately know how viable they will be, but we do know that the scale of data is increasing the need for these kinds of systems.

The Elephant

Generative AI is scary, in part because it’s so unpredictable right now, and far too early in its life cycle. We ourselves have strict policies on not allowing new vendors who will use our own data to train their models. Part of that’s due to the heavy nature of our commitments on protecting customer data, but also because of the unknowns. No one wants their data exposed, and we are no different. A good example of this philosophy is the current “Suggested Fix” feature, which uses an OpenAI model. Even if you have a Data Protection Addendum (DPA) signed with us, we still require you to sign an additional amendment to allow the transfer of data to OpenAI since OpenAI is not listed as a default subprocessor in our DPA. If no DPA is in place, we require explicit consent from developers before their data is sent to OpenAI.

Any future AI features that we offer will be gated in a similar manner - requiring both explicit consent from our users and clearly articulating the data that will be used and where it will be sent. When we talked with customers about the Terms change, what we found was most people assumed what we were after was investments using this technology. It’s important to clarify that the ToS changes we are implementing are entirely independent of these features, which will always be opt-in. In addition to our customer-facing consent mechanisms that we’ll be shipping, we’re also developing an internal program to guide our team on the levels of scrutiny required for different types of data usage. All forms of data use, even those with consent, will undergo a thorough internal review before being offered to customers.

Going forward we will rely on consent in cases where we need to leverage customer data which may be considered IP, PII, or otherwise identifiable. What this looks like in practice varies, but let’s use the stack trace as an example. That is, even though a stack trace might entirely be third-party code, code that’s exclusively open source, we will classify it as customer-specific and require consent to use the data for functionality that is shared with other customers. A consent mechanism will exist in any situation where we cannot make guarantees that the type of data used (e.g. custom tags), or the type of information output (e.g. if we were to use it in generative AI), will not be identifiable.

For non-identifiable, aggregatable customer data (for example, the latency across your endpoints), we’re not going to require consent. This data doesn’t pose any risk to you, and it will help inform things like our severity algorithms, as well as enabling features that could provide you with a significantly better experience. To illustrate the kind of data, and a simplistic use case, let’s take Web Vitals. Most folks are relying on synthetic benchmarks, often trusting whatever Google tells them the industry looks like. By taking this non-identifiable information and aggregating it, we can actually give customers useful benchmarks. While we don’t think this one is controversial and is pretty standard in other terms of service, it’s not something we’ve done in the past.

What does that consent look like? We don’t know yet. We’ve talked about both opt-in and opt-out functionality, based on the level of concern from customers. Right now we’re going with an opt-in model, and we believe customers will value the product enough that they’ll opt-in. We’ll make sure the uses of the data remain clear, and that the corresponding consent mechanisms vary based on that. That is if you consent to share data with us to improve our fingerprinting algorithms, that will not be the same consent prompt that would trigger the “Suggested Fix” feature, which means you’ll have granular control over the ways Sentry utilizes your data.

We’re working on getting the new Terms of Service written up now, and while we don’t yet have a date for releasing them, we’ll give additional time before they go into effect for customers this time around. We’ll let you know via email when they’re available.

Your code is broken. Let's Fix it.
Get Started

More from the Sentry blog

ChangelogCodecovDashboardsDiscoverDogfooding ChroniclesEcosystemError MonitoringEventsGuest PostsMobileMoonlightingOpen SourcePerformance MonitoringRelease HealthSDK UpdatesSentry
© 2024 • Sentry is a registered Trademark
of Functional Software, Inc.