Prioritizing Platform Stability at One of FastCo.’s Most Innovative Companies of 2022
The fitness industry is no stranger to ‘smart’ equipment, and what distinguishes one product from another ultimately comes down to user experience. Product success depends on stability, something top of mind for developers at Tonal.
Compared to the size of our user base and how active they are, we have relatively few users reporting bugs. We have very stable products and the goal is to keep it that way. Max Lapides, Sr. Manager of Mobile Software Engineering, Tonal
Ranked as one of New York Magazine’s best smart home training solutions 2022 and Men’s Health’s best connected cable machine 2022, Tonal literally sets the bar for smart home trainers. In maintaining that standard, developers take a different approach to making sure their product ‘does what it says on the box’… instead of reducing issues, they work towards avoiding them altogether.
We don’t have the mindset of reducing errors and crashes, we don’t have those, they’re the exception, not the rule.
Workflow for a bug-free UX
Okay, so it’s not necessarily that there aren’t any errors, but rather, preventing the ones you do have from ultimately impacting customers.
An important part of that is monitoring and integrating automated error reporting into the development workflow. Most users probably don’t know what that means or even care; but when asked what they think of Tonal, what they often say is 'it works and doesn’t give them any trouble'.
One of the ways Max's team achieves this is by using Debug Symbols to symbolicate obfuscated error logs – including those of 3rd party frameworks – attaching additional context from stack traces.
For example, this stack trace is redacted and for all intents and purposes, useless for debugging:
But once they’ve uploaded their debug symbols to Sentry, they end up with a symbolicated stack trace:
Without Sentry, we would need to collect these debug symbol files – which Sentry collects automatically from App Store Connect and from uploads from our CI system – and then manually use them to convert the obfuscated data to something human-readable.
Another way they maintain a smooth UX is by configuring breadcrumbs so developers investigating issues not only see a timeline of a user’s actions that led to an error, but all the context needed to reproduce and resolve it... fast:
At a certain point, I realized that these user events that we’re already tracking for analytics purposes also represent a stream of the user’s actions in the app that’s useful for debugging. So, we are now also sending these user events to Sentry as error breadcrumbs that let us see exactly what a user did before an issue occurred without leaving Sentry.
For example, a QA engineer looking at edge cases during new feature testing might pick up on a UI issue. Sentry shows them real-time data on the issue like the HTTP requests a user made leading up to it, and any navigation they did, with detailed context including the team responsible, Flutter version, and build number, making it easier to file an actionable bug report that’ll include enough detail on how to reproduce and ultimately resolve the issue:
No such thing as fixing code "on the fly”
Named one of FastCompany’s 10 Most Innovative Companies of 2022, Tonal follows a strict two-week sprint and deployment cadence that’s been in place since 2019 and hasn’t once deviated.
This level of consistency and predictability with releases translates well to their overall model of stability and consistently performant user experience. The thing about mobile, however, is that there’s no such thing as fixing code on-the-fly once it’s been deployed.
We need to rebuild the app for both Android and iOS platforms, submit it for review, which usually takes 24-48 hours, then we can release it to both Google Play and the App Store, and then we have to wait for users’ devices to automatically update, which can take up to another 3 days for critical adoption.
That’s why Max’s team holds their builds in QA and monitors them on the releases dashboard for about a week before deployment. Releases are standardized, with individual build numbers, and once they’ve passed QA the team declares the latest build number the “gold master,” ready to ship to production.
When a critical issue is identified with Sentry we’re generally able to fix it within 48 hours. The goal is that these issues never hit production.
With a focus on platform stability, resilience, and user experience, Max’s team combines their development experience with custom Sentry solutions, giving them the ability to:
Proactively monitor for errors, with detailed context baked in
Prioritize those with a direct impact on platform stability and UX
Maintain a strict release schedule without sacrificing quality
Analyze in-app user behavior to speed up time to resolution and,
Maintain their competitive edge in a crowded market
Sentry helps us maintain platform stability and prevents us from shipping something that has a direct effect on our users. A good day for us is when we don’t have crashes… which is every day.
If you'd like to learn more, check out our full conversation with Max here.