The Game of Telephone that Hurts User Experience
Despite the best efforts of modern development to eliminate all bugs prior to release, crashes still happen. However, unlike the improvements to the commit and integration processes—with the preponderance of tools like GitHub, Bitbucket, GitLab, Travis CI, CircleCI, and Jenkins—most organizations still rely on dated methods to spot and fix errors plaguing their applications in production. By relying on customers to report problems, developers only get a portion of the story and have no means to close the loop with the user on the most important details.
The “Game of Telephone” that results from an error report usually requires enough crashes affecting a large enough portion of the customer base that some of those users contact customer or technical support, whose first step is to document and categorize the problem, usually by asking the user for screenshots and a write-up.
From customer support, the quality assurance team will try to recreate the error from the description in a shared doc and run a battery of test scenarios to better understand its context, origins, breadth, and scope. Even then, many of the important details, not least of which is impact, can not be recaptured by QA alone.
The next line of defense is to share the collected details with an engineering manager, whose best guess about prioritization and how to fix the problem leads to a triage and post-mortem investigation into log files. At this point in the Game of Telephone, crucial details have been lost, and there’s likely more concern about preventing collateral brand damage than interest in reengaging the users to get firsthand insight and pinpoint the crash details.
By the time the who, where, what, why, and when of the error have been aggregated to determine an optimal solution, days or weeks have likely passed, and the fix deployment may still be held up by a scheduled push cycle. In contrast, if the error were disastrous and widespread enough, the investigation and triage would likely have happened faster and sooner, but as an all-hands-on-deck scenario that may prevent any other product work from getting done.
The irony of this scenario is that it affects even the most agile of DevOps organizations. Modern engineering practices are by and large adopted to push development closer to production so that teams can respond quickly and correctly to changes in the market. But, for too long, that ideal has dropped off at deployment, and the developer team, who has the most vested interest in learning from the live customer experience, has had far too little intelligence or ability to react to exceptions.
Automated error tracking fundamentally closes the gap between developer and user so that product teams can focus on what they do best: creating apps that make your customers’ lives better. Sentry was built specifically with modern development teams in mind. As soon as an exception occurs, developers can see their full code directly in the stack trace, along with the context and events leading up to the error. This is key information needed to isolate, reproduce, and fix the issue as part of your existing workflow before your customer even knows there’s a problem.
By using Sentry, you can finally stop relying on the laggy, leaky Game of Telephone that makes reproducing errors so difficult and, instead, go right to the source and keep the customer’s experience as job one.