Let’s Get Ready to Monitorrrrr!
At Sentry, we often get asked how error monitoring is different from APM or logging. To help answer this, I’ve broken down the different types of monitoring into four major categories:
- Systems Monitoring
- Log Management
- Application Performance Management (APM) Monitoring
- Error/Crash Monitoring (Sentry!)
Systems monitoring involves checking that data center equipment works and tracking the system’s overall resource consumption. Are my network and hardware working as expected? Is RAM and CPU consumption within acceptable bounds?
Typically, system admins and/or operations engineers use and implement these systems.
Applications and services output to logs. These logs are centralized, aggregated, and searchable.
Logs are particularly useful for drilling down on specific events (assuming they are being logged) and configuring anomaly detection.
Traditionally speaking, logs were used to notify engineers of and debug errors. However, this isn’t optimal for triaging and remediating errors/exceptions/crashes, as logs neither contain in-application context nor group/aggregate similar errors. Instead of fixing the issue and moving on, developers spend a lot of time querying log data to try to figure out how to recreate the error and determine the root cause.
See Sentry vs Logging for more information on log management.
APM shows your application’s behavior within the system. It displays a range of data/information/graphs and notifies you when performance and overall health fluctuates. Helpful information includes response times, Apdex score, request rate, latency, and other related metrics. It’s up to you and your team to dig deeper and suss out any problems in the code.
Such metrics are typically oriented toward and consumed by operations teams.
Error monitoring surfaces runtime errors and crashes within your application.
In-application context is sent alongside the error, giving developers all necessary information to determine the cause, place, and impact of any given error, as well as any noteworthy patterns.
Developers working on a product or service use error monitoring for up-to-date notifications when code breaks, and the necessary information and context to fix it.
Yes. Each one has distinct purposes, use cases, and intended audiences.
Folks working in data centers will likely use systems monitoring. In contrast, operations teams may use APM to understand overall health, and developers use error monitoring to check the recently deployed code for application errors.
You may need to revert to past logs to understand the entire application trail.
Though physical hardware may be out of your control (as your provider manages that), it’s still on your team’s responsibility to monitor resource utilization and application behavior within the infrastructure. While some parts of system monitoring may be abstracted and the others are still just as relevant for example, you most definitely have to be aware of resource utilization and the impact it has on running your service.
Serverless means you don’t have to worry about monitoring the server (as that has been abstracted). You do, however, have to monitor throughput, latency, and application errors. You also have to make sure that your application does not exceed any of the bounds/limitations/quotas defined by the serverless platform.
In a serverless architecture, the infrastructure is hidden from you with shiny clouds, and services are distributed. This means you must rely on application monitoring for any and all visibility and to understand if your code is not functioning as expected within the abstracted infrastructure. Read about How Droplr uses Sentry to debug Serverless applications.
Sentry is focussed on letting the developer know that something isn’t right with their application code and providing all the relevant context to understand what’s going on in the code and what the impact is, and taking the next step.
Aside from application monitoring and in-application context, Sentry aims to help answer the next question as well, which is “what caused this?“.
As we move into architectures where services are distributed and reliant on each other, we need to know the health of each of them and be able to trace an issue in one or more of those services to the root cause and commit.
Additionally, infrastructure is increasingly being abstracted and distanced from developers, as seen with serverless, which means it is even more critical application/code health is captured and tracked. Without control and access to the infrastructure, engineers need to understand what if the application is behaving as expected in regards to customer experience, performance, and errors.
Application developers need application health visiblity. That’s what we provide.