Introduction to Splunk Observability
Introduction to Splunk Observability
Today we are giving you an overview of Splunk Observability
Difference between IT Monitoring and Observability
In general people often used the monitoring and observability interchangeably, but there are key differences.
Monitoring refers to gathering and analyzing data from applications and infrastructure to track performance of IT systems based on a predefined set of metric logs to identify, mitigate, and resolve issues. This allows problems to be recognised, alleviated and resolved. In simple terms, monitoring is continuous process of checking the output of the system. As an example, whether system process is alive or getting heartbeat or the latency is satisfying the service level agreements (SLA). Monitoring uses dashboards to capture and display predetermined data that helps IT teams detect potential problems and long-term performance trends.
Observability is the ability to understand and get visibility into the internal state of your system and infrastructure based on external output like logs, metrics, events, and traces generated by the application or system. Observability provides context and a deep understanding of interdependencies between the different applications across your IT environment. This allows problems to be recognised, alleviated and resolved.
Finally, monitoring does not attempt to change the system that is being monitored. It just observer the things that are happening without anything special being done on that system. Whereas for a system to be truly observable it needs to do something extra to expose the internal state. In order to expose of internal state of the application or system, developer of the application has to integrate observability code within their application An example: OpenTracing API help developers easily instrument tracing into their code base.
To conclude, if we just had a single process running on a single machine then monitoring is enough but if we have multiple processes running in a bunch of machines even though metric logs coming out of the system, we need something to make them combine to give a coherent view hence we require observability.
Data used for Observability
Telemetry data like logs, metrics and traces that are generated by applications or systems can be used for Observability. In general telemetry data has been generated using a collection of resources, including SDKs & APIs (such as OpenTelemetry). There are different types of Telemetry Metrics
- Business Layer Metrics: Such as A/B testing results, number of new users, average session durations
- Application Layer Metrics: Such as application response times, transaction durations
- Infrastructure Layer Metrics: Such as server traffic, disk I/O operations, network I/O operations, RAM, CPU and disk usage
- Client Layer Metrics: Such as client application response times and client errors on web, mobile, JavaScript and other client applications.
- Deployment Pipeline Layer Metrics: Such as check-ins, deployment lead times, frequencies, status of environments
Splunk products belonging to Observability
Currently Splunk provides the following Observability products
- Splunk Application Performance Monitoring
- Splunk Infrastructure Monitoring
- Splunk IT Service Intelligence
- Splunk Log Observer Connect
- Splunk Real User Monitoring
- Splunk Synthetic Monitoring
- Splunk On-Call