The Observability Tooling Market

Welcome to the wild world of observability tooling. 20+ legitimate, mature vendor options across open source vs closed source.

Once a startup reaches any modicum of success, observability becomes a must have tool for their engineering organization (and they will usually adopt it before then anyway). Downtime is basically the worst possible thing that can happen to an engineering team and they will do, and pay for, pretty much anything that will help them proactively avoid it. This, among other dynamics, is why Datadog can charge millions of dollars and observability can make up 20-30% of a company’s entire infrastructure bill.

That being said…companies in this category are relatively mature, and there aren’t necessarily giant differences in their feature sets that matter to buyers. A lot of a buyer’s choice comes down to open source vs. closed source. All in all, you're looking at 20+ legitimate, mature vendor options here. Welcome to the wild world of observability.

So far in this category, we’ve covered Datadog, Splunk, Elastic, and New Relic.

Logs vs. metrics vs. APM

Broadly, developers think of observability in terms of 3 types of use cases or goals. Understanding the differences between these will help you understand why a company might go for Datadog instead of the Elastic stack, or New Relic and the Grafana stack together.

Logs

The log is the basic unit of observability, which we covered in the post about Splunk. To quote myself:

In a really ideal world, all of the data we want to analyze would magically sit in perfectly manicured tables with nice column names and no missing data. Unfortunately, reality isn’t so rosy; data is only as good and clean as what generates it, and our systems can get pretty dirty. Data is really (usually) just a record of what happened, and the most popular way of generating and storing that today is a log. A log is just a line that says what happened, where, and when, and includes any necessary other information.

There are logs for everything: server logs, access and authorization logs, event logs, availability logs, resource logs…the list goes on.

A log might get generated when a user makes a request, when someone enters the wrong password, or when one server fails to connect to another server. Pretty much anything that happens in code generates a log. These things are the currency of observability.

But a log is a very ba...