The core New Relic product: observability
Every app that you use on the internet is running on a server somewhere. Developers need to understand what’s going on with their apps and servers so that things run smoothly: they’re checking for how fast things run, what errors they run into, spikes in traffic, and stuff like that.
New Relic offers a comprehensive set of tools for doing all of this stuff, from application to infrastructure and beyond (Datadog is a useful comparison). You start by installing New Relic agents (like little cameras) on your infrastructure, and gathering data. Then you can build interactive visualizations, set up alerts, and dig deeper into specific patterns you’re seeing.
A brief, brief history of deployment
Every app that you use on the internet is running on a server somewhere. Until recently, that used to literally mean one server - a giant computer - so you had whatever computing power you had, and you had one place to look if you wanted to know why things were broken – or worse, why things were slow. There were basic utilities in Linux – the standard server operating system – for monitoring a lot of this stuff, like the htop command – which is still used a lot, mind you – but this was mostly a reactive process. Something would go wrong, and you’d check why.
Then a few things changed:
Infrastructure got easier, but more complicated
The one-app-one-server paradigm is not very true anymore. Most major apps now run as distributed systems, or a set of multiple, interconnected servers. It is obviously harder to debug many servers than it is to debug one server; but that’s only part of the problem.
With everything running on Docker, the concept of “servers” got a lot more complicated, because there was a thick layer of abstraction between your code and what infrastructure it was running on. Docker> creates a standalone container on your servers; now, something can be wrong with your server or that container, or even the relationship between the two.
And in addition to that increased surface area, it also meant nicher things to worry about, like your Kubernetes cluster having a hard time restarting pods. New approaches to infrastructure – as well as new layers for managing that infrastructure – means monitoring is much more complex than it used to be.
The internet got bigger
As the internet became widely available, apps are now used by like, billions of people. So when 2 billion people are loading Facebook.com every hour as opposed to 200, a lot more things can go wrong, both in terms of surface area (bigger products) and in terms of having just so many people hitting your servers. Supporting an app that’s trying to handle hundreds of millions of requests per second is very different from one that’s handling thousands. And in addition to that, it’s more and more important to fix things quickly, since there are always a lot of people trying to use what’s broken.
Everything moved to the cloud
The last important change to keep in mind: most apps you access today are sending data across the web, instead of running on your company’s local data center. The cloud is great! Companies can get set up way faster, for much cheaper (initially), and get access to the fastest and best infrastructure around. But it also means that every single request you make from your browser – be it to fetch your emails or send a tweet – has to travel over the public web, instead of across little internal wires. That makes performance much more unpredictable; and doubly important to measure.
So in summary, developers were faced with more complex infrastructure and more pressure to understand, monitor, and keep that infrastructure running smoothly. This is part of why DevOps (development operations) started to become its own discipline – companies were employing teams of developers just to deploy and monitor infrastructure.
What teams are actually monitoring
To understand New Relic, you need to understand what monitoring is. Building your app is far from the finish line: once you get it out to your users, there’s an entire series of workflows around making sure it continues working, and that it’s fast.
There are two big pieces to modern application / infrastructure monitoring: your application and your infrastructure (i.e. both sides of the slash).