The Overlooked Culprit Behind 70% of SaaS Outages

Identifying the source of an outage is the name of the game when it comes to incident response and observability. But what if you were blind to the source of up to 70% of those outages?  That’s the case for most companies because it’s difficult to get visibility into a fundamental part of today’s apps: […]

My Predictions For the Future of Observability

Observability is an exciting, emerging field. My co-founder, Ryan, and I have been here since the early days (including being some of the first hires at Server Density and New Relic, back when we just did monitoring), we’ve seen the field grow in significant ways, and we’re here to help it evolve even more. But […]

Going Beyond Incident Response With Cloud Observability 

When it comes to observability for Cloud Dependencies, we often think about how we can use that data for incident response. However, that data can go beyond incident response and have a number of important applications.  In this article, we’ll discuss the ways cloud dependency data can inform ways to improve resiliency, get early warnings, […]

The Three Reasons You Need Observability

When people talk about observability, it’s usually in the context of obtaining data (metrics, logs, and traces) for the purpose of resolving incidents. But what if that data could be used for more than emergency situations? And what if expanding our understanding of what observability is can help us better resolve incidents – and maybe […]

What Happened at AWS re:Invent? Highlights From the 2022 Conference

Highlights of the AWS Re:Invent Conference What happened at this year’s AWS Re:Invent? We attended and can confirm it was a great event. (Shout-out if you saw Jeff or Ryan there!) If you weren’t one of the 50,000 people who attended in-person and 300,000 online, here’s a recap of the most important updates!   Observability […]

Introducing a Real AWS Status Page!

Why is the official AWS status page so hated? Sometimes it appears to be useless. It’s one of the biggest and most important products in the world – what could possibly make its status page so unreliable? There are a variety of reasons why the AWS status page is not reliable – or even a […]

Recap of DevGuild: Incident Response 2022

We had a great time at DevGuild: Incident Response and learned a lot from some amazing speakers that shared their experiences from Spotify, Zendesk, Salesforce, Honeycomb, Snyk, and more. I wanted to recap the experience and provide some takeaways from the amazing conference. And don’t take my word for it, you can watch the replay […]

Here’s How Chicago Trading Company’s Luke Rotta Engineers Resilient Systems

Just like any tech company, fintech is reliant on third-party cloud apps. However, if those apps malfunction, it puts high-stakes businesses at risk. Luke Rotta is the SRE & Observability Manager at Chicago Trading Company (CTC) and has worked in the Fintech industry for over 20 years. CTC is a privately held company with offices […]