What Went Down: Week Ending April 24, 2023

Newsworthy: Southwest Airlines had a number of delayed flights due to an undisclosed cloud dependency outage – or “data connection issues resulting from a firewall failure.” Flo by Moen – a “smart water monitor and shutoff” system was down and customers were left without water due to a cloud outage. Notable Metrist-Reported Downtime While […]
Slack Said It Had 100% Uptime. Did It Really?

Not too long ago Gergely Orosz pointed out on Twitter that Slack reported 100% uptime since 2022, but he didn’t think that was the case. So who was right, Orosz or Twitter? We functionally monitor SaaS products like Slack to monitor for downtime in real time. So we set out to see whether Slack really […]
What Went Down: Week Ending April 10, 2023

Notable Metrist-Reported Downtime While these outages didn’t make the news, these issues caught by Metrist may have affected your company’s app and operations. The most significant of these were: Several Azure services (including Blob Storage, Virtual Machines, Cosmos DB, SQL, AKS): experienced multiple outages ranging from 6 minutes to 8 hours and 4 minutes. Hotjar […]
March Updates: Important — Out-of-the-Box Data is Going Away

We work hard every day to increase the value of Metrist, and we have two big changes to share with you this month. You don’t want to miss these so read on! Product Updates Important: Out-of-the-box data is going away Effective today, you will no longer see the Metrist view of service health as […]
What Went Down: Week Ending April 3, 2023

Notable Metrist-Reported Downtime While these outages didn’t make the news, these issues caught by Metrist may have affected your company’s app and operations. Want to know if or how you were impacted? Set up Metrist to monitor your own cloud resources by scheduling a demo. Azure Azure SQL had a number of 5-25 […]
What Went Down: Week Ending March 13, 2023

Newsworthy: Datadog had a major outage for 24 hours and 20 minutes on 8th March. You can learn more about the outage, how we monitored it, and how Metrist can be the Datadog for your Datadog here. And you can check out a demo of our Datadog monitor here. Notable Metrist-Reported Downtime While […]
Who is the Datadog for Datadog?

Let’s get a little trippy. In light of the major Datadog outage on Wednesday, people were asking, “Who is the Datadog for Datadog?” When an important tool like Datadog is down – how can you tell if it’s down or back up again without constantly checking status pages or social media? Well That’s Where […]
What Went Down: Week Ending March 6th, 2023

Notable Metrist-Reported Downtime While these outages didn’t make the news, these issues caught by Metrist may have affected your company’s app and operations. AWS AWS RDS (MySQL). On 27th February, CreateInstance was not responding from US East 1 for 40 minutes. Azure Azure AKS. On 28th February, CreateCluster was not responding from Azure […]
What is AI Ops and How is AI Ops Useful for Incident Response?

When it comes to incident response, AIOps is an up-and-coming field. Reliability is critical to companies, but in today’s complex, interdependent software environment, observability and incident response is becoming more and more complex. So, it’s useful to use AI to improve incident response, observability, and the reliability of our systems – which is where AIOps […]
What Went Down: Week Ending February 27, 2023

Notable Metrist-Reported Downtime While these outages didn’t make the news, these issues caught by Metrist may have affected your company’s app and operations. Azure Azure CDN’s PurgeFile was not responding from Azure Central US for 25 minutes on the 24th GCP GCP Compute Engine’s DeleteInstance is not responding from GCP US Central 1 […]