Notable Metrist-Reported Downtime
AWS
- AWS RDS was unable to createnew instances in AWS US-East-1 for 44 minutes on the 20th. AWS only provided a vague error message about the service being unavailable. On the 20th, it was unable to create new instances in AWS US-East-1 for 20 minutes; Metrist’s attempts each timed out after 900 seconds. Then on the 22nd, the platform was unable to create new instances in AWS US-East-1 for 5 hours 5 minutes.
- AWS Route53’s DNS record creation took 1698% longer than usual in AWS US-East-1 for 10 minutes on the 19th.
Azure
- AzureVMs was unable to create new VMs in Azure East US for 30 minutes on the 17th. Then on the 19th, it was was unable to reach instances running Azure West US for 30 minutes.
- Azure AKS was unable to create new clusters in Azure East US 2 for 26 minutes on the 26th.
- Azure CosmosDB was unable to create CosmosDB accounts in Azure East US for 25 minutes on the 21st. Then on the 22nd, the platform was unable to create Cosmos DB accounts in Azure East US for 4 hours 11 minutes. And on the 23rd, it had two instances in which it was unable to create cosmosDB accounts in Azure East US for a total of an hour. Both of these outages occurred in North America and one lasted for 40 minutes while the other lasted for 20 minutes.
GCP
- GCP App Engine was unable to deploy new versions in GCP US East 1 for 44 minutes on the 17th. Further, the platform was displaying intermittent 500 errors when Migrating Traffic and Autoscaling at the same time.
- GCP Compute Engine was unable to create New Instances in GCP US East 4 for 4 hours 20 minutes on the 18th. The error message indicated that GCP did not have enough resources available. Then on the 19th it was unable to create new instances in GCP US-East-4 for 6 hours The GCP message indicated that they didn’t have enough resources available. An identical issue occurred for 3 hours 20 minutes on the 20th.
Saas/Other
- Trello’s Card Creation was extremely latent for some west coast users according to Metrist for 15 minutes on the 19th. The platform updated their status page lighting-fast to indicate these problems.
- NuGet experienced extreme latency when Listing Versions and Downloading Packages from Canada for 1 hour 26 minutes on the 19th.
- Braintree’s sandbox environment was down across all of North America for 42 minutes on the 20th. They never updated their status page, despite having one specifically for the sandbox environment. Metrist’s in-app monitoring showed that production environments were functional.