We are currently investigating System errors on EU-CENTRAL-1.
Incident Report for Wasabi Technologies
Postmortem

Exec Summary

At 04:40 UTC on September 23, 2021, the Wasabi database service in EU-Central-1 region was impacted for about 57 minutes during which customers experienced connection errors.

At 05:37 UTC on September 23, 2021, the system was returned to fully operational service. 

Outage Details

Wasabi uses an internal audit process that runs in the background and validates the metadata from an accounting perspective. It issues queries to the database to collect appropriate information needed for reconciliation of data. The cause of the disruption was an inefficient query which put one of the database servers in a highly stressed state. Please note that Wasabi uses multiple “shards”, for its database operations, and this was seen on one of those. While the majority of operations worked, Buckets on the affected shard experienced read and write failures.

Database on the impacted shards was restarted, as part of a recovery. The audit process has been disabled until the inefficient query can be rewritten. 

Action Items:

  • Disable the audit process until the software is modified - Done
  • Analyze the database tuning options to determine how this may be enabled in the interim
  • Software upgrade of the audit process to improve the inefficient query - Targeted for an upcoming release.
Posted Sep 28, 2021 - 14:14 UTC

Resolved
The System errors issue has been resolved and service is restored to normal operating levels. Thank you for your patience while this was investigated. Please reach out to support@wasabi.com if you continue to see any errors.
Posted Sep 23, 2021 - 06:15 UTC
Monitoring
Service is currently restored in EU-Central-1. We remain in degraded status while we continue to monitor the system.
Posted Sep 23, 2021 - 06:00 UTC
Update
Our Operations team is continuing to investigate this issue.
Posted Sep 23, 2021 - 05:34 UTC
Investigating
We are currently investigating System errors on EU-CENTRAL-1.
Posted Sep 23, 2021 - 05:17 UTC
This incident affected: EU-Central-1.