From 2024-09-16 04:30 UTC to 2024-09-16 12:30 UTC, we experienced an issue within our US-EAST-1 region causing customers to receive 5XX errors and a reduced ability to ingest data to customer buckets within the region. The system user-servers reached capacity with logs resulting in a failure of our streaming service. Because the user-servers were busy writing logs, they had reduced capability to handle requests. Additionally, messages that were unable to be published were written to disk, further increasing I/O operations on the system.
By 12:30 UTC, our Operations team had taken corrective action by emptying the streaming queue, and restarted the user-server services. After these actions were performed, ingest to our US-EAST-1 region had been fully restored.