Latency Issues with 1.0 Platform

Resolved

January 27, 2026 at 5:55 PMUTC

Resolved

January 27, 2026 at 5:55 PMUTC

RCA – Root Cause Analysis: Latency Issues with 1.0 Platform
Status: Resolved Date: January 27, 2026
Incident Window: 9:06 AM – 10:15 AM MST
Summary
Between 9:06 AM and 10:15 AM MST, the 1.0 platform experienced elevated latency and intermittent API timeouts. The root cause was identified as storage exhaustion within the Redis cache cluster. This saturation forced the platform to fall back to primary database queries, leading to system congestion and slower response times.
Timeline

9:06 AM: Incident start; Redis storage reaches capacity.
9:26 AM: System alerts received by IT and Operations teams.
9:33 AM: Formal incident ticket submitted and investigation intensified.
9:50 AM: Root cause confirmed as Redis cluster saturation.
9:55 AM: Cluster upgrade initiated to increase storage capacity.
10:06 AM: Upgrade complete; cache layer fully operational.
10:15 AM: Resolution: API latency returned to baseline.

Resolution & Prevention
To resolve the incident, the Redis cluster was vertically scaled to increase available storage. To prevent a recurrence, we are recalibrating our monitoring and alerting thresholds. These updates will ensure that alerts are triggered more appropriately and earlier in the utilization cycle, allowing our team to intervene before the system reaches saturation.

Monitoring

January 27, 2026 at 5:15 PMUTC

Monitoring

January 27, 2026 at 5:15 PMUTC

We have identified the issue and it has been resolved. We'll continue to monitor the issue.

Investigating

January 27, 2026 at 5:05 PMUTC

Investigating

January 27, 2026 at 5:05 PMUTC

We’re aware of an issue affecting the legacy (1.0) platform and are actively investigating. The team is fully engaged and working to resolve this as quickly as possible. We’ll share updates as they become available.

GUIDEcx - Latency Issues with 1.0 Platform – Incident details

All systems operational