Jul 05, 2023 Trading service failure report
1. Impact and timelines
Between 8:20:00am and 10:00:00am UTC on July 5, 2023, OKX trading systems were partially or fully unavailable for some users.
The timeline for this incident is detailed below:
8:20:00am UTC: Local exceptions were thrown in the OKX trading system. IT and engineering teams were immediately notified about the problem through monitoring systems, alarms, and customer feedback. After urgent investigation, the system exceptions were found and repair solutions were discussed;
9:20:00am UTC: Repair plan design was completed, focusing on optimizing system configuration. The optimization resulted in a gradual reduction of local exceptions;
10:00:00am UTC: Local exceptions were no longer thrown and the OKX trading system returned to normal.
2. Why did this downtime happen?
Servers underlying a core infrastructure component experienced an unexpectedly high transient load due to restart, leading to component failure. This in turn rendered some downstream trading systems unable to service some requests. This issue was resolved by optimizing the system configuration.
3. What steps are we taking to avoid this in future?
1). Add modules for fault injection testing to identify potential system pitfalls and fix them in a timely manner;
2). Optimize the system parameter configuration by simulating the system performance under high transient load.
4. Our commitment to you
OKX is dedicated to providing an ultra-reliable, high performance and feature-rich platform for our valued customers. To this end, we make continual improvements to system performance, stability, and features. Given the complexity and challenges of operating high-performance systems 24 by 7 by 365 days a year, unexpected issues may occasionally arise.
We know that timely communications are critically important to our clients, and believe that transparency is integral to building trust. In the event of any issue, we will notify clients as quickly as possible through Telegram community channels, the Status API, and the Status page.
July 7, 2023