Massive Cloudflare outage was triggered by file that suddenly doubled in size
SMRTR summary
A corrupted machine learning configuration file that suddenly expanded beyond 200 features triggered Cloudflare's worst outage since 2019, causing widespread internet disruptions. The file was automatically regenerated every five minutes by a database system undergoing updates, creating unpredictable cycles where the network would fail and recover. Cloudflare resolved the crisis by manually inserting a working file and restarting their systems, while implementing new safeguards.
SMRTR provides this summary for quick context. The original article belongs to Ars Technica.
Read the original article