Fastly Outage: How one single customer caused a global internet outage

A detailed blog post by Nick Rockwell, Fastly’s SVP of engineering and infrastructure last night has given a more detailed insight into what went wrong.

Fastly Outage: How one single customer caused a global internet outage
Image Source: Fastly

After a significant chunk of the internet took an hour-long nap on Tuesday, all the focus has been on Fastly, an American content delivery network, also known as CDN. The company has now detailed the cause of the outage and apparently, it was one single customer and a host of ‘favourable circumstances’ that caused the outage.

The outage, which had the world talking, was traced back to the American CDN which is one of the world’s top four CDN providers apart from Cloudflare, Akamai and Amazon’s CloudFront.

By design, CDNs are expected to continue the systems running even in case of a server failure or outage but Fastly’s initial response was that the issue was caused by an error in ‘configuration settings.’

But a detailed blog post by Nick Rockwell, the company’s SVP of engineering and infrastructure last night has given a more detailed insight into what went wrong.

Rockwell said that the bug crept into the system through a code that was introduced on May 12. Since then, the bug laid dormant as it could be triggered by a ‘specific customer configuration under specific circumstances.’

“The outage was caused on June 8 after a customer pushed a valid configuration change that included the specific circumstances that triggered the bug, which caused 85% of our network to return errors,” Rockwell said in his blog post.

“The disruption was detected within a minute. It was then identified and we isolated the cause and disabled the configuration. Within 49 minutes, 95% of our network was operating as normal,” he added.
Rockwell also apologized to customers and thanked the community for their support.

Several customers, like the gov.uk website, have a backup contract with other CDN providers so that they can switch to their networks in case of an outage.

The monetary ramifications of the outage aren’t clear yet, though some reports peg the cost of an hour’s downtime at $250,000 (approximately Rs 1,8 Cr). Research by SEO agency Reboot said that a similar outage would have cost Amazon almost Rs 5 lakh per second until the systems were back live.

Why are CDNs used?

CDNs provide several servers across the world that act as a mirage of the original server and have cached data. So, when a user wants to visit a site, the request is sent to the server nearest to the user and not the main server. This helps in two ways: One, it reduces the time taken for the webpage to load and second, it does not overload the main server as the traffic is distributed between multiple servers.