Quicktake: Why did AWS crash again and will there be more cuts?

Seattle-based company competes with the likes of Microsoft Azure, Google Cloud, Oracle Cloud and others

An expo hall at AWS's re:Invent conference in Las Vegas on December 1. AP
Powered by automated translation

Amazon Web Services, the backbone of many websites and apps, suffered a major disruption in the US and in some other parts of the world on Wednesday, disrupting millions of online platforms.

The latest failure hit AWS just a week after the company suffered a major outage that knocked hundreds of websites offline in one of the worst breakdowns the company has faced.

The National looks at the possible snags causing the frequent blackouts and explores the potential solution.

What is AWS?

AWS is one of the world’s most comprehensive and broadly adopted cloud platforms that offers more than200 services from data centres globally. It stores its customers’ data, runs their online operations and helps them to lower overall business costs, thereby becoming more agile and able to innovate faster.

AWS – which competes with the likes of Microsoft Azure, Google Cloud, Oracle Cloud and others – has 84 data centres in 26 locations around the world. It plans to launch 24 more data centres in Australia, India, Canada, Israel, Spain, Switzerland, New Zealand and the UAE in the coming months.

What caused the outage?

The disruption, which lasted more than 90 minutes, was a result of a technical error that sent huge loads of data to the core network. It is still not clear whether it was a human mistake or a technology glitch.

The issue was caused by “network congestion” between parts of the AWS platform and the internet service providers.

“Traffic engineering incorrectly moved more traffic than expected to parts of the AWS backbone that affected connectivity to a subset of internet destinations,” AWS said.

The company said the issue has been resolved and it does not expect a recurrence.

What led to AWS failing?

In the past, industry experts cautioned against the presence of only a handful of companies in the burgeoning cloud market.

The latest AWS crash is a “prime example” of the threat of centralised network infrastructure, said Sean O’Brien, a visiting lecturer in cyber security at Yale Law School.

There is a need for businesses to adopt multi-cloud strategies – using services across different cloud computing providers – and reduce their reliance on a single service provider to ensure a single outage does not close their operations in one go.

Forrester analyst Brent Ellis said a multi-cloud strategy will help companies side-step big web outages.

“It’s a decision large enterprises have to make or they’ll inevitably be in a situation where they’re down for several hours,” he told Bloomberg.

How big is AWS?

AWS posted more than 38 per cent annual growth in revenue in the third quarter that ended on September 30. It earned $16.1 billion, accounting for almost 15 per cent of the parent company Amazon’s overall sales in the three-month period.

In the last financial year, the cloud computing platform generated sales worth $45.4bn, almost 12 per cent of the company's total revenue of more than $386bn.

It captured about one third of the $152bn cloud services market, securing a bigger cut than Microsoft and Google combined, according to a report from the Synergy Research Group.

Has AWS crashed before?

A major interruption hit Amazon’s cloud services on December 7, temporarily knocking out streaming platforms Netflix and Disney+, Robinhood, a wide range of apps and Amazon's e-commerce website. The company said it was probably caused by issues related to the application programming interface, a set of protocols for building and integrating application software.

The Seattle-based company also faced a multi-hour blackout in November last year that affected a large portion of the internet.

There have been outages involving other technology companies as well.

It October, Meta – previously known as Facebook – suffered a nearly six-hour breakdown – along with its associated services WhatsApp and Instagram.

In June, American cloud computing services provider Fastly went down, which affected websites including Amazon, Reddit, The New York Times and the UK government website.

Updated: December 17, 2021, 12:04 AM