Rich Miller: The Aftermath of Amazon’s Cloud Outage

More than five days after its outage began, Amazon Web Services has finally restored virtually all of its services, with some mopping up of a small number of customer accounts with “stuck” data in its Elastic Block Storage (EBS) service. “EBS is now operating normally for all APIs and recovered EBS volumes,” Amazon reports on its status dashboard. “The vast majority of affected volumes have now been recovered. We’re in the process of contacting a limited number of customers who have EBS volumes that have not yet recovered and will continue to work hard on restoring these remaining volumes.” The company promises a detailed incident report will follow.

What are the lessons and implications of the outage? Discussion continued over the weekend. Here’s a look at some notable links with analysis and commentary:

More of the Data Center Knowledge article from Rich Miller

Alex Carroll

Alex Carroll

Managing Member at Lifeline Data Centers
Alex, co-owner, is responsible for all real estate, construction and mission critical facilities: hardened buildings, power systems, cooling systems, fire suppression, and environmentals. Alex also manages relationships with the telecommunications providers and has an extensive background in IT infrastructure support, database administration and software design and development. Alex architected Lifeline’s proprietary GRCA system and is hands-on every day in the data center.