Amazon Cloud Outage a Result of Network Error?

Amazon's rent-a-cloud application platform was the victim of outages, a situation that's been repaired but one that illustrates weaknesses in cloud computing: outages.


You Can't Detect What You Can't See: Illuminating the Entire Kill Chain

On-Demand Webinar

Some of Amazon's U.S. cloud computing environment customers, ostensibly on the East Coast, experienced a prolonged outage that lasted several hours on Monday.

According to published reports, the problems began with an outage at an Amazon (NASDAQ: AMZN) cloud computing facility in northern Virginia that provides both Elastic Compute Cloud (EC2) and Amazon's Relational Database Services.

Among the problems that Amazon's customers experienced were downtimes that ran more than ten hours in some cases with instance connectivity problems, system latency issues, and high error rates.

Amazon's EC2 and related services are a cloud computing platform for hosting third-party websites.

Additionally, Amazon Web Services (AWS) Elastic Beanstalk cloud management service also saw some issues, including increased error rates, according to Amazon's AWS Service Health Dashboard Monday afternoon.

Although the outage affected some customers for several hours, however, later in the afternoon Amazon engineers got most, if not all, of the outage issues repaired, the dashboard showed.

In the meantime, though, the outage hit several popular websites, including Quora and Reddit as well as Foursquare, published reports said.

In fact, according to at least one published report, a post to the AWS dashboard by Amazon blamed the problem on a network error that caused a lot of unexpected storage mirroring.

"A networking event early this morning triggered a large amount of re-mirroring of EBS [Extended Block Store] volumes ... This re-mirroring created a shortage of capacity ... which impacted new EBS volume creation as well as the pace with which we could re-mirror and recover affected EBS volumes," the report said.

The services were mostly restored by mid-afternoon Pacific time, according to posts to the AWS dashboard stating that the services were again operational.

However, that feeling wasn't universal.

"Reddit is in 'emergency read-only mode' right now because Amazon is experiencing a degradation. They are working on it but we are still waiting for them to get to our volumes," said a statement on Reddit's site, Thursday afternoon.

Amazon is not the only cloud platform provider that has experienced unplanned outages. For instance, last August, some of Microsoft's cloud computing customers ran into an outage that affected them for two hours or so.

Two calls to an Amazon spokesperson requesting comment were not returned at publication time.

Stuart J. Johnston is a contributing editor at InternetNews.com, the news service of Internet.com, the network for technology professionals. Follow him on Twitter @stuartj1000.

0 Comments (click to add your comment)
Comment and Contribute


(Maximum characters: 1200). You have characters left.



IT Management Daily
Don't miss an article. Subscribe to our newsletter below.

By submitting your information, you agree that datamation.com may send you Datamation offers via email, phone and text message, as well as email offers about other products and services that Datamation believes may be of interest to you. Datamation will process your information in accordance with the Quinstreet Privacy Policy.