No Reason to Panic Over Periodic Cloud Outages

CloudsAmazon.com became the fourth major site and/or Internet service to go dark in the past week. The sudden outage that lasted 15 minutes meant millions of online consumers couldn’t order “50 Shades of Gray” or the latest John Mayer CD.

More importantly, though, this string over service outages is drawing attention to the fragility of the Internet and cloud-based services. While cloud computing is still evolving, it has become an indispensable part of our daily work and personal life.

Consider what’s happened in the past week.

  • Microsoft’s Outlook.com – the recently rebranded cloud email service – was dark for many users for days. Microsoft has issued an apology to users and has restored service. However, the outage comes as Microsoft is touting the high uptime for Office 365 and other cloud services.
  • The New York Times – the gray old lady and bastion of traditional journalism – was offline for several hours last Wednesday due to technical difficulties. The Washington Post described the scene as people “surging out of their offices in a blind panic” because they couldn’t catch up on the latest news trends.
  • Google – the gateway to all things online – disappeared for about 5 minutes last Friday, taking with it about 40 percent of all Internet traffic. And it wasn’t just search that vanished; so too did most of Google’s Web services, including Google Drive and Google Apps.

Internet and cloud service outages are nothing new, and nor will they end anytime soon. If anything, cloud service providers are doing a good job of incrementally improving quality and reliability of service delivery as they reach for the fabled five-nines (99.999%) reliability rating. If they achieve that, service should be available for all but 5 minutes per year – roughly the amount of time Google was off line last week.

Here’s the rub: five-nines is not easy. Microsoft is publishing the uptime for Office 365, its cloud-based productivity suite, which stands at 99.97% for the last year. That’s pretty good; it translates into roughly one hour of downtime per year. Amazon, which has been dubbed has having the world’s largest cloud infrastructure by Gartner, holds a similar uptime rating.

And even if a service’s general uptime is high, local experience may differ. Cloud providers such as Amazon, Google, Microsoft and Salesforce.com do not own the transport network that delivers service to the end user. The quality, speed and reliability of the data service providers can always affect user experience on a local and regional level. What someone sees in Miami may be different from those in San Francisco.

In 2011, Google did a study that found only one company achieved five-nines reliability, and that was the AT&T, which build and operated the operated the old PSTN telephone network. The sign of service availability was the dial tone you heard when picking up a phone. No other company has achieved that level of service. It’s not to say it won’t happen; it’s simply a work in progress.

So, yes, outages like Amazon.com’s, Microsoft and Google do disturb consumers and commercial users alike. While cloud computing adoption continues to grow, IT decision-makers harbor concerns about reliability and accessibility. And they’re right to do so. It’s up to vendors and solution providers to not hide behinds terms of service and service level agreements. It’s up to vendors and solution providers to explain the realities of cloud and Internet mechanics.

Besides, what happens when the Internet is down or the cloud isn’t available? Just like the days of old when the lights went out and phone service was disrupted, go read a book (a real paper one).