Talk to anyone invested in the cloud, though, and it doesn’t take long to understand that outages are just one of the costs of doing business in the cloud, and . . . well, so what?
Outages happen with pretty much every service we consume. Apple is enjoying record profits, even as the iPhone 4 drops calls at an alarming rate. Where are the stories questioning the viability of smartphones or the iPhone or Apple?
Outages happen in on-premise data centers everywhere. Where are the stories questioning the viability of in-house IT? (Actually, those stories are out there, but they all ask if cloud computing is making traditional IT obsolete.)
When was the last time your power went out? Did you question the viability of utility-provided electricity?
There’s only so much you can do in an outage – backup generators (or in the case of the cloud, backed up data) help, but they don’t solve the problem. Outages are the service provider’s problem, not yours.
With other common failures, however, the customer takes a much more active role in determining success or failure. Here are some of the most common mistakes organizations make as they embrace the cloud.
1. Failing to define “success.”
Too many organizations regard cloud computing as a modern-day cure-all. Having problems with the bottom line? Turn to the cloud. Having trouble keeping remote workers productive? Trust the cloud. Are more of your employees working from home? Hey, maybe the cloud can help.
“Setting unrealistic expectations is the number one reason organizations have trouble with cloud computing,” said Robert Stroud, international VP of ISACA (Information Systems Audit and Control Association), a non-profit IT governance organization, and VP of service management and governance at CA.
“Too many organizations believe that they can put in a request to a cloud provider, and, magically, everything will be working perfectly overnight.”
If you were setting up a new application in house, would you be that naïve? If you don’t set concrete, realistic goals, don’t be surprised when the cloud doesn’t meet your expectations.
2. Failing to update computing concepts.
Early this year, startup Heroku was blindsided by an Amazon EC2 outage. Heroku provides a cloud development platform for Ruby on Rails that is hosted by Amazon. When weather caused an outage, Heroku saw its entire infrastructure disappear, along with the 40,000+ applications running on its platform.
The company had done everything it was supposed to in terms of failover and redundancy. What they hadn’t realized, though, was that everything resided in a single Amazon “availability zone.”
Amazon worked with Heroku to get their platform back on line quickly, but this incident shows how out-of-date computing concepts can undermine cloud efforts. Failover, backups and redundancy were easier to visualize in the on-premise computing world. If you backed up off-site, you were in good shape.
If everything is off-site, though, how do you know what level of failover capability you actually have? The whole concept of data being in a specific place is challenged by cloud computing.
“One of the things we’ve learned is that stability in the cloud is complicated,” said Byron Sebastian, CEO of Heroku. “One of the myths about cloud computing is that cloud infrastructure is a complete solution. It’s not. You need add-ons in the cloud as with any other IT system.”
As a result, Heroku has expanded its own platform to offer its customers such services as advanced failover, load balancing and redundancy, all tailored for cloud-hosted applications.
3. Failing to hold service providers accountable.
Heroku was lucky. Amazon immediately reached out to them and helped them solve the problem. Others haven’t been so lucky. Visit the user forums of any major cloud computing platform and you’ll see plenty of venting.
“X provider lost all of my data and won’t do anything about it!” is how these complaints often go (they’re usually in all caps and with many more exclamation points.) Some of the rants are obviously from people who screwed up and are looking for someone else to blame. Some are the rants of unbalanced lunatics. Others have the ring of legitimacy.
I’ve talked to plenty of people off the record who complained about service providers, but few will discuss the struggles they’ve had with customer service. (This isn’t unusual for any story, so don’t start imagining a broad cloud conspiracy.) Anecdotally, though, the scales are weighted in the service providers’ favor.
Michele Hudnall, solution marketing manager for BSM at Novell, emailed me to emphasize the importance of well-defined SLA’s. According to Hudnall, things you should watch out for are a lack of SLA’s, vague SLA’s and poor overall service management.
Organizations can easily lose 1-2% of revenues when mission-critical services go down even for a short amount of time. When that happens, it’s important to hold the service provider accountable. This may mean renegotiating your contract to include SLA penalties or seeking remediation.
Gartner recently drafted a list of customer rights that cloud vendors should honor. These included the right to SLA’s that address liabilities, remediation and business outcomes; the right to notification and choice about changes that will affect the service of consumers’ business processes; and the right to understand the technical limitations of the system up front.
4. Failing to hold yourself accountable.
Even if you have a solid SLA that has provisions for remediation, that doesn’t mean you are off the hook if something goes wrong.
For instance, what happens if you store sensitive customer data in the cloud and someone breaches it? Do you really think it matters what your SLA says? Who will your customers hold responsible?
You, that’s who.
Earlier this month a security breach at AT&T exposed the email addresses of more than 100,000 iPad users. Most customers blamed Apple, but the problem was with AT&T’s cloud service.
The breach was a minor one. After all, most people’s email addresses have been farmed by spammers many times over. However, had the leaked information been credit card or personal information, Apple would have had a problem that made the iPhone 4 antenna problems seem trivial.
“You can never abdicate responsibility to a service provider,” Stroud said. “The cloud provider may be a custodian of your information, but the reality is that it is your reputation that will suffer if something goes wrong.”
5. Failing to scrutinize vendors.
Pretty much every service provider, hosting company and ISP is rebranding itself as a “cloud provider.” However, not all cloud providers are created equal. While it’s a pretty safe bet that Google, Amazon and IBM will be around in the years to come, you can’t say the same about numerous cloud computing startups.
What happens if your cloud provider fails? The collapse of cloud startup Coghead last year shows just how dangerous skimping on due diligence can be. Coghead wooed customers with low prices. Then, when it ran out of money and failed to raise additional VC capital, it gave its customers a few short weeks to get their data off of its systems.
It could have been worse. What happens if your cloud provider shuts down with no notice? What happens if disgruntled employees smuggle servers out the back door after they get their pink slips? What happens if the local sheriff chains up the building under order from a bankruptcy judge?
If you don’t do your due diligence, you might find out.
6. Failing to understand the service supply chain.
Even if your cloud provider is stable, do you know how stable their service providers are? Cloud providers are increasingly outsourcing components of their services to third-parties. It’s important to understand the entire service supply chain in order to accurately judge the viability of the service you are signing up for.
If you’re dealing with a large, established cloud provider, at least you have a single neck to choke if something goes wrong, and you can bet that bad press will motivate them to solve the problem. With smaller vendors, you might be on your own.
7. Failure to manage and monitor applications.
Many organizations have made the mistake of believing that management and performance problems disappear when they move to the cloud. “With traditional applications, eighty percent of your time and resources are spent on management and monitoring,” Sebastian of Heroku said. “The cloud puts a big dent in that, but it doesn’t go away entirely.”
If your application performs poorly, your customers won’t blame the cloud provider, they will blame you. “There will be mistakes in your application. There always are,” Sebastian said. “With the proper performance management and monitoring tools in place, you’ll have a better chance of catching those mistakes before they become a disaster.”
8. Failing to understand financial realities.
Many organizations embrace the cloud because it is sold as being cheaper than in-house IT. That’s often true, but even when cloud services are cheaper, organizations may perceive them as being more expensive.
“We have so little visibility into what we’re paying for various technologies today that it’s easy to get sticker shock,” Stroud of ISACA and CA said. “That’s not the cloud providers’ fault.”
It’s not necessarily your fault either. Financial visibility into IT systems is a tricky matter. Many costs are opaque. Who consumes what? Who pays for what? Who gets to consume how much? For many IT departments, the answer to those questions is fuzzy at best. With the cloud, though, those answers become painfully clear.
9. Failing to understand the legal complexities of the cloud.
When you outsource computing resources, your business, no matter how small, may have opened itself up to the legal risks of a much bigger company. You may have to comply with laws from different jurisdictions, and you may face different liabilities, depending on where your data resides.
According to Gartner, “Service providers have not done a good job of explaining which jurisdictions they put data in and what legal requirements the service consumer must, therefore, meet. The service consumer needs reassurance that the provider does not violate any country’s rules for which the consumer may be held accountable.”
Complying with industry regulations is also more troublesome. Even if cloud services limit your risk and technically make you more compliant, you may have a more difficult time proving that.
10. Failing to get off the sidelines.
Finally, the biggest reason cloud deployments fail is because they don’t get started in the first place. Too many organizations fret about issues that are not all that different from the ones they have in their own data centers. Outages, security breaches and compliance are all general IT challenges, not cloud-specific ones.
The vast majority of people I corresponded with for this story overwhelmingly advocated cloud computing. I received several emails saying that they’ve seen few, if any, cloud failures.
The truth is that the cloud solves more problems than it creates. The cloud eases your IT management and maintenance headaches and lets you turn your attention away from IT and back to your core business. Failing to understand that is a huge mistake.