Stern said that IT will need to change how it thinks of assets in order to manage the coming multi-cloud infrastructure. Building on a recent blog post and speech, Stern said that infrastructure is now addressed through URLs and APIs and not through read and write commands.
He said it takes a surprising weight of manuals to learn how to use non-cloud infrastructure to create a Hello world program, a simple piece of software that prints the words "Hello, world" to a printer or display. "As complexity increases, interest declines."
In contrast, cloud applications can be built quickly. In response, sysadmins will need to abandon their favorite equation for measuring reliability.
"In the nineties, we measured reliability as: MTBF/(MTBF + MTTR) where MTBF is mean time between failure and MTTR is mean time to restart," Stern said.
He said that the equation usually yielded a number with a lot of nines in it. He said that as administrators looked at the equation, they worried about what if all the downtime came at once, or about how the equation would be effected if a server went down and stayed down for 50 minutes even just once during the year.
"So we bought hardware to lower the mean time to restart. We bought RAID and SAN and clusters," he said. "But software and deployment are also affecting reliability."
Sysadmins need to measure service performance by tracking a series of KPIs Stern called PIPE. "Sysadmins need to measure predictability, integrity, productivity, and efficiency," he said.
He added that sysadmins will measure the efficiency of the datacenter in terms of throughput per dollar and work per watt.
Sysadmins will not track hardware performance in the cloud. "We do not see the underlying hardware even in the private cloud and we may not know what the hardware is in the public cloud."
Sysadmins no longer worry about replacing hardware. "If you have 10,000 servers, something's always failing" said Stern. "But there's also often a software failure."
He said that instead of focusing on mean time to reboot a server, sysadmins will focus on time taken to recognize a problem. They will be purging old instances of an application and will be adding more instances if they need to.
He said that in the cloud, a poor response time is virtually the same as failure.
He added that although this change in job function won't be easy, there's a massive ecosystem of competing cloud providers who are eager to help.
"Remember: think 'recognize and restart' and not nines in terms of reliability," he concluded.
Article courtesy of InternetNews.com.