In this article, we’ll explore cloud computing from the computing and storage angle, and review some of the most important offerings in the various cloud models.
Let’s begin with a definition of cloud computing, and then explore how it’s being applied today.
Cloud computing is really nothing more than an abstraction of resources, separating them from the user in a way that allows them to be shared in a dynamic and scalable way.
Users pay for what they use, when they use it, making it a very cost effective service offering. Cloud computing’s actual efficiency is dependent upon the use model, but for many usages, it’s difficult to compete with the dynamic and scalable capabilities of the cloud. So at its core, the cloud is about delivery of IT as a service. This concept is fundamentally changing the economics of data centers.
Let’s now explore cloud computing in depth — what it means, how it’s deployed, and what issues exist for this new trend.
Cloud Computing: Why now?
While clouds could have existed in the past, a confluence of technologies have arrived which make the cloud not only a viable, but an efficient and productive technology.
Two technologies in particular that make the cloud possible include virtualization (for IaaS and PaaS), and commodity hardware (for all of the technologies). Additionally, inexpensive hardware that includes efficient support for virtualization makes this even more attractive.
But the real key for clouds today is the hypervisor as a commodity and its ability to support commodity hardware. While IBM pioneered virtualization on big iron hardware, VMware and open-source hypervisors such as Xen and KVM have commoditized the hypervisor such that it can support low-end servers.
These innovations have created additional capabilities that make the cloud ecosystem even more compelling for cloud (virtualized infrastructure). At the high level is the hypervisor itself, which is fundamentally an operating system. The hypervisor carves a server up and transparently shares the server with a number of operating systems that see each their own instance of the server (with fewer and abstracted resources).
The next capability is the Virtual Machine, or VM, which bundles an operating system and application set together (as a file, from the perspective of the hypervisor) which makes it easy to manage and provision new OS/Application instances.
Other capabilities of these include virtual networking, which provides the means to efficiently tie VMs together without physical networking, and live VM migration, which makes it possible to move an operating system and applications between servers for RAS and load balancing. Figure 1 provides a high-level illustration for these concepts.
Figure 1: Innovations that Make the Cloud Possible
So a cloud is nothing more than a highly virtualized infrastructure (commonly with commodity hardware and storage) along with a collection of tools which enable simple administration and metering of the infrastructure use to efficiently share among a variety of diverse users with varying SLAs.
Being amorphous, cloud computing supports a variety of deployment models over a variety of resources. By deployment model, I mean how and where the service is deployed, and by resource, I mean what is being delivered to you as a cloud-based resource.
In this section, we’ll explore both of these to understand the how and what of cloud computing.
The cloud deployment model focuses on where the “cloud” exists, but from the user’s perspective, it’s still a remote resource for all intents and purposes. There are two primary deployment models, with two additional variations on those two themes. We’ll explore some of the players for these models in a later section.
The first is the public model (see Figure 2), which was the most prevalent in the cloud’s resurgence. In this model, a third party provides the cloud for remote users over the Internet. A user specifies the scope of the resources needed, and the cloud provider meters and bills the user based upon their actual usage.
The public model has numerous advantages, primarily in terms of cost. The cloud provider purchases the resources from vendors and then leases them to users over time, which translates to capital expense by the cloud provider and operational expense to the user.
The user is also not required to manage the resources, but instead rely on the cloud provider for this function. This can be an advantage, or disadvantage, but is a key element of the public model decision.
Figure 2: The Public and Private Cloud Models.
While the public model has numerous advantages, particularly in initial expense, it has disadvantages in terms of privacy. In the public model (see Figure 1), your application and data resides on third party hardware, likely side-by-side with applications and data from users.
To alleviate this concern (and others), the private model arose. In the private model, the user purchases the resources to manage internally (privately). These resources are carved up in an identical fashion as the public model, to support resource sharing among a number of private (internal) users.
So in addition to the capital expense of the hardware, the private model must also be privately managed. This has a downside, but also an upside as private resources can be managed to SLAs required of the resource (instead of those advertised by a public cloud provider).
To exploit the advantages of both the public and private cloud models, the hybrid model was created (see Figure 3). This variation applies both the public and private models to take advantage of the privacy and additional security of the private model, while exploiting the potential cost savings of the public model for data and applications that have lesser security concerns.
Another perspective on this model views hybrid as an overflow model. In this way, private resources are used and when found to be insufficient (which can be a transient occurrence), the public resources are exploited as a reserve. This model may not be applicable to all use-cases, but depending upon the application, can provide distinct advantages. From the perspective of the user, the hybrid is a transparent merging of private and public resources with their used defined by user SLAs.
Figure 3: The Hybrid Cloud Computing Model.
Finally, to resolve issues in privacy for public cloud models, the community model was created (see Figure 4). This variation restricts usage of resources to a set of users (who may have strategic relationships and therefore fewer issues with data sharing).
Recall that in a cloud computing infrastructure, a single virtualized server may support numerous virtual machines owned by different users. Therefore, competitors may unknowingly share resources, which may not work for the desired SLA. The community model restricts resources to a set of users, and therefore removes competitive sharing amongst those resources.
Figure 4: The Community Cloud Model.
Other variations are of course possible. For example, a community-based hybrid cloud, where private virtualized resources of a community of users are shared publically as they become available. This variation of cloud is ideal, for example, when the community exists in different time zones that create new opportunities for resource sharing.
Cloud Computing Services (Resources)
In addition to various models for resource management, there are various perspectives on what resource means in this context. The “cloud” has traditionally been defined by stacks which provide some level of capability to the user, though this classification doesn’t necessarily cover all use models in play today. But for the purposes of consistency, the stack model is used here.
Software as a Service
Software as a service (SaaS) is a popular model for software management, in some ways hijacked under the cloud umbrella. A number of companies built SaaS before the cloud was popularized (under the name “Application Service Provider”), in particular, salesforce.com.
SaaS provides a way to deploy software in a pay-as-you-use model. Rather than charging for software on a purchase basis, SaaS distributes software to end users as it’s needed and therefore provides a dynamic model for software management.
In many cases, licensing software in this way can be very cost-efficient, particularly high-end applications, but this certainly doesn’t apply to all software. But where it does apply, it has enabled smaller companies to use software that was commonly restricted to very large companies. Therefore, the deployment model has enabled wider application usage, which benefits both users and application providers.
Platform as a Service
Platform as a Service implements specialized stacks for applications. The key value behind this service is that not only is the hardware independently managed, but so also is the platform on which applications are developed.
Consider the LAMP stack as one example. LAMP requires a significant amount of configuration outside of the simplest cases. Using PaaS, the stack is preconfigured on a server (whether physical or virtualized) and the user adds their value in the specific application (see Figure 5).
Figure 5: Platform as a Service.
The real power in PaaS comes with the more specialized stacks that permit simple integration of user code to develop solid Web computing applications.
Infrastructure as a Service
IaaS is about the lease of infrastructure within a shared virtualized data center (see Figure 6). This can be as simple as a single virtual server (logical CPU within a physical server), or as complex as servers, storage capacity and bandwidth, networking bandwidth, and overall network services combined to support the infrastructure (this is sometimes referred to as Data Center as a Service).
Recall that virtualization and related virtualization technologies permit this style of service, and make it simple enough to manage with a limited team. Compare the alternative, where physical equipment is required to be configured and cabled to support this type of service.
Figure 6: Infrastructure as a Service.
This service provides the base hardware layer for an application(s), and therefore serves as the base onto which the user would apply operating system and application sets.
As we discussed before, virtualization makes this process both dynamic and simple with the concept of the virtual machine. In fact, many IaaS vendors provide VM templates which can be used within their architectures, some of which are generic, while others focus on a particular application (web server, database, etc.).
Storage as a Service
Storage as a service, while not a new concept, is finding wide use in its second iteration as a technology.
Storage as a service, or Cloud Computing Storage, is a service that extends persistent storage to users over the Internet at a reasonable cost. What is actually being purchased is both physical storage (within a larger virtualized storage infrastructure) and networking bandwidth to both write and read the stored data. For this reason, cloud storage is commonly priced in terms of $/GB/month (use of physical storage over time) as well as the networking bandwidth (and transactions) required to satisfy the user requests. Users can also optionally pay for SLAs which define the protection given to storage in the cloud.
What’s not immediately apparent in cloud storage is that it also has the advantage of geographic distribution of data. That’s not to say that what’s being purchased is disaster recovery, and as such, much of cloud storage is devoted to personal backup data, or copies of enterprise data.
Whether talking about managing a cloud compute environment, or reading and writing data into a cloud storage instance, a common protocol model exists. While these examples can be classified as web services, the ReST model is a common way to build the management and I/O protocols. ReST stands for Representational State Transfer, and is an architecture for distributed communication built over HTTP. The ReST architecture transforms HTTP into a powerful and scalable protocol for moving data and managing resources.
ReST is used in a number of cloud storage protocols, and also for managing cloud computing infrastructures. You’ll also find SOAP (or Simple Object Access Protocol) as a management protocol to implement remote procedure calls over the Internet.
A number of efforts are underway to define standard APIs for clouds. While not open, VMware’s vCloud API appears promising for IaaS. vCloud supports hybrid clouds, through which VMs can transparently migrate between sites containing virtualized infrastructure.
Much of what we’ve covered thus far has defined some of the theory around cloud computing and storage. Let’s now explore some examples of the various cloud deployment models that exist today.
SaaS (Software as a Service)
Software as a service, once called application service providers, exist in a number of forms. Two of the leaders in this space are Salesforce and Oracle. Salesforce (force.com) provides a SaaS environment which provides a large number of business applications as well as the ability to develop and host custom apps within their data center infrastructure. Force.com provides a software stack to simplify the development process, even with point-and-click development, in addition to support for more complex applications. Oracle is also well established within the SaaS space, providing both cloud and on-premise solutions.
PaaS (Platform as a Service)
Platform as a Service provides a more specialized environment that may be restricted by language, or development environment. One prime example in the PaaS space is Google’s App Engine which permits the development and hosting of web-based applications in the Python or Java languages. App Engine provides a number of APIs to simplify complex tasks as well as providing an environment which provides automatic scaling and load balancing to support dynamic use models that are possible with the Web (such as defined by the ‘Slashdot effect’).
Another useful PaaS solution is Engine Yard. Engine Yard focuses on applications developed using the Ruby on Rails application environment, and permits hosting within Engine Yard’s private cloud or within Amazon’s EC2 (discussed shortly). Ruby on Rails is an open source web framework developed with the Ruby language.
IaaS (Infrastructure as a Service)
Infrastructure as a Service is one of the most interesting and includes a number of alternatives. The leader in this space is Amazon with their Elastic Compute Cloud (EC2). EC2 is Amazon’s shared compute infrastructure and provides a simple interface to their cloud. Within EC2, a user can host VMs that can scale to the desired capacity over a configurable set of resources (low end to high end CPUs, scalable memory, scalable network, and storage). Amazon even supports cluster compute instances, which enable HPC applications in the cloud.
Rackspace Cloud Servers is another IaaS solution which competes with EC2. Rackspace focuses on simplicity, making it very easy to get a VM up and running within their infrastructure. Like Amazon, you scan easily scale your infrastructure over a set of servers that very in size and performance (using either Linux or Windows VMs).
Each of the IaaS solutions operate in a pay-as-you-go model (both examples here rely on the Xen hypervisor). When your servers are running, you’re charged an hourly fee. If desired, your compute capacity can scale with demand, making it very simple, and in many cases, very cost efficient.
It’s also possible to build a private compute cloud that implements Amazon’s EC2 interface. Eucalyptus is an open source project that allows you to build private clouds, and also hybrid clouds that interface to Amazon for the public portion.
DSaaS (Data Storage as a Service)
Data Storage as a Service is the last cloud based service model that we’ll explore here. There exist a number of options in this storage space, each with their own advantages.
Mozy is one of the most popular in the consumer space. Rather than focus on generic storage, Mozy provides remote storage with a backup application that automatically copies data into their public cloud for storage. Within their infrastructure, Mozy implements encryption to protect your data.
One of the most well known cloud storage providers is Amazon, with their Simple Storage Service offering (called Amazon S3). Amazon provides an API in a variety of languages that permit programmatic access to their public storage cloud (using a ReST or SOAP API). A unique aspect of their storage is that access occurs through HTTP, such that stored objects can be accessed as HTTP URLs like any other web file (assuming a user is authorized to view the files).
The Nirvanix Storage Delivery Network is another example of a public storage cloud. Nirvanix implements an interesting set of features. In addition to implementing a public storage cloud, Nirvanix also provides a hybrid solution, permitting a fully managed private and public cloud storage solution.
Finally, EMC’s Atmos is a deployable cloud storage solution which may be used as private cloud storage, or to develop a public or hybrid cloud storage offering. Atmos was designed for massive scalability on a global scale, while providing cost effective storage for unstructured data.
Cloud Computing Summary
As the cloud grows, so do the options available to enable scalable compute and storage over the range of deployment models. But the cloud remains a solution with a specific domain of problems. While consumers have embraced the cloud, enterprise adoption is still restricted to the SaaS and PaaS models. Putting one’s compute infrastructure or private data in the cloud still represents its Achilles heel. But as with any problem, if it’s worth solving, it will be solved and open the gate to a broader set of applications both in the consumer and enterprise space. While momentum began building in the last few years, the cloud is new viewed in its various embodiments as a powerful platform for scalable computing and storage.
Cloud Computing Resources
Representational State Transfer
VMware’s vCloud API
Salesfore.com Software as a Service
Google App Engine
Engine Yard Platform as a Service
Ruby on Rails Web Application Environment
Amazon Elastic Compute Cloud (Infrastructure as a Service)
Rackspace Cloud Server (Infrastructure as a Service)
Eucalyptus Private Cloud Software (Enterprise Edition)
Eucalyptus Open Source Cloud Platform
Mozy Online Backup
Amazon Simple Storage Service
Nirvanix Storage Delivery Network
EMC Atmos Cloud Storage
About the Author
M. Tim Jones is a firmware and product architect and the author of Artificial Intelligence: A Systems Approach, GNU/Linux Application Programming (now in its second edition), AI Application Programming (in its second edition), and BSD Sockets Programming from a Multilanguage Perspective. His background ranges from the development of software for geosynchronous satellites to the architecture and development of storage and virtualization solutions.