Storage virtualization is a technology that may seem too big or expensive for small and medium-sized infrastructures. Yet in fact companies of many different sizes can benefit from storage virtualization — without breaking the bank, using both commodity hardware as well as traditional virtualized storage engines.
In this piece, I’ll explain what storage virtualization is and how to go about seeing if it is right for you.
What is Storage Virtualization?
Simply speaking, virtualized storage is abstracting data from a disk. In traditional storage deployment configurations, we are fixated on this drive letter (on Windows systems) or a logical unit number (LUN) being sized to this level and given a specific RAID algorithm on a specific tier of disk.
The first instance of virtualizing storage may have come from the initial migration to a virtualized server environment. In most situations, this involved implementing some form of shared storage. This is usually a storage area network (SAN) connected via a fibre channel or iSCSI network.
In this configuration the individual servers are abstracted from the hardware typically associated with a server infrastructure. In terms of the storage, there may or may not be a full abstraction of data from disk. Virtualized storage provides an abstraction from the host and the disk.
The connected system, be it a VMware ESXi host or a Windows Server system, will not know if the underlying disk is RAID 5, 6 or otherwise be able to interact with it directly. The storage processor serves as the storage virtualization engine to coordinate the I/O between the actual disk and the host system.
Virtualizing storage is performed to roll in additional features as well as to allow transparent storage scaling. Among the most coveted features is thin provisioning, which only consumes drive space for what is actually used on a volume. Another feature that storage administrators may want to use is deduplication.
Deduplication, when applied at the block level, will go through a logical area’s disk usage and look for similar blocks. Those similar blocks are linked back to a first instance and the duplicated blocks are reclaimed by the storage system.
A final set of principal features that may drive administrators to move toward virtualized storage are volume management features such as replication, snapshots and migration.
Replication of a volume or LUN, from one storage system to another, can be a boon to disaster recovery. In fact, solutions such as VMware Site Recovery Manager depend on this replication technology to enable a coordinated failover to another site. Snapshots of a LUN can be very useful as well. A LUN snapshot can function like a virtual machine’s snapshot functionality, in that an entire data set can be quickly reverted back to a designated point in time.
Finally, migration functionality can also save the day for the infrastructure administrator. Migration from one storage system to another is made possible in virtualization engines with technologies such as VMware’s Storage vMotion functionality. But this does not do you much help for the non-virtualized segments of storage. SAN-based migration can move a volume from one storage system behind the storage processor to another storage system (such as a drive tray) to accommodate moving the data off of equipment that needs to be removed.
A principle example of this feature is to evacuate an older drive array that utilizes Ultra-320 SCSI disks in favor of moving to a newer drive array using serial attached SCSI (SAS) drives. This can enable better performance as well as utilizing current storage systems. With a virtualized storage environment, the LUN may be able to be migrated from one storage system to the other, totally free of the connected system. This is because the VMware ESXi host or Windows Server are connected to the storage processor and not the underlying storage, hence the level of abstraction.
A hidden benefit to virtualized storage is that administrators may be able to address data protection requirements for unstructured data. Consider few terabytes of storage — not that bad at face value. But let’s also say that this collection of data consists of 1KB files. You will quickly see that this much data will be very painful if not nearly impossible to manage in most file systems.
This can make backing up this type of data a feat of monumental proportion. Virtualized storage can attack this at block level and replicate or snap the volume to another storage system, which may address data protection requirements. Any time a storage system can work underneath the contents of the LUN at block level, it will perform much better.
How does virtualized storage work under the hood?
There are a few primary ways that virtualized storage can be delivered to consumers, which can be VMware ESX hosts, Microsoft Hyper-V hosts, Citrix XenServer hosts, Windows servers or Linux servers.
There are a few requirements to identify virtualized storage, according to Hu Yoshida, Vice President and Chief Technology Officer of Hitachi Data Systems (HDS). During a recent presentation, Hu provided the following criteria to determine if a product is a virtualized storage platform, shown below:
• Application, server and network independent management of storage infrastructure
• Enhance existing storage assets with the latest enterprise storage functions
• Safe multi-tenancy to leverage shared storage resources across multiple applications
• Transparency to provide applications with the ability to track their service level objectives
• Scalability to meet growing peak demands
These criteria can then be applied to one of three storage architecture configurations for a virtualized storage solution. The first of these configurations has each storage controller passing through to each disk resource via a separate set of fibre channel interfaces. The figure below shows this configuration:
In the case of the presentation from Hu, this applies to a number of HDS products, such as the Universal Storage Platform. Visually, this architecture is rather straightforward compared to the other virtualized storage configurations.
The second of virtualized storage configuration would have a storage processor in between the storage consumers and the disk resources. This storage processor virtualizes the storage resources with a mapping table. Products in this space include the IBM SAN Volume Controller. This architecture is shown in this figure:
The third configuration introduces a split-path approach to presenting storage resources to consumers. The key is application integration to the switches to manage the path assignment (sometimes referred to as controlled path) from the switches to the storage resources.
Products in this approach include EMC’s Invistastorage virtualization. The figure below shows this architecture:
In each of the architectures, there may be multiple switches shown. This is to show the path flow rather than the pieces and parts in place. For most switches, zones can be assigned to provide a logical area for the path to flow without adding an entirely separate switch infrastructure, if the security requirements would permit that type of separation.
What options are available for virtualized storage beyond traditional storage?
With the three principle types of storage identified above, there are still other products available for administrators to select from. This includes options that are software-only and that use commodity hardware in favor of the generally higher-cost options from the traditional storage companies.
Rick Vanover, vExpert, VCP, MCITP, MCTS, MCSA, is an IT Infrastructure Manager for Alliance Data in Columbus, Ohio. He is an IT veteran specializing in virtualization, server hardware, operating system support and technology management. Follow Rick on Twitter at @RickVanover .