Data Migration: EMC, Compellent, 3PAR, FalconStor: Page 3

(Page 3 of 3)

Automated Data Migration

Like many complex technologies, there are many ways to implement automated data migration. One common theme is the virtualization of the storage, which creates an abstraction of the user’s view of the storage (LUN or LBA) and the actual storage mapping on disk (PBA).

The ability to automatically and transparently migrate data within a storage system relies on this mapping so that the data can be reconstructed for the user. This reconstruction is embodied within metadata that specifies how data is distributed across the various storage subsystems.

In addition to the various implementation styles (which we’ll explore next), there are a number of trade-offs in what granularity of data is to be migrated (see Figure 2). Each comes with their own advantages and disadvantages. For example, some vendors implement LUN-level migration, which is conceptually simple, but means that all content within a LUN is treated the same way.

Sub-LUN level migration is also implemented, which can take the form of large blocks of data, in the extreme case down to the block level. Sub-LUN level migration has certain advantages, as high-frequency data can be migrated to faster tiers, leaving the other data in the LUN to less expensive tiers of storage.

Sub-LUN level migration also has a cost, as metadata must be managed for the individual blocks of data (and the smaller the chunk size, the less efficient it may ultimately be). Additionally, if the migrated chunks of data are larger than a block, performance gains may be realized in the form of read-ahead (for example, if the blocks within the chunk are logically related).

An important characteristic of a solution that incorporates data migration is efficiency. The solution should minimize any impact on storage performance. Other trade-offs include the method by which data is classified, the frequency that it’s performed, initial placement of data (assume the data is initially hot or cold), and others.

Some implementations, for example, perform data migration as a background process (nightly activity), where others perform this activity in real-time. While potentially introducing latency, real-time migration provides the ability to react dynamically to the user needs of data.

data migration, LUN level

Figure 2: Levels of Data Migration.

Implementation Styles

Data migration can be implemented in a number of ways, but they can be categorized into three fundamental architectures; host, network, and target. Let’s begin with a short introduction to these three styles, and then explore some implementations that build in one of these three categories. Figure 3 provides a graphical visualization of these styles.

Host-based implementations integrate the tiering and migration logic into the host servers. While this can be restrictive from the perspective of single-user storage, virtualization has changed this to also support multi-user (multi-VM) configurations.

Operating systems, for example, can integrate this type of functionality into their logical volume managers (such as Linux’s LVM), and hypervisors can incorporate into their storage stacks. VMware implements this under the product name Storage vMotion, which permits the migration of live (active) virtual machine disks between storage mediums. This is implemented efficiently using changed block tracking to migrate the virtual machine disk in the background, and in the end, suspend the VM for a short time to move any remaining blocks to the destination datastore.

Network-based implementations place an intermediary into the network between the storage users and the physical storage. This offloads the functionality from the host, but also permits a vendor-agnostic storage backend (storage from multiple vendors). Examples of network-based implementations (for both data migration, and numerous other features) include IBM’s SAN Volume Controller (SVC), HP’s SAN Virtualization Storage Platform (SVSP), and FalconStor’s Network Storage Server (NSS).

Finally, target-based implementations pull the required logic into the storage array itself. Like network-based implementations, the overhead of virtualizing the data is offloaded from the host, creating an abstraction at the target. Once this abstraction is constructed, other advanced features can be implemented, such as data reduction (as the physical placement and format of data is hidden from the host users). Many examples of target-based implementations exist, such as EMC’s FAST, Compellent’s Data Progression, 3PAR’s Dynamic Optimization, and many others.

data migration, host based, target based

Figure 3: Implementation Styles.

Data Migration Resources

Automated Storage Tiers

Data Migration

Will Solid State Kill Hard Drives

Storage Networking Basis: Understanding Fibre Channel

Storage Virtualization: Overview and Options

SAS and SATA

VMware Storage vMotion

Hierarchical Storage Management

Information Lifecycle Management

RAM Clouds

EMC FAST

Compellent Data Progression

Compellent Data Placement Optimization (Fast Track)

3PAR Autonomic Storage Tiering

FalconStor Data Migration

Algorithms for Data Migration with Cloning

Automated Lookahead Data Migration in SSD-enabled Multi-tiered Storage Systems

Efficient Data Migration in Self-Managing Storage Systems

Adaptive Data Migration in Multi-tiered Storage Based Cloud Environment

Aqueduct: Online Data Migration with Performance Guarantees

BASIL: Automated IO Load Balancing Across Storage Devices t

About the Author

M. Tim Jones is a firmware and product architect and the author of Artificial Intelligence: A Systems Approach, GNU/Linux Application Programming (now in its second edition), AI Application Programming (in its second edition), and BSD Sockets Programming from a Multilanguage Perspective. His background ranges from the development of software for geosynchronous satellites to the architecture and development of storage and virtualization solutions.


Page 3 of 3

Previous Page
1 2 3
 



Tags: EMC, Compellent, virtualization, data migration, storage virtualization


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.