Table 1 illustrates these differences in a simple way, focusing on where the category leaders exist (for Solid-State Disks, Fibre Channel, Serial Attached SCSI, and Serial ATA). For example, if the focus is low-cost capacity storage, then SATA is the way to go. On the other hand, if speed is the goal and cost is not an issue, then SSDs are the right solution.
Table 1: Characteristics of Common Drive Types.
The idea exhibited here is that storage mediums present different characteristics that can be exploited for the given data being stored. Lets explore this topic further using the concept of storage tiering.
Note: Outside of the traditional disk interfaces, there are also interfaces that have no native drive interfaces. Examples include Infiniband, FC over Ethernet (FCoE), and iSCSI (IP-based storage protocol). As these protocols and interfaces evolve, the solution landscape changes with it (noting that Infiniband and 10GbE create more decision points for storage systems).
The concept of storage tiers is certainly not new, but has existed for some time under the moniker Hierarchical Storage Management (or HSM). HSM is defined as a storage technique that provides the capability to move data between high-cost storage elements (such as FC drives contained within enclosures), and low-cost storage elements (such as optical disks).
IBM first implemented the concept in their mainframe computers, and continued to evolve HSM within their AIX operating system.
While the concept may not be new, storage technologies have evolved to make this concept even more important. Recall from Table 1 that current drive technologies and storage protocols and buses are segmenting the storage landscape and providing the means to alter the cost and speed of access to data.
Applying the concept of drive types based on performance vs. cost results in a tiered storage architecture (which is a common approach by numerous vendors, as shown in Figure 1).
Figure 1: Tiered Storage Architecture.
Ideally, we would take all of our data and place it on the fastest storage available (one example of this is the RAMClouds architecture proposed by Stanford University). But since cost is a factor, it must be factored in. For this reason, a 1MB file sitting in solid state storage costs significantly more than the same file sitting on a consumer SATA drive. Next we need to factor in the temperature of the data. If the file is one that we use frequently and require fast access to, then its justified to have this file on an SSD. If the file represents old data which we rarely use, then having that file on the cheaper SATA drive is ideal.
The goal then is to place hot data on SSDs and cold data on less expensive storage to optimize the overall cost of the data (to find an equilibrium of data temperature and $/GB). To meet that goal, we must first identify the temperature of the data.