A few weeks ago I had a dream that I was the COB, CEO, and CTO of a major storage company, with the opportunity to architect and develop any product I wanted. Basically, I got to be the storage king in this dream, but of course, with being a king comes the responsibilities of your subjects (the company stockholders and employees) and your lineage (ensuring that you are successful in the market so your company has a future).Also, as a king you are periodically required to take over other lands (buy companies), make treaties with others (joint marketing and/or development agreements), or declare war and eliminate the enemy (beat them in the market to render them a non-factor).
As a mere storage consultant, I figured dreams could not get any better than this, and the best part was I remembered the dream in the morning. That next morning, I began thinking about the reality of what’s missing in the market and what requirements are not being met by the major product vendors’ current product offerings.
The old adage “build it and they will come” may well apply to mundane evolutionary products, but what about revolutionary products? What market requirements are not currently being met, and if the market truly is ready for something revolutionary in terms of large storage configurations, what would the product look like and why would customers consider buying it?
What Market Requirements Aren’t Being Met
Again, if I were king, I would first have my marketing requirements people confirm my speculation, but I personally believe there are three very important factors currently missing from the market. First, though, let me define the market.
I like to differentiate between storage and data. Storage for the most part has become a commodity market. RAID, for example, is now sold by dollars per gigabyte ($10-$20 is often quoted), while back in 1996 I remember RAID costs of over $1 per megabyte.
With storage now delivered and marketed as a commodity, what about the critical information you put on the storage — your data? To me, that’s where the real value is. People in general really do not care all that much about storage, but data is a completely different story. I believe that in the future data will become a more important requirement of the storage architecture, and the focus might even change from that of storage architecture to data architecture. (Well, that’s my hope at least, both as a consultant and as storage king in my dream.)
So the bottom line is that as king I want to define the market for my company as data, not with data being storage, but rather how you access, protect, maintain, and migrate what appears as files on the computer systems used. This includes, in most cases, the file system(s) that are used on top of the storage. And while raw devices are sometimes used for databases, for all intents and purposes the database is really managing the raw device the same as a file system manages a raw device, which is why I contend the database is a file system.
With all of this is mind, let’s take a closer look at what requirements are specifically missing from the market today:
- High performance and predictive scaling
- End-to-end security
- Simplified management
High Performance and Predictive Scaling
Some newer NAS products do scale reasonably well, but you are currently limited to 1 Gbit connections (some new 10 Gbit host cards are out, but even at PCI-X 133, they cannot be used efficiently). Most sites requiring multiple gigabytes of performance solve the high performance problem by using Fibre Channel-attached storage. Given the TCP/IP overhead and NFS, this is not possible with NAS, as even 100 MB/sec from a single host is nearly impossible.
For the most part, file systems do not scale linearly. There are many reasons for this lack of scaling, including:
- Sometimes the cause is the file system itself, as the internal algorithms for free space lookup, metadata lookup, file and directory names, and other areas do not scale linearly
- Sometimes the cause is due to the applications using the file system utilizing significant system overhead (see this article for more information)
Each of these areas can be mitigated by tuning the file system and tuning the applications, but what about the RAID? The RAID device is a block device (at least for now) that reads ahead blocks based on sequential addresses and writes behind blocks based on sequential addresses. The RAID and the file system have no communication about the topology of the data that you are using. All the RAID knows are simple block counts.
If the file system does not place data in sequential block order on the RAID, the RAID cannot know how to efficiently operate. The SCSI protocol does not provide a way of passing the data topology to the RAID, so if the data is not read sequentially and allocated sequentially, the RAID operates inefficiently, which means that scaling with the hardware is not really possible.
Even if the addresses are not allocated sequentially, most RAID devices still try and readahead, but this adds overhead, as you are reading data that you will not use, which of course reduces the RAID performance. A new device allocation method will be developed over the next few years that uses objects. This method is now in the process of being standardized. This development should help, but the file system will still need to communicate with the object, and work on that end is far in the future at best.
End-to-End Security
Most local file systems provide standard security such as ACL (access control lists), UNIX groups, and permissions. Some file systems support encryption such as Microsoft NTFS on a file or folder basis, but encryption is very CPU intensive, and key management gets more difficult as we all get older and forget our many passwords more and more often. The issue of end-to-end local file system security has not been efficiently solved from the host to the RAID either. (Please review this article for a closer look at this issue.)
Now, add to this the requirements for multi-level security, or MLS, that many vendors are moving toward for authentication and tracking file access. The U.S. Government has some new requirements in this area that are interesting for both operating system security and encryption, but even with these requirements, true end-to-end security still comes up short.
In addition, as you may have read from past articles, I have been involved with shared file systems for a long time, and security policy between multiple vendors’ operating systems with shared file systems is virtually impossible. Some of the problems in this area are that file systems distributed across heterogeneous operating systems have no common and often no public interface for security, and issues like HBA, switch, RAID, tape, and SAN/WAN encryption have not been adequately addressed either.
Simplified Management
Wouldn’t it be nice to have a tool that:
- Manages your shared file system(s) on multiple platforms
- Manages and tracks security policies for the file system, HBAs, switches, RAIDs, tapes, and libraries
- Allows replication of data for use by others and for disaster planning and recovery
- Manages all of your storage infrastructure, including configuration, performance analysis, and error reporting
- Conducts performance analysis of data through the file system, to the HBA, to the switch, to the RAID, to the HSM, and/or to backup software, and out to the SAN/WAN
I’m sure I’m missing a few things, but even all of the above would be the Holy Grail for management. Unfortunately, though, we’re nowhere close to having a tool that does all of this. A number of vendors are working on tools — VERITAS, McDATA, and EMC, just to name a few — that will help somewhat, but we won’t be arriving at the Holy Grail anytime soon, I’m afraid.
What This Product Would Solve
Assuming that the market analysis is valid and that the pain points customers are suffering from are correct enough for them to considering purchasing it, the product I would create would be a SAN/NAS hybrid that combines the best of both worlds and adds significant new features.
Many NAS limitations are based on TCP/IP overhead, and NAS does not allow for centralized control. The only way to centrally control a heterogeneous shared file system is to move most of the functionality to a single unit, as you cannot control an end-to-end security policy from one host in a pool of heterogeneous machines.
So, for the data-centric world I think is coming, the only way to manage the data is to create a single machine with a new DMA-based protocol that looks like NFS in terms of no changes to the user application, but scales more like a locally-attached RAID communicating without TCP/IP. This new protocol would have to support:
- Authentication
- Encryption
- High performance and scalability (i.e. low overhead)
- DMA communication of the data to the host
- No application changes (POSIX standards and read/write/open system calls)
- WAN and SAN access
Since my data is now centralized, security, replication, data encryption, HSM, backup, and disaster recovery policies can be implemented more easily. Another advantage is that I would be free from having to write and maintain tools for each OS, OS release vendor, etc.
The new box would have a tight coupling between the file system and the reliable storage. I might have RAID 1-like functions for small random access files and RAID 5-like functions for larger, sequentially accessed files. The file system could understand the topology of the file in question and read ahead based on access patterns like reading the file backward, even though the file might not be sequentially allocated. Tight coupling between the cache and the data would improve scaling and reduce latency and costs.
Ah, cost — that’s the key. What would the return on investment (ROI) be for this new data-centric device? Well, that’ where my dream ended. We may never know if this box would work, what the ROI would be, and whether or not people would actually buy it, but I do believe it meets the requirements of the market.
Can it be built? I think it can. Will it be built? I don’t know, but it sure would solve a bunch of problems if done correctly.
Please feel free to send any comments, feedback, and/or suggestions for future articles to Henry Newman.