If you want to learn about SANs, NAS devices, I-SCSI and InfiniBand, you do far worse than reading the second edition of Marc Farley’s tome — Building Storage Networks (McGraw-Hill/Osborne Media). In fact, the Storage Networking Industry Association recommends Farley’s book as a “must-read for anyone considering designing and carrying out storage networks.”
Free of vendor references, Farley’s concisely written and illustrated book explains everything from piecing together the I/O path to wiring storage networks and clusters with InfiniBand.
As an independent consultant in the network storage industry, Farley shares with an interviewer his views about storage trends and how to avoid potential potholes in a network storage roadmap. Here’s what he had to say about what’s hot and what’s not.
Q. Are organizations sorting through the variety of SAN topologies or just keeping things simple?
Most organizations are building small SANs – a single server connected to a single storage subsystem via a switch. Because of their customers’ needs for around-the-clock availability of storage, the financial services players have gotten a little more sophisticated about cross-connected SAN topology. At the very minimum, you’ll find a SAN that has two servers, two storage subsystems, and two SAN switches. Each server and subsystem has two SAN ports. Each switch connects to each server and each subsystem. You’ll also see a few SANs that have large storage subsystems with many ports.
Q. So when do you think organizations are going to be building the type of SAN topologies the vendors have put down on paper?
Brocade will tell you customers are doing it today. I think those customers are few and far between. I would probably say that would be one to two years out.
The beauty of all the cross-connections is to provide redundancy and availability to the storing side between storage and servers. You’re also able to maintain and to upgrade both servers and storage and not loose access to the data. Bus topologies just don’t work for these things.
Q. There are more than 50 SAN virtualization vendors and still no standard API. What advice would you give someone who is considering SAN virtualization?
SAN virtualization isn’t a panacea or a solution in of itself. The critical issue here is the persistence of the virtualization metadata and whether or not it can be recovered in the event of a disaster. My fear about installing some vendor’s SAN virtualization is you could have a disaster and then have no way to create the virtualization you had before. This limitation would make data recovery impossible and it’s one of the real downsides of being careless with virtualization.
To this end, you want to work with vendors that know how to recover and make sure the information can be backed up. Veritas knows virtualization better than any other vendor. Veritas does virtualization by writing to the private areas on the hard disk. So, you have data that knows where it belongs and how it works.
Q. What do you think of doing SAN virtualization with a separate appliance?
You use SAN virtualization because you have a capacity issue and you want to lump some volumes together. What happens when you hit the wall the next time? Do you put another layer of virtualization over that? If you do, then how do you know you can recover? You made it more difficult to recover. Why? If a disaster strikes, you’ll have to be calm enough to know how to put all of the layers back together. Don’t think so. You’ll probably opt for a backup and recovery only to find that it is broken. So if you build a huge complex storage repository, the chances of your backing it up and recovering don’t look good.
Q. Your book has several chapters about InfiniBand and I-SCSI? When will these things be more than beacon along the horizon?
InfiniBand is several years away; it’ll happen, but it’s terribly complex. Storage over IP is a great deal, which will happen. On the other hand, the selection of TCP/IP as a transport protocol for I-SCSI is a mistake. The Internet Engineering Task Force requested TCP/IP because you’d be able to leverage old technology. Everyone has warm fuzzies about TCP/IP, except storage folks. We don’t like to have our I/Os cut up into little pieces and sprinkled around whatever path is available. It may have been faster to develop something new then try to fit storage in some old framework.
Q. What storage trends turn you on right now?
I’m excited about distributed file systems, which separate the data representation functions from the data structure functions, which manages where all the data goes. If you put the data structure function in a storage subsystem, you get an intelligent storage subsystem that allows you to manage data. Tricord is doing it; other companies are developing file systems that’ll do it.
I’m also interested in something that requires a lot of work — changing the routing and topology used for storage networks. So far, we’ve borrowed things from Ethernet and IP. I’d like to see a new topology algorithm created, and new ways to create topology databases and distribute them.