The ZFS Story: Clearing Up the Confusion

The ZFS file system has much to recommend it, yet a good deal of confusion exists about it as well.
(Page 1 of 3)

It's pretty common in IT circles to develop a certain cult-like or "fanboy" mentality. What causes this reaction to technologies and products I am not quite sure, but that it happens is undeniable.

One area that I never thought that I would see this occur is in the area of filesystems - one of the most "under the hood" system components and one that, until recently, received literally no attention even in decently technical circles.

Let's face it: misunderstanding when something comes from Active Directory versus from NTFS is nearly ubiquitous. Filesystems are, quite simply, ignored. Ever since Windows NT 4 released and NTFS was the only viable option, the idea that a filesystem is not an intrinsic component of an operating system and that there might be other options for file storage has all but faded away.

That is, until recently.

The one community where, to some small degree, this did not happen was the Linux community. But even there Ext2 and its descendants so completely won mindshare that even though they were widely available, alternative filesystems were sidelines and only XFS received any attention, historically, and even it received very little.

Where some truly strange behavior has occurred, more recently, is around Oracle's ZFS filesystem, originally developed for the Solaris operating system and the X4500 "Thumper" open storage platform (originally under the auspices of Sun prior to the Oracle acquisition).

At the time (nine years ago) when ZFS released, competing filesystems were mostly ill prepared to handle large disk arrays that were expected to be made over the coming years. ZFS was designed to handle them and heralded in the age of large scale filesystems. Like most filesystems at that time, ZFS was limited only to a single operating system.

The Why of ZFS

ZFS was, truly, a groundbreaking filesystem and I’m a great proponent of it. But it is very important to understand why ZFS did what it did, what its goals are, why those goals were important and how it applies to us today. The complexity of ZFS has lead to much confusion and misunderstanding about how the filesystem works and when it is appropriate to use.

The principle goals of ZFS were to make a filesystem capable of scaling well to very large disk arrays. At the time of its introduction, the scale to which ZFS was capable was unheard of in other file systems. But there was no real world need for a filesystem to be able to grow that large.

By the time that the need arose, many other file systems – such as NTFS, XFS, Ext3 and others – had scaled to accommodate the need. ZFS certainly lead the charge to larger filesystem handling but was joined by many others soon thereafter.

Because ZFS originated in the Solaris world where, like all big iron UNIX systems, there is no hardware RAID, software RAID had to be used. Solaris had always had software RAID available as its own subsystem. The decision was made to build a new software RAID implementation directly into ZFS. This would allow for simplified management via a single tool set for both the RAID layer and the filesystem. It did not introduce any significant change or advantage to ZFS, as is often believed. It simply shifted the interface for the software RAID layer from being its own command set to being part of the ZFS command set.

The real "innovation" that ZFS inadvertently made was that instead of just implementing the usual RAID levels of 1, 5, 6 and 10 they instead "branded" these levels with their own naming conventions. RAID 5 is known as RAIDZ. RAID 6 is known as RAIDZ2. RAID 1 is just known as mirroring. And so on. This was widely considered silly at the time and pointlessly confusing. But, as it turned out, that confusion because the cornerstone of ZFS' revival many years later.

It needs to be noted that ZFS later added the industry's first production implementation of a RAID 7 (aka RAID 7.3) triple parity RAID system and branded it RAIDZ3. This later addition is an important innovation for large scale arrays that need the utmost in capacity while remaining extremely safe, but are willing to sacrifice performance in order to do so. This remains a unique feature of ZFS but one that is rarely used.

In the spirit of collapsing the storage stack and using a single command set to manage all aspects of storage the logical volume management functions were rolled into ZFS as well. It is often mistakenly believed that ZFS introduced logical volume management in certain circles but nearly all enterprise platforms, including AIX, Linux, Windows and even Solaris itself, had already had logical volume management for many years. ZFS was not doing this to introduce a new paradigm but simply to consolidate management and wrap all three key storage layers (RAID, logical volume management and filesystem) into a single entity that would be easier to manage and could provide inherent communications up and down the stack. There are pros and cons to this method and an industry opinion remains unformed nearly a decade later.

ZFS’s Confusing Merger

One of the most important aspects of this consolidation of three systems into one is that now we have a very confusing product to discuss. ZFS is a filesystem, yes, but it is not only a filesystem. It is a logical volume manager, but not only a logical volume manager. People refer to ZFS as a filesystem, which is its primary function, but that it is so much more than a filesystem can be very confusing and makes comparisons against other storage systems difficult. At the time I believe that this confusion was not foreseen.

What has resulted from this confusing merger is that ZFS is often compared to other filesystems, such as XFS or Ext4. But this is confusing as ZFS is a complete stack and XFS is only one aspect of a stack.

ZFS would be better compared to MD (Linux Software RAID) / LVM / XFS or to SmartArray (HP Hardware RAID) / LVM/ XFS than to XFS alone. Otherwise it appears that ZFS is full of features that XFS lacks but, in reality, it is only a semantic victory. Most of the features often touted by ZFS advocates did not originate with ZFS and were commonly available with the alternative filesystems long before ZFS existed. But it is hard to compare "does your filesystem do that" because the answer is "no.... my RAID or my logical volume manager do that." And truly, it is not ZFS the filesystem providing RAIDZ, it is ZFS the software RAID subsystem that is doing so.


Page 1 of 3

 
1 2 3
Next Page



Tags: networking, ZFS, filesystem


0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.