Friday, April 16, 2021

Data Storage vs. the Laws of Physics

Data storage hardware has improved, yet still strains at the limits. Data storage expert Henry Newman looks at the underlying science and discusses trends in storage capacity over a multi-year timespan.


This is an update of one of my first articles for Enterprise Storage Forum from way back in 2002. The article was well received, so I figured it was time to see what has changed – if anything – in our industry over the last 8 years. What I found is that things have only gotten worse when it comes to managing storage. Simple physics defines the limitations under which you have to work. The movement of data from applications to hardware devices is limited by physical constraints within the computer and its storage hardware.

First, let’s compare the fastest computers and the fastest storage disk storage devices from 1976, 2002, and today, to get a better understanding of the changes we’ve seen over the last 26 years.

Year CPU Performance * Disk Drive Type Disk Drive Size Disk Seek Plus Latency Transfer Rate
1976 CDC 7600 25 MFLOPs ** Cyber 819 80 MB 24 ms 3MBps half duplex
2002 NEC Earth Simulator 40 TFLOPS *** Seagate Cheetah 10.6K RPM 146GB 7.94 ms **** 200 MB/sec full Duplex******
2010 Oak Ridge Lab Jaguar 1.75 PFLOPs ****** Many vendors 2TB 3.5 inch 10.7K RPM SATA; 146 GB 2.5-inch 15K RPM SAS; 600GB 2.5 inch 10K RPM SAS ~13.6 ms write; ~5.3 ms write; ~7.5 ms write (Flash SSD average access time is 20 to 120 microseconds) 800 MBps full Duplex********

* Though this might not be the best measure of throughput it is a good comparison ** Million Floating Point Operations Per Second *** Trillion Floating Point Operations Per Second **** Average seek and latency for read and write ***** Using FC RAID with 2 Gb interfaces and RAID-5 8+1 in ******According towww.top500.org6/2010 (A new machine from China is expected to be faster and will be announced in mid-November) ******* Using RAID-5/6 8+1 or 8+2

Read the rest at Enterprise Storage Forum.

Similar articles

Latest Articles

Best Data Quality Tools...

Data quality is a critical issue in today’s data centers. The complexity of the Cloud continues to grow, leading to an increasing need for...

NVIDIA’s New Grace ARM/GPU...

This week is NVIDIA’s GTC, or GPU Technology Conference, and they likely should have changed the name to ATC because this year – it...

What is Data Segmentation?

Definition of Data Segmentation Data segmentation is the process of grouping your data into at least two subsets, although more separations may be necessary on...

The Conversational AI Revolution:...

One of the things I’m looking forward to seeing at next week’s NVIDIA GTC event is an update on their Conversational AI efforts. I’m fascinated...