Tuesday, April 13, 2021

Cache Management: Beware the Dangers

Data storage expert Henry Newman provides background information on caching issues, including what works well on the Web. Key point: the central issue is data access patterns for both the Web and local storage – what is being accessed and when.


More and more vendors are making wild claims about their appliances with Flash cache. Most RAID controller vendors and NAS providers are planning to add Flash to their product designs, which seems like a good idea for all cases as Flash offers a significant amount more cache than is available using standard DRAM. Since the storage stack latency for NAS is often greater than SAN, having cache with a bit more latency is usually does not impact performance.

All of this reminds me of something a late friend of mine, Larry Schermer, used to say back in the 1980’s when he was working on some of the first Cray solid state disks (SSD). He said, “Cache is good if you are reusing data or if it is large enough to handle the data being written to reduce latency. Otherwise, cache management will eat your lunch.”

The point is clear: if your data does not fit in cache, you are not doing small writes that can be coalesced or you are reusing data. In that scenario, cache is not going to help much – it might actually hurt. The issues surrounding caching Web pages as opposed to caching other data are pretty interesting using Larry’s analysis framework. Over the last year, an increasing number of vendors have told me that their cache-based appliances work for all applications and dramatically increase I/O performance. I know that this is just not true for all applications.

The latest crop of storage appliances have been designed in a way that puts more bandwidth between the cache and the hosts than they do between the cache and back-end storage. This means if you are doing a streaming write from an application (or multiple applications) that do not fit in cache, the performance will be limited to the performance between the cache and the back-end storage. All of this is pretty obvious to me, but clearly not to the vendors that are making these claims.

Read the rest at Enterprise Storage Forum.

Similar articles

Latest Articles

The Conversational AI Revolution:...

One of the things I’m looking forward to seeing at next week’s NVIDIA GTC event is an update on their Conversational AI efforts. I’m fascinated...

Edge Computing

Edge computing is a broad term that refers to a highly distributed computing framework that moves compute and storage resources closer to the exact...

Data-Driven Decision Making: Top...

The phrase data-driven decision making – certainly popular in the field of data analytics – may seem redundant. After all, nearly everything is driven...

Top Performing Artificial Intelligence...

As artificial intelligence has become a growing force in business, today’s top AI companies are leaders in this emerging technology. Often leveraging cloud computing and...