Wednesday, December 4, 2024

Cache Management: Beware the Dangers

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Data storage expert Henry Newman provides background information on caching issues, including what works well on the Web. Key point: the central issue is data access patterns for both the Web and local storage – what is being accessed and when.


More and more vendors are making wild claims about their appliances with Flash cache. Most RAID controller vendors and NAS providers are planning to add Flash to their product designs, which seems like a good idea for all cases as Flash offers a significant amount more cache than is available using standard DRAM. Since the storage stack latency for NAS is often greater than SAN, having cache with a bit more latency is usually does not impact performance.

All of this reminds me of something a late friend of mine, Larry Schermer, used to say back in the 1980’s when he was working on some of the first Cray solid state disks (SSD). He said, “Cache is good if you are reusing data or if it is large enough to handle the data being written to reduce latency. Otherwise, cache management will eat your lunch.”

The point is clear: if your data does not fit in cache, you are not doing small writes that can be coalesced or you are reusing data. In that scenario, cache is not going to help much – it might actually hurt. The issues surrounding caching Web pages as opposed to caching other data are pretty interesting using Larry’s analysis framework. Over the last year, an increasing number of vendors have told me that their cache-based appliances work for all applications and dramatically increase I/O performance. I know that this is just not true for all applications.

The latest crop of storage appliances have been designed in a way that puts more bandwidth between the cache and the hosts than they do between the cache and back-end storage. This means if you are doing a streaming write from an application (or multiple applications) that do not fit in cache, the performance will be limited to the performance between the cache and the back-end storage. All of this is pretty obvious to me, but clearly not to the vendors that are making these claims.

Read the rest at Enterprise Storage Forum.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles