Software has stolen the spotlight from hardware over the last few years in the world of storage.
Hardware, to some degree, has become more commoditized, certainly at the lower end. It is the software that provides the secret sauce that provides functionality and more features.
Here are some of the top trends in the storage software market:
1. Data access requirements have changed
The industry focus for many years has been on how to store rapidly growing mountains of unstructured data.
Now organizations are looking at how to monetize that data. That means finding ways to cost-effectively store much more data without bogging it down in difficult-to-access databases and repositories.
“Monetizing unstructured data means it needs to be liberated from data silos for retention, operationalized with actionable metadata, and made available to applications and users wherever they reside,” said Floyd Christofferson, VP of product marketing, Hammerspace.
2. Addressing incompatibility
Incompatibility has been an ongoing challenge in storage.
There are many operating systems (OSs) in which storage is stored. The storage vendors have also issued their own storage operating systems and platforms, each using different standards, protocols, and languages. It has been far from easy to tie pieces together.
“The emerging trend is software that can bridge the gaps between incompatible file systems and vendor silos to enable a holistic access to and management of data via shared and extensible metadata across any on-premises or cloud storage type,” said Christofferson with Hammerspace.
“In this way, data becomes a more easily manageable and exploitable resource.”
3. AI and analytics
Greater compatibility and data access has opened many doors in the fields of machine learning (ML), artificial intelligence (AI), and analytics.
No longer must analytics wrestle with data formats, attempt to shift data around into business intelligence (BI) applications, and otherwise waste most of their time trying to corral the data they wish to explore and gain insight front.
“AI/ML applications can now leverage global cross-silo access without the need to consolidate data into a single repository,” said Christofferson with Hammerspace.
“Data protection and other services can be globalized across silos as well, reducing operational expenses, and increasing user/application productivity with persistent access.”
4. Higher-performance storage
The wider usage of AI on storage is exerting an impact on storage software.
Users want to achieve the best possible performance, and they are looking to gain that from both the hardware side and the software side.
“AI workloads in the cloud require cloud storage to optimize for high performance and throughput,” said Bin Fan, VP of open source and founding engineer, Alluxio.
Gartner predicts that by 2024, 70% of enterprises will use cloud and cloud-based AI infrastructure to operationalize AI.
However, although cloud-based AI platforms are getting adopted, building optimal storage systems is challenging, Fan said. AI workloads are usually data- and metadata-intensive, requiring storage systems to have high performance, high concurrency, and high data throughput.
“While designing AI platform architecture in the cloud, organizations should consider these storage requirements upfront and choose software-defined storage solutions for scalability and cost-effectiveness,” Fan said.
5. Container-native storage
Container-native storage is becoming increasingly popular among enterprises, according to Gou Rao, CTO, Portworx by Pure Storage.
There are three primary database or data state use cases for container-native storage in an enterprise application:
- Transactional databases, such as Oracle and SQL Server
- Large data lakes or data warehouses for analytics and post-processing
- Near-line data sources, such as message queues (Kafka), SQL/NoSQL (Postgres, Mongo), file and object, and searchable databases (Elastic)
Rao recommended that the first two be kept outside of any container platform, since they really don’t have an agile and dynamic management requirement. However, the third use case, near-line data sources, is more closely tied with the front-end application stack and is subject to more frequent updates, schema changes, and duplication of deployments.
“The truth is, there are a lot of database deployments at the near-line tier with frequent enough changes that require constant control directly by the development team,” Rao said.
“Since the container-based (cloud-native or stateless) applications, which are managed by Kubernetes, directly depend on this tier of databases, it makes more sense to have these resources then directly controlled by a common control plane, the Kubernetes control plane. Over the next few years, expect to see Kubernetes offer a richer and more native database-as-a-service experience.”