Data compression is the process of encoding information to reduce bit rate and shrink data size.
As companies and organizations aim to archive and safe-keep old databases instead of deleting them, data compression enables them to store more data in a smaller space.
Read below to learn more about the data compression market.
The Market for Data Compression
The global data compression market was valued at $3.01 billion in 2020. With a compound annual growth rate of 5.2%, it’s expected to reach $4.51 billion by the year 2026.
The major key players in the data compression market’s competitive landscape are the U.S., Germany, the U.K., France, and Norway.
Industries that generate and store massive amounts of user data and online activity are also the biggest buyers of data compression services:
- Finance and banking
- Media and entertainment
- Health care
- Social media
- Communications
- Retail and e-commerce
Data Compression Features
Data compression strategically removes bits of files to make them smaller and more manageable. This is done through a variety of compression algorithms, such as 7-Zip, JPEG, ZIP, and bzip2.
While the use of compression algorithms differs depending on the application, data type, and speed and access requirements, data compression can be split into two main categories: lossy compression and lossless compression.
Lossy Compression
Most commonly used on videos, images, and audio files, lossy compression reduces the file’s size by removing any unnecessary bits of information.
Lossy compression, however, falls short with large files, where the quality of the data is noticeably affected. It also can’t be used to compress files with critical data such as spreadsheets, where every data point is important.
Lossless Compression
Instead of removing actual data from the file, lossless compression removes redundant bits of information. It’s mostly used in PNG images, FLAC audio, and ZIP files.
The downside of using lossless compression is the minimal reduction in a file’s size. It ends up using more storage and takes longer to load.
Benefits of Data Compression
The primary benefit of data compression is reducing file and database sizes for more efficient storage in data warehouses, data lakes, and servers.
It increases the overall volume of information in storage without increasing costs or upscaling the infrastructure.
Other data compression benefits include:
- Reducing required storage hardware capacity
- Reducing transmission time
- Faster data transfer
- Reducing required communication bandwidth
- Faster to load
- Faster file writing and reading
- Faster analysis and insight extraction
“From the standpoint of extracting insights from large datasets, the file sizes of those datasets are utterly and entirely irrelevant,” said Kalev Leetaru, internet entrepreneur in AI and big data. “From an operational standpoint it is important to know the compressed and uncompressed sizes of a dataset for network and storage planning and resource allocation, but for the actual analysis itself, what matters is how much relevant data that dataset actually contains.”
Data Compression Use Cases
Theoretical work on data and file compression first began in the late 1940s. But as it became more complex over the decades, companies started utilizing third-party services and tools to compress their data into manageable sizes.
Medicat
Medicat designs and provides healthcare information technology (HIT) to universities and colleges. Operating in 46 states and three countries, Medicat handles reminders, reporting, and counseling for students’ health.
With a rapidly growing clientele, Medicat needed a way to adapt to the demand before it reached the limit of its IT infrastructure. Teaming up with IBM Business Partner Dynamix Group to deploy VersaStack, Medicat was able to use built-in automation and data compression capabilities to optimize performance and reduce infrastructure requirements.
“VersaStack stood out from competing options as it combines proven technology from two vendors with unparalleled track records,” said Jonathan Cox, Director of Technology Service at Medicat. “If we surpass our growth predictions, we can easily scale the solution both horizontally and vertically, assuring that we can meet unknown future challenges with ease.”
AstraZeneca
AstraZeneca is a British-Swedish pharmaceutical and biotechnology company. It focuses on researching, discovering, and developing prescription medicines in Oncology as well as studying and mapping genomes.
Working with genetic sequences and complex chemical compounds results in massive amounts of data that AstraZeneca needs to keep track of. Partnering with PetaGene, AstraZeneca used PetaSuite to compress the genomics datasets for its Center for Genomics Research (CGR), resulting in an average data size reduction of 76%.
“AstraZeneca’s Center for Genomics Research has the bold ambition to analyze up to two million genomes by 2026,” said Slavé Petrovski, vice president and head of genome analytics and bioinformatics, discovery sciences, R&D at AstraZeneca. “Minimizing the storage footprint and transfer time of genome data while maximizing data access and compute processing is a necessity to enable us to achieve our ambition.”
Data Compression Providers
A few of the leading players in the data compression market include:
- UtopiaCompression
- PetaGene
- IBM
- Exasol
- Opera Software
- Mentis Sciences
- VividQ
- Hydrolix
- Wandera
- Dotphoton SA