Comparing Big Data solutions with an eye toward finding the best one for your business has become critical: In recent years, big data solutions have gone from being a hot emerging technology to an essential part of everyday business.
In 2015, Gartner dropped big data from its Hype Cycle report, and explained the decision by observing, "Big data isn't obsolete. It's normal."
According to the NewVantage Partners "Big Data Executive Survey 2016", 62.5 percent of organizations now have at least one big data solution running in production. That's more than twice as many as in 2013. And today 69.9 percent of organizations say that big data is very important or critical to their companies' growth.
While big data solutions have become very popular, they remain very complex. And because vendors are likely to slap the "big data" label on almost any data-related product, enterprises and small businesses need to be very careful to ensure that the solutions they are purchasing will meet their needs. With data scientists in short supply, organizations also need to ensure that their existing staff has the necessary skills to make use of any software they purchase.
So what should organizations compare big data solutions for their needs? Experts offer nine key tips:
Some companies have fallen into the trap of adopting a big data solution before they know how they are going to use it or why they are going to use it. To avoid this mistake, Sean Anderson, product marketing, Cloudera, recommends that companies "define and organize around a business case." He says, "Technology can be very compelling but useless ultimately if it is not solving a business problem."
Once everyone involved understands the use case driving the decision to purchase a solution, it becomes much easier to select the product that will be the best fit for the organization's needs. Instead, of just looking for a product that checks a lot of boxes, companies can focus on selecting a vendor that can help them accomplish their objectives.
"Many customers get into a feature and functionality bake- off, when in reality you need to think about how you are going to partner with a vendor to ensure your success in bringing an analytics offering to market," explains Roman Stanek, CEO and founder at GoodData. He adds, "As opposed to thinking strictly about individual features, consider the wealth of expertise and knowledge a vendor can bring to your partnership."
Stanek says that the most important question a company can ask their big data vendor is "How are you going to help me or allow me to create value from my data assets?" In addition, he advises, "Consider how you are going to productize the analytics solution to turn it into a profit center for your business. Work backwards as you would with any new product or feature you are going to introduce to your product portfolio."
The fundamental issue that launched the "big data" trend is the sheer amount of data that most organizations must store and manage. According to IDC's Digital Universe study, the volume of data stored in all of earth's digital systems is increasing 40 percent per year. And rapidly growing companies may experience even faster data growth.
With this reality in mind, Jeff Healey, director of product marketing for Vertica at Hewlett Packard Enterprise, says that it's "important to choose a technology that can scale as you grow and position you for success – as your business changes and data insight moves beyond just experimental."
Importantly, organizations need to make sure that the solution they choose will continue to offer the levels of performance they need as data stores get larger. "Most of the big data problems have their roots in scalability," notes Kiran Kamreddy, senior product manager at Teradata. "The 'big' volumes create performance issues. A big data solution [must] scale well as the data volumes increase rapidly and deliver acceptable levels of performance."
When discussing big data, people often refer to the "three Vs": volume, velocity and variety. Of those three, the variety of data is often the most difficult for enterprises to handle. In the NewVantage Partners report, 40 percent of those surveyed said that data variety is the primary technical driver for their big data investments, compared to just 14.5 percent who said the same about volume and 3.6 who selected velocity as their primary issue.
Kamreddy explains, "Big data means a large variety of data and analytical paradigms, so a big data solution must not be too rigid and be open to handle a lot of variety." That means the solution should support both structured and unstructured data in a variety of formats, and it should support Hadoop and other common big data tools.
Just because you are investing in a big data solution doesn't mean that you will be getting rid of your existing storage, data management and analytics tools. "I think some organization view big-data as something that is radically and completely new, and they need to rip and replace their existing data investments," says Kamreddy. "That is not completely true. It is important to evaluate their current solutions and why/how they are falling short of their big data needs and requirements and how the new solutions augment the existing ones. They should also think about the integration and configuration features of the big data solutions and how easy or difficult is it for them to integrate with existing solutions, and data types."
Companies should also consider the burden that the new technology will place on IT staff. "Is your big data solution going to force your analysts and BI professionals to learn new tools or limit the tools they can use?" asks Anderson. "This is a surefire way to bolster low buy-in for big data projects. Be sure to choose a vendor that works with popular tooling for ETL, data visualization, data management, analytics, BI, etc. on premise and in the public cloud."
Leveraging the resources you already have can also help to keep expenses low.
Cost is always a factor in any technology purchase, and many big data initiatives are driven in part by a need to lower the cost of storing, managing, maintaining and analyzing data stores. However, determining the total bill can be a tricky process that involves estimating ongoing operational expenses as well as a careful analysis of the hard costs.
Anderson cautions organizations to make sure that they are calculating the complete cost of the project. "This includes the technology but also the skills, administration, and professional services/support costs."
Kamreddy agrees, noting that overall total cost of ownership (TCO) is one of the most important considerations for selecting a big data solution. He recommends that organizations evaluate "the ROI/TCO implications of each option, in the light of solving the business case and value delivered."
Many of the most popular big data solutions, such as Apache Hadoop and its related projects, are available under open source licenses. So organizations frequently go looking for commercially supported options that are based on these open source projects.
"Open source solutions are great because they give the ecosystem great velocity to meet new customer demands, but they are all adopted and supported to varying degrees," notes Anderson.
Choosing a solution that is based on open standards can give organizations greater agility and the freedom to switch vendors as needs change. Healey advises organizations to look for solutions that are "based on open SQL standards with in-database analytical functions for machine learning, IOT sensor data analytics, pattern matching, and more." He also notes, "The ability to natively integrate with open source and complementary technology helps to avoid vendor lock in and affords ultimate flexibility."
As big data solutions have matured, most have improved their security features. According to IDG, "Confidence in security solutions and products for company data rises, increasing from 49% in 2014 to 66% [in 2015]."
Still, organizations should make sure that any solution they purchase meets their security and compliance requirements. Anderson warns, "Most solutions address security in their own unique way but it’s very rare that a solution covers security from end to end, ensuring data is being protected during ingest, analysis, and when served via online applications."
Experts say that big data becomes most useful when a wide range of people within the organization can access big data insights. Look for tools that don't require users to be experts in data science in order to use them.
"Enabling a data-driven culture across the organization means opening up these systems to self-service access and discovery of data — making data and analytics usage ubiquitous to all users," says Ritkia Gunnar, vice president of big data and analytics solutions, IBM Analytics.
"The focus is really about how to make sense and interpret value from all data - how to turn the 'ha-dump' of data into something meaningful and of interest," adds Gunnar. "Organizations need to look for solutions that essentially make big data simple."
Finally, when selecting a big data solution, it never hurts to get some customer references from the vendors you are considering. Healey recommends, "Ask vendors specifically for examples of how reference customers have started small, grown without pain, incorporated key complementary technologies, and have succeeded across multiple analytical use cases."
If you contact other customers, they may also be able to provide advice and tips gleaned from their experience deploying a big data solution. That, in turn, can help your vendor selection and solution deployment process run more smoothly.