How to Select a Big Data Application

To be sure, big data solutions are in great demand. Today, enterprise leaders know that their big data is one of their most valuable resources — and one they can’t afford to ignore. As a result, they are looking for hardware and software that can help them store, manage and analyze their big data.

According to IDC, enterprises will likely spend $150.8 billion on big data and analytics in 2017, 12.4 percent more than they spent last year. And that spending is likely to increase at 11.9 percent per year through 2020, when revenues will likely top $210 billion.

Much of that revenue is going toward big data applications. IDC forecasts that spending on software alone could exceed $70 billion in 2020. Spending is increasing particularly rapidly on non-relational analytic data stores (like NoSQL databases), which will likely grow 38.6 percent per year, and cognitive software platforms (like analytics tools with artificial intelligence and machine learning capabilities), which will likely grow 23.3 percent per year.

In order to capitalize on all that big data spending, vendors have slapped the “big data” label on a dizzying array of different products and services. That product proliferation can make it difficult for organizations to find the right big data applications to meet their needs. Experts suggest that a good way to start the process of selecting a big data application is to determine exactly what kind of application (or applications) you need.

Types of Big Data Applications

Enterprise software vendors offer a wide array of different types of big data applications. The kind of big data application that is right for you will depend on your goals.

For example, if you just want to expand your existing financial reporting capabilities with greater detail and depth, a data warehouse and business intelligence solution might be sufficient for your needs. If your sales and marketing teams want to use your big data to uncover new opportunities for increasing your revenue and margins, you might consider creating a data lake and/or investing in a data mining solution. If you want to create a data-driven culture where everyone in your organization is using data to guide their decision-making, you might want a data lake and predictive analytics and an in-memory database and possibly streaming analytics too.

Things can get a little more complicated because the lines between the different types of tools can be a little fuzzy. Some business intelligence tools have data mining and predictive analytics capabilities. Some predictive analytics tools include streaming capabilities.

Your best approach is to define your goals clearly at the outset and then go looking for products that will help you reach those goals. The chart below offers an overview of some of the most common types of big data applications and how they can be useful in the enterprise.

big data solutions

Key Decisions When Selecting a Big Data Application

No matter which type of big data application you select, you’ll need to make some key decisions that will help you narrow down your options. Here are a few of the most important of these considerations:

On-premise vs cloud-based big data applications

The first big decision you’ll need to make is whether you want to host your big data software in your own data center or if you want to use a cloud-based solution.

Currently, more organizations seem to be opting for the cloud. “Global spending on big data solutions via cloud subscriptions will grow almost 7.5 times faster than on-premise subscriptions.” Brian Hopkins, Forrester vice president and principal analyst, wrote in an August 2017 blog post. “Furthermore, public cloud was the number one technology priority for big data according to our 2016 and 2017 surveys of data analytics professionals.”

Cloud-based big data applications are popular for several reasons, including scalability and ease of management. The major cloud vendors are also leading the way with artificial intelligence and machine learning research, which is allowing them to add advanced features to their solutions.

However, cloud isn’t always the best option. Organizations with high compliance or security requirements sometimes find that they need to keep sensitive data on premises. In addition, some organizations already have investments in existing on-premises data solutions, and they find it more cost effective to continue running their big data applications locally or to use a hybrid approach.

Proprietary vs open source big data applications

Some of the most popular big data tools available, including the Hadoop ecosystem, are available under open source licenses. Forrester has estimated, “Firms will spend $800 million in Hadoop software and related services in 2017.”

One of the big appeals of Hadoop and other open source software is the low total cost of ownership. While proprietary solutions have hefty license fees and may require expensive specialized hardware, Hadoop has no licensing fees and can run on industry-standard hardware.

However, enterprises sometimes find it difficult to get the open source solutions up and running and configured for their needs. They may need to purchase support or consulting services, and organizations need to consider those expenses when figuring out total cost of ownership.

Batch vs streaming big data applications

The earliest big data solutions, like Hadoop, processed batch data only, but enterprises increasingly find that they want to analyze data in real-time. That has generated more interest in streaming solutions such as Spark, Storm, Samza and others.

Many analysts say that even if organizations don’t think they need to process streaming data today, streaming capabilities are likely to become standard operating procedure in the not-too-distant future. For that reason, many organizations are moving toward Lambda architecture, a data processing architecture that can handle both real-time and batch data.

Characteristics to Look for in a Big Data Application

Once you have narrowed down your options, you’ll need to evaluate the big data applications you are considering. The criteria below include some of the most important factors to examine.

  • Integration with Legacy Technology – Most organizations already have existing investments in data management and analytics technology. Replacing that technology completely can be expensive and disruptive, so organizations often choose to look for solutions that can be used alongside their current tools or that can augment their existing software.
  • Performance – A 2017 Talend study found that real-time analytics capabilities were one of business leaders’ top IT priorities. Executives and managers need to be able to access insights in a timely manner if they are going to profit from those insights. That means investing in technology that can provide the speed they need.
  • Scalability – Big data stores get larger every day. Organizations not only need big data applications that perform quickly right now, they need big data applications that can continue to perform quickly as data stores grow exponentially. This need for scalability is one of the key reasons why cloud-based big data applications have become very popular.
  • Usability – Organizations should also consider the “learning curve” for any big data applications that they intend to purchase. Tools with easy deployment, easy configuration, intuitive interfaces and/or similarity or integration with tools the organization already uses can provide tremendous value.
  • Visualization – According to, “Visualization and explorative data analysis for business users (known as data discovery) have evolved into the hottest business intelligence and analytics topic in today’s market.” Presenting data in charts and graphs makes it easier for human brains to spot trends and outliers, speeding up the process of identifying actionable insights.
  • Flexibility – The big data needs you have today are likely very different from the needs you will have in another year or two. That’s why many enterprises choose to look for tools with the capacity to serve a variety of different goals rather than performing a single function very well.
  • Security – Much of the data included in those big data stores is sensitive information that would be highly valuable to competitors, nation-states or hackers. Organizations need to ensure that their big data has adequate protection to prevent the sorts of large data breaches that have recently been dominating headlines. That means looking either for tools that have security features like encryption and strong authentication built in or tools that integrate with your existing security solutions.
  • Support – Even experienced IT professionals sometimes find it difficult to deploy, maintain and use complex big data applications. Don’t forget to consider the quality and cost of the support available from the various vendors.
  • Ecosystem – Most organizations need a number of different applications to meet all of their big data needs. That means looking for a big data platform that integrates with a lot of other popular tools and a vendor with strong partnerships with other providers.
  • Self-Service Capabilities – The Harvey Nash KPMG CIO Survey 2017 found that sixty percent of CIOs consistently report talent shortages, with big data and analytics being the most in-demand skillset. Because there aren’t enough qualified data scientists to go around, organizations are looking for tools that other business professionals can use on their own. A recent Gartner blog post noted that in an average organization, about 32 percent of employees are using BI and analytics.
  • Total Cost of Ownership – The upfront costs of a big data application are only a small part of the picture. Organizations need to make sure they consider related hardware costs, ongoing license or subscription fees, employee time, support costs and any expenses related to the physical space for on-premises applications. Don’t forget to factor in the fact that cloud computing costs generally decrease over time.
  • Estimated Time to Value – Another important financial consideration is how quickly you’ll be able to get up and running with a particular solution. Most companies would prefer to see benefit from their big data projects within days or weeks rather than months or years.
  • Artificial Intelligence and Machine Learning – Finally, consider how innovative the various big data applications vendors are. AI and machine learning research are advancing at an incredible rate and becoming a mainstream part of big data analytics. Forrester has predicted, “In 2017, investments in AI will triple as firms work to convert customer data into personalized experiences.” If you choose a vendor that isn’t on the cutting-edge of this research, you may find yourself falling behind the competition.

Tips for Selecting a Big Data Application

Clearly, choosing the right big data application is a complicated process that involves a myriad of factors. Experts and organizations that have successfully deployed big data software offer the following advice:

  • Understand your goals — As previously mentioned, knowing what you want to accomplish is of paramount importance when choosing a big data application. If you aren’t sure why you are investing in a particular technology, your project is unlikely to succeed.
  • Start small — If you can demonstrate success with a small-scale big data analytics project, that will generate interest in using the tool throughout the company.
  • Take a holistic approach — While a small-scale project can help you gain experience and expertise with your technology, it’s important to choose an application that can ultimately be used throughout the business. Gartner advises, “To support a ‘data and analytics everywhere’ world, IT professionals need to create a new end-to-end architecture built for agility, scale and experimentation. Today, disciplines are merging and approaches to data and analytics are becoming more holistic and encompassing the entire business.”
  • Work together — That same blog post also notes, “Gartner recommends data and analytics leaders work proactively to spread analytics throughout their organization, to get the largest possible benefit from enabling data to drive business actions.” Many organizations are attempting to build a data-driven culture, and that requires a great deal of cooperation among business and IT leaders.
  • Go viral — Those previously mentioned self-service capabilities can also help with the creation of data-driven culture. Gartner advises, “Enable analytics to truly go viral, within and outside the enterprise. Empower more business users to perform analytics by fostering a pragmatic approach to self-service and by embedding analytic capabilities at the point of data ingestion within interactions and processes.”

For more information about big data, check out the following resources:

Similar articles

Get the Free Newsletter!
Subscribe to Data Insider for top news, trends & analysis
This email address is invalid.
Get the Free Newsletter!
Subscribe to Data Insider for top news, trends & analysis
This email address is invalid.

Latest Articles