A colleague was recently trying to convey to his client the best way to build an analytics strategy. The company had already begun the hard work of creating a data warehouse, data lake, data lab and data team — and it was getting prepared technically to handle the complexities and challenges that arise when an enterprise begins trying to extract new insights and information from its data.
When the team began to test its new capabilities, the conversation turned to the questions it needed to ask. What should the strategy be? How should we allocate our scarce resources – i.e., data scientists – to solve our problems?
It became clear the company didn’t understand what it would be doing with its data. This is why the terms “big data” and “analytics” have to go, in my opinion.
The trouble is, these terms are too broad and have been so widely misused that they have become catch-alls that have lost their meaning. Huge swaths of problems – from data reporting, processing, warehousing, and even distributed file systems – have been lumped under the category “big data, while seemingly almost anything that involves math more complicated that counting is lumped under “analytics.” From a historical perspective, big data picks up where traditional databases left off, and analytics picks up where business intelligence (BI) left off.
When I think about the problems being solved in today’s businesses and with today’s compute techniques, I like to break it down into two categories: 1) math, and 2) systems that do math.
For the most part, the math we’re talking about here is statistics, and the hierarchy of complexity that we’re talking about goes like this:
counting -> descriptive statistics -> predictive statistics -> prognostics and prescriptive statistics
These are the general (and not perfectly defined) branches of statistics that get lumped into analytics. These categories also imply a hierarchy of purpose – from simple accounting and reporting questions like, “How many sales did we have last quarter?” to predictive questions like “How many sales will we have this quarter?” From “Are our sales effected by weather?” to “What will the weather be?” The increasing complexity of the questions requires increasingly complex statistics to answer.
There are generalized categories of systems that map back to this hierarchy, too. BI tools that primarily report on data focus on counting and describing data with measures like averages and quartiles. Analysis packages deliver more advanced tests like regressions, logistic models and ANOVA tests. And data science packages can be used for building even more advanced models that incorporate machine learning – clustering, neural networks, natural language processing and the beginnings of artificial intelligence (AI). That progression of tools starts out looking a lot like an app, but ends up looking a lot more like software development as complexity grows. Also, in the initial analytics hierarchy, you’ll notice that it ends in prescriptive statistics, and does not include machine learning. Despite the hype that they are receiving, machine learning and neural networks are methods of prediction — a means to an end and not the end itself.
The ambiguity of the term “big data” should be obvious from the get go. Are we talking about the files? The systems? The problems? Yes, and more.
Big data has come to mean the collection of systems, techniques, statistical methods and technologies that are used to handle the gathering, storage and processing of information represented by data (my definition). In the last six years, I’ve seen it refer to all of those things. The important takeaway from that definition is that it is always focused on how to do things, and not why to do them or even what to do.
Whether we call it “big data” or not, the various components of this category always come after the problem has been defined and the strategy has been set to solve it. Big data is a tactical category, an arsenal that businesses now have to work with, but it won’t tell you what to do.
The technology for collecting and processing data has matured over the last half decade and is still changing almost daily. The advances in neural networks alone show a great deal of promise for handling a wide variety of information that, in the past, would have taken enormous effort to sift through. But, in those places where the rubber meets the road – in the data centers, data labs and board rooms – we need to abandon the catch-all terms and talk in ways that will help us figure out what we need to know, why we want to know it and what we plan to do with the answers.
Don’t talk about analytics and big data, talk about what the business needs to know and how it is going to find it out.
Alex Bakker is a principal analyst at ISG, a global technology research and advisory firm.
Photo courtesy of Shutterstock.
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2020
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Anticipating The Coming Wave Of AI Enhanced PCs
FEATURE | By Rob Enderle,
September 05, 2020
The Critical Nature Of IBM’s NLP (Natural Language Processing) Effort
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
August 14, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.