Open source. As Big Data gathers momentum, the focus is on open-source tools that help break down and analyze data. Hadoop and NoSQL databases are the winners here, while any proprietary technologies are frowned upon. This seems almost like a forced move in chess; after all, how can you justify creating a platform that unlocks data from various proprietary data siloes only to lock it back up again?
Market segmentation. Plenty of general-purpose Big Data analytics platforms have hit the market, but expect even more to emerge that focus on specific niches, such as drug discovery, CRM, app performance monitoring and hiring. Were the market more mature, it would make sense to build vertical-specific apps on top of general analytics platforms. That's not how it's happening, unless you consider the underlying database technology (Hadoop, NoSQL) as the general-purpose platform.
Expect more vertical-specific tools to emerge, which target specific analytic challenges common to business in such sectors as shipping, marketing, online shopping, social media sentiment analysis and more.
Predictive Analytics. Modeling, machine learning, statistical analysis and Big Data are often thrown together in the hopes of predicting future events and behaviors. Some things are pretty easy to predict, such as how bad weather can suppress voter turnout, while other predictions are fairly hard to pin down, such as the point when swing voters get alienated rather than influenced by push polls.
However, as data accumulates, we basically have the ability to run large-scale experiments on a continuous basis. Online retailers redesign shopping carts to figure out what design produces the most sales. Doctors are able to predict future disease risks based on things like diet, family history and the amount of exercise you get each day.
We've been making these sorts of predictions since the dawn of human history, of course. However, in the past, many predictions were based on gut feelings, incomplete data sets or common sense. (Common sense, after all, is what tells us that the world is flat).
Of course, just because you have plenty of data to base predictions on doesn't mean they'll be correct. Plenty of hedge fund managers and Wall Street traders analyzed market data in 2007 and 2008 and thought that the housing bubble would never burst. Historical data predicted that the bubble would burst, but many analysts wanted to believe things were different this time.
On the other hand, predictive analytics has been catching on in such areas as fraud detection (you know, those calls you get when you use your credit card out of state), risk management for insurance companies and customer retention.
Refocusing on the Human Decision-Making? As machine learning improves and becomes a table stakes feature in analytics suites, don't be surprised if the human element initially gets downplayed, before coming back into vogue.
Business owners always try to limit "human error." Talk to any security professional, and they'll talk at length about how most security vulnerabilities are due to people making mistakes—relying on weak passwords, falling for phishing attacks or clicking on links they shouldn't.
However, even as machine learning improves, the machines will only ask the questions we ask them to. There will be limits to how much we can learn just be relying on machines (although to hear some Big Data vendors talk, you could be excused for coming away worried about a Terminator-like future.)
You don't have to look too hard at how Big Data is emerging to see just how important the human element is, however.
Two of the most famous Big Data prognosticators/pioneers are Billy Beane and Nate Silver. Beane popularized the idea of correlating various statistics with under-valued player traits in order to field an A's baseball team on the cheap that could compete with deep-pocketed teams like the Yankees.
Meanwhile, Nate Silver's effect was so strong that people who didn't want to believe his predictions created all sorts of analysis-free zones, such as Unskewed Polls (which, ironically, were ridiculously skewed). Many think of Silver as a polling expert, but Silver is also a master at Big Data analysis.
In each case, what mattered most was not the machinery that gathered in the data and formed the initial analysis, but the human on top analyzing what this all means. People can look at polling data and pretty much treat them as Rorscharch tests. Silver, on the other hand, pours over reams of data, looks at how various polls have performed historically, factors in things that could influence the margin of error (such as the fact that younger voters are often under-counted since they don't have landline phones) and emerges with incredibly accurate predictions.
Similarly, every baseball GM now values on-base percentage and other advanced stats, but few are able to compete as consistently on as little money as Beane's A's teams can. There's more to finding under-valued players than crunching numbers. You also need to know how to push the right buttons in order to negotiate trades with other GMs, and you need to find players who will fit into your system.
As Big Data analytics becomes mainstream, it will be like many earlier technologies. Big Data analytics will be just another tool. What you do with it, though, will be what matters.