It's time for a Big Data reality check. All of the hype about the profound value and benefits of the ability of new databases, servers, networks and other ingredients to rapidly process and present massive amounts of data in the Big Data stew has risen to the peak of expectations made famous by the Gartner hype cycle. After conducting a variety of surveys about the reality of Big Data implementations this year, and asking leading consultants and vendors about what they and their clients have learned, it's time to just slightly deflate the balloon.
While it is too early to declare the arrival of the next phase of the hype cycle—the inevitable trough of disillusionment—early adopters have learned lessons that should be shared with the rest of us. Here are nine Big Data lessons learned that I've collected:
1. Focus on data management. The IT department, specifically data architects, need to determine where the data and apps will reside. In one on-premise system or together in a cloud implementation? The traditional Business Intelligence era approach of 10 years ago—trying to have everything in one data warehouse—frequently failed in the wake of numerous data marts developed by maverick departments like finance. Thomas Davenport, co-author of the best-selling book Competing on Analytics and the upcoming Big Data at Work, warns that "while it is good to have options, multiple Big Data implementations leads to a more complex set of IT management decisions."
Michael Driscoll, CEO of Metamarkets and a longtime observer of the analytics scene, says he's seen too many large companies attempt to put all of the data—and the processors—in one place. He warns against pursuing a "one- platform" solution, foisted on the organization by the CIO. "Unified data platforms are a false promise of hope," he contends. They are too big, too complex and will inevitably frustrate one or more departments or units. "A federation of services approach is best," he explains. In these arrangements, marketing and finance and other departments can each have their own Big Data implementation.
Most of the value of Big Data comes from co-locating it with knowledgeable end users, at the edges of the organization, where they can tinker with and glean insights from their own data.
2. Don't underestimate the data integration challenges. Deriving value from Big Data usually is dependent on processing unstructured information—video feeds from shop floors, telematics sensors in vehicles, GPS sensors in mobile devices, speech to text files and a host of other bits and pieces of information that are not readily processed. "Most organizations do not have experience cleaning these types of data," notes Davenport.
IBM and others promise that their semantic analytics tools are able to not only parse these unstructured data types, but do it fast enough to support real time decision making. Anjul Bhambhri, IBM's vice president of Big Data within its software group, advises keeping all of the incoming data in its raw state, to preserve information that may be useful later when processed by semantic analytics.
"One of the implicit benefits of a Big Data platform is that you can preserve the raw fidelity of the data and apply multiple types of semantic analytics tools that will filter out the appropriate noise for the specific types of analysis being performed," Bhambhri explains. "This allows the same set of raw data to be applied to multiple applications and domains, without having to model the raw data upfront."
3. Start with the basics. "Many of us love to wax poetic about a utopian future where you stroll into a BestBuy, and your smartphone buzzes with a coupon for the new Microsoft Surface," comments Driscoll. "The deal is offered because it is back-to-school week and BestBuy has access to and processed information about your household, including past Microsoft purchases." Another example of utopia: "We analytics folks love to tout our ability to predict the perfect song for your current mood or movie for your weekend. "However, we need to first focus on the basics," he adds. "Big Data should first answer questions like 'How much money did my company make yesterday.' Or, 'Why did our revenues spike 10 percent last Thursday?'"