Saturday, June 15, 2024

Choosing a Big Data Solution: Seven Steps

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Also see: How to Choose a Big Data Solution (video roundtable)

Clearly, choosing a big data solution isn’t easy. As companies of all sizes try to extract more from their existing data stores, Big Data vendors are rushing in to provide a range of Big Data solutions, which comprise everything from database technology to visualization tools. With such a diverse selection of tools to choose from, buyers must carefully define their goals in order to find the right tools to meet their goals.

Before finding the right tools, however, organization must first ask themselves what business problems they’re trying to solve — and why.

“Too many big data projects don’t start with problems to solve, but rather start with exploratory analytics,” said Chris Selland, VP of marketing and business development for HP Vertica. “That’s okay to a point, but eventually these questions need to be asked and answered. Companies have a lot of Big Data and many questions, but that doesn’t result in the CIO or CFO simply handing you a large amount of money to work with.”

Choosing a Big Data Solution: Step by Step

To be sure, choosing a Big Data solution is a complex process requiring balancing what may be conflicting input – the C suite may view the process differently from, for instance, the sales reps. The matter of costs (licensing? Start-up?) is particularly confusing, giving that many vendors’ solutions don’t compare on an apples to apples basis. Still, these seven steps should help you select the best big data solution for your business.

Step One: Define the Problem — in the Right Way

The first step in choosing a Big Data solution, then, is to figure out what business problem you want to solve, but as you do, it’s important to frame that problem in a way that will get the project financed.

Christer Johnson, Principal, Analytics at EY, agrees with this first step but adds that it’s important to narrow your focus. “Some companies make the mistake of trying to do too much. Rather than seeking to extract value from all of your data, it’s important to start with the end in mind, allowing your business needs to drive the process,” he said.

He also recommends that you don’t overlook data siloes that seem resistant to analysis. “Very soon, I think companies will find the biggest value in unstructured data,” Johnson said. When most people think of unstructured data, though, many go no further than, say, digging into social media sentiment. “There are other types of unstructured data that will deliver more value overt time. For instance, look at call center data. Voice analysis software is getting good enough that you can analyze those calls for all sorts of things.”

Most businesses, though, only analyze five percent or less of those calls. When you start posing Big Data questions, keep an open mind. The biggest payoffs may come from unlikely sources. Will the Big Data tool you choose be able to adapt as your needs evolve?

Step Two: Ask What Will Be Gained by Solving this Particular Problem

By now, you’re probably aware of the three Vs of Big Data: volume, velocity, and value. Often, however, businesses have a fuzzy idea about that third V. Just because you can theoretically extract value from some data store or another doesn’t mean it’ll be worth your trouble — at least for now.

There is so much low hanging fruit that businesses must be smart about prioritizing potential projects, zeroing in on the ones with the biggest, most immediate payoff. Timing is a key component of any successful endeavor, after all.

Data never creates value all by itself. Rather, it simply drives action. Will your organization be ready to act on data-driven insights? Will you need to acquire other tools to do so?

It’s important for project leaders to clearly define what business problems they hope to solve, when they expect to solve them, and why the timing to solve any particular problem is now.

They should also clarify why they think Big Data tools are the right ones to solve it.

Step Three: Rank Expected Benefits

A spokesperson for Jaspersoft, an embedded BI reporting and analytics provider, emailed me to recommend ranking expected benefits. The process of ranking expected benefits (improved decision making, reduced costs, increased efficiency, etc.) should help you narrow your search.

If your top priorities are sales and marketing ones, for instance, you’ll be looking at an entirely different set of tools than those that will help you streamline hiring. That may seem obvious, but with so many general-purpose tools available, the act of ranking will help you determine whether a broad platform is the way to go, or whether you’re better off finding something more specialized.

Ranking expected benefits can also help your organization refine its focus as it explores various data stores.

Step Four: Do a Self Assessment

Before shopping around for a shiny new Big Data solution, it’s important to accurately assess where you are now. “Companies are starting to realize they need to start learning more about Hadoop and MapReduce,” Johnson of EY said. “But before you go too far down that path, you need to look at the overall architecture of the software you already have. For instance, there are a whole slew of visualization solutions, but your existing architecture may work better with some than others.”

How are you currently addressing the problem? Do you have any existing reports, dashboards, or spreadsheets that you can review with Big Data vendors to help them better understand your needs?

Ask yourself, where am I now? What tools do I already have? And be sure not to overlook foundational technologies, like a robust data warehouse, that you may have previously skimped on.

Step Five: Figure out How You Will Measure Progress

If you want to drive from LA to New York, it’s not enough to simply locate New York on a map and point your car in the right direction. You need to follow signs along the way, and if you get off course, you need to map your way back to your route.

Measuring progress will be important to your users, as well. Let’s say you just arrived in the U.S. from a small European country and have no sense of scale. After a day’s drive, you could very well get discouraged since you’re not yet in New York. Being able to measure progress is important for any project, just to keep your employees sane. It’s easy to be excited early in a project, but harder to stay motivated later on once the going gets tough — especially if you have no idea how far along you are.

How will the Big Data solution you choose help you map progress and correct your course if you aren’t making progress? In some cases, this may require visualization features. In others, this may require that the solution has APIs that will allow you to connect it to existing software suites, such as CRM.

While some business goals may be fairly straightforward, such as customer acquisition, others like customer “engagement” are more nebulous. Even so, you should keep those secondary, hard-to-measure benefits in mind because along the way you may find some proxy that can help you estimate things that are hard to quantify.

“One of my favorite examples of proxy variables is churn in the telecom business,” Johnson said. EY was consulting with a mobile provider who found that they could predict with 80 percent accuracy which customers would leave within 15 days based on variables such as customer service calls. “The trouble is 15 days is not enough advanced notice to do anything about it.”

Rather than obsessing about data that couldn’t provide enough early warning, EY and the mobile provider began looking for a proxy that could help them expand their window of opportunity. They found that by assessing operational, rather than customer, data (dropped calls, 3G vs. 4G availability, data usage), they could get 60 days advanced notice.

“With a 60 day window, you have so many more options. You can do things like provide better QoS in the background for customers who may be thinking about leaving. With 15 days, the customer wouldn’t even notice, and if they did, it would still be too late,” he said.

Will the Big Data solution you’re considering be flexible enough to shift to a very different set or type (i.e., structured versus unstructured) of data if the first data set can’t deliver the results you seek?

Step Six: Consider Your End Users

Before buying, it’s important to determine who will actually use the tool. If you have zero data scientists in house, you should probably limit yourself to Big Data as a service, or you should find analytics plug-ins for the software your team already uses and understands, such as from your marketing automation platform.

Many of the Big Data solutions hitting the market today are designed to abstract complexity, so pretty much any business unit leader can pose a questions and get actionable info from their data. Others are far more complex and will necessitate that you have Hadoop experts in-house or will only reveal their Big Data insights to trained data scientists.

Whatever the case, you should test drive your top two or three solutions to make sure that the usability actually matches your end users’ skills.

Step Seven: Figure out How to Move from Projects to Processes

Many early Big Data projects will be more exploratory in nature than anything. Think of it as data wildcatting. There’s nothing wrong with exploration, per se, but while you’re exploring your data for possible areas of value, keep track of how you decided which data sets to explore, how you extracted value, how the data drove actions, and how you measured what you did.

After a few projects stack up, you should be able to draw lessons from those projects and, perhaps, define processes. Better yet, with the right Big Data platform, you may be able to automate some of those processes along the way.

As you assess tools, ask how hard it will be to turn your progress into a process. Are there features that will help you identify processes you can automate?

And I’ll offer one final piece of advice: be patient — but not too patient. It can take time to gain the insights you seek, but even for big projects that could take years to complete, if you break them down and have goals for much shorter timeframes, your likelihood of following through will be much higher.

It’s not wrong to have a big vision that will take time, but if you can reach goals or milestones along the way, you’ll get more buy-in from other parts of your organization.

Photo courtesy of Shutterstock.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles