Friday, May 17, 2024

Hadoop and Big Data: Ready to Cross the Chasm?

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

It’s rare for a technology and concept like Hadoop and Big Data to take off with the speed that they have, more so with a technology and concept as big and complex. Hadoop was an internal project at Yahoo just five years ago. Now, companies are betting their business on Big Data.

The challenge, though, is for the products to mature. And much like those silly cheese commercials on TV, Hadoop needs to mature as well. It’s one thing for techies with Java skills to download the source code from the Apache Foundation and conduct internal experiments. But if that’s all there is, that would have guaranteed that Hadoop never got beyond the propellerheads.

For the union of Hadoop and Big Data to grow and thrive it has to do like every other concept or product and “cross the chasm,” as defined in Geoffrey Moore’s definitive 1991 book on marketing. Crossing the chasm means making the leap from a few early adopters, almost always very technically advanced who can work with the product on their own, to the masses on the other side of the chasm. If you make it, you’re Linux. If you fall short, you end up like CORBA.

As far as Gartner is concerned, Hadoop and Big Data are at the point beyond early adopters and ready to make the leap to the other side, but not just yet. In a survey conducted last month of almost 300 enterprise IT decision makers, 54% said they had no plans to invest in Hadoop technologies in the coming year, but that means 46% are ready to pull the trigger or already have. All things considered, said Merv Adrian, the Gartner research VP who conducted the survey, that’s not bad.

“I’m not surprised, because in terms of maturity of the market, Gartner believes we are just finishing the early adopter stage. And there is a natural pause reflected in a lifecycle when mainstream adopters start to look at what early adopters have already taken in. All kinds of questions come up,” he said.

The hype cycle is about the trigger – some disruptive technology coming into the market. In the early days, the fact that a given technology is rough around the edges and needs new skills is accepted by early adopters willing to put up with it to take advantage of being first. But it needs more to make it to the next level.

Hadoop and Big Data: Rejected?

One reason why the disinterest in Hadoop/Big Data is so high is some companies have tried it and decided they didn’t need it. It’s not an uncommon characteristic of early adopters that they get into a tech because it’s the bright shiny object and want to see what it’s good for and play around with it, but not have any use for it. Then it becomes a technology in search of a solution.

“When we ask people why they are not interested, the reason we heard the most was we haven’t identified something we need to use this for. We don’t need it or we have a technology that’s adequate. Or they say no one has presented us with a business case,” said Adrian.

Randall Barnes, senior cloud architect at 2nd Watch, an AWS managed services partner, said that about 25% of his clients tried Hadoop and/or Big Data and didn’t want it, which he said is higher than most other new technologies.

“Hadoop has a much higher rate of customers deciding not to pursue it than others, but the vast majority of customers we work with are pleased and do commit to move into the next phase,” he said.

Aside from the business case, there was also the cost, Barnes added. In looking at what it would cost to retool their developer teams and developer processes, match their existing IP with ETL and other data requirements, some customers felt the cost of a migration shift was higher than the benefits.

Finally, one of the biggest obstacles he’s encountered is just getting familiarity and having a broader audience participate in the evaluation and experience and try to understand how it will benefit to have this new workload or app stack. In other words, Big Data still doesn’t have a perfect pitch. “There is this barrier to getting this [concept] into a form the IT department or decision makers can get their hands around it,” he said.

Hadoop Product Maturity

Adrian said that while 54% of people he surveyed have not made a decision on Hadoop and don’t plan to invest in the next two years, that number is not going to hold through the next two years. “As you move into the trough, you move into the mainstream where people have different assessments and have expectations that early adopters don’t have. They want more fit and finish and easier to use and security,” he said.

What’s happening now is buyers are looking for more than just the software download. They are looking for things that show the Hadoop stack and offerings are maturing. He made it clear Gartner is not going to predict their change, it’s going to report their change.

And change is coming. If you look at the stuff announced at the recent Hadoop Summit, there were deployment utilities, deployment services, security services, monitoring services and more. “These are the kinds of things mature enterprises want. They don’t want to hear a bunch of guys in the Silicon Valley are writing this code and it will be available next week,” said Adrian.

Building the Business of Big Data

Prakash Nanduri, co-founder and CEO of Paxata, a Big Data development and deployment firm, agreed with Adrian’s contention that Big Data and Hadoop are at the chasm crossing phase, but thinks it will go quicker than Gartner does.

“Hadoop is taking on some very strong footholds in the enterprise,” he said. “Lots of people test out free stuff. Where Hadoop is right now is moving away from experimenters to guys saying ‘we’re going to use this, we’re going to use it in an effective manner and it will deliver value to the business.’”

He compares Hadoop with RDBMS, which came out in the 1970s but didn’t really take off until the 1980s when Oracle made a massive push to support developers. “When RDBMSes came out, there was a huge push to get people to become DBAs. They learned SQL. Well, we’re in the early days of Hadoop. Right now it’s high pressure but in a few years, the ecosystem has to train people. The majority of efforts from Cloudera and Hortonworks are spending a lot of time on education. Will it be overnight? No.”

And he agrees with Adrian’s contention that for Hadoop to grow in use it has to be accessible to more than just Java programmers, since Hadoop is written in Java. “You need the skillsets, the tooling, the value-added apps. The only way you can make this accessible to the mere mortals is to hide the complexity,” said Nanduri. “If they are well managed we will see continued adoption.”

Finding the Hadoop and Big Data Talent

The biggest challenge, though, is the oft-documented problem of finding the talent to get the job done. There is a huge shortage of people with Big Data skills, and they can command top dollar. “The top challenges we hear from organizations using Hadoop today, number one is obtaining skills and capabilities, number two is figuring out how to get value out of it,” said Adrian.

In fact, he argues that the technology is way ahead of the skills. “The truth is today, the Hadoop stack is very competent and capable of doing things people want to do with it. The tech is ready for the kinds of things most people are thinking of doing. Some researchers say skills are not a problem. They are dead wrong. Skills are definitely a problem,” said Adrian.

Enterprise companies with large services divisions like IBM and HP and pure services firms like CSC and InfoSys are training people as fast as they can, but you don’t create a data scientist overnight. “People can’t write the algorithms, never mind implement them. Some large systems integrators are turning down jobs because they can’t provide the resources,” said Adrian.

That’s been 2nd Watch’s experience, which has had to turn down some projects, according to Barnes, due to a shortage of talent. For most of the Hadoop or Big Data-type projects, 2nd Watch has a longer lead time because those few individuals with the appropriate skillsets are booked longer than a typical project. So the company is hiring bright talent and training them.

“[Training them is] not easy at all. We’re talking to bright people and training them but finding people with those skillsets [already] is almost impossible. I speak with architects and engineers all the time, there’s a great deal of interest in training candidates but I’m not seeing that translate into a net surplus of candidate available. So it could be demand is absorbing them faster than folks are being trained.”

Hadoop Going Forward

Adrian believes Hadoop will eventually live up to its hype. “It’s a more cost-effective approach hitherto unexploited information than is available. It represents an opportunity we haven’t exploited. As it becomes more available and more manageable for the enterprise class it will be adopted,” he said.

But he added Hadoop will change, too. “Two to three years from now, people could be talking about their Spark stack, that’s entirely possible. These things continue to change. The answer to what is Hadoop is different today than it was two years ago and will be different two years from now. I think it has legs and will be here for quite some time,” said Adrian.

Barnes said interest is growing, slow but steady. “It has got the mindshare. A lot of the conversations we have are with key decision makers all the way up to executive level are very aware. They’re familiar with it and the promise. We’re having interesting conversations for the possibilities of it. It’s not just mining usage logs or parsing for fraud detection anymore,” he said.

Photo courtesy of Shutterstock.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles