The open source Hadoop project is all about providing the ability to manage and understand large datasets. Yahoo which uses Hadoop to manage 120 terabytes of data per day, this week released a new version of their edition of Hadoop but they weren’t the only ones with a new Hadoop release this week.
Commercial Hadoop vendor Cloudera this week announced Cloudera’s Distribution for Hadoop (CDH) version 3, including some technologies that were previous closed source. In addition to the new version of CDH, Cloudera is announcing a new Enterprise version of their Hadoop distribution, providing additional usability and management features for enterprise users.
CDH is a version of the Apache Hadoop project that bundles additional projects and technologies to make Hadoop more usable for enterprises. CDH includes the Yahoo developed open source Oozie workflow engine as well as including projects originated by Cloudera. Among the Cloudera-originated projects is one called HUE (Hadoop User Experience), which began its life as the closed source Cloudera Desktop.
“Cloudera Desktop was a desktop based user interface for people building apps for Hadoop,” Cloudera CEO Mike Olson told InternetNews.com. “That was always available for free, but it wasn’t open source. We believe that the platform has got to be open source in order to succeed.”
Olson added that Cloudera has rebranded the desktop product as HUE and it has now also evolved. He explained that HUE has become a collection of APIs
Additionally Olson noted the Cloudera developed the open source Flume project. The Flume project, which is included as part of CDH, is all about getting various data sources into a Hadoop cluster in a continual, reliable and fault-tolerant way. Flume is a complement to the Sqoop project, also developed and open-sourced by Cloudera, which is a tool for importing database tables into Hadoop.
With the HBase project included in CDH, Cloudera is also aiming to expand beyond just SQL types of database inputs.
“HBase is a NoSQL layer on top of HTFS (Hadoop’s filesystem),” Olson said.
To date, Cloudera has built its business around offering services for Hadoop, but with Cloudera Enterprise, they’re now aiming to monetize software as well. Cloudera Enterprise includes deployment management tools as well as support and legal indemnification.
As to where Cloudera draws the line between what is an open source feature for CDH versus what is an Enterprise feature for paying customers, it’s all about the platform.
“If it is a platform feature, it belongs in the open source platform,” Olson said. “Platform features include ways to store data reliably — basically any of the plumbing that is required to make data storage and analysis work well.”
Olsen explained that the enterprise features are the tools that are required to integrate Hadoop clusters with existing infrastructure and the dashboards that IT staff needs to manage thousands of nodes in a cluster.
While Yahoo is a big contributor and backer of Hadoop, Olson doesn’t see Yahoo’s version of Hadoop as being competitive with Cloudera’s corporate efforts. Olson noted that Cloudera benefits from the work that is done in the open source Hadoop community, including Yahoo’s contributions. That said, in his view the Yahoo version of Hadoop isn’t necessarily the right fit of services for enterprise deployments.
“Yahoo has build a Hadoop distro that runs well on its own infrastructure,” Olson said. “Not all enterprises have the same compute infrastructure as Yahoo does and Yahoo does not provide support for that software.”
Sean Michael Kerner is a senior editor at InternetNews.com, the news service of Internet.com, the network for technology professionals.
Ethics and Artificial Intelligence: Driving Greater Equality
FEATURE | By James Maguire,
December 16, 2020
AI vs. Machine Learning vs. Deep Learning
FEATURE | By Cynthia Harvey,
December 11, 2020
Huawei’s AI Update: Things Are Moving Faster Than We Think
FEATURE | By Rob Enderle,
December 04, 2020
Keeping Machine Learning Algorithms Honest in the ‘Ethics-First’ Era
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 18, 2020
Key Trends in Chatbots and RPA
FEATURE | By Guest Author,
November 10, 2020
FEATURE | By Samuel Greengard,
November 05, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
November 02, 2020
How Intel’s Work With Autonomous Cars Could Redefine General Purpose AI
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 29, 2020
Dell Technologies World: Weaving Together Human And Machine Interaction For AI And Robotics
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
October 23, 2020
The Super Moderator, or How IBM Project Debater Could Save Social Media
FEATURE | By Rob Enderle,
October 16, 2020
FEATURE | By Cynthia Harvey,
October 07, 2020
ARTIFICIAL INTELLIGENCE | By Guest Author,
October 05, 2020
CIOs Discuss the Promise of AI and Data Science
FEATURE | By Guest Author,
September 25, 2020
Microsoft Is Building An AI Product That Could Predict The Future
FEATURE | By Rob Enderle,
September 25, 2020
Top 10 Machine Learning Companies 2021
FEATURE | By Cynthia Harvey,
September 22, 2020
NVIDIA and ARM: Massively Changing The AI Landscape
ARTIFICIAL INTELLIGENCE | By Rob Enderle,
September 18, 2020
Continuous Intelligence: Expert Discussion [Video and Podcast]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 14, 2020
Artificial Intelligence: Governance and Ethics [Video]
ARTIFICIAL INTELLIGENCE | By James Maguire,
September 13, 2020
IBM Watson At The US Open: Showcasing The Power Of A Mature Enterprise-Class AI
FEATURE | By Rob Enderle,
September 11, 2020
Artificial Intelligence: Perception vs. Reality
FEATURE | By James Maguire,
September 09, 2020
Datamation is the leading industry resource for B2B data professionals and technology buyers. Datamation's focus is on providing insight into the latest trends and innovation in AI, data security, big data, and more, along with in-depth product recommendations and comparisons. More than 1.7M users gain insight and guidance from Datamation every year.
Advertise with TechnologyAdvice on Datamation and our other data and technology-focused platforms.
Advertise with Us
Property of TechnologyAdvice.
© 2025 TechnologyAdvice. All Rights Reserved
Advertiser Disclosure: Some of the products that appear on this
site are from companies from which TechnologyAdvice receives
compensation. This compensation may impact how and where products
appear on this site including, for example, the order in which
they appear. TechnologyAdvice does not include all companies
or all types of products available in the marketplace.