Seven Hot Hadoop Startups that Will Tame Big Data

These businesses have big plans for helping companies extract valuable insights from their Big Data.


You Can't Detect What You Can't See: Illuminating the Entire Kill Chain

(Page 1 of 3)

Everyone knows that data volumes are growing exponentially. What’s not so clear is how to unlock the value all of that data holds. Enterprises are struggling to figure out how to store, manage and derive any real business value from Big Data.

Part of the problem is that traditional databases just aren't suited for mining Big Data insights. Legacy systems were designed decades ago, long before Big Data was a trend.

Enter Apache Hadoop, an open-source framework that enables the processing of large data sets in a distributed environment. With Hadoop, applications can be run on systems composed of thousands of nodes with thousands of terabytes data.

Gartner estimates the current Hadoop ecosystem market to be worth around $77 million. They expect that it will grow to $813 million by 2016. However, despite a few big-name backers, Hadoop is still relatively unproven in enterprise settings. Critics argue that while Hadoop works great as a processing platform, it's not all that good with queries. The add-ons Hive and Pig both help with this, but Hadoop still isn't quite a fully mature platform.

These startups intend to change that.

1. Alpine Data Labs

What they do: Provide data science solutions for Hadoop and Big Data

Headquarters: San Mateo, CA

CEO: Joe Otto, who previously ran Worldwide Sales for Greenplum, which is now part of EMC.

Founded: 2010

Funding: Alpine Data Labs is backed by a $7.5 million Series A round of funding from Sierra Ventures and Mission Ventures, along with EMC and Sumitomo Bank. The company is in the process of closing out a Series B round, which is expected to raise between $10 and $13 million.

Why they're on this list: While there are a ton of Big Data tools entering the market, many companies still struggle to gain actionable insight from their mountains of data.

According to Alpine Data, part of the problem is that it's much too difficult to get real insights out of Hadoop and other parallel platforms. Most companies don't know what to do with massive datasets, and few have gotten any further with Hadoop than batch processing and basic querying.

Alpine Data set out to simplify machine-learning methods and make them available on petabyte-scale datasets. Their tools make these methods available in a lightweight web application with a code-free, drag-and-drop interface.

Alpine Data leverages the parallel processing power of Hadoop and MPP databases and implements data mining algorithms in MapReduce and SQL. Users interact with their data directly where it already sits and design analytics workflows without worrying about data movement or complex code. All this is done in a web browser, and Alpine Data then translates these visual workflows into a sequence of in-database or MapReduce tasks.

Alpine Data's visual environment helps teams collaborate and quickly create and deploy analytics workflows and predictive models.

Customers include AT Kearney, Havas Digital, Zion Bank, Kaiser Permanente and CareCore

Competitors: SAS dominates this market, but other startups are moving into this space too, including Platfora, Skytree, Revolution Analytics and Rapid-I.

2. Cloudera

What they do: Provide a Hadoop-based Big Data Platform

Headquarters: Palo Alto, CA

CEO: Mike Olson, who was formerly CEO of Sleepycat Software, an embedded database company that was acquired by Oracle in 2006. After the acquisition, Olson spent two years at Oracle as VP for Embedded Technologies.

Founded: 2008

Funding: Cloudera has raised $140 million in venture capital to date. Its investors include Accel Partners Greylock Partners, Ignition Partners, In-Q-Tel and Meritech Capital Partners.

Why they're on this list: Big Data is hot, and Cloudera is the pioneer that first developed a Hadoop-based platform for Big Data. Moreover, they're sitting on a mountain of VC cash and have a solid management team.

Cloudera lets users query all of their structured and unstructured data and have a view beyond what's available from relational databases. Cloudera recently released Impala, a new open-source interactive query engine for Hadoop that enables interactive querying on massive data sets in real time.

Customers include CBS Interactive, eBay, Expedia, Monsanto and Samsung.

Competitors: EMC Pivotal, Hortonworks, MapR. Intel recently joined the market as well, but it's too early to tell how serious they are about this space.

Page 1 of 3

1 2 3
Next Page

Tags: Hadoop, NoSQL, SQL, big data, startup

0 Comments (click to add your comment)
Comment and Contribute


(Maximum characters: 1200). You have characters left.



IT Management Daily
Don't miss an article. Subscribe to our newsletter below.

By submitting your information, you agree that datamation.com may send you Datamation offers via email, phone and text message, as well as email offers about other products and services that Datamation believes may be of interest to you. Datamation will process your information in accordance with the Quinstreet Privacy Policy.