For Big Data companies, this is a critical period for competitive jockeying. These are the early days of Big Data, which means there are still a plethora of companies – a mix of new firms and old guard Silicon Valley firms – looking to stay current. Like everything else, the Big Data market will mature and consolidate. In five years, you can bet that many of the Big Data companies on this list will be gone – either out of business or merged/acquired with a larger player.
This isn't meant as a Big Data buyer's guide. Instead, it’s an overview of 30 big and small companies in the field of Big Data Analytics. We're not looking at hardware players, unless they have a software story that goes with it (and some do). The one thing they have in common is analytics.
In addition to its big iron, IBM offers DB2, Informix and InfoSphere database software, Cognos and SPSS analytics applications, and of course its well-known Global Services division. IBM also supports the Hadoop analytics platform.
HP is a major hardware vendor and services provider, but its big analytics platform is Vertica, which it acquired in 2011. Vertica Analytics Platform is designed to manage large, fast-growing volumes of structured data and provide very fast query performance and petabyte scalability on commodity enterprise servers. It also has the Autonomy unit with its HAVEn software for analyzing and finding meaning from petabytes of structured and unstructured information.
EMC specializes in storage and its Big Data analytics are built around that. It has a Big Data group that covers hardware and software and a number of verticals, like high performance computing, enterprise and oil and gas exploration. EMC also has a Marketing Science Lab to help companies use Big Data analytics in their marketing department.
Teradata's Aster platform has a mix of analytics, including the Discovery Platform, a database, a discovery portfolio with pre-built functions for a broad set of Big Data applications, the Aster SQL-GR next-generation graph analytics engine, SNAP Framework for integration and a unified SQL interface across multiple analytic engines and data sources and its own MapReduce.
Oracle has its Big Data Appliance that combines an Intel server with a number of Oracle software products. They include Oracle NoSQL Database, Apache Hadoop, Oracle Data Integrator with Application Adapter for Hadoop, Oracle Loader for Hadoop, Oracle R Enterprise tool, which uses the R programming language and software environment for statistical computing and publication-quality graphics, Oracle Linux and Oracle Java Hotspot Virtual Machine.
SAP's best Big Data tool is its HANA in-memory database, which the company says can run analytics on 80 terabytes of data, integrate with Hadoop, search text content, harness the power of real-time predictive analytics, and more.
Probably not the first company you would think of, but Microsoft's Big Data strategy is fairly broad. It has a partnership with Hortonworks and offers the HDInsights tool based for analyzing structured and unstructured data on Hortonworks Data Platform. Microsoft also offers the iTrend platform for dynamic reporting of campaigns, brands and individual products.
Amazon has a number of enterprise Big Data platforms, including the Hadoop-based Elastic MapReduce, DynamoDB big data database, and the Redshift massively parallel data warehouse. All of these services work within its greater Amazon Web Services offerings.
VMware is known best for its virtualization hypervisor, but it's building on that platform to offer Big Data software, such as its recent VMware vSphere Big Data Extensions, which lets vSphere control Hadoop deployments and make it easier for enterprises to launch Big Data projects.
Google is more of a cloud services company but it is making a push into Big Data analytics by offering BigQuery, a cloud-based Big Data analytics platform for quickly analyzing very large datasets. Unlike most services, you send data up to BigQuery rather than store it in the cloud.
Splunk Enterprise was originally a log analysis tool, but after partnering with Tableau Software to use Tableau's visual analytics package, Splunk has been reborn as a machine data analytics company. It can monitor online end-to-end transactions, study customer experience, behavior and usage of services in real time and identify spot trends and sentiment analysis on social platforms.
Develops an in-memory relational database that can perform both mixed workloads and analytics at the same time. MemSQL is a highly scalable, in-memory transactional database management system with increased focus on historical analysis.
If you can get past the creepy factor, CIA-funded Palatir has two Big Data analytics products: Palantir Gotham integrates structured and unstructured data for search and discovery capabilities; and Palantir Metropolis for data integration, information management and quantitative analytics. The software connects to a variety of public data sets and discovers trends, relationships and anomalies, including predictive analytics.
Trifacta bridges the gap between collecting data and transforming it into something useable, usually a two-step process. Trifacta's data transformation software automates the process of transforming data from database sources like Hadoop into something that can be used by software visualization and business intelligence tools.
Datameer claims its Datameer Analytics Solution (DAS) is the only end-to-end Hadoop solution for analytics. DAS is a business integration platform for Hadoop that includes data source integration, an analytics engine with a spreadsheet-like interface designed that has more than 200 analytic functions and visualization functions.