5 Open Source Big Data File Systems and Programming LanguagesA roundup of open source file systems and programming languages that are the core of today's Big Data tool set in the enterprise.
Sponsored by Red Hat, Gluster offers unified file and object storage for very large datasets. Because it can scale to 72 brontobytes, it can be used to extend the capabilities of Hadoop beyond the limitations of HDFS (see below). Operating System: Linux.
Hadoop Distributed File System
Also known as HDFS, this is the primary storage system for Hadoop. It quickly replicates data onto several nodes in a cluster in order to provide reliable, fast performance. Operating System: Windows, Linux, OS X.
Another Apache Big Data project, Pig is a data analysis platform that uses a textual language called Pig Latin and produces sequences of Map-Reduce programs. It helps makes it easier to write, understand and maintain programs which conduct data analysis tasks in parallel. Operating System: OS Independent.
Developed by Bell Laboratories, R is a programming language and an environment for statistical computing and graphics that is similar to S. The environment includes a set of tools that make it easier to manipulate data, perform calculations and generate charts and graphs. Operating System: Windows, Linux, OS X.
ECL ("Enterprise Control Language") is the language for working with HPCC. A complete set of tools, including an IDE and a debugger are included in HPCC, and documentation is available on the HPCC site. Operating System: Linux.
The file system is, in many ways, the very center of the Big Data universe. It’s the tools provided by the file system that enables an overall structure to a data set, that helps turns it from a vast pool of information to something that can be held and mined for insights. And if there’s a file system that is clearly the star of the show in the Big Data world, it’s HDFS, the key to Hadoop – the open source platform that, for many users, is all but synonymous with Big Data itself. Hadoop is one of the greatest success stories from the open source community. But as you’ll see on the following pages, there are other file systems and languages that are central to the Big Data world that are also open source. In fact, this list of file systems and programming languages demonstrates that importance of open source to today’s rapidly evolving Big Data toolset.
|5 Open Source Big Data Analysis Platforms and Tools|
|8 Open Source Big Data Mining Tools|
|5 Open Source Big Data Tools: Transfer and Aggregate||50 Top Open Source Tools for Big Data|