In a recent Gartner survey, CIOs picked business intelligence and analytics as their top technology priority for 2012. The market research firm predicts that enterprises will spend more than $12 billion on business intelligence (BI), analytics and performance management software this year alone.
As the market for business intelligence solutions continues to grow, the open source community is responding with a growing number of applications designed to help companies store and analyze key business data. In fact, many of the best tools in the field are available under an open source license. And enterprises that need commercial support or other services will find many options available.
This month, we’ve put together a list of 50 of the top open source business intelligence tools that can replace proprietary solutions. It includes complete business intelligence platforms, data warehouses and databases, data mining and reporting tools, ERP suites with built-in BI capabilities and even spreadsheets. If we’ve overlooked any tools that you feel should be on the list, please feel free to note them in the comments section below.
Business Intelligence Platforms
Jaspersoft claims the title of “the world’s most widely used business intelligence software.” It offers Web-based reports, dashboards and analysis, and it supports cloud computing, mobility and big data. In addition to the free Community versions of its software, the company also offers paid Express, Professional and Enterprise editions. Operating System: OS Independent.
The Palo BI Suite brings together four open-source business intelligence tools: Palo OLAP Server, Palo Web, Palo ETL Server and Palo for Excel. Commercial solutions based on Palo can be purchased through Jedox. Operating System: OS Independent.
Claiming more than 10,000 users, Pentaho offers tools that provide data integration and analytics capabilities. The link above provides information about the commercial versions of the software; the open source downloads can be found at the Pentaho Community Wiki. Operating System: Windows, Linux, OS X.
SpagoBI boasts that it is “the only 100% open source, complete and flexible BI suite.” It includes engines for reporting, multidimensional analysis (OLAP), charts, KPI, location intelligence, data mining, real-time dashboards, mobile, master data management, ETL and more . Operating System: OS Independent.
OpenI is “all about monetizing your data… in the quickest and most economic way.” It offers open source dashboards, interactive reporting, ETL, predictive modeling and more. Operating System: OS Independent.
6. ERP BI
Designed for enterprises that use open source ERP systems including PostBooks and XTuple ERP, this project offers business intelligence tools that extend the capabilities of those ERP solutions. It’s based on the community edition of Pentaho. Operating System: Windows, Linux, OS X.
Apatar focuses on integrating data from cloud-based and on-premise applications, which special features for CRM applications, including Salesforce.com, SugarCRM and Goldmine CRM. The company offers paid support, consulting, training, and other services, as well as an on-demand version of the software. Operating System: Windows, Linux.
8. Clover ETL
This data integration platform aims to “keep data in your business systems valuable, organized, and meaningful.” They offer a range of commercial products based on the open source CloverETL Engine, which is available for download from SourceForge. Operating System: Windows, Linux, OS X.
Based on Kettle/Pentaho Data Integration, GeoKettle incorporates geospatial capabilities from a variety of other open source tools. It is owned by Spatialytics, which offers commercial versions of the tools. Operating System: Windows, Linux, OS X.
Java-based KETL is a portable, scalable ETL tool with support for many popular security and data management tools. Commercial support is available through project owner Kinetic Networks. Operating System: OS Independent.
Java-based Scriptella offers a simple tool for performing ETL tasks and executing scripts. Note that this tool isn’t quite as polished as some of the others on our list, and no enterprise version or commercial support is available. Operating System: Windows, Linux, OS X.
Talend offers a range of open source data integration, data quality, MDM and big data solutions. In addition to the open source editions, it also offers products with a commercial subscription, as well as paid support, training and consulting. Operating System: Windows, Linux, Unix.
The “premier open source data quality solution,” DataCleaner is useful for data profiling and DQ analysis, data cleansing, detecting and merging duplicates, and lightweight ETL tasks. It’s also available in a commercial edition. Operating System: OS Independent.
Used by Netflix, Twitter, Reddit, Cisco, Digg, eBay and many other companies with large, active data sets, Cassandra was developed by Facebook and is now managed by the Apache Foundation. Highly scalable, Cassandra has been known to hold more than 300TB of data in a cluster spread across more than 400 machines. Commercial support and services are available through third-party vendors. Operating System: OS Independent.
Designed to support humongous data sets, MongoDB offers document-oriented storage with full index support. Project owner 10gen offers paid subscriptions with commercial support and additional features. Operating system: Windows, Linux, OS X, Solaris.
Developed by Twitter, FlockDB is a distributed graph database that excels at storing social networking data. It was designed to handle massive horizontal scaling and high throughput. Operating System: OS Independent.
Based on Google’s Bigtable design, Hypertable offers NoSQL scalability that requires less hardware and less power than competing solution. Commercial support, training, and licenses are available. Operating System: Linux, OS X.
Built for the Web, this Apache project offers document-based storage that you can access via HTTP with a browser. Key features include distributed scaling, eventual consistency, high availability, on-the-fly document transformation and real-time change notifications. Operating system: Windows, Linux, OS X, Android.
OrientDB combines features of document databases and graph databases. Exceptionally fast, it can store 150,000 documents per second on standard hardware. Operating system: OS Independent.
Replaces Microsoft SQL Server Standard
With a huge list of customers that includes Yahoo, LinkedIn, Alcatel-Lucent, Google, Nokia, YouTube, Craigslist, Sears and Zappos.com, Oracle-owned MySQL claims to be “the world’s most popular open source database.” It’s available in multiple commercial editions as well as the free community version. Operating System: Windows, Linux, Unix, OS X.
Replaces Microsoft SQL Server Standard
PostgreSQL doesn’t claim to be the most popular open source database, but it does claim to be the “most advanced.” It’s been around for more than 15 years and includes features like Multi-Version Concurrency Control (MVCC), point-in-time recovery, support for international characters, asynchronous replication, nested transactions (savepoints), online/hot backups, write ahead logging and more. Operating System: Windows, Linux, Unix, OS X.
Replaces Microsoft SQL Server Standard
Now more than 30 years old, Firebird is a RBDMS with many SQL standard features. Key capabilities include excellent scalability, multiple platform support, multi-generation architecture, and logging and monitoring. Operating System: Windows, Linux, Unix, OS X, Solaris.
LucidDB claims to be “the first and only open-source RDBMS purpose-built entirely for data warehousing and business intelligence.” It offers column-store tables, intelligent indexing, star join optimization, SQL support, intelligent prefetch and much more. Operating System: Windows, Linux.
Managed by Apache, HBase is the distributed database used by Hadoop. Modeled after Google’s Bigtable, it scales to billions of rows and millions of columns. Operating System: OS Independent.
Neo4j calls itself the “world’s leading graph database” and claims to offer 1,000 times better performance than relational databases. The link above connects to downloads of the open source Community version; Advanced and Enterprise versions are available through Neo Technology. Operating System: Windows, Linux.
Popular with telecom providers in Asia and Europe, Hibari is a distributed, linearly scalable, highly available, key-value, open source big data store. For commercial support, see Gemini Mobile. Operating System: OS Independent.
Riak claims thousands of users, including Comcast, Yammer, Voxer, Boeing, Joyent, DotCloud, GitHub and the Danish Government. The website boasts “Riak is the most powerful open-source, distributed database you’ll ever put into production.” New users should check out the Riak Fast Track for an introduction and quick installation guide. Operating System: Linux, OS X.
Capable of scaling from a single server to thousands of machines, BigData is a high-performance RDF database with high availability and high concurrency. Commercial support and licenses are available. Operating System: OS Independent.
Formerly a Hadoop sub-project, Hive is a data warehouse designed for easy data summarization, ad-hoc queries, and the analysis of large datasets. It uses a SQL-like language known as HiveQL. Operating System: OS Independent.
Replaces: Microsoft SQL Server Standard
This public-domain software claims to be “the most used database engine in the world” because it is included on every Android, iOS, Mac, and Windows 10 device, as well as being integrated into popular applications like Firefox, Chrome, Skype, iTunes, Dropbox, TurboTax, QuickBooks and others. Its development is sponsored by a consortium of companies that includes Blomberg, Mozilla, Expensify and others. Operating System: Windows, Linux, OS X, Android
Java-based JMagallanes offers OLAP and dynamic reporting from a variety of data sources, including SQL, Excel, XML, and others. Commercial support is available on a per incident basis, and you can also purchase two related products–JMagallanes Datawarehouse and JMagallanes Web. Operating System: OS Independent.
Business Intelligence and Reporting Tools, a.k.a. BIRT, can add reporting features to any Java/Java EE application. Actuate is the company that leads development of BIRT and also offer commercial products based on the open source project. Operating System: OS Independent.
This Web-based reporting interface works with a variety of reporting engines, including JasperReports, JFreeReport, JXLS, and BIRT. The paid professional version adds OLAP support, dashboards, conditional scheduling and some other advanced features. Operating System: OS Independent.
Data Mining and Analytics
Short for “Konstanz Information Miner,” KNIME describes itself as “a user-friendly and comprehensive open-source data integration, processing, analysis, and exploration platform.” Gartner named KNIME a “Cool Vendor” in analytics, business intelligence and performance management in 2010. The Desktop version is open source; the Professional, Team Space, Server and Cluster Execution editions require a paid subscription. Operating System: Windows, Linux, OS X.
Owned by Rapid-I, Rapid Miner is the self-proclaimed “world-leading open-source system for data and text mining.” It’s available as a standalone solution, as a data mining engine for use with other applications, or as part of the RapidAnalytics server suite. Paid enterprise versions of the software are available. Operating System: OS Independent.
This “fruitful and fun” project aims to offer data visualization and analysis capabilities that can be used by both experienced professionals and novices. Add-ons are available for bioinformatics and text mining. Operating System: Windows, Linux, OS X.
Java-based jHepWork, or jWork, is a platform for analysis of large volumes of numbers, data mining, statistical analysis and mathematics. It includes libraries for creating data visualizations, as well as libraries for data structures and data manipulation. Operating System: OS Independent.
Also Java-based, SPMF began life as a sequential pattern mining framework (hence the SPMF acronym). Since then, the project has expanded in scope and also includes association rule mining, sequential rule mining and frequent itemset mining algorithms. Operating System: OS Independent.
Short for “R Analytical Tool To Learn Easily,” Rattle offers a graphical interface to perform data mining tasks using the R programming language. It can summarize data statistically or visually, build models, score datasets and more. Operating System: Windows, Linux, OS X.
ERP Solutions with BI Features
40. Openbravo ERP
A complete enterprise resource management solution, Openbravo also includes business intelligence capabilities like reporting, multidimensional analysis (OLAP) and balanced scorecards. The paid basic and professional editions add support and features not found in the open source community edition. Operating System: Windows, Linux, OS X.
This ERP suite incorporates the BI capabilities of the Pentaho suite. Although plenty of documentation and community support is available, this community-run project does not offer paid support. Operating System: Windows, Linux, OS X.
The Compiere ERP and CRM solution includes several BI functions, including dashboards and reporting. It comes in community and enterprise editions that can be deployed on premise or in Amazon’s cloud. Operating System: Windows, Linux, OS X.
JFire describes is both a full-featured Java-based ERP suite and a platform for building customized business applications. The ERP suite includes BIRT integration for BI and reporting. Support and services are available through third parties or through project owner NightLabs. Operating System: OS Independent.
Used by enterprises like Toyota and Honeywell, Opentaps is a full-featured ERP suite with ecommerce, customer relationship management, warehouse and inventory management, supply chain management, financial management and business intelligence capabilities. It’s also available in a professional edition that can be deployed on premise or on Amazon Web Services. Operating System: Windows, Linux.
Downloaded more than 150,000 times, project-open aims to bridge the gap between ERP and project management. The core (free) software includes basic reporting, but more advanced business intelligence and data warehouse modules are available for a fee. Operating System: Windows, Linux.
46. OpenOffice Calc
Replaces Microsoft Excel
Apache’s OpenOffice project offers word processing, spreadsheets, presentations, graphics, database software that is compatible with the Microsoft Office equivalents. The latest version of Open Office has been downloaded more than 3 million times since its release in early May. Those who have used Excel will find Calc very similar. Operating System: Windows, Linux, OS X.
47. LibreOffice Calc
Replaces Microsoft Excel
A community fork of OpenOffice, LibreOffice’s version of Calc also offers a look and feel similar to Excel with support for Excel file formats. It also includes several advanced feature for BI professionals, including Advanced DataPilot technology, natural language formulas, scenario manager and more. Operating System: Windows, Linux, OS X.
48. KOffice KCells
Replaces Microsoft Excel
KDE’s office productivity suite also includes a spreadsheet application. Although the interface is quite a bit different than Excel, OpenOffice and LibreOffice, users generally report that it’s fairly intuitive. Key features include built-in templates and a comprehensive formula list. Operating System: Windows, Linux, OS X.
Replaces Microsoft Excel
Gnumeric isn’t part of a comprehensive office productivity suite–it’s a standalone spreadsheet. Although the interface is similar to Excel’s and it imports Excel files, it’s not meant to be an Excel clone. It offers all of Excel’s functions, plus 153 more, and reports have praised Gnumeric as more accurate than commercial spreadsheet software. Operating System: Windows, Linux.
Replaces Microsoft Excel’s standard graphics
Sparklines doesn’t replace Excel–it extends its functionality with new functions that create simple, intense graphics known as Sparklines. See the user manuals for examples of the type of charts and graphs it can create. Operating System: Windows.