Wednesday, June 12, 2024

Open Source Artificial Intelligence: Leading Projects

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

Open source artificial intelligence projects don’t always get a lot of publicity, but they play a vital role in the development of artificial intelligence. Because these open source projects are often pursued as passion projects by developers (sometimes in colleges and universities), the advances are creative and particularly forward-looking.

Typically freed from the constraints of a corporate setting (though some are supported by companies), these open source AI projects can dream big – and often deliver ground-breaking machine learning (ML) and AI advances.

Also important: the advances from these leading open source AI projects fuel the larger AI sector. That is, a new idea from this month’s AI project ends up next year (or even next month) in a high- end AI solution sold by a company.

Remember, if you know of additional top open source AI tools that should be on this list, please include them in the comments section below.

Open Source AI Projects


PyTorch has all the elements you’d expect from a leading open source AI project. It focuses on machine learning, arguably the most popular use of AI in this stage of the emerging technology’s growth. Even more important, developers and AI engineers can set PyTorch up on the top cloud computing platforms; PyTorch on AWS and PyTorch on Azure are both viable, as well as Google Cloud and Alibaba. PyTorch offers neural networks, a foundational element of AI development.

Open Neural Network Exchange

Developed by Microsoft and Facebook, Open Neural Network Exchange offers some very powerful tools, most particularly the ability to recycle fully developed neural network models (which have spent hours and hours being trained in systems) into various other systems. In essence, the Open Neural Network Exchange greatly extends the usefulness of existing models by enabling this porting. Expect ONNX to grow ever more popular in the years ahead.

IBM’s AI Fairness 360

The problem with bias in artificial intelligence algorithms is a growing concern, and AI Fairness 360 is the open source solution to address this. The tool provides algorithms to enable a developer to scan a ML model to find any potential bias, an essential part of fighting bias – and certainly a complex task. Importantly, AI Fairness allows AI engineers to explore the algorithms throughout the development lifecycle. The tool can be set to work automatically. Built into the tool’s foundation is an architecture that checks for correlations; do the correlations create a prediction that suggests a harmful stereotype?


Keras is a rarity in the world of AI open source projects: it promotes itself as “an API designed for human beings, not machines.” A Python deep-learning API, Keras interoperates with high- profile AI projects like Theano and Microsoft Cognitive Toolkit. Developers and AI engineers use it as a ML library to build prototypes with comparative ease. Also aiding its ease of deployment, Keras can run on a mix of processor hardware.


As the name suggests, Accord.NET uses the .NET framework. It’s a .NET ML learning framework that offers image and audio libraries coded in C#. It’s forward-looking, in that it offers a platform for developing commercial-level applications, including apps geared for signal processing, audio-visual toolsets and statistics apps. If you’re just getting your feet wet, Accord also includes template apps so you can start building faster.


Certainly, an open source AI technology that’s generating buzz, Generative Pre-Trained Transformer 2 (GPT-2) was released by OpenAI in 2019. GPT leverages a deep neural network, which uses numerous layers of software to process any number of inputs. GPT-2 is broadly known for handling text, from translation to creating text that, at its best, can be remarkably similar to that written by humans. Moreover, it’s a widely powerful learning tool that can synthesize and adapt to data with significant accuracy.

Cheatsheets AI 

This project is useful if you’re a ML or AI developer who could use a helping hand with open source ML/AI projects. More of a learning tool than a project, Cheatsheets assists you in getting up to speed with AI/ML projects, from Keras to Scripy to PySpark to Dask. The instruction offered is in-depth and necessarily complex. While Cheatsheets AI is designed for “AI newbies,” in fact you will need some prior training to use this resource.


Is there a developer who doesn’t know TensorFlow? It’s practically a household name. Developed by the Google Brain team for internal use at Google, TensorFlow is now one of the most well-known open source machine learning platforms. Google is also making a cloud-based version of TensorFlow available for free to researchers.


Originally created by the bright minds at UC Berkeley, Caffe has become a very popular deep learning framework. Its claims to fame include expressive architecture, extensible code and speed.


With a huge user base, H2O claims to be “the world’s leading open source deep learning platform.” In addition to the Open Source version, the company also offers a Premium edition with paid support.

Microsoft Cognitive Toolkit

Clearly, Microsoft has moved into the world of open source. Formerly known as CNTK, the Microsoft Cognitive Toolkit promises to train deep-learning algorithms to think like the human brain. It boasts speed, scalability, commercial-grade quality and compatibility with C++ and Python. Microsoft uses it to power the AI features in Skype, Cortana and Bing.

DeepMind Labs

Another very big name in AI and ML. Intended for use in AI research, DeepMind Lab is a 3D game environment. It was created by the DeepMind group at Google and is said to be especially good for deep reinforcement learning research.


Developed at Carnegie Mellon University, ACT-R is the name of both a theory of human cognition and software based on that theory. The software is based on Lisp, and extensive documentation is available. Operating Systems: Windows, Linux, macOS.

StarCraft II API Library

You didn’t think AI was all work, did you? Google’s DeepMind and Blizzard Entertainment are collaborating on a project that makes it possible to use the StarCraft II video game as an AI research platform. It’s a cross-platform C++ library for building scripted bots.


The Numenta organization offers numerous open source projects related to hierarchical temporal memory. Essentially, these projects attempt to create machine intelligence based on current biological understandings of the human neocortex.

Open Cog

A big ambition, to be sure: instead of focusing on a narrow aspect of AI such as deep learning or neural networks, Open Cog aims to create beneficial artificial general intelligence (AGI). The project is working toward creating systems and robots with the capacity for human-like intelligence.

Stanford CoreNLP

This Java-based natural language processing software can identify the base forms of words, their parts of speech and whether they are names of companies or people, as well as normalizing dates and times. It marks up the structure of sentences in terms of phrases and syntactic dependencies, indicating which noun phrases refer to the same entities, identifying sentiment, extracting particular or open-class relations between entity mentions and getting quotes. It was designed for English but also supports a wide array of languages.


Developed and used by Facebook – yes, they have deep resources – Prophet forecasts time series data. It’s implemented in R or Python and is fully automatic, accurate, fast and tunable.


Originally an IBM Research project, SystemML is now a top-level Apache project. It describes itself as “an optimal workplace for machine learning using big data,” and it integrates with Spark.


Deep learning can be thought of as the furthest edge of AI. Theano, geared for deep learning, describes itself as “a Python library that allows you to define, optimize and evaluate mathematical expressions involving multi-dimensional arrays efficiently.” Key features include GPU support, integration with NumPy, efficient symbolic differentiation, dynamic C code generation and more.


Short for “Machine Learning Language Toolkit,” MALLET includes Java-based tools for statistical natural language processing, document classification, clustering, topic modeling, information extraction and more. It was first created in 2002 by faculty and graduate students at the University of Massachusetts Amherst and the University of Pennsylvania.


An example of cross-collaboration in the open source AI sector, DeepDetect has been used by organizations like Airbus and Microsoft. DeepDetect is an open source deep learning server based on Caffe, TensorFlow and XGBoost. It offers an easy-to-use API for image classification, object detection, and text and numerical data analysis.

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles