15 Top Open Source Artificial Intelligence Tools

Posted September 12, 2016 By  Cynthia Harvey
  • Previous
    Open Source Artificial Intelligence
    Next

    Open Source Artificial Intelligence

    These open source AI applications are on the cutting edge of artificial intelligence research.
  • Previous
    Caffe
    Next

    1. Caffe

    The brainchild of a UC Berkeley PhD candidate, Caffe is a deep learning framework based on expressive architecture and extensible code. It's claim to fame is its speed, which makes it popular with both researchers and enterprise users. According to its website, it can process more than 60 million images in a single day using just one NVIDIA K40 GPU. It is managed by the Berkeley Vision and Learning Center (BVLC), and companies like NVIDIA and Amazon have made grants to support its development.

  • Previous
    CNTK
    Next

    2. CNTK

    Short for Computational Network Toolkit, CNTK is one of Microsoft's open source artificial intelligence tools. It boasts outstanding performance whether it is running on a system with only CPUs, a single GPU, multiple GPUs or multiple machines with multiple GPUs. Microsoft has primarily utilized it for research into speech recognition, but it is also useful for applications like machine translation, image recognition, image captioning, text processing, language understanding and language modeling.

  • Previous
    Deeplearning4j
    Next

    3. Deeplearning4j

    Deeplearning4j is an open source deep learning library for the Java Virtual Machine (JVM). It runs in distributed environments and integrates with both Hadoop and Apache Spark. It makes it possible to configure deep neural networks, and it's compatible with Java, Scala and other JVM languages.

    The project is managed by a commercial company called Skymind, which offers paid support, training and an enterprise distribution of Deeplearning4j.

  • Previous
    Distributed Machine Learning Toolkit
    Next

    4. Distributed Machine Learning Toolkit

    Like CNTK, the Distributed Machine Learning Toolkit (DMTK) is one of Microsoft's open source artificial intelligence tools. Designed for use in big data applications, it aims to make it faster to train AI systems. It consists of three key components: the DMTK framework, the LightLDA topic model algorithm, and the Distributed (Multisense) Word Embedding algorithm. As proof of DMTK's speed, Microsoft says that on an eight-cluster machine, it can "train a topic model with 1 million topics and a 10-million-word vocabulary (for a total of 10 trillion parameters), on a document collection with over 100-billion tokens," a feat that is unparalleled by other tools.

  • Previous
    H20
    Next

    5. H20

    Focused more on enterprise uses for AI than on research, H2O has large companies like Capital One, Cisco, Nielsen Catalina, PayPal and Transamerica among its users. It claims to make is possible for anyone to use the power of machine learning and predictive analytics to solve business problems. It can be used for predictive modeling, risk and fraud analysis, insurance analytics, advertising technology, healthcare and customer intelligence.

    It comes in two open source versions: standard H2O and Sparkling Water, which is integrated with Apache Spark. Paid enterprise support is also available.

  • Previous
    Mahout
    Next

    6. Mahout

    An Apache Foundation project, Mahout is an open source machine learning framework. According to its website, it offers three major features: a programming environment for building scalable algorithms, premade algorithms for tools like Spark and H2O, and a vector-math experimentation environment called Samsara. Companies using Mahout include Adobe, Accenture, Foursquare, Intel, LinkedIn, Twitter, Yahoo and many others. Professional support is available through third parties listed on the website.

  • Previous
    MLlib
    Next

    7. MLlib

    Known for its speed, Apache Spark has become one of the most popular tools for big data processing. MLlib is Spark's scalable machine learning library. It integrates with Hadoop and interoperates with both NumPy and R. It includes a host of machine learning algorithms for classification, regression, decision trees, recommendation, clustering, topic modeling, feature transformations, model evaluation, ML pipeline construction, ML persistence, survival analysis, frequent itemset and sequential pattern mining, distributed linear algebra and statistics.

  • Previous
    NuPIC
    Next

    8. NuPIC

    Managed by a company called Numenta, NuPIC is an open source artificial intelligence project based on a theory called Hierarchical Temporal Memory, or HTM. Essentially, HTM is an attempt to create a computer system modeled after the human neocortex. The goal is to create machines that "approach or exceed human level performance for many cognitive tasks."

    In addition to the open source license, Numenta also offers NuPic under a commercial license, and it also offers licenses on the patents that underlie the technology.

  • Previous
    OpenNN
    Next

    9. OpenNN

    Designed for researchers and developers with advanced understanding of artificial intelligence, OpenNN is a C++ programming library for implementing neural networks. Its key features include deep architectures and fast performance. Extensive documentation is available on the website, including an introductory tutorial that explains the basics of neural networks. Paid support for OpenNNis available through Artelnics, a Spain-based firm that specializes in predictive analytics.

  • Previous
    OpenCyc
    Next

    10. OpenCyc

    Developed by a company called Cycorp, OpenCyc provides access to the Cyc knowledge base and commonsense reasoning engine. It includes more than 239,000 terms, about 2,093,000 triples, and about 69,000 owl:sameAs links to external semantic data namespaces. It is useful for rich domain modeling, semantic data integration, text understanding, domain-specific expert systems and game AIs. The company also offers two other versions of Cyc: one for researchers that is free but not open source and one for enterprise use that requires a fee.

  • Previous
    Oryx 2
    Next

    11. Oryx 2

    Built on top of Apache Spark and Kafka, Oryx 2 is a specialized application development framework for large-scale machine learning. It utilizes a unique lambda architecture with three tiers. Developers can use Oryx 2 to create new applications, and it also includes some pre-built applications for common big data tasks like collaborative filtering, classification, regression and clustering. The big data tool vendor Cloudera created the original Oryx 1 project and has been heavily involved in continuing development.

  • Previous
    PredictionIO
    Next

    12. PredictionIO

    In February this year, Salesforce bought PredictionIO, and then in July, it contributed the platform and its trademark to the Apache Foundation, which accepted it as an incubator project. So while Salesforce is using PredictionIO technology to advance its own machine learning capabilities, work will also continue on the open source version. It helps users create predictive engines with machine learning capabilities that can be used to deploy Web services that respond to dynamic queries in real time.

  • Previous
    SystemML
    Next

    13. SystemML

    First developed by IBM, SystemML is now an Apache big data project. It offers a highly-scalable platform that can implement high-level math and algorithms written in R or a Python-like syntax. Enterprises are already using it to track customer service on auto repairs, to direct airport traffic and to link social media data with banking customers. It can run on top of Spark or Hadoop.

  • Previous
    TensorFlow
    Next

    14. TensorFlow

    TensorFlow

    is one of Google's open source artificial intelligence tools. It offers a library for numerical computation using data flow graphs. It can run on a wide variety of different systems with single- or multi-CPUs and GPUs and even runs on mobile devices. It boasts deep flexibility, true portability, automatic differential capabilities and support for Python and C++. The website includes a very extensive list of tutorials and how-tos for developers or researchers interested in using or extending its capabilities.

  • Previous
    Torch
    Next

    15. Torch

    Torch describes itself as "a scientific computing framework with wide support for machine learning algorithms that puts GPUs first." The emphasis here is on flexibility and speed. In addition, it's fairly easy to use with packages for machine learning, computer vision, signal processing, parallel processing, image, video, audio and networking. It relies on a scripting language called LuaJIT that is based on Lua.

Artificial Intelligence (AI) is one of the hottest areas of technology research. Companies like IBM, Google, Microsoft, Facebook and Amazon are investing heavily in their own R&D, as well as buying up startups that have made progress in areas like machine learning, neural networks, natural language and image processing. Given the level of interest, it should come as no surprise that a recent artificial intelligence report from experts at Stanford University concluded that "increasingly useful applications of AI, with potentially profound positive impacts on our society and economy are likely to emerge between now and 2030."

In a recent article, we provided an overview of 45 AI projects that seem particularly promising or interesting. In this slideshow, we're focusing in on open source artificial intelligence tools, with a closer look at fifteen of the best-known open source AI projects.

Image source: Microsoft.com

Photo courtesy of Shutterstock.



0 Comments (click to add your comment)
Comment and Contribute

 


(Maximum characters: 1200). You have characters left.