I had an update from IBM’s AI Research unit spokesperson Lisa Amini this week, and she walked me through some of their work on automating AI creation. This technology is under development, and it doesn’t yet have an announced date, but it could have a massive impact on being able to research significant scale problems like pandemics.
You see, we have a huge language barrier between those that need this information, medical professionals, and those that build AIs, data scientists. The result is that building AI solutions at scale has been far more difficult than most seemed to imagine. There isn’t just a shortage of data scientists, and there is a massive shortage of Data Scientists who understand the various environments they are trying to model.
With the Covid-19 problem, we also have a significant problem that there are a lot of different systems, geographies, and people attempting to capture the data surrounding this virus, let alone trying to analyze it. The result is that the inefficiencies in the system are delaying the identification of mitigating medications and making a cure unlikely near term.
Let’s talk about how this advanced automated AI creation too, once it becomes available, could significantly shorten and increase the accuracy of related AI Machine Learning efforts like the Covid-19 analysis now being run on supercomputers like the IBM Summit machine.
The Data Scientist Problem
We have a severe shortage of qualified data scientists, and data scientists who are also operational experts in the areas they are to analyze are scarce. The result is a lot of AI projects use the wrong models, are biased, and generate results that can’t be trusted.
The people operating a company don’t understand how to create an AI, and the people tasked with creating the AI have no clue about the operational needs of the company. This problem is a Tower of Babel kind of communication problem that significantly impedes advancement in this space.
Also, nearly every AI effort is a custom one-off, which makes data collection problematic. We are, in the case of the Covid-19 virus, collecting massive amounts of data. Still, because nearly every system is different, our ability to combine and analyst these massive repositories is both limited and results in biased and unreliable outcomes.
And this is on top of the fact problems like Covid-19 are excessively convoluted, making it very difficult to build the AI systems that can produce meaningful and actionable results.
AIs Building AIs
What makes this future capability is unusual in that it applies AI technology to the creation of the AI. When you put the data set into the tool, it attempts to figure out what you might want and then makes configuration choices based on its experience. The user can then change parameters until they get an AI that comes closest to what they are trying to accomplish.
This process isn’t a fail and start over process, this is a fail, modify, and rerun process potentially cutting down significantly the time it takes to go from data to actionable information. The result is the creation of AI systems that are more flexible in use and more adjustable after the fact. You can reach the intended result through rapid iteration.
The process also helps the user break down complex problems into simpler components that then can be layered to reach the desired solution. Also, rather than taking raw data from various incompatible systems and then mis-analyzing the resulting mess, which is a common problem, the system looks at the results from those different systems. It integrates those results into the final product.
Other benefits are faster searches with greater accuracy and less unintended bias, far more success with unusual data sets like those that come from IoT devices, and a far more successful path to integrating multi-modal data sets. Oh, and a far easier path to integrating the resulting system with Business KPIs (Key Performance Indicators).
Making AIs Faster, Better, and Cheaper
As we approach new and critical problems like those surrounding the Covid-19 virus having the ability to deploy our most powerful analytics tools quickly will not only result in better and faster response, it will save lives. AutoML is one of the more exciting things worked on in IBM’s labs, and it is on the path to a future of AIs that can create AIs without the need for Data Scientists. Given the threats we face as a race, this capability can’t come soon enough.