By John P. Desmond, AI Trends Editor
Machine learning has been used successfully in many disciplines that increasingly depend on it. However, the success relies on human machine learning experts to perform many tasks, according to an account on AutoML.org, a website of the community.
These tasks include: Preprocessing and cleaning the data; selecting and constructing appropriate features; selecting an appropriate model family; optimizing model hyper parameters; post-processing machine learning models; and critically analyzing the results.
The growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used more easily and without necessarily expert knowledge. The goal is to progressively automate these manual tasks in what is being called AutoML.
Most companies offering AutoML solutions are positioning them as tools to increase the production of data scientists, and to simplify the process to make it more accessible to new AI developers, according to an account in Towards Data Science written by Justin Tennenbaum, a data scientist.
The machine learning field is trying to move away from “black box” models, and instead are trying simpler models that are easier to interpret. However, AutoML has the potential to exacerbate the problem of whether the model is introducing bias, by hiding the mathematics of the model and performing so many tasks in the background.
“If there is bias in the real world it can easily be increased by not accounting for it properly in the model,” Tennenbaum stated. “AutoML might cause more harm than good if we use it to fully automate the process. However, full automation is only the goal of a handful of researchers of companies and there is another side which might benefit the community greatly.”
These include tools that help with a quick overview of data and suggestions for what models and techniques are most effective, thus helping the data scientist save time. He cited TPOT, created by Epistasis Labs, an AutoML library in Python built on top of Scikit-Learn, a popular machine learning library. Its design is meant to quickly run analysis of the subject data and let the data scientist know which models, features, and parameters are more effective than others.
“TPOT should be thought of as a place to begin or a model comparison and not as a final product,” Tennenbaum stated.
Microsoft Cites “Over-Fitting” Risk
A recent posting on the Microsoft Azure training site addresses some risks of AutoML, as it guides developers in how to use it. For example, “over-fitting” in machine learning occurs when a model fits the training data too well, and as a result may not be able to predict unseen test data. “The model has simply memorized specific patterns and noise in the training data, but is not flexible enough to make predictions on real data,” the article states.
To address the issue, machine learning best practice would call for more training data to be used, and for simplifying the model with fewer features. “Using more data is the simplest and best possible way to prevent over-fitting,” the article states. Of course, that might be easier said than done, if more data is not available.
A list of AutoML tools recently published by AIMultiple describes three types of AutoML solution providers: open source, startups and tech giants.
The practice in AI is to publish research findings, thus many competitive AutoML open source tools are out there. Most need an active development environment in Python or R and require the developer to write some code to initiate the machine learning process.
Many startups are raising money and being launched to try to capture this opportunity. Most startups try to offer tools that can be launched by a non-technical user. Some offer visualization to explain their findings and the resulting model.
Among tech giants, Google with its Cloud AutoML is one of the first to be launched. IBM’s SPSS is among the most popular analytics software providers; it has been offering various AutoML tools such as an auto classifier.
The site listed some 18 suppliers of AutoML tools in its coverage.
One startup not listed in that account is Aible, which offers an AutoML platform the company describes on its website as enabling, “citizen data scientists to rapidly build AI models.” Several of its founders worked at BeyondCore, a supplier of AI-enabled business analytics acquired by Salesforce in 2016.
In response to a few queries from AI Trends, sent the first week of coronavirus social distancing, Aible CEO and co-founder Arijit Sengupta sent these responses:
“According to a 2019 survey by MIT-Sloan and the Boston Consulting Group, 65% of companies report getting no business value from AI. Why?
“The fact is, without business adoption, you can’t get value from AI. Period. Aible bridges the adoption gap by meeting users where they are with tools tailored to their existing skills and needs.”
He discussed how the platform offers a portfolio of predictive models that take into account a range of potential business realities. This is said to enable team members to contribute their individual business insights to AI projects.
“Needless to say, the world has changed overnight. It’s already clear that businesses can expect an extended period of economic uncertainty in the weeks and months to come. AI can be an invaluable tool to help businesses navigate an economic downturn, at a time when making the right decisions can mean the difference between managing a recession and being overwhelmed by it.”
Spoken like a true entrepreneur.