Wednesday, July 24, 2024

Keeping Machine Learning Algorithms Humble and Honest in the ‘Ethics-First’ Era

Datamation content and product recommendations are editorially independent. We may make money when you click on links to our partners. Learn More.

By Davide Zilli, Client Services Director at Mind Foundry

Today in so many industries, from manufacturing and life sciences to financial services and retail, we rely on algorithms to conduct large-scale machine learning analysis. They are hugely effective for problem-solving and beneficial for augmenting human expertise within an organization. But they are now under the spotlight for many reasons – and regulation is on the horizon, with Gartner projecting four of the G7 countries will establish dedicated associations to oversee AI and ML design by 2023. It remains vital that we understand their reasoning and decision-making process at every step.

Algorithms need to be fully transparent in their decisions, easily validated and monitored by a human expert. Machine learning tools must introduce this full accountability to evolve beyond unexplainable ‘black box’ solutions and eliminate the easy excuse of “the algorithm made me do it!”

The Need to Put Bias in its Place

Bias can be introduced into the machine learning process as early as the initial data upload and review stages. There are hundreds of parameters to take into consideration during data preparation, so it can often be difficult to strike a balance between removing bias and retaining useful data.

Gender for example might be a useful parameter when looking to identify specific disease risks or health threats, but using gender in many other scenarios is completely unacceptable if it risks introducing bias and, in turn, discrimination. Machine learning models will inevitably exploit any parameters – such as gender – in data sets they have access to, so it is vital for users to understand the steps taken for a model to reach a specific conclusion.

Lifting the Curtain on Machine Learning

Removing the complexity of the data science procedure will help users discover and address bias faster – and better understand the expected accuracy and outcomes of deploying a particular model.

Machine learning tools with built-in explainability allow users to demonstrate the reasoning behind applying ML to a tackle a specific problem, and ultimately justify the outcome. First steps towards this explainability would be features in the ML tool to enable the visual inspection of data – with the platform alerting users to potential bias during preparation – and metrics on model accuracy and health, including the ability to visualize what the model is doing.

Beyond this, ML platforms can take transparency further by introducing full user visibility, tracking each step through a consistent audit trail. This records how and when data sets have been imported, prepared and manipulated during the data science process. It also helps ensure compliance with national and industry regulations – such as the European Union’s GDPR ‘right to explanation’ clause – and helps effectively demonstrate transparency to consumers.

Providing Humans with the Tools to make Breakthrough Discoveries

There is a further advantage here of allowing users to quickly replicate the same preparation and deployment steps, guaranteeing the same results from the same data – particularly vital for achieving time efficiencies on repetitive tasks. We find for example in the Life Sciences sector, users are particularly keen on replicability and visibility for ML where it becomes an important facility in areas such as clinical trials and drug discovery.

Models Need to be held Accountable…

There are so many different model types that it can be a challenge to select and deploy the best model for a task. Deep neural network models, for example, are inherently less transparent than probabilistic methods, which typically operate in a more ‘honest’ and transparent manner.

Here’s where many machine learning tools fall short. They’re fully automated with no opportunity to review and select the most appropriate model. This may help users rapidly prepare data and deploy a machine learning model, but it provides little to no prospect of visual inspection to identify data and model issues.

An effective ML platform must be able to help identify and advise on resolving possible bias in a model during the preparation stage, and provide support through to creation – where it will visualize what the chosen model is doing and provide accuracy metrics – and then on to deployment, where it will evaluate model certainty and provide alerts when a model requires retraining.

…and Subject to Testing Procedures

Building greater visibility into data preparation and model deployment, we should look towards ML platforms that incorporate testing features, where users can test a new data set and receive best scores of the model performance. This helps identify bias and make changes to the model accordingly.

During model deployment, the most effective platforms will also extract extra features from data that are otherwise difficult to identify and help the user understand what is going on with the data at a granular level, beyond the most obvious insights.

The end goal is to put power directly into the hands of the users, enabling them to actively explore, visualize and manipulate data at each step, rather than simply delegating to an ML tool and risking the introduction of bias.

Industry Leaders can Drive the Ethics Debate Forward

The introduction of explainability and enhanced governance into ML platforms is an important step towards ethical machine learning deployments, but we can and should go further.

Researchers and solution vendors hold a responsibility as ML educators to inform users of the use and abuses of bias in machine learning. We need to encourage businesses in this field to set up dedicated education programs on machine learning including specific modules that cover ethics and bias, explaining how users can identify and in turn tackle or outright avoid the dangers.

Raising awareness in this manner will be a key step towards establishing trust for AI and ML in sensitive deployments such as medical diagnoses, financial decision-making and criminal sentencing.

Time to Break Open the Black Boxes

AI and machine learning offer truly limitless potential to transform the way we work, learn and tackle problems across a range of industries—but ensuring these operations are conducted in an open and unbiased manner is paramount to winning and retaining both consumer and corporate trust in these applications.

The end goal is truly humble, honest algorithms that work for us and enable us to make unbiased, categorical predictions and consistently provide context, explainability and accuracy insights.

Recent research shows that 84% of CEOs agree that AI-based decisions must be explainable in order to be trusted. The time is ripe to embrace AI and ML solutions with baked in transparency.

About the author: 

Davide Zilli, Client Services Director at Mind Foundry

Subscribe to Data Insider

Learn the latest news and best practices about data science, big data analytics, artificial intelligence, data security, and more.

Similar articles

Get the Free Newsletter!

Subscribe to Data Insider for top news, trends & analysis

Latest Articles