CMO

Data scientist: Black box algorithms should not be applied to human outcomes

Algorithms must be transparent, accountable, and interpretable, says University of Sydney data science lecturer and expert

Where human outcomes are the goal, black box algorithms should not be allowed, a data scientist from the University of Sydney believes.

Speaking to CMO ahead of the University’s Ethics of Data Science conference next week, Dr Roman Marchant, lecturer and data scientist at the University of Sydney, said that while it is common practice for big companies to use black box algorithms to come up with an output, they should not be applied to the lives of humans.

This is because you cannot ‘open’ the black box and discover how the algorithm is coming up with the outcomes it does, so it is not transparent, accountable, or interpretable, he said.

“It is common for companies to use black box algorithms, otherwise known as deep learning or neural networks, and spit out an output,” Dr Marchant said. “We do use them for some problems, but it’s very tricky to use them on human data, and we are completely against using them for when the result affects the lives of humans, because you cannot open the box and uncover what the algorithm is doing in the background.

“The type of problems where we do apply these models is in applications where no humans can be affected, such as for automated machinery and so on. But, when it comes to humans, we believe these models shouldn’t be allowed.”

While data and its analysis to improve the customer journey and personalisation is key to marketing now, it is far from the silver bullet many marketers believe it to be, as most data contains inherent bias.

Related: Why bias is the biggest threat to AI development

Beware of AI inherent biases and take steps to correct them

According to the University of Sydney, algorithms are a fundamental tool in everyday machine learning and artificial intelligence, but experts have identified a number of ethical problems. Models built with biased and inaccurate data can have serious implications and dangerous consequences, ranging from the legal and safety implications of self-driving cars and incorrect criminal sentencing, to the use of automated weapons in war.

Before data is used, Dr Marchant advised it be evaluated by an inter-disciplinary team to ensure it is not full of errors, both for the sake of not stereotyping, but also to improve the customer experience.

“The models that are built are very generic and could be used for what outcome you want. All the models are fairly similar, except for the response variable. But there is a concern around biased data, and most companies don’t know what the exiting bias is in the data,” he explained.

“For example, we all have the right to be offered the same product. Usually you wouldn’t discriminate and only offer a product to a certain subset of the population. However, if a data set has an internal bias that males are more likely to buy a product than females, you may end up only marketing that product to men, even though perhaps all your marketing to date has been aimed at men, which then means only men are buying it, which affects the data."

Context around the data is very important, Dr Marchant continued. "Companies need to be able to quantify and correct that bias, which is tricky to do," he said.  “To do so, we have to consider the decisions that have been made in the past, to take into account when building the models.

“Companies must start taking into account other explanatory variables, and they need to be constantly revising models and thinking about what they are doing. Casual effects need to be uncovered, not just correlations and predictions with black box models."

This is why transparency so a third party can assess what algorithm company is using with personal data is vital. "Interpretability means the algorithm needs to be understandable, to enable someone to understand why the algorithm came up with the prediction it did. It means you can make the algorithm accountable for existing problems, like bias,” Dr Marchant said.

“This is not only for the protection of a customer, it also means you can open the box and understand how an algorithm is coming up with predictions and therefore understand your customer in a better way."

The University of Sydney also recommends companies use one of the multiple institutes and data centres, like The Gradient Institute for example, or an internal team of research engineers who can consult with third parties to analyse and study algorithms used, the way they are making decisions, and extract patterns from the data to indicate whether the data is fair or not.

“Ultimately, if a big company has a lot of data and using it to make decisions, they should have an internal team that does that, plus an external data team to evaluate that everything is transparent and done according to law," Dr Marchant said. 

The government also has a responsibility to work on laws that protect both companies and users to allow and relieve conflicts around using data, Dr Marchant said. As a step forward, he noted the Human Rights commission of Australia has already assembled a group to examine how AI affects human rights.

Follow CMO on Twitter: @CMOAustralia, take part in the CMO conversation on LinkedIn: CMO ANZ, join us on Facebook: https://www.facebook.com/CMOAustralia, or check us out on Google+:google.com/+CmoAu