New research explores how to filter out unfairness from machine learning

Businesses and institutions use machine learning techniques to make sensitive decisions and predictions about people, like whether to grant someone a low interest loan, or estimating how likely a person with a criminal conviction may be to reoffend.

A team of researchers from the Alan Turing Institute, Matt Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva, have published new research proposing methods to analyse the fairness of such algorithms and see whether they contain hidden discrimination.

The team combined two branches of research in machine learning, using techniques from causal inference to analyse fairness. They call this approach “counterfactual fairness.”

Fairness has drawn a surge of recent interest from researchers who have proposed a variety of mathematical definitions and endeavoured to modify algorithms to satisfy their definitions. However, it is not always clear when a particular definition works in practise or makes sense in a given setting.

In the paper, Turing researchers argue that algorithmic fairness must address the underlying causal factors behind historical unfairness in the data. This is lacking from previous research in this field and in the way many algorithms have been crafted so far.

Joshua Loftus, an author on the paper, said:

“One central fact of statistics is that ‘correlation’ is not ‘causation’. For example, there is a correlation between number of Nobel prizes won by people in a given country and the level of chocolate consumption. Such correlations can often be explained by some other factor, in this case something like economic wealth. Wealth is required to fund the great educational institutions that conduct Nobel prize winning research and to purchase luxury goods like chocolate.”

Definitions of fairness that only address correlations may lead to nonsensical results and decisions, like increasing chocolate consumption in order to achieve better educational outcomes. In the worst case, algorithms based on such definitions may end up increasing unfairness in the long run.

In the United States, previous research has shown that black and Hispanic minorities are significantly more likely to be arrested for possession of marijuana than the white population despite nearly identical rates of marijuana usage between these groups. This is due, at least partially, to “broken windows” policing where neighbourhoods with ethnic minorities and poorer residents are more heavily patrolled. A definition of fairness that does not account for this underlying causal relationship may perpetuate unfairness rather than help address it.

The counterfactual fairness framework aims to address fairness through causal relationships rather than relying entirely on correlations, like the spurious one between race and arrest for marijuana possession. To illustrate this, the team analysed a similar policing example in the paper using stop and frisk data from New York City.


Understanding criminality. The above maps show the decomposition of stop and search data in New York into factors based on perceived criminality (a race dependent variable) and latent criminality (a race neutral measure).

One of the authors, Turing Faculty Fellow Ricardo Silva, added:

“Firms may have specific legal obligations regarding fairness and they could potentially use our framework not just as way to achieve fairness, but also to explain how and why they are making those decisions. We would like to explore further how to relate our work to actual legal and regulatory requirements, and to collaborate with Turing partners or other organisations to apply this to real-world problems.”

This new publication links to previous research at The Alan Turing Institute querying algorithmic accountability.

For more information please read the full article or contact the authors directly: Matt Kusner, Joshua Loftus, Chris Russell, and Ricardo Silva.

To request a media interview please contact the communications team at The Alan Turing Institute.