LLMs could transform the financial sector, but first we need to make sure they’re fair

A new tool is aimed at helping businesses to tackle harmful bias throughout AI development

Monday 27 Jan 2025

Filed under

The adoption of AI technologies in the financial sector is gathering pace. A recent survey by the Bank of England and Financial Conduct Authority found that 75% of UK financial service firms are using AI, with a further 10% planning to use it over the next three years. 

Large language models (LLMs) hold particular promise: this AI technology has been making waves with its ability to analyse text, generate high-quality responses, and make predictions. In the financial sector, LLMs could vastly improve efficiency and safety by, for example, detecting fraud, generating insights from financial data, and automating customer interactions. Financial institutions are already exploring the opportunities offered by these tools

But LLMs also raise critical questions about fairness and accountability. A major concern is that LLMs could unintentionally perpetuate or even amplify bias, leading to discriminatory outcomes. For example, an LLM trained on historical lending data might reproduce biases against certain demographics, such as disproportionately rejecting loan applications from minority groups. 

A Turing project that I am involved in, commissioned by the Department of Science, Innovation and Technology and funded by Innovate UK, is working on ways to help businesses, particularly in the financial sector, to develop AI methods and tools where fairness is embedded from the beginning. 

Unchecked bias can disadvantage customers, undermine organisational trust and risk violating anti-discrimination regulations. Addressing bias in AI systems before they are widely deployed is therefore critical. 

Tackling bias in LLMs

Our goal is to help businesses – especially small- and medium-sized enterprises (SMEs) with limited technical capacity – to monitor and mitigate bias at all stages of AI development. We call this ‘end-to-end fairness-by-design’. 

Our methodology can be summarised as follows: 

1. Log potential sources of bias continuously using system documentation
From data collection to model deployment, AI development processes produce several logs and other artefacts that can be used for tracking potential fairness issues. Development teams should use these artefacts effectively to automatically and continuously monitor how a model performs across different population groups, and, where relevant, share the results with colleagues in other areas of the business.

2. Consider societal perspectives as well as technical ones
AI fairness is a sociotechnical problem, so it is important to align technical solutions with societal values and policies. In other words, addressing fairness requires choosing the fairness notions, metrics, data processing techniques, model design considerations and deployment practices that align with the societal context. Starting the project by reviewing any previous relevant incidents (e.g. using the AI Incident Database), creating use-case-specific fairness criteria, or ‘claims’, and then actively monitoring the codebase for these claims is a proactive way to review societal concerns while aligning with technical AI safety characteristics such as robustness, security and privacy. 

3. Integrate a fairness-by-design approach at the system design level
In software engineering, developers use well-documented best practices or ‘patterns’ to create reliable and secure programming code by design – software engineering students learn these techniques in their curriculum. Similarly, AI developers should use design patterns for processes such as data unit tests to share best practices for mitigating model bias. 

Our fairness monitoring and logging tool

Guided by this methodology, we recently released the first version of a monitoring and logging tool to help developers integrate existing system documentation for fairness monitoring and follow a structured evaluation approach, with the aim of enabling a quick-start, fairness-oriented AI development journey. The tool also includes selected evaluation and mitigation techniques for natural language processing use cases. 

We focused on usability: the tool lowers barriers to adopting fairness practices, ensuring fairness is not an afterthought but a core design principle. As part of our usability concerns, we are aiming for the tool to work seamlessly with existing popular AI fairness packages and frameworks such as Fairlearn and the AI Safety Institute’s Inspect. We have also released design patterns and ‘recipes’ for integrating fairness monitoring into development pipelines, with the aim of improving the developer experience of our tool. 

While our work is particularly relevant to the financial sector, its implications extend to other domains where AI bias is a major concern, including healthcare, education and recruitment. For example, an independent review into equity in medical devices in 2024 found that bias can arise in AI-enabled medical devices in a number of ways, including how the underlying algorithms are developed and tested. 

The tools and techniques that we are developing could help businesses across multiple areas to ensure that their AI models and devices are equitable and debiased. 

What’s next?

Fairness in AI is not just a technical challenge: it’s a societal imperative. With the right tools and practices, we can harness the transformative power of LLMs in the financial sector and beyond, ensuring that their benefits are shared equitably. 

We are now planning workshops at the Turing on 5 February and 6 March 2025 to demonstrate our tool to businesses and gather feedback so that we can refine the tool and ensure that it meets real-world needs. 

If you are interested in finding out more or testing the tool within your organisation, we invite you to join us by filling out this expression of interest form.

 

Top image: RZ / Adobe Stock