Distributing false information in an organised manner to manipulate public opinion is a practice as old as civilisation, recorded as early as 44BC in campaigns against Roman leader Mark Antony.
However, the wide adoption of social media combined with new generative language models mean that it has never been easier to spread misinformation.
Misinformation at scale
Traditionally, large misinformation campaigns have required significant resources so were thought to be limited to states and state-backed groups, whereas smaller-scale actors could only push their agenda in more targeted operations.
But as methods for replicating and sharing content become more accessible, these operations are on the rise and becoming easier.
The wide adoption of social media has given anyone the ability to create and disseminate their content to large audiences. Being able to manipulate social media can enable the illusion of public opinion, through the creation of networks of ‘fake’ or ‘bot’ accounts that are controlled by a single entity. In this way, malicious actors can push narratives through the appearance of large-scale interest and discussion on the social networks we use every day.
The traditional toolkit of someone running a malicious campaign typically involves two stages:
- The creation phase entails generating false news articles, social media posts and other misleading content.
- Dissemination involves setting up networks of websites and social media accounts to reshare content to make it look credible and popular.
Being able to create a variety of content is important for these strategies, as duplicate social media posts are easily spotted by users and platforms. While small-scale campaigns may rely on human-written content, scaling up requires automation to generate this content.
Before generative AI, this might have involved using templates to make variations on the same basic sentence fragment. But now this content can be created much more easily and look more convincing.
How has generative AI changed the playing field?
Developments in generative AI over the last decade have led to the creation of models that can create coherent and convincing language, image and audio-based content.
The launch of large language models (LLMs) like OpenAI’s GPT-3 announcement in 2020, and many more models following since such as Google’s Gemini and Anthropic’s Claude, have raised questions around the potential negative impacts of these technologies.
And we’re now seeing the real-life impacts by malicious actors, in the form of networks of AI-generated news websites and ‘deepfakes’ of politicians which attempt to persuade the public.
Now tools to generate realistic, nuanced natural language in response to a simple prompt are available to anyone with an internet connection, enabling those who would wish to generate misinformation to do so at scale for relatively little cost.
This is particularly concerning in a year of elections across the globe, raising fears over how information integrity, electoral processes and public trust could be disrupted and eroded.
What can we do to address these problems?
Work is being done by model providers to ensure their models are aligned with human values, but there are limits to efforts to make models safer. For example, ostensibly benign prompts such as to “write a social media post” might not be refused by a model, even if the result was intended for malicious ends.
There are also known issues with how models handle truth, with makers of the models acknowledging that they ‘hallucinate’ information. As sowing chaos is often the aim of malicious actors, what exactly the content claims is often less important than its ‘human-ness’.
How we evaluate AI models is a hot topic of debate as AI systems continue to improve, and existing methods of evaluating whether an AI model is ‘safe’ or not often rely on measuring to what extent it will engage in certain behaviours.
In the use case of a malicious actor, the extent to which the content produced by a model could be perceived as being written by a human is just as, if not more, important as understanding whether it will comply with the request.
At the Turing we are working with partners such as the AI Safety Institute to investigate the degree to which widely-available generative language models can be used to produce effective election misinformation campaigns.
This work includes systematic evaluation of datasets aligned with the stages of a typical misinformation campaign, outlined earlier, to measure how often models will comply with these kinds of requests, and experiments exploring how well human participants can distinguish between human- and AI-generated content to understand the potential impact of material generated by these models.
AI will join the fight against misinformation
It’s important to remember that every new technology has two sides.
While LLMs pose risks in their ability to generate natural language, they also hold potential for better systems for detecting harmful content such as misinformation. AI-based tools can analyse text on social media at pace and scale, detecting signals that identify computer-generated content from human-produced text, which can then be fact-checked by humans.
Other tools use human fact-checkers to initially identify misinformation, before relying on AI to identify and counter the fake across the web.
As AI researchers, we recognise the huge benefits that generative AI and LLMs can offer to society, which makes it all the more important to manage the risks through ongoing research and the development of both technical and societal interventions.
Top image: ronstik