Introduction

The Alan Turing Institute and its university partners are pleased to invite you to ‘Presenting the Turing Fellow Research Projects’, an event series taking place across the Institute’s university partner network from April 2021.

In 2018 over 300 Turing Fellows were appointed at the Institute following an open call. Some of these received additional funding to deliver research projects that have had substantial impact in the areas of data science and artificial intelligence. The events will showcase the breadth of research and demonstrate the impact of these research projects.

At each event Turing Fellows will share details of their research project, its successes and impact as well as details of their collaborations and next steps. Each event, coordinated at each Turing university partner by the respective university liaison team, will consist of two/three presentations followed by a Q&A session. Please click on the links below to book your place on the session.

Upcoming events

9 November, University of Warwick

10:00–12:30

Presentation title

Presenter

Description

Bayesian Decision Support for Frustrating Violent Criminals

Jim Smith

In an era of radicalisation, an increasing number of individuals are determined to undermine our democratic institutions through deadly attacks on the general public. Policing to mitigate this threat in a proportionate but effective way is challenging.

This talk will outline work at the Alan Turing institute where Bayesian methodologies have been harnessed to support the frustration of such endeavours. The talk will begin with a description of the development of a model to help evaluate - in real-time - the threat currently posed by triaged suspects before moving on to present more recent models of violent gangs.

Professor Jim Smith will finish the talk by briefly discussing spin-off work concerning models to protect computer systems from the criminal exfiltrate of sensitive documents and models to help catch modern-day slavers.

Confidential Fusion (Confusion)

Murray Pollock

A surprisingly challenging problem in computational statistics is how to unify distributed statistical analyses and inferences into a single coherent inference.

This problem arises in many settings (for instance, combining experts in expert elicitation, incorporating disparate inference in multi-view learning, and recombining in distributed big data problems), but a general framework for conducting such unification has only recently been addressed.

A particularly compelling application is in statistical cryptography. Consider the setting in which multiple (potentially untrusted) parties wish to share distributional information (for instance in insurance, banking and social media settings), but wish to ensure information theoretic security. Joint work with Louis Aslett, Hongsheng Dai and Gareth Roberts.

Sign up

10 November, University of Warwick

10:00–11:05

Presentation title

Presenter

Description

Spatiotemporal Machine Learning Foundations for Urban Digital Twins

Theo Damoulas

Real-world processes unfold over space and through time in a complex, non-stationary, fashion with non-trivial dynamics and interdependencies.

We also partially observe these through multiple, heterogeneous sensing modalities that have varying signal-to-noise ratios and sampling periods. Furthermore, we constantly act and intervene on these systems, nudging them towards different regimes.

Dr Damoulas will describe our past and ongoing work in this area, as part of his Turing fellowship, aiming to tackle some of these key challenges and set the foundations for digital twinning with an application focus on urban and environmental settings.

NLP for Mental Health and Social Media

Maria Liakata

Collecting together microblogs representing opinions about the same topics within the same timeframe is useful to a number of different tasks and practitioners. A major question is how to evaluate the quality of such thematic clusters.

Maria will discuss the task of evaluating thematic coherence on the basis of a corpus created from three different domains and distinct time windows and the introduction of different metrics for the task as well as the results of our experiments, showing the clear benefits of text generation metrics.

She will also discuss the effect of this work on downstream tasks such as automated multi-document opinion summarisation.

Sign up

18 November, Newcastle University

11:00 - 12:00

Presentation title

Presenter

Description

Automating Data Visualisation

Nick Holliman

We have sought to demonstrate how novel approaches to data visualization can be enabled using cloud computing at scale, how automatic metrics can be used for assessing data visualizations and latterly a key strand has become the visualization of uncertainty for decision makers.

Streaming data modelling for real-time monitoring and forecasting

Darren Wilkinson

Streaming data modelling for real-time monitoring and forecasting

We seek to address a key challenge of modern data science: the development of scalable algorithms for extracting useful information from large, complex, heterogeneous and ever-growing data sets in (near) real-time.

The project combines sequential statistical modelling with computational frameworks for streaming data processing, making significant use of functional programming approaches.

The techniques will be illustrated using live streaming data from Newcastle’s Urban Observatory – one of the largest public sources of smart-city data in the world. Data from environmental sensors which are both multivariate and spatially distributed provide a challenging use case for on-line statistical modelling.

Sign up

18 November, University College London

14:00 - 15:30

 

Presenter

Description

Uncertainty quantification of multi-scale and multi-physics computer models

Serge Guillas

What possible tsunamis will hit the west coast of India? What are the uncertainties in future predictions of global warming over Europe? 'Uncertainty Quantification' (UQ) can help answer these questions by analysing the propagation of uncertainties in complex simulators, such as climate or tsunami computer models that run on supercomputers and require multiple scales and physical processes to be combined. Typically UQ makes use of surrogate models that are much faster to run than simulators, in order to sample uncertainties efficiently. These are often 'Gaussian processes' that need to be fitted using a smart design of computer experiments. Building a UQ workflow that integrates heterogeneous models (both in scale and in nature) is a challenge, and the corresponding designs also have to be investigated. We describe in this talk the range of statistical and computational advances, as well as impact and collaborations that resulted from this project, along with future steps ahead.

Large-scale embeddings from human behaviour

Brad Love

How do people think about everyday objects, whether they be products in the supermarket or natural images? We find that people organise everyday objects around goals (e.g., items needed to cook a stir-fry). Moreover, we derive an image embedding (size 50k) of the ImageNet validation set and find that recent deep learning models organise the images in a manner at odds with human judgment.

nAvIgate: Understanding urban navigation through Deep Reinforcement Learning

Steven Gray

nAvIgate aims to further our understanding of urban navigation by training agents to navigate simulated cities via deep reinforcement learning. As well as being helpful in understanding human navigation, development of agents that can robustly navigate urban spaces is useful in a number of different areas, such as agent-based models for urban simulation, robotics, and video-games. In this presentation we share our findings from the nAvIgate project, and the challenges we encountered in applying deep reinforcement learning to urban navigation. We will also talk about how we used existing tools to streamline the creation of our environment and creation of our agent and the next steps of the project.

Sign up

23 November, University of Southampton, University of Manchester

14:-00–15:30

Presentation title

Presenter

Description

Anonymisation and Provenance: Expression Data Environments With PROV

Adriane Chapman, Mark Elliot

The Anonymisation Decision-Making Framework (ADF) is a comprehensive practice designed for assessing and controlling the risks of sharing and disseminating data. This project examines how to use provenance to support anonymization decision-making. To enable this, we analyze the mapping of concepts between ADF and prov. We have operationalized provenance into the framework, and analyse the suitability via real use cases. We have created prototype tool support from simulators to reasoners.

A Multidisciplinary Study of Predictive Artificial Intelligence Technologies in the Criminal Justice System

Pamela Ugwudike

The project explored a classic predictive policing algorithm to investigate conduits of bias. Whilst many studies on real data have shown that predictive policing algorithms can create biased feedback loops, few studies have systematically explored whether this is the result of legacy data, or the algorithmic model itself. To advance the empirical literature, this project designed a framework for testing predictive models for biases. With the framework, the project created and tested: (1) a computational model that replicates the published version of a predictive policing algorithm, and (2) statistically representative, biased and unbiased synthetic crime datasets, which were used to run large-scale tests of the computational model. The study found evidence of self-reinforcing properties

Topology and neural networks generalisations

Jacek Brodzki

Neural networks are at the centre of many remarkable applications of AI. These powerful classification tools are great when they work well, but have demonstrated weaknesses where they fail at surprisingly easy tasks. This talk will summarise the results of our pilot project devoted to the study of the geometry of the decision boundaries of neural networks as a predictor for their performance.

Sign up

Past events

3 November, University of Southampton

13:15–14:30

Presentation title

Presenter

Description

Data science approaches to applied mathematical modelling

Marika Taylor

In this talk Marika Taylor will describe new relationships between tessellations and codes used for quantum error correction, focussing on tessellations of negatively curved (hyperbolic) spaces. The motivations for constructing such codes will be explored - these range from fundamental physics to understanding the geometry underlying quantum machine learning.

Jazz as Social Machine

Tom Irvine

Making jazz with machine learning agents turns out to be complicated. Using insights from Web Science, Science and Technology Studies and musicological jazz studies, I survey the techniques currently in use, and explore what it is about jazz's data that makes machine learning jazz more of a "social" problem than other challenges in the growing field of Music Information Retrieval.

Sign up

1 November, University of Exeter

15:00–16:00

Presentation title

Presenter

Description

Towards a resilience sensing system for the biosphere

Tim Lenton

I will focus on the development of a resilience sensing and tipping point early warning system for the biosphere. This will include results for the changing resilience of patterned vegetation systems in the Sahel. I will also touch on progress using deep learning to improve early warning methods for tipping points. 

Sign up

21 October, University of Cambridge

10:00–12:30

Presentation title

Presenter

Description

AI-guided solutions for early detection of dementia

Zoe Kourtzi

Alzheimer’s disease (AD) is characterised by a dynamic process of neurocognitive changes from normal cognition to mild cognitive impairment (MCI) and progression to dementia. However, not all individuals with MCI develop dementia. Predicting whether individuals with MCI will decline (i.e. progressive MCI) or remain stable (i.e. stable MCI) is impeded by patient heterogeneity due to comorbidities that may lead to MCI diagnosis without progression to AD. Despite the importance of early diagnosis of AD for prognosis and personalised interventions, we still lack robust tools for predicting individual progression to dementia. Here, we propose a novel trajectory modelling approach based on metric learning that mines multimodal data from MCI patients to derive individualised prognostic scores of cognitive decline due to AD. Our approach affords the generation of a predictive and interpretable marker of individual variability in progression to dementia due to AD based on cognitive data alone. Including non-invasively measured biological data (grey matter density, APOE 4) enhances predictive power and clinical relevance. Our trajectory modelling approach has strong potential to facilitate effective stratification of individuals based on prognostic disease trajectories, reducing MCI patient misclassification with important implications for clinical practice and discovery of personalised interventions.

The cooked and the raw; extracting and exploring structured and unstructured clinical data from patient electronic health records

Paul Schofield

Electronic health records (EHRs) contain information critical to the realisation of the promise of personalised medicine, but also data essential for the discovery of the molecular basis of disease. Clinical information systems and EHRs were not developed for the discovery, integration and export of information, most being based on the concept of paper records going back to the 1990s. Consequently we find in EHRs information contained in administrative, diagnostic  and procedure codes, which are highly structured and standardised ( pre-cooked) , the results of investigative tests, ranging from blood chemistry to images, which might be regarded as partially structured information (lukewarm?), and finally narrative reports of clinical encounters and discharge letters which are rich sources of information but completely unstructured – raw data. Reliably extracting and integrating these types of information is a huge challenge, but the ability to retrieve coded and quantitative data into a common symbolic framework opens up the possibility of connecting these data together with the large amounts of background knowledge now available, to begin to make semantic sense of our whole ‘menu’.

I will discuss three approaches to extracting and using EHR information: the first uses the Komenti platform which is designed to extract information from free text into semantically formalised ontological annotations, the second is an approach to combine quantitative data into that same semantic framework. The third, a new resource, axiomatises ICD-10 terms uses the Human phenotype ontology for integration with existing knowledge and, for example, patient classification. The promise of these multi-pronged approaches will be discussed.

Sign up

18 October, University of Southampton, Newcastle University

10:30–12:00

Presentation title

Presenter

Description

Mapping biology from mouse to human using transfer learning

Ben MacArthur

In this talk Ben MacArthur will outline how tools from machine learning can be combined with experiments to better understand how biology can be mapped between species and thereby improve the biomedical research and development pipeline. As an example, he will show how transfer learning can be used to determine when biology learnt from one organism (the mouse) can be effectively transferred be to another (the human) and when it cannot.

Decision support algorithms for Emergency Departments

Neil White

In this talk, Neil White and Chris Duckworth will describe the outcomes of the TriagED project. Emergency departments (EDs) are facing unprecedented levels of overcrowding, which delays and impacts patient care. By analysing data collected from EDs, we can use machine learning models to predict patient outcomes (e.g. whether a patient was discharged or admitted to hospital). These models will predict the patient outcome as early as possible in the hospital visit, with an aim to improve the efficiency of EDs and help allocate resources in downstream care. Clinical settings are, however, dynamic environments and the reasons for attending the ED and their severity can change with time (i.e. data drift). This can have serious ramifications for any machine learning model implemented.  We demonstrate how explainable machine learning can be used to monitor data drift for a predictive model deployed within a hospital ED. We use the COVID-19 pandemic as an extreme case of data drift, which has brought a severe change in operational circumstances. Furthermore we show how emergent health risks can be identified by using the relative importance of model features.

4P Healthcare

Paolo Missier

Chronic, metabolic, and neurodegenerative diseases are now the leading cause of death and disability. Current data acquisition technology provides the means to fully characterise an entire population of individuals in terms of a broad diversity of quantitative datasets - ranging from periodic but low-rate multi-omics data (genomics, proteomics, metabolomics, and more) to continuous and high rate self-monitoring data from wearable sensors. This project addresses some of the key Data Science challenges associated with the development of new preventive, predictive and personalised models that define the data-driven future of healthcare.

Sign up

15 October, University of Southampton

11:00–12:00

Presentation title

Presenter

Description

Machine learning of seismicity induced by hydraulic fracturing

Tom Gernon

In this talk, Tom Gernon will describe how machine learning can be applied to forecast earthquakes triggered by underground fluid injection, and thereby improve real-time regulation practices in fracking and wastewater disposal regions. As an example, he will show how Bayesian networks can be used to model joint conditional dependencies between both natural (e.g. geology, seismicity) and operational (e.g. injection volumes, rates, and depth) parameters. This approach is key to unlocking spatial complexity and is applicable to geothermal and carbon capture and storage projects including those in the UK.

Open-source Private Data Integration (Enhancement Project)

George Konstantinidis

In this talk George is going to present the latest developments on the new area of collaborative data privacy. In these scenarios the service provider is considered a friend and not an adversary to the data owner, thus privacy enforcement is collaborative and does not rely on encryption or distortion of data. Instead, this area investigates and develops mechanisms for users to encode their custom requirements, data consent, privacy preferences and data policies in a machine-processable language to form data usage contracts that can be automatically (or algorithmically) respected.  George will discuss the formal foundations of the area, connections to data privacy, algorithms and open source implementations for supporting these automated agreements in data management. He will present results on real and synthetic datasets and discuss extensions ranging from blockchains to clinical research, to AI reasoning and Knowledge Graphs.

Sign up

11 October, University of Exeter

15:00–15:50

Presentation title

Presenter

Description

The Evidence Base of Artificial Intellignece: Supporting Data Fairness to Enhance Research Quality

Sabina Leonelli

Efforts to ensure responsible and ethical AI are intertwined with efforts to develop high-quality reference data and trustworthy, reliable and accessible data sources. The importance of effective strategies for data management and sharing has been widely recognised: a key
challenge remains to find ways of fostering fair and responsible data use. This talk reviews and compares lessons learnt from the qualitative study of responsible data linkage in the agricultural/plant sciences as well as the biomedical/health domain (including research grounded on social media analysis and data integration set up in response to the COVID-19 pandemic). 

Sign up

8 October, University of Bristol

10:30–11:30

Presentation title

Presenter

Description

Non-Destructive Evaluation (NDE) Data Science for Industry 4.0

Paul Wilcox

This project is about applying data science to exploit the full potential of quantitative Non-Destructive Evaluation (NDE) measurements of engineering assets. Currently, the majority of the NDE data is irreversibly condensed after a measurement is made and in doing so, a wealth of useful information that could impact on the future safety and economic utility of an asset is lost forever. This project aims to take advantage of new ML techniques and data storage capabilities to make use of NDE data.

Boosting Manufacturing Productivity through AI

Nick Wright (Newcastle Fellow)

This project primarily addresses the efficiency and effectiveness of visual inspection which is the most important form of quality measurement in industries as diverse as car manufacturing, food and luxury goods. The use of AI in visual inspection is still in its infancy but AI has the potential to make an enormous impact and to contribute to a significant improvement in UK manufacturing productivity. This project utilises AI Deep Learning techniques to perform visual inspection on products whilst they are still in the manufacturing line – developing techniques to make fast identifications of faults in a highly reliable way and often in circumstances not ideal for computer systems – particularly in the harsh environment of a typical factory.

Sign up

7 October, University of Bristol, Newcastle University, UCL

10:30–11:30

Presentation title

Presenter

Description

Consistent urban areas for global urban polimetrics

Levi Wolf, Sean Fox

Formal urban boundaries drawn by the government often do not reflect the effective territory of the city. This problem, urban political fractionalisation, is difficult to study, this project aims to develop and apply a new method for bounding urban areas using satellite imagery as well as applying these data to analysis of political fractionalisation.

Spatial Inequality and the Smart City

Rachel Franklin (Newcastle Fellow)

Much attention is given to fairness and equity in the smart city, whether algorithmic bias, surveillance or socio-economic inclusion. This talk directs attention to an explicitly spatial component of the smart city apparatus: sensor networks and the emergence of coverage gaps—or sensor deserts. How are cities and other stakeholders to make decisions about placement and where does coverage of vulnerable groups and places fit in? The talk provides a conceptual overview of the sensor location-spatial inequality dilemma, gives a case study example from Newcastle upon Tyne in the United Kingdom, introduces a decision support tool prototype, and concludes with some thoughts for both researchers and those on the ground working with smart city sensor networks, including local governments, policymakers, and community groups.

The attachments of ‘autonomous’ vehicles - A social science approach

Jack Stilgoe (UCL Fellow)

The ideal of the self-driving car replaces an error-prone human with an infallible, artificially intelligent driver. This narrative of autonomy promises liberation from the downsides of automobility, even if that means taking control away from autonomous, free-moving individuals. We can look behind this narrative to understand the attachments that so-called ‘autonomous’ vehicles (AVs) are likely to have to the world. Drawing on 50 interviews with AV developers, researchers and other stakeholders, we explore the social and technological attachments that stakeholders see inside the vehicle, on the road and with the wider world. These range from software and hardware to the behaviours of other road users and the material, social and economic infrastructure that supports driving and self-driving. We describe how innovators understand, engage with, or seek to escape from these attachments in three categories: “brute force”, “solve the world one place at a time” and “reduce the complexity of the space”.

Sign up

6 October, University of Bristol

10:30–11:30

Presentation title

Presenter

Description

Applying machine learning techniques to large Cryo-EM data sets. 
PROOF: Software for the identification of accessory PROteins On Filaments 

Danielle Paul

This project aims to open the field of high-resolution Cry-Electron Microscopy (EM) to a large number of users. The recent Cry-EM revolution has produced a massive increase in the amount of image data. This project aims to develop software to facilitate single particle analysis of Cryo-EM images filamentous proteins.

Sign up

5 October, University of Bristol

10:30–12:00

Presentation title

Presenter

Description

Towards a Measurement Theory for Data Science and Artificial Intelligence

Peter Flach

Measurement concepts are underdeveloped in data science and AI, in at least the following senses: (i) a wide-spread under-appreciation of the importance and effects of measurement scales; and (ii) the fact that in most cases the quantity of interest is latent, i.e. not directly observable. This project seeks to make fundamental advances in our understanding of capabilities and skills of models and algorithms in data science and AI , and how to measure those capabilities and skills.

Simulating boundary examples

Song Liu

This project aims to develop a sampling algorithm that generates atypical/boundary examples for a given dataset. The simulated boundary example can help us quantify the uncertainty of an underlying system in many engineering and science subjects.

Analysis of jointly embedded cyber-security graphs

Patrick Rubin-Delanchy

There is more information in graphs, and other complex discrete data structures than is currently exploited in data-driven approaches to cyber-security. This project aims to set out a principled and tractable framework, at the intersection of Mathematics, Statistics and Computer Science, to allow more effective use of multi-modal cyber-security data and work with industry and government partners to see prototype algorithms and software deployed.

Sign up

4 October, University of Bristol

10:30–12:00

Presentation title

Presenter

Description

UK Birth Cohorts as a Platform for Ground Truth in Mental Health Data Science

Oliver Davis, Claire Haworth 

There have been recent and startling rises in reported mental health difficulties, particularly in young adults. This means it is more crucial than ever to understand the origins of mental health and wellbeing to inform public health interventions. This project aims to develop a software framework for UK birth cohorts to act as a platform for the validation of algorithms when studying mental health, considering the ethical and legal challenges.

Data Donation: Personal Data for Health Research and Policy Making

Anya Skatova, presented by Kate Shiells

This project builds on knowledge in psychology, public policy, ethics and data science to study the public acceptability of using commercial transactional data in public health research. We focus on two research questions: (1) what are the mostly widely held attitudes towards using transactional data for public health research? (2) what are the publicly acceptable and ethical pathways of using transactional data for public health research?

Creating an open research data platform from the world's most intensively monitored dairy farm for tackling One Health grand challenges

Andrew Dowsey, working with Tilo Burghardt

In this project we have developed new deep metric learning methodology to identify and track individuals which will underpin creation of the world’s first permanent 24/7, non-intrusive, vision-centred cattle herd tracking platform for a working farm at scale. This will in turn support our goal of creating and harnessing a longitudinal cohort resource underpinning grand ‘One Health’ research challenges in sustainable food security, health and welfare monitoring, and underpin fundamental behaviour research.

Sign up

16 September, Queen Mary University of London

10:00–12:00

Presentation title

Presenter

Description

Realising the Potential for Learning from Electronic Health Records using Synthetic Data

William Marsh

The clinical data held in Electronic Health Records (EHR) offers excellent opportunities for researchers to improve the treatment and management of patients. However, so far it has been difficult to realise this potential. There is a need to provide more information about the meaning of the data to the data analyst. We analyse this problem, explaining why the database schema is not very informative. We explain the data journey, which is the series of transformation from the original data to the data suitable for analysis, and why this needs to be explicit. We also suggest that synthetic data could be used to accelerate access to data and we report on the performance achieved using Bayesian Networks to create synthetic data.

Robot-human tool handover in an intelligent framework of tactile interaction 

Kaspar Althoefer

Purposefully handling objects comes very naturally to humans – we instinctively understand how to pick up a tool and rearrange it in our hands so that we can use the tool to conduct a specific task or pass it on to someone else to conduct a task themselves. We are also adept at anticipating when and how a tool will be received when passed to us by someone. Our project aims to create intelligent methods for natural and intuitive human-robot interaction. Robots are to be equipped with the required intelligence to anticipate the handing over of tools and to actively support the human completing it as part of activities performed in different work environments, including surgery, manufacturing and nuclear waste decommissioning. 

An important aspect of the work is to capture tactile information during the handover action. We hypothesise that principal motion parameters are recognised by humans through their tactile sensors within their fingers when an object moves from one hand to another - essential for increasing the chances of a successful handover.

In my presentation, I will provide an overview of our research in tactile and force sensing. I will highlight our advancements concerning the integration of tactile sensors with robot hands for improved manipulation capabilities and research efforts towards thin-layer tactile sensors to be worn by humans allowing us to study close-up the interaction dynamics that human hands experience during tool handovers. Going beyond the biological role model, we also develop proximity sensors that when integrated with robot hands allow us to estimate useful distance information when approaching the object to be handled as well as the object’s stiffness during handling. 

Can large biomedical datasets be interpreted automatically?

Conrad Bessant

Modern biomedical studies often involve the measurement of tens of thousands of molecular variables from relatively few (often <100) samples. The dimensionality, heterogeneity and missingness of the resulting datasets make them ill-suited to traditional statistical techniques and machine learning. Furthermore, pulling out novel discoveries from these data is only possible in the context of a large body of prior knowledge. Using high throughput phosphoproteomics data as an example, this talk will explore the potential of logic programming for the automated interpretation of biomedical data. We will demonstrate the extent to which this approach can automatically explain observed results and generate novel hypotheses suitable for laboratory validation.

Making sense of cancer evolution with mathematical models and machine learning

Trevor Graham

Cancer research is full of Big Data, foremost genome sequencing data. Typically these datasets have been analysed with sophisticated statistical methods that find patterns in the data, with biological interpretation of the patterns added post hoc. We have pursued an alternative approach where we propose mathematical models of cancer evolutionary dynamics up front – these models are derived from biological first principles – and then use statistical inference is used to match models to data. This approach gives us mechanistic insight into how cancers grow, and offers a direct route to predicting the future disease course. 

Sign up

15 September, University of Edinburgh

12:25–13:45

Presentation title

Presenter

Description

Managing Uncertainty in Government Modelling

Chris Dent

This talk will give an overview of the Managing Uncertainty in Government Modelling project, which are in translating specialist methodology for uncertainty management into wider use by analysts in government and related sectors. It will then describe specific applications in heat network planning, criminal justice, digital twins and more.

Global uncertainty risk factor qualification with data driven risk modelling & AI powered predictions

Tiejun Ma

In this talk, I will present a couple of my research group’s recent research studies, challenges, and findings on adopting AI and modelling techniques to understand human being’s risk-taking behaviour, particularly related to financial decisions.  In the first study, we developed a deep learning model for predicting whether individual investors are likely to secure profits from future trades. This embodies typical modelling challenges faced in risk and behaviour forecasting using hierarchical distributed representations of investors’ risk taking behaviour, it uncovers generative features that determine the target (e.g., trader’s profitability), and is more robust toward change (e.g. dynamic market conditions). The results of employing a deep network for risk forecasting confirm the feature learning capability of deep learning.  To understand the underline influential factors of investors’ decisions, we investigated the role of news playing in individual sequential investment behaviour by analysing a ten-year dataset that includes 20 million news items, 8.5 million trades of 29,434 individual investors.  In addition, we investigated the impact of smartphones on the nature and quality of individual decision. Our results show significant performance differentials and differences in the nature of the investment decisions of smartphones users and non-smartphone users. Those who use smartphones achieve higher risked adjusted returns but exercise less investment discipline (measured by disposition effect).  Last, we tested the potential risk-taking behaviour of investors’ survival in markets.  The least profitable and those tended to cease trading sooner than other and a V-shaped relationship was found between an investor’s sharpe ratio and their likelihood of ceasing to trade (cf. the average investor).

Sign up

29 June, University of Manchester

14:00–15:30

Presentation title

Presenter

Advancing methodology for predictive healthcare 

Niels Peek

Explanatory Tools for Probalistic Graphical Models

Nadia Papamichail

Sign up

23 June, University of Cambridge

13:25–14:25

Presentation title

Presenter

Description

Data science and the reconstruction of past behaviour: capturing the stone tool technology of prehistoric people

Robert Foley (with Jason Gellis and Camila Rangel)

For most of human history, stone was the primary raw material for much of the technological basis for human adaptation. The flaking of stone to create sharp edges and particular shapes and sizes of tools represents one of our major evolutionary advances. Once the skill was acquired, stone tools were made and discarded in prolific quantities, and changed in ways that mapped developments in cognition, behaviour and ecology. Archaeologists and anthropologists have, over more than one hundred years, collected vast numbers of stone tools, and developed intensive methods of analysis. The result is that there is a major resource in archived photographs and drawings of lithics. The Turing funded project, PALAEOANALYTICS, aims to develop AI/machine learning approaches to automate the retrieval of this information and to expand the potential data collected. In this talk we will present the progress we have made in developing computer vision techniques to collect key morphometric information from drawings of stone tools, focusing on those that indicate the technological processes used by prehistoric people to produce them.

Assessing psychosis risk using quantitative markers of transcribed speech

Sarah Morgan

There is a pressing clinical demand for tools to predict individual patients' disease trajectories for schizophrenia and other conditions involving psychosis, however to date such tools have proved elusive. Behaviourally and cognitively, psychosis expresses itself by subtle alterations in language. Recent work has suggested that Natural Language Processing markers of transcribed speech might be powerful predictors of later psychosis (Mota et al 2017, Corcoran et al 2018), for example, Corcoran et al 2018 used quantitative markers of semantic coherence collected at baseline from individuals at clinical high risk for psychosis, to predict transition to psychosis with 79% accuracy. However, it remains unclear which NLP measures are most likely to be predictive, how different NLP measures relate to each other and how best to collect speech data from patients. In this talk, I will discuss our research tackling these questions, as well as the wider challenges of translating this type of approach to the clinic. Ultimately, computational markers of speech have the potential to transform healthcare of mental health conditions such as schizophrenia, since they are relatively easy to collect and could be measured longitudinally to quickly identify changes in patients’ disease trajectories.

Sign up

29 June, University of Manchester

14:00–15:30

Presentation title

Presenter

Advancing methodology for predictive healthcare 

Niels Peek

Explanatory Tools for Probalistic Graphical Models

Nadia Papamichail

Sign up

16 June, University of Edinburgh

12:15–13:45

Presentation title

Presenter

Description

Safe AI for Surgical Assistance

Subramamian Ramamoorthy

In this talk, I will describe key results arising from my Turing Fellowship project, sAIfer surgery. Motivated by practical problems arising in this space, I will describe novel approaches to learning from demonstration, incorporating safety constraints into learned models, and structuring latent variable models in a disentangled manner to facilitate human-robot interactions.

Developing Methods for Text Mining Scottish Brain Imaging Reports

Beatrice Alex

This talk will present ongoing work at the Edinburgh Clinical NLP Group on text mining brain imaging reports and the practical challenges involved. I will demo how rule and neural network information extraction systems compare on such data and will contrast their strengths and weaknesses. I will also summarise the main findings of our systematic review on applying Natural Language Processing to radiology data.

Sign up

15 June, University of Manchester

14:00–15:30

Presentation title

Presenter

PICO extraction for imporved systematic review screening

Sophia Ananiadou

Link prediction in sparse bipartite graphs: the UK procurement network 

Julia Handl

Sign up

9 June, University of Southampton

14:00–15:00

Presentation title

Presenter

Description

Machine learning algorithms for automated event detection in Space Physics

Caitriona Jackman

This talk will give an overview of the use of machine learning to classify signatures of physical processes in magnetic fields and plasmas in space, using data from spacecraft such as Cassini at Saturn.

A smart algorithm for optimising hypertension management strategies

Francesco Shankar

In this presentation, Francesco Shankar will discuss the results of a multi-disciplinary project based on an ongoing successful collaboration between the Southampton Astronomy group and Clinical Pharmacologists at King’s College London, in which state-of-the-art techniques used in extra-galactic astronomy have been transferred to medical science.

Sign up

8 June, University of Manchester

14:00–15:30

Presentation title

Presenter

Analysis methods for epidemiological data on subgroups with correlated outcomes

Thomas House

Symbolic time series forecasing using LSTM networks: A search space odyssey

Stefan Guttel

Randomised multilevel MCMC and Digital Fingerprinting of Materials Microstructure

Kody Law

Sign up

28 May, University of Southampton

10:30–12:00

Presentation title

Presenter

Description

Strategic influence in dynamic opinion formation: Theory and data

Markus Brede

In this talk, Dr Brede will explain how the the focus of his pilot project was on theoretical advances in influence maximization for dynamic models of opinion formation, with the aim to pave the way for a fuller understanding of influence and opinion dynamics in real-world settings.

Towards flexible autonomy for swarms in dynamic and uncertain environments

Sarvapali Ramchurn

In the near future, robots will need to be deployed in large numbers, coordinated across multiple agencies and with fewer operators, if we are to make the best use of their capabilities. In this presentation Professor Ramchurn will describe the project's focus on the design of AI algorithms and interfaces which will ensure that human operators aren't overloaded and that they understand the automated actions taken by the swarms.

AI and Inclusion

Mike Wald

How AI can overcome barriers to inclusion: Of the nine protected characteristics identifield by the quality Act 2010, disability is the least homogeneous and so AI needs to work fairly for these ‘edge cases/outliers’ while the design and deployment of digital accessibility and assistive technology can benefit all members of society. 

Sign up

24 May, University of Exeter

15:00–15:50

Presentation title

Presenter

Description

Uncertainty Quantification (UQ) for Black-Box Computational Models with Application to Machine Decisions

Peter Challenor

Uncertainty quantification of complex numerical models is a large subject with potential to help address many important issues facing the world today, from epidemics to the climate emergency and the biodiversity crisis. We will discuss a number of problems arising from the theory and mathematics of uncertainty quantification; including model calibration (inverse modelling), the sequential design of experiments  and dynamic emulation of numerical models.

Sign up

24 May, University of Birmingham

11:00–13:00

Presentation title

Presenter

Description

A Working Group in Machine Learning for Cancer

Andrew Beggs

Professor Andrew Beggs will explain how his Turing Fellowship has helped contribute to the UK's COVID-19 fight in his talk "From molecular diagnostics to image AI: How a Turing Fellow got repurposed".

Data Intensive Life Sciences

Jean-Baptiste Cazier

Professor Jean-Baptiste Cazier, whose fellowship project became part of the UK Coronavirus Cancer Monitoring Project that is assessing the impact of COVID-19 on cancer patients.

Data Trusts Initiative

Sylvie Delacroix

Professor Sylvie Delacroix, whose talk on "Bottom-up data trusts: disturbing the one-size-fits-all approach to data governance" will explain how data trusts can enable data to be shared for personal or public benefit without exposing individuals, communities and society to harms resulting from data misuse.

Sign up

26 April, Queen Mary University of London

15:00–16:30

Presentation title

Presenter

Description

ElasticSketch: Towards network traffic measurements at scale

Steve Uhlig

Network measurements provide indispensable information at the best of times for network operations, quality of service, network billing and anomaly detection in data centers and backbone networks. However, measurements are all the more important when a network is undergoing problems (congestion, scan attacks, or DDoS attacks). During such times, traffic characteristics vary drastically, significantly degrading the performance of most measurement tasks. In this talk, I will present our recent efforts to design data structures capable of adapting to changing network traffic conditions, to keep network measurement tasks going.

Protecting personal information in image, audio and motion data

Andrea Cavallaro

Images, sounds and motion data we share in social media networks, and through services like voice interfaces and health apps reveal information about our behaviours, personal choices and preferences, which can be inferred by machine-learning classifiers. To prevent privacy violations, I will discuss how to protect personal information from unwanted automatic inferences by learning feature representations that disentangle sensitive from non-sensitive attributes as well as by crafting perturbations that protect selected attributes. I will show examples and discuss application scenarios for each data type.

Less Is More: Deep Learning with User Ownership at the Edge

Sean Gong

Visual search of unseen objects assumes the availability of a query image of a search target, which is limited when only a brief text description is available. Deep learning has been hugely successful in computer vision because of shared and centralised large sized training data. However, privacy concerns and a need for user-ownership of localised data pose new challenges to the conventional wisdom for centralised deep learning on big data. In this talk, I will highlight challenges and recent progress on deep learning for language guided user interactive visual search at the edge, and decentralised learning at the edge from non-shared distributed small data all having different learning tasks (non-shared labels).

Sign up

21 April, University of Edinburgh

12:15–13:45

Presentation title

Presenter

Description

​Tracking and Leveraging Online Slang from Urban Dictionary for NLP Applications

Walid Magdy

As an online, crowd-sourced, open English-language slang dictionary, the Urban Dictionary (UD) platform contains a wealth of opinions, jokes, and definitions of terms, phrases, acronyms, and more. In this research, we study the temporal activity trends on UD and provide the first analysis of how this activity relates to content being discussed on a major social network: Twitter. We explore the connections between the words and phrases that are defined and searched for on UD and the content that is talked about on Twitter. In addition, we show how using word-embeddings built from UD on various NLP applications can outperform larger word-embeddings built from collections that are order of magnitude of UD, which shows the uniqueness of this resource for language.

Artificial Intelligence for Data Analytics

 

Chris Williams

 

The practical work of deploying a machine learning system is dominated by issues outside of training a model: data preparation, data cleaning, understanding the data set, debugging models, and so on. The goal of the Artificial Intelligence for Data Analytics project at The Alan Turing Institute is to help to automate the whole data analytics process by drawing on advances in AI and machine learning.

We will describe tools to address such tasks, including identifying syntactic and semantic data types, data integration, and identifying and repairing missing and anomalous data. Joint work with the AIDA team: Taha Ceritli, James Geddes, Ernesto Jimenez-Ruiz, Ian Horrocks, Alfredo Nazabal, Tomas Petricek, Charles Sutton, Gerrit Van Den Burg.

Sign up