About the event

Our next Data Study Group will be held Monday 8 April - Friday 12 April 2019 at The Alan Turing Institute in London.

Apply now

Application deadline: Monday 4 March 12 noon GMT.

What are Data Study Groups?

  • Intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia
  • Organisations act as Data Study Group 'Challenge Owners', provide real-world problems and data sets to be tackled by small groups of highly talented, carefully selected researchers
  • Researchers brainstorm and engineer data science solutions, presenting their work at the end of the week

Why apply?

The Turing Data Study Groups are popular and productive collaborative events and a fantastic opportunity to rapidly develop and test your data science skills with real-world data. The event also offers participants the chance to forge new networks for future research projects, and build links with The Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.

It’s hard work, a crucible for innovation and excellent science, and a space to develop new friendships.

Reports from a previous Data Study Group are available on the Outcomes section of the April 2018 Data Study Group pages.


Our challenges and data sets are provided by partner organisations for researchers to work on over the week. The organisations and challenges leading the Data Study Group this April are:

  • Roche - Personalised lung cancer treatment modelling using electronic health records and genomics
  • Great Ormond Street Hospital - Augmenting clinical decision making in intensive care (ACaDeMIC)
  • NATS UK Air Traffic Services Provider - Using real world data to advance air traffic control
  • Spend Network - Automated matching of businesses to government contract opportunities
  • British Antarctic Survey - Seals from space: automated Antarctic ecosystem monitoring via high-resolution satellite imagery

Please see below for further details on each challenge

The skills that we think are particularly relevant to the challenges for this Data Study Group are listed under each challenge description below.  Please note, the lists are not exhaustive and we are open to creative interpretation of the challenges listed. Diversity of disciplines is encouraged, and we warmly invite applications from a range of academic backgrounds and specialisms.


How to apply

Apply now 

Applicants will be contacted regarding the outcome of their application by 15 March 2019.

If you have questions about your application please contact [email protected].

Applicants from industry participate as individuals and not as representatives of an organisation.


The Alan Turing Institute will cover travel costs in alignment with our expenses policy. We will also provide accommodation for researchers not normally London-based. Accommodation for researchers who are from a London university or research institute may be available for those who travel from outside London to work. Expenses for international applicants is capped at £200, which includes any costs of visa. Lunch and dinner is provided for participants during the week. 

Challenge descriptions


Personalised lung cancer treatment modelling using electronic health records and genomics

Roche, Foundation Medicine and Flatiron are providing a recently collected, systematic and representative dataset comprising ten thousands of US lung cancer patients’ electronic health records, including detailed omics. Participants are invited to investigate whether modern data science and AI can help predict individual responses to different treatments, and how (or whether) these predictions can be leveraged for therapy recommendations.

Useful skills: generic applied data science, predictive/supervised modelling, survival modelling, event modelling, electronic health records, multi-modal data, omics data, feature selection or feature engineering, causal inference

Great Ormond Street Hospital

Augmenting ClinicAl DEcision Making in Intensive Care (ACaDeMIC)

When children are on life-support in intensive care units, their vital signs (heart rate, oxygen level [saturation], blood pressure and others) are monitored continuously. However, standard charting systems only record these data hourly.  We have a high resolution vital sign dataset of 3 years data from 5500 patients sampled every 5 seconds. Clinical staff make treatment decisions typically informed by combinations of absolute values (with little or no adjustment for the context of age or trend) of these vital signs. Our challenge is to interrogate this uniquely rich dataset to develop predictive algorithms that out-perform clinical decision-making for one high-risk decision faced by every patient: optimal timing for a trial of breathing without life-support.

Useful skills: Exploratory analysis of highly correlated and/or longitudinal data, Time series modelling, Mixed (Fixed and Random) modelling, Time-to-event predictive modelling, Causal inference, Bayesian inference & Practical data analysis with messy data.


Using real-world data to advance air traffic control

Building on our previous Data Study Group with the Turing, this challenge will further investigate methods by which large volumes of historical data can be used to improve real-time aircraft predictions, helping us to modernise our systems to meet the future demands on UK airspace.  The main goal of the challenge is to predict aircraft behaviour given its physics model, current position, instructions received and also weather information.

NATS will provide an extensive and well organised dataset alongside domain experts and software libraries to support and collaborate with the attendees to investigate data-driven refinements to the models provided. The outcomes will drive the future models of aircraft behaviour and influence the safe and resilient evolution of air traffic systems in the UK.

Useful skills: filtering and tracking algorithms, time series prediction, probabilistic prediction

Spend Network

Automated matching of businesses to government contract opportunities.

Government publishes thousands of opportunities per month across hundreds of different websites, making it hard for businesses to find the right opportunity. Spend Network will provide access to hundreds of thousands of tender notices from the UK and Overseas in order to explore the use of AI in matching UK companies capabilities to specific opportunities.

Useful skills: text mining, natural language processing, recommender systems, modelling with structured data records, semi-supervised learning, dashboard visualization, familiarity with EU or UK tenders

British Antarctic Survey

Seals from space: automated Antarctic ecosystem monitoring via high-resolution satellite imagery

Antarctic seal populations are potential indicators for the Antarctic ecosystem’s health. Very High Resolution (VHR) satellite imagery provides an opportunity to monitor these seals with greatly reduced cost and effort. British Antarctic Survey would like to investigate whether modern data science technology can be used for automatic counting of seals, or for constituent classification of the sea ice where seals breed and rest.

Useful Skills: Image Classification, Object Detection, GIS, Remote Sensing, deep learning, spatial modelling, satellite images


Find out more

How to get involved as a researcher

How to write a great Data Study Group application

Queries can be directed to Data Study Group