About the event
Our next Data Study Group will be held Monday 8 April - Friday 12 April 2019 at The Alan Turing Institute in London.
Applications are now closed. Applicants should hear before Wednesday 20 March as to who has been successful.
The next Data Study Group will be in early September. If you would like to be notified when the applications are opened for the next Data Study Group, or for other similar events you can register for the DSG updates here, by selecting Receive Data Study Group Updates.
What are Data Study Groups?
- Intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia
- Organisations act as Data Study Group 'Challenge Owners', provide real-world problems and data sets to be tackled by small groups of highly talented, carefully selected researchers
- Researchers brainstorm and engineer data science solutions, presenting their work at the end of the week
The Turing Data Study Groups are popular and productive collaborative events and a fantastic opportunity to rapidly develop and test your data science skills with real-world data. The event also offers participants the chance to forge new networks for future research projects, and build links with The Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.
It’s hard work, a crucible for innovation and excellent science, and a space to develop new friendships.
Reports from a previous Data Study Group are available on the Outcomes section of the April 2018 Data Study Group pages.
Our challenges and data sets are provided by partner organisations for researchers to work on over the week. The organisations and challenges leading the Data Study Group this April are:
- Roche - Personalised lung cancer treatment modelling using electronic health records and genomics
- Great Ormond Street Hospital - Augmenting clinical decision making in intensive care (ACaDeMIC)
- NATS UK Air Traffic Services Provider - Using real world data to advance air traffic control
- Spend Network - Automated matching of businesses to government contract opportunities
- British Antarctic Survey - Seals from space: automated Antarctic ecosystem monitoring via high-resolution satellite imagery
(Above): Prem Gill, British Antarctic Survey researcher, provides an industry perspective on the April 2019 Data Study Group.
The skills that we think are particularly relevant to the challenges for this Data Study Group are listed under each challenge description below. Please note, the lists are not exhaustive and we are open to creative interpretation of the challenges listed. Diversity of disciplines is encouraged, and we warmly invite applications from a range of academic backgrounds and specialisms.
The Alan Turing Institute will cover travel costs in alignment with our expenses policy. We will also provide accommodation for researchers not normally London-based. Accommodation for researchers who are from a London university or research institute may be available for those who travel from outside London to work. Expenses for international applicants is capped at £200, which includes any costs of visa. Lunch and dinner is provided for participants during the week.
Personalised lung cancer treatment modelling using electronic health records and genomics
Roche, Foundation Medicine and Flatiron are providing a recently collected, systematic and representative dataset comprising ten thousands of US lung cancer patients’ electronic health records, including detailed omics. Participants are invited to investigate whether modern data science and AI can help predict individual responses to different treatments, and how (or whether) these predictions can be leveraged for therapy recommendations.
Useful skills: generic applied data science, predictive/supervised modelling, survival modelling, event modelling, electronic health records, multi-modal data, omics data, feature selection or feature engineering, causal inference
Augmenting ClinicAl DEcision Making in Intensive Care (ACaDeMIC)
When children are on life-support in intensive care units, their vital signs (heart rate, oxygen level [saturation], blood pressure and others) are monitored continuously. However, standard charting systems only record these data hourly. We have a high resolution vital sign dataset of 3 years data from 5500 patients sampled every 5 seconds. Clinical staff make treatment decisions typically informed by combinations of absolute values (with little or no adjustment for the context of age or trend) of these vital signs. Our challenge is to interrogate this uniquely rich dataset to develop predictive algorithms that out-perform clinical decision-making for one high-risk decision faced by every patient: optimal timing for a trial of breathing without life-support.
Useful skills: Exploratory analysis of highly correlated and/or longitudinal data, Time series modelling, Mixed (Fixed and Random) modelling, Time-to-event predictive modelling, Causal inference, Bayesian inference & Practical data analysis with messy data.
This work is supported by Wave 1 of The UKRI Strategic Priorities Fund under the EPSRC Grant EP/T001569/1, particularly the “Health” theme within that grant and The Alan Turing Institute”.
Using real-world data to advance air traffic control
Building on our previous Data Study Group with the Turing, this challenge will further investigate methods by which large volumes of historical data can be used to improve real-time aircraft predictions, helping us to modernise our systems to meet the future demands on UK airspace. The main goal of the challenge is to predict aircraft behaviour given its physics model, current position, instructions received and also weather information.
NATS will provide an extensive and well organised dataset alongside domain experts and software libraries to support and collaborate with the attendees to investigate data-driven refinements to the models provided. The outcomes will drive the future models of aircraft behaviour and influence the safe and resilient evolution of air traffic systems in the UK.
Useful skills: filtering and tracking algorithms, time series prediction, probabilistic prediction
Automated matching of businesses to government contract opportunities.
Government publishes thousands of opportunities per month across hundreds of different websites, making it hard for businesses to find the right opportunity. Spend Network will provide access to hundreds of thousands of tender notices from the UK and Overseas in order to explore the use of AI in matching UK companies capabilities to specific opportunities.
Useful skills: text mining, natural language processing, recommender systems, modelling with structured data records, semi-supervised learning, dashboard visualization, familiarity with EU or UK tenders
Seals from space: automated Antarctic ecosystem monitoring via high-resolution satellite imagery
Antarctic seal populations are potential indicators for the Antarctic ecosystem’s health. Very High Resolution (VHR) satellite imagery provides an opportunity to monitor these seals with greatly reduced cost and effort. British Antarctic Survey would like to investigate whether modern data science technology can be used for automatic counting of seals, or for constituent classification of the sea ice where seals breed and rest.
Useful Skills: Image Classification, Object Detection, GIS, Remote Sensing, deep learning, spatial modelling, satellite images
This work is supported by Wave 1 of The UKRI Strategic Priorities Fund under the EPSRC Grant EP/T001569/1, particularly the “AI for Science” theme within that grant and The Alan Turing Institute.
Find out more
Queries can be directed to Data Study Group