The Reinforcement Learning Study Group invites participants to tackle a single reinforcement challenge provided by the Defence Science and Technology Laboratory (Dstl). Please see the challenge section below for full project details. 

Researchers will brainstorm and engineer data science solutions, presenting their work at the end of the event. 

The event offers participants the chance to forge new networks for future research projects, hone their reinforcement learning skills and build links with The Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.



Dstl's Defence and Security Analysis Division exists to provide evidence to help Defence make effective decisions. Example studies include assessing different scenarios to see which would best allow Defence to deliver government policy and assessing the cost-benefit of investing in future capabilities.

Dstl undertakes planning exercises to explore different approaches to particular campaigns, for example humanitarian aid or logistics, given a set of specific assets and capabilities. One method for doing so is by playing games, either tabletop or virtual, to understand how to optimise planning efforts. Currently, experienced defence staff play such games to understand how best to manage virtual assets in these simulations. 

Recent advances in reinforcement learning have demonstrated its effectiveness for creating competitive virtual agents to play games such as Starcraft and Go, which have a fixed, static rule set. In realistic planning problems, new situations are typically unlike those seen before, which can be thought of as the “rules of the game” changing. Consequently, a requirement has been identified to understand how viable reinforcement learning approaches are when the rules of a game change.


This Reinforcement Learning Study Group will look at how to apply reinforcement learning solutions to a virtualised game environment. The following two closely linked research questions will be addressed.

Firstly, the extent to which reinforcement learning systems can be trained to win a game with one set of rules and maintain this level of performance on a game with a modified set of rules. The challenge here is to learn a more abstract set of success criteria which are invariant to the specific rules of one particular game. These rule changes are entirely numerical, for example size of playing area, detection range, numbers of assets available, etc.

Secondly, related to the above, it is advantageous if the knowledge of previous games can be used to inform the understanding of a new game. How well can reinforcement learning systems be bootstrapped on a specific set of rules? Does training on one specific rule set mean less training is required on a new rule set? Can the model be trained more efficiently by incorporating earlier learning? Participants will be given a set of numerical values in the rules which are 'allowed'. Trained models will then be tested on games with rule sets outside the training range, to understand how to best structure the initial training such that subsequent work is minimised. 

The environment, known as ‘PLARK’, simulates a game in which a submarine must get from one side of a board to the other, whilst avoiding detection from an aircraft. Whilst the resulting strategies from this specific game are entirely non-applicable in real-world contexts, the virtualisation provides a useful environment for testing these methodological questions. 

The game can be found on GitHub


Understanding how to optimally disperse a limited set of assets and capabilities will help our partners at Dstl to undertake their missions, including keeping the UK safe and secure, and providing humanitarian aid and disaster relief in times of need. This mission aligns closely with the Turing's goal of advancing world-class research and applying it to real-world problems, to change the world for the better.

Useful skills

Essential: Basics of reinforcement learning, proficiency with Python.

It is desirable to have familiarity with: Stable reinforcement learning baselines, TensorFlow, Jupyter Notebooks, tools such as ssh/docker/pip/git

Beyond a familiarity with the basics of reinforcement learning, which is essential, a familiarity with the following particularly relevant sub-areas of the reinforcement learning literature would be useful: 

Transfer learning, adversarial reinforcement learning, curriculum learning, domain adaptation, multi-agent reinforcement learning, deep reinforcement learning, off-policy reinforcement learning. 

Remote format

Due to COVID-19 the Reinforcement Learning Study Group will run remotely over three weeks and will be divided into two stages.

Stage 1: The Precursor Stage (part-time)

  • The precursor stage will last one week in the run up to the 'event stage' (22 February – 26 February).

  • Maximum time commitment two hours a day (around lunchtime and/or after 6pm UK time).

  • Online workshops / presentations / team building in order to prepare for the ‘event stage'.

Stage 2: The Event Stage (full-time)

  • The 'event stage' will run over two weeks (1 – 12 March).

  • The core working hours will be 09:00 – 17:00 GMT, however flexibility will be demonstrated with regards to those participating in different time zones.

  • Group work begins and continues throughout.

Applicants should be able to commit to the duration of the event and will require access to a laptop, webcam and a homeland line Wi-fi connection (or equivalent quality) with at least 10 Mbps download and 1-5 Mbps upload speed.

The Alan Turing Institute is committed to supporting individual circumstances, please do not hesitate to email [email protected] to discuss any reasonable adjustments or Wi-fi concerns (we may be able to help).   


Researchers are typically PhD level / early career academics from statistics, computer science, engineering, mathematics, and computational social science, as well as wider disciplines where data science and AI skills are increasingly becoming relevant. A basic understanding of reinforcement learning is essential for participation.  

In general, we look for the following qualities in a reasonable application:

  • Familiarity with data munging/preparation/visualisation in Python.
  • Ability to translate real-world problems into a mathematical framework.
  • Sound knowledge of reinforcement learning or demonstratable interest in accruing reinforcement learning skills. 
  • Desire to, or experience working collaboratively with a group of researchers.
  • Capable of writing a data science report. 

A genuine interest and basic understanding of reinforcement learning. 

How to apply

Applications are now closed. 

Join our Data Study Group mailing list to be notified when the next call for applications opens.

The Alan Turing Institute recognises the under-representation that exists within data science. We are committed to increasing the representation of female, black and minority ethnic, LGBTQ+, disabled and neurodiverse researchers in data science and especially welcome these applications. We believe the best solutions to challenges result when a diverse team work together to share and benefit from the different facets of their experience. You can review our equality, diversity and inclusion (EDI) statement online

About the event

What are Data Study Groups?

  • Intensive five day 'collaborative hackathons' hosted at the Turing, which bring together organisations from industry, government, and the third sector, with talented multi-disciplinary researchers from academia. (Please note this format is currently different due to COVID-19).
  • Organisations act as Data Study Group 'Challenge Owners', providing real-world problems and datasets to be tackled by small groups of highly talented, carefully selected researchers.
  • Researchers brainstorm and engineer data science solutions, presenting their work at the end of the week.

Why apply?

The Turing Data Study Groups are popular and productive collaborative events and a fantastic opportunity to rapidly develop and test your data science skills with real-world data. The event also offers participants the chance to forge new networks for future research projects, and build links with The Alan Turing Institute – the UK’s national institute for data science and artificial intelligence.

It’s hard work, a crucible for innovation and a space to develop new ways of thinking.

Reports from previous Data Study Groups are available here.


Read our FAQs for Data Study Group applicants.


Find out more

Learn more about being a DSG participant including FAQs

How to write a great Data Study Group application

Queries can be directed to the Data Study Group Team