Research areas

Introduction

Reproducible research is work that can be independently verified. In practice, it means sharing the data and code that were used to generate published results - yet this is often easier said than done. 'The Turing Way' is a guide to reproducible data science that will support students and academics as they develop their code, with the aim of helping them produce work that will be regarded as gold-standard examples of trustworthy and reusable research.

Project aims

In the ideal case, all published results should be independently verifiable and suitable for other researchers to build upon. For this to happen, the data and code that support the publication need to be made available in an easy-to-use and open format.

Sharing these research outputs means understanding data management, library sciences, software development and continuous integration techniques: skills that are not widely taught or expected of academic researchers and data scientists.

'The Turing Way' is a handbook to support students, their supervisors, funders and journal editors in ensuring that reproducible data science is 'too easy not to do'. It will include training material on version control, analysis testing and open and transparent communication with future users, and build on Turing Institute case studies and workshops.

Applications

'The Turing Way' will support everybody involved in data science research: the developers of the code (research engineers, postdocs and doctoral students), their supervisors and the business team members who coordinate these projects. The format will be easy for the reader to dip in and out of, depending on their level of experience in the various topics. The project will help to answer questions that researchers don't always ask: "How do I ensure that my code's existing functionality doesn't change as I extend the codebase?", "How do I make my project easy for someone else to run?", and many more.

Senior team members - Turing fellows, program directors and managers - will be catered for with key points tailored towards managing reproducible research projects highlighted for each topic covered. The project will build and curate checklists for what can be done to ensure all project outputs are reproducible. A chapter on Binder will be of interest to supervisors who want to regularly review their students' code, and will include the technical details of how to set up a BinderHub that will be useful for research software engineers.

Recent updates

January 2019

'The Turing Way' team will host three workshops in March 2019, all focused around Binder, an easy-to-use service that runs version-controlled computational environments.

Workshop: Boost your research reproducibility with Binder

During this free workshop we will discuss reproducible computing environments, show examples of others’ projects in Binder and help you learn how to prepare a Binder-ready project. At the end of the workshop you will be able to take some of your own content (in a R or Jupyter notebook, or scripts that can be run in the terminal) and prepare it so that it can be used by others on mybinder.org.

This workshop is for people who are:

  • Interested in reproducibility, containers, Docker or continuous integration
  • Already familiar with R Markdown or Jupyter notebooks
  • Looking to communicate their research more effectively

Eventbrite links:

Friday 1st March, University of Manchester
Tuesday 12th March, The Alan Turing Institute

Workshop: Build a BinderHub

During this free workshop we will demonstrate how to build your own BinderHub on Microsoft Azure cloud computing resources. We will help you get started with building a BinderHub on your institution's computing platform and discuss the challenges of maintaining a BinderHub. At the end of the workshop you will know why this would be a useful resource for your team, and will know where to look for help and support building your institution's BinderHub.

This workshop is for Research Software Engineers and IT staff who are:

  • Interested in reproducibility, containers, Docker or continuous integration
  • Already familiar with Binder and R Markdown or Python for data science
  • Interested in setting up their own local BinderHub

Eventbrite link:

Monday 18th March: University of Sheffield

Organisers

Researchers

Contact info

[email protected]

This project is openly developed; any and all questions, comments and recommendations are welcome at the GitHub repository.

To hear about events and monthly project updates, sign up to the newsletter.