Introduction
As part of the Turing-Roche Partnership Community Scholar Scheme, the Tech-Talk series aims to facilitate the exchange of technical knowledge and skills between scientists in academia and industry.
About the event
Reproducibility has become common practice in computation-based research. We use version control to keep track of our code and write analysis notebooks to document decisions, commands, scripts, and parameters. As a research project progresses, there are often updates to both data and code, and it can be hard to keep track of the interdependencies between each set of results and know whether they are all up-to-date. Conversely, replicability requires that our analyses can also be run using different datasets.
Scientific workflow systems can help fill this gap. Workflows organise and encode all the steps required to get from raw data to final results, while also handling the busywork of managing whether any analyses need to be rerun and the computational resources required to make that happen.
In this talk, Mark will cover the philosophy and principles behind workflow systems, discuss two systems in detail (Nextflow and Snakemake), and share some examples of how workflows have enabled his research.
Watch now
You can watch a recording of this event here.