Dr Catalina Vallejos

Catalina Vallejos


Turing Fellow

Partner Institution


Catalina is a Chancellor's Fellow at the MRC Human Genetics Unit, where she leads the Biomedical Data Science research group

Before moving to Edinburgh, Catalina was part of the first cohort of Turing Research Fellows. As part of her Fellowship, Catalina was also a Group Leader within the Lloyds Register Foundation-Turing Programme on Data-Centric Engineering

Between 2014 and 2016, Catalina was a Postdoctoral Fellow in a joint appointment between the MRC Biostatistics Unit (MRC-BSU) and the EMBL European Bioinformatics Institute (EMBL-EBI), both located in Cambridge (UK). In this position, she was a member of the Statistical Genomics research group (MRC-BSU) and the Marioni group (EMBL-EBI) which are respectively lead by Professor Sylvia Richardson and Dr John Marioni


Catalina completed a PhD in Statistics at the Department of Statistics of the University of Warwick, under the supervision of Professor Mark Steel. Her PhD thesis covered theoretical and practical aspects of Bayesian inference and survival analysis. She completed her undergraduate and MSc studies in Chile: BSc in Mathematics (Statistics track) and MSc in Statistics at the Faculty of Mathematics of the Pontificia Universidad Católica de Chile. During her BSc studies, Catalina also completed a Certificate in Economics. Her MSc dissertation project was in the area of long memory time times, under the supervision of Dr Wilfredo Palma

Research interests

Catalina's main area of research is on Bayesian statistical methodology, mostly driven by applications in biomedicine. An important area of Catalina's research programme is to translate the methods she develops into open-source analysis tools that can reach the wider community. Currently, Catalina's group focuses on two areas of application: single-cell genomics and electronic health records. 

In terms of methodology, her main interests include:

  • High-dimensional Bayesian hierarchical models
  • Integration of heterogeneous datasets
  • Statistical models for count-based datasets
  • Statistical models for time-to-event data
  • Statistical genomics
  • Scalable Bayesian methodology