Geneticists almost never know which DNA letter changes cause complex disease, and molecular biologists almost never know what molecular changes cause disease. Geneticists and molecular biologists together can combine their skills to decide what DNA changes alter molecules that, in turn, alter the risk of disease. This project combines population and molecular data to pinpoint disease causing DNA changes.

Explaining the science

Population genetics studies can narrow down where in the human genome lie DNA changes that influence disease risk, yet they cannot immediately highlight what molecular mechanism has gone wrong in disease. This study proposes to consider each molecular mechanism in turn and model whether genetic variants implicate this mechanism in each of many traits and diseases. The approach that has been developed explicitly tests whether other mechanisms are confounders - variables other than the independent variable being assessed that can affect other variables.

In a pilot study, funded by the Medical Research Council (MRC), the project's researchers inferred dozens of likely causal DNA variants that explain both a change in a protein's binding to DNA and, by virtue of this variable binding, a change in a trait or disease risk. This project expands on this pilot study by generalising to all proteins for which appropriate data currently exist.

Project aims

Population genetics has been very successful at identifying places in the human genome where DNA predicts a trait (e.g. height) or a disease (e.g. diabetes). At the same time, experimentalists have discovered how proteins alter gene activity. However, genetics cannot commonly pinpoint what specific DNA change causally changes the trait or disease risk, and experiments cannot pinpoint what molecules participate in altering trait or disease. What is needed is a 'best-of-both-worlds' taking advantage of population genetics and molecular experiments.

This project does this by extending a popular approach called 'two-sample Mendelian Randomisation'. The extension considers large numbers of molecules and all traits measured in the UK Biobank. 

To achieve this, three steps are required:

  1. Identify places that differ between a person’s maternal and paternal DNA and that differentially bind a particular protein
  2. Find DNA changes that predict the quantity of this protein
  3. Build a mathematical model of these DNA changes and, using the model, test whether the DNA changes predict a person’s disease risk.

The project aims to generate a database of inferred DNA changes that cause both molecular and trait/disease variation for hundreds of proteins in dozens of cell types, for hundreds of traits or diseases. Such an outcome would be transformational in impact because few causal DNA changes are currently defined robustly, and few molecular mechanisms explaining trait change are known currently.


Selecting genetically supported drug targets doubles their success rate in clinical development. It is expected that this research will improve on genetic support from genome-wide association studies - which do not commonly directly illuminate disease aetiology (cause) or mechanism - by indicating molecular mechanism and causal variant. Consequently, it is expected that the pharmaceutical industry will benefit from this research.


Researchers and collaborators

Contact info

[email protected]