Introduction
The cprd-data-wrangle repository is a resource created and maintained by the AI for Multiple Long Term Conditions Research Support Facility (AIM-RSF). It has been created for researchers working with the Clinical Practice Research Datalink (CPRD), facilitated by the RSF gaining access to the medium-fidelity synthetic versions of CPRD's datasets.
Researchers tasked with understanding the database tables, then querying and filtering to create a research cohort, may find our pre-processing pipeline and interactive notebooks a helpful guide to getting started. The overarching goal of this work is to streamline the process for researchers using CPRD datasets, with the creation of clear documentation, efficient data management strategies and analytical pipelines.