Between 2013 and 2015, the UK Biobank (UKBB) collected accelerometer traces (AXT) using wrist-worn triaxial accelerometers for 103,712 volunteers aged between 40 and 69, for one week each. This dataset has been used in the past to verify that individuals with chronic diseases exhibit reduced activity levels compared to healthy populations 1. Yet, the dataset is likely to be noisy, as the devices were allocated to participants without a specific set of inclusion criteria, and the traces reflect uncontrolled free-living conditions.


To determine the extent to which AXT traces can distinguish individuals with Type-2 Diabetes (T2D) from normoglycaemic controls, and to quantify their limitations.


Physical activity features were first extracted from the raw AXT dataset for each participant, using an algorithm that extends the previously developed Biobank Accelerometry Analysis toolkit from Oxford University 1. These features were complemented by a selected collection of socio-demographic and lifestyle (SDL) features available from UKBB. Clustering was used to determine whether activity features would naturally partition participants, and the SDL features were projected onto the resulting clusters for a more meaningful interpretation. Supervised machine learning classifiers were then trained using the different sets of features, to segregate T2D positive individuals from normoglycaemic. Multiple criteria, based on a combination of self-assessment Biobank variables and primary care health records linked to the participants in Biobank, were used to identify 3,103 individuals in this population who have T2D. The remaining non-diabetic participants were further scored on their physical activity impairment severity levels based on other conditions found in their primary care data, and those likely to have been physically impaired at the time were excluded.


Three types of classifiers were tested, with AUROC close to .86 for all three, and F1 scores in the range [.80,.82] for T2D positives and [.73,.74] for controls. Results obtained using non-physically impaired controls were compared to highly physically impaired controls, to test the hypothesis that non-diabetes conditions reduce classifier performance. Models built using a training set that includes controls with other conditions had worse performance: AUROC [.75-.77] and F1 in the range [.76-.77] (positives) and [.63,.65] (controls). Clusters generated using k-means and hierarchical methods showed limited quality (Silhouette scores: 0.105, 0.207 respectively), however a 2-dimensional visual rendering obtained using T-SNE reveals well-defined clusters. Importantly, one of the 3 hierarchical clusters contain almost exclusively (close to 100%) T2D participants.


The study demonstrates the potential, and limitations, of AXT in the UKBB when these are used to discriminate between T2D and normoglycaemic controls. The use of primary care EHRs is essential both to correctly identify positives, and also to identify controls that should be excluded to reduce noise in the training set.

Citation information

Lam B, Catt M, Cassidy S, Bacardit J, Darke P, Butterfield S, Alshabrawy O, Trenell M, Missier P

Turing affiliated authors