Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

AIMS/HYPOTHESIS: This study aimed to explore the added value of subgroups that categorise individuals with type 2 diabetes by k-means clustering for two primary care registries (the Netherlands and Scotland), inspired by Ahlqvist's novel diabetes subgroups and previously analysed by Slieker et al. METHODS: We used two Dutch and Scottish diabetes cohorts (N=3054 and 6145; median follow-up=11.2 and 12.3 years, respectively) and defined five subgroups by k-means clustering with age at baseline, BMI, HbA1c, HDL-cholesterol and C-peptide. We investigated differences between subgroups by trajectories of risk factor values (random intercept models), time to diabetes-related complications (logrank tests and Cox models) and medication patterns (multinomial logistic models). We also compared directly using the clustering indicators as predictors of progression vs the k-means discrete subgroups. Cluster consistency over follow-up was assessed. RESULTS: Subgroups' risk factors were significantly different, and these differences remained generally consistent over follow-up. Among all subgroups, individuals with severe insulin resistance faced a significantly higher risk of myocardial infarction both before (HR 1.65; 95% CI 1.40, 1.94) and after adjusting for age effect (HR 1.72; 95% CI 1.46, 2.02) compared with mild diabetes with high HDL-cholesterol. Individuals with severe insulin-deficient diabetes were most intensively treated, with more than 25% prescribed insulin at 10 years of diagnosis. For severe insulin-deficient diabetes relative to mild diabetes, the relative risks for using insulin relative to no common treatment would be expected to increase by a factor of 3.07 (95% CI 2.73, 3.44), holding other factors constant. Clustering indicators were better predictors of progression variation relative to subgroups, but prediction accuracy may improve after combining both. Clusters were consistent over 8 years with an accuracy ranging from 59% to 72%. CONCLUSIONS/INTERPRETATION: Data-driven subgroup allocations were generally consistent over follow-up and captured significant differences in risk factor trajectories, medication patterns and complication risks. Subgroups serve better as a complement rather than as a basis for compressing clustering indicators.

Original publication




Journal article



Publication Date



Data-driven subgroups, Longitudinal analysis, Real-world data, Routine care, Stratification of diabetes