Aims/hypothesis The aims of this study were to evaluate systematically the predictive power of comprehensive metabolomics profiles in predicting the future risk of type 2 diabetes, and to identify a panel of the most predictive metabolic markers. Methods We applied an unbiased systems medicine approach to mine metabolite combinations that provide added value in predicting the future incidence of type 2 diabetes beyond known risk factors. We performed mass spectrometry-based targeted, as well as global untargeted, metabolomics, measuring a total of 568 metabolites, in a Finnish cohort of 543 non-diabetic individuals from the Botnia Prospective Study, which included 146 individuals who progressed to type 2 diabetes by the end of a 10 year follow-up period. Multivariate logistic regression was used to assess statistical associations, and regularised least-squares modelling was used to perform machine learning-based risk classification and marker selection. The predictive performance of the machine learning models and marker panels was evaluated using repeated nested cross-validation, and replicated in an independent French cohort of 1044 individuals including 231 participants who progressed to type 2 diabetes during a 9 year follow-up period in the DESIR (Data from an Epidemiological Study on the Insulin Resistance Syndrome) study. Results Nine metabolites were negatively associated (potentially protective) and 25 were positively associated with progression to type 2 diabetes. Machine learning models based on the entire metabolome predicted progression to type 2 diabetes (area under the receiver operating characteristic curve, AUC = 0.77) significantly better than the reference model based on clinical risk factors alone (AUC = 0.68; DeLong’s p = 0.0009). The panel of metabolic markers selected by the machine learning-based feature selection also significantly improved the predictive performance over the reference model (AUC = 0.78; p = 0.00019; integrated discrimination improvement, IDI = 66.7%). This approach identified novel predictive biomarkers, such as α-tocopherol, bradykinin hydroxyproline, X-12063 and X-13435, which showed added value in predicting progression to type 2 diabetes when combined with known biomarkers such as glucose, mannose and α-hydroxybutyrate and routinely used clinical risk factors Conclusions/interpretation This study provides a panel of novel metabolic markers for future efforts aimed at the prevention of type 2 diabetes.
- multivare models
- Kallikrein-kinin system