Extensive phenotype data and machine learning in prediction of mortality in acute coronary syndrome – the MADDEC study

Jussi A. Hernesniemi*, Shadi Mahdiani, Juho A. Tynkkynen, Leo-Pekka Lyytikäinen, Pashupati P. Mishra, Terho Lehtimäki, Markku Eskola, Kjell Nikus, Kari Antila, Niku Oksala

*Corresponding author for this work

    Research output: Contribution to journalArticleScientificpeer-review

    47 Citations (Scopus)

    Abstract

    Objective: Investigation of the clinical potential of extensive phenotype data and machine learning (ML) in the prediction of mortality in acute coronary syndrome (ACS). Methods: The value of ML and extensive clinical data was analyzed in a retrospective registry study of 9066 consecutive ACS patients (January 2007 to October 2017). Main outcome was six-month mortality. Prediction models were developed using two ML methods, logistic regression and extreme gradient boosting (xgboost). The models were fitted in training set of patients treated in 2007–2014 and 2017 (81%, n = 7344) and validated in a separate validation set of patients treated in 2015–2016 with full GRACE score data available for comparison of model accuracy (19%, n = 1722). Results: Overall, six-month mortality was 7.3% (n = 660). Several variables were found to be significantly associated with six-month mortality by both ML methods. The xgboost scored the best performance: AUC 0.890 (0.864–0.916). The AUC values for logistic regression and GRACE score were 0.867(0.837–0.897) and 0.822 (0.785–0.859), respectively. The AUC value of xgboost was better when compared to logistic regression (p =.012) and GRACE score (p <.00001). Conclusions: The use of extensive phenotype data and novel machine learning improves prediction of mortality in ACS over traditional GRACE score.KEY MESSAGES The collection of extensive cardiovascular phenotype data from electronic health records as well as from data recorded by physicians can be used highly effectively in prediction of mortality after acute coronary syndrome. Supervised machine learning methods such as logistic regression and extreme gradient boosting using extensive phenotype data significantly outperform conventional risk assessment by the current golden standard GRACE score. Integration of electronic health records and the use of supervised machine learning methods can be easily applied in a single centre level to model the risk of mortality.

    Original languageEnglish
    Pages (from-to)156-163
    JournalAnnals of Medicine
    Volume51
    Issue number2
    DOIs
    Publication statusPublished - Mar 2019
    MoE publication typeA1 Journal article-refereed

    Funding

    This study is supported by Business Finland research funding [Grant no. 4197/31/2015] as a part of collaboration between Tays Heart Hospital, University of Tampere, VTT Technical Research Center Finland Ltd, GE Healthcare Finland Ltd, Fimlab laboratories Ltd, Bittium Ltd and Politechinco di Milano. This study was also supported with grants from the Competitive Research Funding of the Tampere University Hospital [Grant no. X5001 for Professor Lehtimäki], the Emil Aaltonen Foundation (for Professor Lehtimäki), and the Academy of Finland [Grant no. 286284 for Professor Lehtimäki], the Finnish Foundation for Cardiovascular Research, the Tampere Tuberculosis Foundation (for Professor Lehtimäki), the Yrjö Jahnsson Foundation, and EU Horizon 2020 [grant 755320 for TAXINOMISIS].

    Keywords

    • machine learning
    • risk factors
    • mortality
    • acute coronary syndrome
    • Risk Assessment
    • Comorbidity
    • Humans
    • Middle Aged
    • Acute Coronary Syndrome/mortality
    • Logistic Models
    • Male
    • Coronary Angiography/statistics & numerical data
    • Machine Learning
    • Phenotype
    • Female
    • Registries
    • Aged
    • Retrospective Studies
    • Electronic Health Records/statistics & numerical data

    Fingerprint

    Dive into the research topics of 'Extensive phenotype data and machine learning in prediction of mortality in acute coronary syndrome – the MADDEC study'. Together they form a unique fingerprint.

    Cite this