Abstract
Previous works have shown the risk factors associated with an increased likelihood of Colorectal cancer (CRC) and its bad prognosis. This study aimed to build an efficient model to predict the mortality caused by CRC in Lleida, Spain. To this purpose, three different machine learning algorithms, such as Random Forest (RF), Neural Network (NN) and Extreme Learning Machine (ELM), were trained at different augmentation rates on a real dataset. It contained gender, age group, such risk factors as body mass index (BMI), smoking consumption, alcohol consumption and tumour staging. The study included 179 patients with a CRC detected whom 16 passed away. Furthermore, to balance the dataset, Synthetic Minority Oversampling Technique (SMOTE) algorithm was used. The results show that Random Forest (RF) obtained an accuracy of 90% with the balanced dataset. Extreme Learning Machine (ELM) also received a similar accuracy to RF (around 90%). Neural Network (NN) decreased the performance and got an accuracy of 80%. Regarding precision, recall and F1-score, RF and ELM contacted similar outcomes. These results suggested the excellent performance of these models and the use of an oversampling to balance the dataset. They could be considered perfect algorithms for building a predictive model.
Original language | English |
---|---|
Title of host publication | Proceedings of ELM 2022 |
Subtitle of host publication | Theory, Algorithms and Applications |
Publisher | Springer |
Pages | 70-79 |
ISBN (Electronic) | 978-3-031-55056-0 |
ISBN (Print) | 978-3-031-55055-3, 978-3-031-55058-4 |
DOIs | |
Publication status | Published - 2024 |
MoE publication type | A4 Article in a conference publication |