An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction

Richard Wijaya (Corresponding / Lead Author), Faisal Saeed (Corresponding / Lead Author), Parnia Samimi, Abdullah M. Albarrak, Sultan Noman Qasem

    Research output: Contribution to journalArticlepeer-review

    Abstract

    Stroke poses a significant health threat, affecting millions annually. Early and precise prediction is crucial to providing effective preventive healthcare interventions. This study applied an ensemble machine learning and data mining approach to enhance the effectiveness of stroke prediction. By employing the cross-industry standard process for data mining (CRISP-DM) methodology, various techniques, including random forest, ExtraTrees, XGBoost, artificial neural network (ANN), and genetic algorithm with ANN (GANN) were applied on two benchmark datasets to predict stroke based on several parameters, such as gender, age, various diseases, smoking status, BMI, HighCol, physical activity, hypertension, heart disease, lifestyle, and others. Due to dataset imbalance, Synthetic Minority Oversampling Technique (SMOTE) was applied to the datasets. Hyperparameter tuning optimized the models via grid search and randomized search cross-validation. The evaluation metrics included accuracy, precision, recall, F1-score, and area under the curve (AUC). The experimental results show that the ensemble ExtraTrees classifier achieved the highest accuracy (98.24%) and AUC (98.24%). Random forest also performed well, achieving 98.03% in both accuracy and AUC. Comparisons with state-of-the-art stroke prediction methods revealed that the proposed approach demonstrates superior performance, indicating its potential as a promising method for stroke prediction and offering substantial benefits to healthcare.
    Original languageEnglish
    Pages (from-to)1-21
    Number of pages21
    JournalBioengineering
    Volume11
    Issue number7
    DOIs
    Publication statusPublished (VoR) - 2 Jul 2024

    Fingerprint

    Dive into the research topics of 'An Ensemble Machine Learning and Data Mining Approach to Enhance Stroke Prediction'. Together they form a unique fingerprint.

    Cite this