A-Tuning Ensemble Machine Learning Technique for Cerebral Stroke Prediction
Abstract
A cerebral stroke is a medical problem that occurs when the blood flowing to a section
of the brain is suddenly cut off, causing damage to the brain. Brain cells gradually die because of
interruptions in blood supply and other nutrients to the brain, resulting in disabilities, depending
on the affected region. Early recognition and detection of symptoms can aid in the rapid treatment
of strokes and result in better health by reducing the severity of a stroke episode. In this paper, the
Random Forest (RF), Extreme Gradient Boosting (XGBoost), and light gradient-boosting machine
(LightGBM) were used as machine learning (ML) algorithms for predicting the likelihood of a
cerebral stroke by applying an open-access stroke prediction dataset. The stroke prediction dataset
was pre-processed by handling missing values using the KNN imputer technique, eliminating
outliers, applying the one-hot encoding method, and normalizing the features with different ranges
of values. After data splitting, synthetic minority oversampling (SMO) was applied to balance the
stroke samples and no-stroke classes. Furthermore, to fine-tune the hyper-parameters of the ML
algorithm, we employed a random search technique that could achieve the best parameter values.
After applying the tuning process, we stacked the parameters to a tuning ensemble RXLM that
was analyzed and compared with traditional classifiers. The performance metrics after tuning the
hyper-parameters achieved promising results with all ML algorithms.