Comparison of Bagging and Adaboost Methods on C4.5 Algorithm for Stroke Prediction

Nur Diana Saputri, Khalid Khalid, Dwi Rolliawati

Abstract


Stroke is a non-communicable disease and is very dangerous because of functional disorders of the brain caused by blockage of blood circulation. This disease is classified as a cerebrovascular disease because it requires treatment for 24 hours, if not treated quickly it can cause death. The purpose of this research is to overcome this problem is to create a machine learning-based prediction model for medical experts in dealing with diseases to help reduce the risk of death. The method applied for this research is to apply the C4.5 algorithm classification method as well as the bagging and Adaboost methods from Ensemble Learning. Stroke data is processed using 2 stages of data processing, namely the data cleaning stage and the data transformation stage. In this study, a comparison will be made between the C4.5 algorithm, the bagging method + the C4.5 algorithm and the Adaboost method + the C4.5 algorithm using the confusion matrix, k-fold cross validation and validation test based on the values of TP, TN, FP, FN, recall, precision, F1-Score and accuracy. The results of the classification test using the Confusion Matrix and k-fold cross validation for the C4.5 algorithm resulted in an accuracy of 92.87%. Then the accuracy of the C4.5 algorithm with the bagging method increased to 95.02% and when combined with the Adaboost method the accuracy value also increased to 94.63%. From these results, it can be said that a single classifier algorithm, namely the C4.5 algorithm with the bagging and Adaboost methods, has been proven to improve classification performance.

Full Text:

PDF

References


N. Permatasari, “Perbandingan Stroke non Hemoragik dengan Gangguan Motorik Pasien Memiliki Faktor Resiko Diabetes Melitus dan Hipertensi,” J. Ilm. Kesehat. Sandi Husada, vol. 11, no. 1, pp. 298–304, 2020, doi: 10.35816/jiskh.v11i1.273.

N. R. Wardhani, S. Martini, and J. Timur, “Faktor yang Berhubungan dengan Pengetahuan tentang Stroke pada Pekerja Institusi Pendidikan Tinggi,” J. Berk. Epidemiol., vol. 2, pp. 13–23, 2014.

N. C. Suwanwela and N. Poungvarin, “Stroke Burden and Stroke Care System in Asia,” Neurol India, vol. 64, no. 7, pp. 46–51, 2016.

D. W. Nugraha, A. Y. E. Dodu, and N. Chandra, “Klasifikasi Penyakit Stroke menggunakan Metode Naive Bayes Classifier (Studi Kasus pada Rumah Sakit Umum Daerah Undata Palu),” semanTIK, vol. 3, no. 2. pp. 13–22, 2017.

I. Setiawati, A. P. Wibowo, and A. Hermawan, “Implementasi Decision Tree untuk Mendiagonis Penyakit Liver,” J. Inf. Syst. Manag., vol. 1, no. 1, pp. 13–17, 2019.

Y. Pristyanto, “Penerapan Metode Ensemble untuk Meningkatkan Kinerja Algoritme Klasifikasi pada Imbalanced Dataset,” J. Teknoinfo, vol. 13, no. 1, p. 11, 2019, doi: 10.33365/jti.v13i1.184.

J. R. S and D. S. Kumar, “Stroke prediction using SVM,” Int. Conf. Control. Instrumentation, Communication Comput. Technol., pp. 600–602, 2016.

V. Adelina, D. E. Ratnawati, and M. A. Fauzi, “Klasifikasi Tingkat Risiko Penyakit Stroke menggunakan Metode GA-Fuzzy Tsukamoto,” J. Pengemb. Teknol. Inf. dan Ilmu Komput. Univ. Brawijaya, vol. 2, no. September, pp. 3015–3021, 2018.

M. Zainuddin, K. Hidjah, and I. W. Tunjung, “Penerapan Case Based Reasoning (CBR) untuk Mendiagnosis Penyakit Stroke menggunakan Algoritma K-Nearest Neighbor,” Citesee, pp. 21–26, 2016.

L. Amini, R. Azarpazhouh, M. T. Farzadfar, S. A. Mousavi, and F. Jazaieri, “Prediction and Control of Stroke by Data Mining,” Int. J. Prev. Med., vol. 4, no. May 2013, pp. 245–249, 2014.

I. Rohmana and R. Arifudin, “Perbandingan jaringan syaraf tiruan dan naive bayes dalam Deteksi Seseorang Terkena Penyakit Stroke,” J. MIPA, vol. 37, no. 2, pp. 105–114, 2014.

Pareza Alam Jusia, “Analisis Komparasi Pemodelan Algoritma Decision Tree menggunakan Metode Particle Swarm Optimization dan Metode Adaboost untuk Prediksi Awal Penyakit Jantung,” Semin. Nas. Sist. Inf. 2018, pp. 1048–1056, 2018.

A. Bisri, “Penerapan Adaboost untuk Penyelesaian Ketidakseimbangan Kelas pada Penentuan Kelulusan Mahasiswa dengan Metode Decision Tree,” J. Intell. Syst., vol. 1, no. 1, pp. 27–32, 2015.

T. Kansadub, S. Thammaboosadee, S. Kiattisin, and C. Jalayondeja, “Stroke Risk Prediction Model based on Demographic data,” BMEiCON 2015-8th Biomed. Eng. Int. Conf., pp. 3–5, 2016, doi: 10.1109/BMEiCON.2015.7399556.

A. Byna and M. Basit, “Penerapan Metode Adaboost untuk Mengoptimasi Prediksi Penyakit Stroke dengan Algoritma Naïve Bayes,” J. Sisfokom (Sistem Inf. dan Komputer), vol. 9, no. 3, pp. 407–411, 2020, doi: 10.32736/sisfokom.v9i3.1023.

M. Mirqotussa’adah, M. A. Muslim, E. Sugiharti, B. Prasetiyo, and S. Alimah, “Penerapan Dizcretization dan Teknik Bagging untuk Meningkatkan Akurasi Klasifikasi Berbasis Ensemble pada Algoritma C4.5 dalam Mendiagnosa Diabetes,” Lontar Komput. J. Ilm. Teknol. Inf., no. August, p. 135, 2017, doi: 10.24843/lkjiti.2017.v08.i02.p07.

C. T. Tran, M. Zhang, P. Andreae, and B. Xue, “Bagging and Feature Selection for Classification with Incomplete Data,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 10199 LNCS, pp. 471–486, 2017, doi: 10.1007/978-3-319-55849-3_31.

T. Prasertsakul, P. Kaimuk, and W. Charoensuk, “Defining the Rehabilitation Treatment Programs for Stroke Patients by Applying Neural Network and Decision Trees Models,” Biomed. Eng. Int. Conf., 2014.

A. A. Aprilia Lestari, “Increasing Accuracy of C4 . 5 Algorithm using Information Gain Ratio and Adaboost for Classification of Chronic Kidney Disease,” J. Soft Comput. Explor., vol. 1, no. 1, pp. 32–38, 2020.

R. H. Saputra, “Optimasi Algoritma C4.5 Menggunakan Seleksi Fitur Particle Swarm Optimization (PSO) dan Teknik Bagging pada Diagnosis Penyakit Kanker Payudara,” Skripsi, 2020.

W. D. Septiani, “Komparasi Metode Klasifikasi Data Mining Algoritma C4.5 dan Naive Bayes untuk Prediksi Penyakit Hepatitis,” None, vol. 13, no. 1, pp. 76–84, 2017, doi: 10.33480/pilar.v13i1.149.

A. Saifudin, U. Pamulang, R. S. Wahono, U. Dian, and N. Semarang, “Penerapan Teknik Ensemble untuk Menangani Ketidakseimbangan Kelas pada Prediksi Cacat Software,” J. Softw. Eng., vol. 1, no. 1, pp. 28–37, 2015.

A. Ilham, “Komparasi Algoritma Kasifikasi dengan Pendekatan Level Data untuk Tidak Seimbang,” J. Ilm. Ilmu Komput., vol. 3, no. May, 2017, doi: 10.35329/jiik.v3i1.60.




DOI: https://doi.org/10.32520/stmsi.v11i3.1684

Article Metrics

Abstract view : 99 times
PDF - 59 times

Refbacks



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.