Comparative Analysis of Oversampling and SMOTEENN Techniques in Machine Learning Algorithms for Breast Cancer Prediction
Abstract
Keywords
Full Text:
PDFReferences
M. Arnold et al., “Current and Future Burden of Breast Cancer: Global Statistics for 2020 and 2040,” The Breast, vol. 66, hal. 15–23, Des 2022, doi: 10.1016/j.breast.2022.08.010.
C. H. Barrios, “Global Challenges in Breast Cancer Detection and Treatment,” The Breast, vol. 62, hal. S3–S6, Mar 2022, doi: 10.1016/j.breast.2022.02.003.
U. Naseem et al., “An Automatic Detection of Breast Cancer Diagnosis and Prognosis based on Machine Learning using Ensemble of Classifiers,” IEEE Access, vol. 10, hal. 78242–78252, 2022, doi: 10.1109/ACCESS.2022.3174599.
M. Nasser dan U. K. Yusof, “Deep Learning based Methods for Breast Cancer Diagnosis: A Systematic Review and Future Direction,” Diagnostics, vol. 13, no. 1, hal. 161, Jan 2023, doi: 10.3390/diagnostics13010161.
R. Rabiei, S. M. Ayyoubzadeh, S. Sohrabei, M. Esmaeili, dan A. Atashi, “Prediction of Breast Cancer using Machine Learning Approaches,” J. Biomed. Phys. Eng., vol. 12, no. 3, hal. 297–308, 2022, doi: 10.31661/jbpe.v0i0.2109-1403.
P. Dinesh, A. S. Vickram, dan P. Kalyanasundaram, “Medical Image Prediction for Diagnosis of Breast Cancer Disease Comparing the Machine Learning Algorithms: SVM, KNN, Logistic Regression, Random Forest and Decision Tree to Measure Accuracy,” 2024, hal. 020140. doi: 10.1063/5.0203746.
E. Y. Boateng, J. Otoo, dan D. A. Abaye, “Basic Tenets of Classification Algorithms K-Nearest-Neighbor, Support Vector Machine, Random Forest and Neural Network: A Review,” J. Data Anal. Inf. Process., vol. 08, no. 04, hal. 341–357, 2020, doi: 10.4236/jdaip.2020.84020.
J. A. Benítez-Andrades, C. Prada-García, N. Ordás-Reyes, M. E. Blanco, A. Merayo, dan A. Serrano-García, “Enhanced Prediction of Spine Surgery Outcomes using Advanced Machine Learning Techniques and Oversampling Methods,” Heal. Inf. Sci. Syst., vol. 13, no. 1, hal. 24, Mar 2025, doi: 10.1007/s13755-025-00343-9.
M. Khushi et al., “A Comparative Performance Analysis of Data Resampling Methods on Imbalance Medical Data,” IEEE Access, vol. 9, hal. 109960–109975, 2021, doi: 10.1109/ACCESS.2021.3102399.
E. F. Agyemang et al., “Addressing Class Imbalance Problem in Health Data Classification: Practical Application from an Oversampling Viewpoint,” Appl. Comput. Intell. Soft Comput., vol. 2025, no. 1, Jan 2025, doi: 10.1155/acis/1013769.
R. Resmiati dan T. Arifin, “Klasifikasi Pasien Kanker Payudara menggunakan Metode Support Vector Machine dengan Backward Elimination,” Sistemasi, vol. 10, no. 2, hal. 381, 2021, doi: 10.32520/stmsi.v10i2.1238.
M. M. Hassan et al., “A Comparative Assessment of Machine Learning Algorithms with the Least Absolute Shrinkage and Selection Operator for Breast Cancer Detection and Prediction,” Decis. Anal. J., vol. 7, hal. 100245, Jun 2023, doi: 10.1016/j.dajour.2023.100245.
S. Bej, N. Davtyan, M. Wolfien, M. Nassar, dan O. Wolkenhauer, “LoRAS: An Oversampling Approach for Imbalanced Datasets,” Mach. Learn., vol. 110, no. 2, hal. 279–301, Feb 2021, doi: 10.1007/s10994-020-05913-4.
F. Gurcan dan A. Soylu, “Learning from Imbalanced Data: Integration of Advanced Resampling Techniques and Machine Learning Models for Enhanced Cancer Diagnosis and Prognosis,” Cancers (Basel)., vol. 16, no. 19, hal. 3417, Okt 2024, doi: 10.3390/cancers16193417.
G. Menghani, “Efficient Deep Learning: A Survey on Making Deep Learning Models Smaller, Faster, and Better,” ACM Comput. Surv., vol. 55, no. 12, hal. 1–37, Des 2023, doi: 10.1145/3578938.
jing teng (North China Electric Power University), “SEER Breast Cancer Data,” IEEE Dataport. [Daring]. Tersedia pada: https://ieee-dataport.org/open-access/seer-breast-cancer-data
D. A. Pisner dan D. M. Schnyer, “Support Vector Machine,” in Machine Learning, Elsevier, 2020, hal. 101–121. doi: 10.1016/B978-0-12-815739-8.00006-7.
G. Dagnew dan B. H. Shekar, “Ensemble Learning‐based Classification of Microarray Cancer Data on Tree‐based Features,” Cogn. Comput. Syst., vol. 3, no. 1, hal. 48–60, Mar 2021, doi: 10.1049/ccs2.12003.
N. Syam dan R. Kaul, “Random Forest, Bagging, and Boosting of Decision Trees,” in Machine Learning and Artificial Intelligence in Marketing and Sales, Emerald Publishing Limited, 2021, hal. 139–182. doi: 10.1108/978-1-80043-880-420211006.
S. A. Alex, J. J. Vedha Nayahi, dan S. Kaddoura, “Deep Convolutional Neural Networks with Genetic Algorithm-based Synthetic Minority Over-Sampling Technique for Improved Imbalanced Data Classification,” Appl. Soft Comput., vol. 156, hal. 111491, Mei 2024, doi: 10.1016/j.asoc.2024.111491.
D. Krstinić, M. Braović, L. Šerić, dan D. Božić-Štulić, “Multi-Label Classifier Performance Evaluation with Confusion Matrix,” in Computer Science & Information Technology, AIRCC Publishing Corporation, Jun 2020, hal. 01–14. doi: 10.5121/csit.2020.100801.
DOI: https://doi.org/10.32520/stmsi.v14i3.5146
Article Metrics
Abstract view : 114 timesPDF - 30 times
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.