Detection of Graduation Potential in Prospective Students using the Random Forest Algorithm

Puguh Hasta Gunawan, Irving Vitra Paputungan

Abstract


Detecting students’ graduation potential is commonly performed by evaluating various academic and non-academic factors. This study aims to develop a predictive model for student graduation from the beginning of their academic journey, utilizing high school academic data such as grades, attendance, study hours, as well as demographic and social factors. The goal is to enable universities to identify students who are at risk of delayed graduation. With accurate predictions, institutions are expected to design more targeted academic interventions, such as tutoring, counseling, or other forms of academic support. A total of 396 student records were used in this study and processed through a series of preprocessing steps, including the removal of irrelevant data and the encoding of categorical variables. The model was developed using the Random Forest algorithm with parameters set to max_depth = 15 and random_state = 42. Model performance was evaluated using accuracy, recall, F1-score, and the ROC curve. The results show that the model achieved an accuracy of 89%, with the Pass class having a recall of 87% and an F1-score of 91%, and the Fail class showing a recall of 92% and an F1-score of 84%. Additionally, the Area Under the Curve (AUC) value of 0.94 indicates excellent model performance in distinguishing between students likely to graduate and those at risk of not graduating.
This study confirms that the model is effective in classifying graduation outcomes based on early academic data. For further development, it is recommended to include additional variables such as psychological factors, learning motivation, and socioeconomic conditions. Moreover, tuning the model by adding other parameters—such as n_estimators, min_samples_split, and max_features—is suggested to improve the model’s accuracy and generalizability.

Keywords


Random Forest; Confusion Matrix; Graduation Detection

Full Text:

PDF

References


Supriyanto, “Strategi Membangun Budaya Akademik Mahasiswa,” Ilmu Pendidikan: Jurnal Kajian Teori dan Praktik Kependidikan, Vol. 6, No. 1, pp. 11–21, 2021. DOI: http://dx.doi.org/10.17977/um027v6i12021p011

M. H. B. Roslan and C. J. Chen, “Educational Data Mining for Student Performance Prediction: A Systematic Literature Review (2015-2021),” International Journal of Emerging Technologies in Learning, Vol. 17, No. 5, pp. 147–179, 2022, doi: 10.3991/ijet.v17i05.27685.

H. Andrianof, A. P. Gusman, and O. A. Putra, “Implementasi Algoritma Random Forest untuk Prediksi Kelulusan Mahasiswa berdasarkan Data Akademik: Studi Kasus di Perguruan Tinggi Indonesia,” Jurnal Sains Informatika Terapan (JSIT) E-ISSN, Vol. 4, No. 1, pp. 24–28, 2025, Accessed: May 16, 2025. [Online]. Available: https://rcf-indonesia.org/home/

“Panduan Indikator Kinerja Utama (IKU) Perguruan Tinggi Tahun 2023", Direktorat Jenderal Pendidikan Tinggi,” https://pddikti.kemdikbud.go.id.

S. Sobari, A. I. Purnamasari, A. Bahtiar, and K. Kaslani, “Meningkatkan Model Prediksi Kelulusan Santri Tahfidz di Pondok Pesantren Al-Kautsar menggunakan Algoritma Random Forest,” Jurnal Informatika dan Teknik Elektro Terapan, Vol. 13, No. 1, Jan. 2025, doi: 10.23960/jitet.v13i1.5704.

L. Breiman, “Random Forests,” Vol. 45, Kluwer Academic, 2001, pp. 5–32.https://doi.org/10.1023/A:1010933404324

S. Ray, A Quick Review of Machine Learning Algorithms. 2019. doi: 10.1109/COMITCon.2019.8862451.

K. Kowsari, K. J. Meimandi, M. Heidarysafa, S. Mendu, L. Barnes, and D. Brown, “Text Classification Algorithms: A Survey,” 2019, MDPI AG. doi: 10.3390/info10040150.

S. Kumar, F. Janan, and S. K. Ghosh, “Prediction of Students’ Performance using Random Forest Classifier,” in Proceedings of the 11th Annual International Conference on Industrial Engineering and Operations Management, Singapore, Singapore, Mar. 2021, pp. 7089–7100. [Online]. Available: https://www.researchgate.net/publication/354925634

C. Ma, “Improving the Prediction of Student Performance by Integrating a Random Forest Classifier with Meta-Heuristic Optimization Algorithms,” IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 15, No. 6, pp. 1032–1044, 2024, [Online]. Available: www.ijacsa.thesai.org

F. Orji and J. Vassileva, “Using Machine Learning to Explore the Relation between Student Engagement and Student Performance,” in Proceedings of the International Conference on Information Visualisation, Institute of Electrical and Electronics Engineers Inc., Sep. 2020, pp. 480–485. doi: 10.1109/IV51561.2020.00083.

Y. Chen and K. Jin, “Educational Performance Prediction with Random Forest and Innovative Optimizers: A Data Mining Approach,” IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 15, No. 3, 2024, [Online]. Available: www.ijacsa.thesai.org

S. M. F. D. S. Mustapha, “Predictive Analysis of Students’ Learning Performance using Data Mining Techniques: A Comparative Study of Feature Selection Methods,” Applied System Innovation, Vol. 6, No. 5, pp. 2–24, Oct. 2023, doi: 10.3390/asi6050086.

J. Kuswanto, H. Lukmanul, A. Info, and K. Kunci, “Penerapan Algoritma Random Forest untuk memprediksi Performa Akademik Mahasiswa,” Decode (Jurnal Pendidikan Teknologi Informasi), Vol. 5, No. 1, pp. 262–270, 2025, doi: http://dx.doi.org/10.51454/decode.v5i1.1103ll.

Y. Priantama, T. Azhima, and Y. Siswa, “Optimasi Correlation-based Feature Selection untuk Perbaikan Akurasi Random Forest Classifier dalam Prediksi Performa Akademik Mahasiswa,” Jurnal Informatika dan Komputer), Vol. 6, No. 2, pp. 251–260, 2022.

R. P. Munggaran, M. Nurmalasari, H. Hosizah, and D. Krismawati, “Prediksi Waktu Tunggu Pelayanan Pasien Rawat Jalan dengan Algoritma Random Forest,” MALCOM: Indonesian Journal of Machine Learning and Computer Science, Vol. 5, No. 1, pp. 35–40, Nov. 2024, doi: 10.57152/malcom.v5i1.1529.

R. Herdiana, “Prediksi Penetapan Tarif Penerbangan menggunakan Auto-Ml dengan Algoritma Random Forest,” Jurnal Ilmu Komputer Ruru, Vol. 2, No. 1, pp. 17–23, 2025, doi: 10.69688/jikr.v2i1.10.

R.S. Reza and M.A. Yusuf, “Penerapan Algoritma Random Forest untuk Klasifikasi Kualitas Air berbasis Web,” Jurnal Ilmu Komputer Dan Informatika, Vol. 1, No. 3, pp. 79–88, Jan. 2025, Accessed: May 16, 2025. [Online]. Available: https://jurnal.globalscients.com/index.php/jiki

A. Fauzi, N. Maulidah, R. Supriyadi, H. Nalatissifa, and S. Diantika, “Prediksi Harga Properti di Indonesia menggunakan Algoritma Random Forest,” Journal of Artificial Intelligence and Digital Business (RIGGS), Vol. 4, No. 1, pp. 43–49, 2025, doi: 10.31004/riggs.v4i1.367.

Rumini, Norhikmah, “Prediksi Kegagalan Siswa dalam Data Mining dengan menggunakan Metode Naive Bayes,” Jurnal Mantik Penusa, Vol. 3, No. 1, pp. 42–46, 2019, doi: 10.13140/RG.2.2.22726.42560.


Article Metrics

Abstract view : 775 times
PDF - 147 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.