Weather Classification in West Java using Ensemble Learning on Meteorological Data

Cynthia Nur Azzahra, Yulison Herry Chrisnanto, Gunawan Abdillah

Abstract


Weather classification in West Java presents several challenges, particularly related to class imbalance in the dataset and the complexity of meteorological variables. This study aims to improve classification accuracy by proposing a stacking classifier approach that combines Support Vector Machine (SVM) and Random Forest as base learners, with Logistic Regression serving as the meta-classifier. To address the class imbalance, the Synthetic Minority Oversampling Technique (SMOTE) was applied, while model optimization was conducted using GridSearchCV. Weather data from the Indonesian Meteorological, Climatological, and Geophysical Agency (BMKG) for December 2024 was used and processed through transformation, normalization, and outlier handling. The dataset was then split into training and testing sets with ratios of 70:30, 80:20, and 90:10. The stacking classifier without SMOTE achieved the highest accuracy of 86.73%, but suffered from overfitting, indicated by a 13.27% gap between training and validation accuracy. The application of SMOTE improved the recall for minority classes to 76.3% and reduced overfitting, with the accuracy gap narrowing to less than 1%. The most stable performance was achieved with an 80:20 train-test split, where the SMOTE-applied and hyperparameter-optimized model reached an accuracy of 85.97%, an F1-score of 68.99%, and a statistically significant t-test result (p < 0.001). These findings demonstrate that the combination of stacking classifiers, SMOTE, and hyperparameter tuning effectively mitigates class bias and enhances model generalization, outperforming single-model classifiers in handling imbalanced weather data.

Keywords


Klasifikasi Cuaca; Ensemble Learning, Stacking Classifier, SMOTE, Optimasi Hyperparameter

Full Text:

PDF

References


S. Ardhasena, Marjuki, A. F. Radjab, and H. T. Djatmiko, “At the Front Line of Climate Action,” Kedeputian Bidang Klimatologi, BMKG, 2024.

A. Rosyida, M. Aziz, Y. Firmansyah, T. Setiawan, K. P. Pangesti, and F. Kakanur I., Data Bencana Indonesia 2023, Vol. 3. Pusat Data Informasi dan Komunikasi Kebencanaan Badan Nasional Penanggulangan Bencana, 2024. [Online]. Available: https://bpbd.kepriprov.go.id/files/buku-data-bencana-indonesia-tahun-2023.pdf

E. Dritsas, M. Trigka, and P. Mylonas, “A Multi-class Classification Approach for Weather Forecasting with Machine Learning Techniques,” in 2022 17th International Workshop on Semantic and Social Media Adaptation & Personalization (SMAP), Corfu, Greece: IEEE, Nov. 2022, pp. 1–5. doi: 10.1109/SMAP56125.2022.9942121.

S. I. Fallo, M. A. Aprihartha, and J. Prasetya, “Optimization of Early Warning System for Landslides based on Rainfall using Naive Bayes Classifier and Multiclass Support Vector Machine Algorithm in Takari Region,” 2024.

A. Toha, P. Purwono, and W. Gata, “Model Prediksi Kualitas Udara dengan Support Vector Machines dengan Optimasi Hyperparameter GridSearch CV,” Buletin Ilmiah Sarjana Teknik Elektro, Vol. 4, No. 1, pp. 12–21, May 2022, doi: 10.12928/biste.v4i1.6079.

I. Srivani, M. Sridhar, K. C. T. Swamy, and D. Venkata Ratnam, “Multi-Class Classification of Ionospheric Scintillations using SMOTE-Super Learner Ensemble Technique,” Advances in Space Research, Vol. 73, No. 7, pp. 3845–3854, Apr. 2024, doi: 10.1016/j.asr.2023.09.039.

N. Larasati, “Perbandingan Regresi Logistik dan Random Forest pada Klasifikasi Cuaca Wilayah Jawa Tengah,” AKS, Vol. 14, No. 2, pp. 172–181, Sep. 2023, doi: 10.26877/aks.v14i2.15985.

G. G. Ghiffary, N. T. Amanda, R. Ardhani, B. Sartono, and A. R. Firdawanti, “Analisis Kinerja Model Stacking berbasis Random Forest dan SVM dalam Klasifikasi Rumah Tangga berdasarkan Garis Kemiskinan Makanan di Provinsi Jawa Barat,” SCI TECH ED MATH, Vol. 5, No. 3, pp. 2244–2265, Dec. 2024, doi: 10.46306/lb.v5i3.856.

S. Joses, D. Yulvida, and S. Rochimah, “Pendekatan Metode Ensemble Learning untuk Prakiraan Cuaca menggunakan Soft Voting Classifier,” J. Appl. Comput. Sci. Technol., Vol. 5, No. 1, pp. 72–80, Jun. 2024, doi: 10.52158/jacost.v5i1.741.

B. Selvanandhini and R. Karthikeyan, “Ensemble Heartguard: Integrating SVM and Random Forest for Robust Heart Disease Prediction,” eatp, May 2024, doi: 10.53555/kuey.v30i5.5662.

P. Widiharso, S. Sendari, A. N. Handayani, and N. S. F. Putri, “Performa Metode Klasifikasi Tunggal dan Ensemble Model dalam Identifikasi Baku Mutu Air,” infotekmesin, Vol. 13, No. 2, pp. 206–211, Jul. 2022, doi: 10.35970/infotekmesin.v13i2.1529.

S. Alam, M. S. Ayub, S. Arora, and M. A. Khan, “An Investigation of the Imputation Techniques for Missing Values in Ordinal Data Enhancing Clustering and Classification Analysis Validity,” Decision Analytics Journal, Vol. 9, p. 100341, Dec. 2023, doi: 10.1016/j.dajour.2023.100341.

A. Setiawan, Y. Andalantama, M. Sidiq, and Kusrini, “Predictive Analysis of Monthly Flood Variables in the Palangkaraya Area using Multiple Regression Methods and MLR, NN, KNN, Random Forest, SVM Algorithms,” in 2023 6th International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia: IEEE, Nov. 2023, pp. 224–229. doi: 10.1109/ICOIACT59844.2023.10455906.

F. D. Rahman, M. I. Z. Mulki, and A. Taryana, “Clustering dan Klasifikasi Data Cuaca Cilacap dengan menggunakan Metode K-Mean dan Random Forest,” J. SINTA: Sist. Inf. dan Teknol.Komputasi, Vol. 1, No. 2, Apr. 2024, doi: 10.61124/sinta.v1i2.15.

H. Hou et al., “Load Forecasting Combining Phase Space Reconstruction and Stacking Ensemble Learning,” IEEE Trans. on Ind. Applicat., Vol. 59, No. 2, pp. 2296–2304, Mar. 2023, doi: 10.1109/TIA.2022.3225516.

I. D. Mienye and Y. Sun, “A Survey of Ensemble Learning: Concepts, Algorithms, Applications, and Prospects,” IEEE Access, Vol. 10, pp. 99129–99149, 2022, doi: 10.1109/ACCESS.2022.3207287.

S. Zhao et al., “Attach Importance of the Bootstrap t-test Against Student’s t-test in Clinical Epidemiology: A Demonstrative Comparison using COVID-19 as an Example,” Epidemiol. Infect., Vol. 149, p. e107, 2021, doi: 10.1017/S0950268821001047.

A. Sakho, E. Malherbe, and E. Scornet, “Do We Need Rebalancing Strategies? A Theoretical and Empirical Study Around SMOTE and Its Variants,” May 22, 2025, arXiv: arXiv:2402.03819. doi: 10.48550/arXiv.2402.03819.




DOI: https://doi.org/10.32520/stmsi.v14i5.5343

Article Metrics

Abstract view : 0 times
PDF - 0 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.