Comparison of Support Vector Machine and K-Nearest Neighbor Algorithms on the Effectiveness of a Free Lunch Program

Frenda Farahdinna, Pipit Prabawati

Abstract


The Free Lunch Program is a government initiative aimed at ensuring adequate nutrition for the public. This study aims to examine public perceptions of the program through sentiment analysis and to compare the effectiveness of Support Vector Machine (SVM) and K-Nearest Neighbor (K-NN) models. A total of 6,532 public comments were collected from Twitter, YouTube, and TikTok. After preprocessing, including normalization, stopword removal, and stemming, features were extracted using Term Frequency–Inverse Document Frequency (TF-IDF), resulting in 5,992 clean data points. The dataset was split into 80% training and 20% testing sets. Model training was conducted with hyperparameter tuning using 3-fold GridSearchCV. The results indicate that negative sentiment dominated at 42.7%. In the model comparison, SVM with a linear kernel significantly outperformed K-NN, achieving an accuracy of 72%, while K-NN (k=3) reached only 48%. These findings suggest that the SVM algorithm is more effective in classifying public opinion sentiment on high-dimensional data compared to K-NN.

Keywords


sentiment evaluation; support vector machine; k-nearest neighbor; free lunch

Full Text:

PDF

References


F. P. Wahyu, S. Alia, E. Susanti, F. H. Abdillah, and R. Febrianti, “AAPA-EROPA-AGPA-IAPA International Conference 2024 Towards World Class Bureaucracy A Systematic Review of the Data-Driven Public Policy Making in Indonesia”, DOI: 10.30589/proceedings.2024.1111.

Q. A. Xu, V. Chang, and C. Jayne, “A Systematic Review of Social Media-based Sentiment Analysis: Emerging Trends and Challenges,” Decision Analytics Journal, Vol. 3, p. 100073, Jun. 2022, DOI: 10.1016/j.dajour.2022.100073.

B. Liu, J. Zhao, K. Liu, and L. Xu, “Book Review Sentiment Analysis: Mining Opinions, Sentiments, and Emotions,” Press, 2015, DOI: 10.1162/COLI.

M. Amien, “Sejarah dan Perkembangan Teknik Natural Language Processing (NLP) Bahasa Indonesia: Tinjauan tentang Sejarah, Perkembangan Teknologi, dan Aplikasi NLP dalam Bahasa Indonesia”, DOI: 10.48550/arXiv.2304.02746.

Y. Song, X. Liu, and Z. Zhou, “A Comprehensive Review of Text Classification Algorithms,” J. Electron. Inf. SCI., Vol. 9, No. 2, pp. 34–42, 2024, DOI: 10.23977/jeis.2024.090205.

A. K. Hidayah, Y. Erwadi, and S. Handayani, “Analisis Sentimen Publik terhadap Pemindahan Ibu Kota Negara di Twitter menggunakan Metode Klasifikasi Random Forest dan Smote,” 2025.

A. de la Cruz Huayanay, J. L. Bazán, and C. M. Russo, “Performance of Evaluation Metrics for Classification in Imbalanced Data,” Comput Stat, Vol. 40, No. 3, pp. 1447–1473, 2025, DOI: 10.1007/s00180-024-01539-5.

N. Arifin, U. Enri, and N. Sulistiyowati, “Satuan Tulisan Riset dan Inovasi Teknologi Penerapan Algoritma Support Vector Machine (SVM) dengan TF-IDF N-Gram untuk Tect Classification.”

A. Salhi, R. Alshamrani, A. Althbiti, A. Ismail, M. Abd-ElRahman, and B. M. Hassan, “Optimizing High Dimensional Data Classification with a Hybrid AI Driven Feature Selection Framework and Machine Learning Schema,” SCI Rep, Vol. 15, No. 1, p. 35038, Dec. 2025, DOI: 10.1038/s41598-025-08699-4.

I. Rifky Hendrawan, E. Utami, and A. D. Hartanto, “Analisis Perbandingan Metode TF-IDF dan Word2vec pada Klasifikasi Teks Sentimen Masyarakat terhadap Produk Lokal di Indonesia.”

M. Bansal, A. Goyal, and A. Choudhary, “A Comparative Analysis of K-Nearest Neighbor, Genetic, Support Vector Machine, Decision Tree, and Long Short Term Memory Algorithms in Machine Learning,” Decision Analytics Journal, Vol. 3, p. 100071, Jun. 2022, DOI: 10.1016/j.dajour.2022.100071.

V. Pestov, “Is the K-NN Classifier in High Dimensions Affected by the Curse of Dimensionality?,” Computers and Mathematics with Applications, Vol. 65, No. 10, pp. 1427–1437, 2013, DOI: 10.1016/j.camwa.2012.09.011.

R. Agrawal, M. Majumder, I. Yadav, N. Taneja, S. Hamdare, and P. Hemnani, “Evaluating Sentiment Analysis Models: A Comparative Analysis of Vaccination Tweets During the COVID-19 Phase Leveraging DistilBERT for Enhanced Insights,” MethodsX, Vol. 14, Jun. 2025, DOI: 10.1016/j.mex.2025.103407.

S. K. Adharani, S. Kacung, and A. V. Vitianingsih, “Sentiment Analysis on Indonesian National Football Team Naturalization using KNN and SVM,” Edumatic: Jurnal Pendidikan Informatika, Vol. 9, No. 1, pp. 189–197, Apr. 2025, DOI: 10.29408/edumatic.v9i1.29653.

L. Widiastuti, D. Nurlaela, A. Vol, L. D. Utami, A. Surniandari, and P. Studi Sistem Informasi Akuntansi Kampus Kota Bogor, “Jusikom : Jurnal Sistem Komputer Musi Rawas.”

A. Kataria, M. Singh, W. : Www, and M. D. Singh, “A Review of Data Classification using K-Nearest Neighbour Algorithm International Journal of Emerging Technology and Advanced Engineering A Review of Data Classification using K-Nearest Neighbour Algorithm,” 2008. [Online]. Available: https://www.researchgate.net/publication/353306410

M. Wang, X. Xu, Q. Yue, and Y. Wang, “A Comprehensive Survey and Experimental Comparison of Graph-based Approximate Nearest Neighbor Search,” May 2021, [Online]. Available: http://arxiv.org/abs/2101.12631

C. Schröer, F. Kruse, and J. M. Gómez, “A Systematic Literature Review on Applying CRISP-DM Process Model,” in Procedia Computer Science, Elsevier B.V., 2021, pp. 526–534. DOI: 10.1016/j.procs.2021.01.199.

T. Oswari, Murniyati, T. Yusnitasari, Nurasiah, and S. Wijaya, “Sentiment Analysis of Indonesian YouTube Reviews about Lesbian, Gay, Bisexual, and Transgender (LGBT) using IndoBERT Fine Tuning,” Lontar Komputer : Jurnal Ilmiah Teknologi Informasi, Vol. 15, No. 01, pp. 26–37, Oct. 2025, DOI: 10.24843/LKJITI.2024.v15.i01.p03.

Y. A. Singgalen, “Penerapan CRISP-DM dalam Klasifikasi Sentimen dan Analisis Perilaku Pembelian Layanan Akomodasi Hotel berbasis Algoritma Decision Tree (DT),” Jurnal Sistem Komputer dan Informatika (JSON), Vol. 5, No. 2, p. 237, Dec. 2023, DOI: 10.30865/json.v5i2.7081.

W. Nugraha and A. Sasongko, “Hyperparameter Tuning pada Algoritma Klasifikasi dengan Grid Search,” Sistemasi: Jurnal Sistem Informasi., Vol. 11, No. 2, pp. 391–401, May 2022, DOI: 10.32520/stmsi.v11i2.1766.

M. Hasan, M. F. Rabbi, M. N. Sultan, A. M. Nitu, and M. P. Uddin, “A Novel Data Balancing Technique Via Resampling Majority and Minority Classes Toward Effective Classification,” Telkomnika (Telecommunication Computing Electronics and Control), Vol. 21, No. 6, pp. 1308–1316, 2023, Doi: 10.12928/TELKOMNIKA.V21I6.25211.

L. A. Pekandi, R. G. Widjaja, A. Ananta, J. Harefa, and K. Jingga, “Evaluating IndoBERT for Indonesian Hoax News Detection: A Comparative Study with Ensemble and CNN-LSTM Models,” Procedia Comput SCI, Vol. 269, pp. 1625–1633, 2025, DOI: 10.1016/j.procs.2025.09.105.

F. Koto, J. H. Lau, and T. Baldwin, “IndoBERTweet: A Pretrained Language Model for Indonesian Twitter with Effective Domain-Specific Vocabulary Initialization,” Sep. 2021, [Online]. Available: http://arxiv.org/abs/2109.04607




DOI: https://doi.org/10.32520/stmsi.v15i2.5999

Article Metrics

Abstract view : 9 times
PDF - 3 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.