Data Balancing Approach Using Combine Sampling on Sentiment Analysis With K-Nearest Neighbor
Abstract
Full Text:
PDFReferences
Y. Tresnawati, “Analisis Sentimen Pada Twitter Menggunakan Pendekatan Agglomerative Hierarchial Clustering,” Universitas Sanata Dharma, 2017. [Online]. Available: https://123dok.com/document/y6e7jl4z-analisis-sentimen-twitter-menggunakan-pendekatan-agglomerative-hierarchical-clustering.html
A. R. Isnain, J. Supriyanto, and M. P. Kharisma, “Implementation of K-Nearest Neighbor (K-NN) Algorithm For Public Sentiment Analysis of Online Learning,” IJCCS (Indonesian J. Comput. Cybern. Syst., vol. 15, no. 2, p. 121, Apr. 2021, doi: 10.22146/ijccs.65176.
S. Mulyani, S. A. Thamrin, and S. Siswanto, “Analisis Sentimen Masyarakat Pada Kebijakan Vaksinasi Covid-19 Di Twitter Menggunakan Metode Mesin Vektor Pendukung Dengan Kernel Radial Basis Function Berbasis Fitur Leksikon,” Jambura J. Probab. Stat., vol. 3, no. 2, pp. 110–119, 2022, doi: 10.34312/jjps.v3i2.16663.
N. Rezki, S. A. Thamrin, and S. Siswanto, “Sentiment Analysis of Merdeka Belajar Kampus Merdeka Policy Using Support Vector Machine With Word2Vec,” BAREKENG J. Ilmu Mat. dan Terap., vol. 17, no. 1, pp. 0481–0486, 2023, doi: 10.30598/barekengvol17iss1pp0481-0486.
R. T. Prasetio, “Seleksi Fitur dan Optimasi Parameter k-NN Berbasis Algoritma Genetika Pada Dataset Medis,” J. RESPONSIF, vol. 2, no. 2, pp. 213–221, 2020, [Online]. Available: http://ejurnal.ars.ac.id/index.php/jti
A. Prayoga Permana, K. Ainiyah, and K. Fahmi Hayati Holle, “Analisis Perbandingan Algoritma Decision Tree, kNN, dan Naive Bayes untuk Prediksi Kesuksesan Start-up,” JISKa, vol. 6, no. 3, pp. 178–188, 2021, [Online]. Available: http://repository.uin-malang.ac.id/9921/
G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data,” 2004. [Online]. Available: https://dl.acm.org/doi/abs/10.1145/1007730.1007735
S. Choirunnisa and J. Lianto, “Hybrid Method of Undersampling and Oversampling for Handling Imbalanced Data,” in 2018 International Seminar On Research Of Information Technology and Intelligent Systems, 2018, pp. 276–280.
M. Mustaqim, B. Warsito, and B. Surarso, “Combination of synthetic minority oversampling technique (Smote) and backpropagation neural network to handle imbalanced class in predicting the use of contraceptive implants,” Regist. J. Ilm. Teknol. Sist. Inf., vol. 5, no. 2, pp. 116–127, Jul. 2019, doi: 10.26594/register.v5i2.1705.
H. Ali, M. N. M. Salleh, R. Saedudin, K. Hussain, and M. F. Mushtaq, “Imbalance class problems in data mining: A review,” Indones. J. Electr. Eng. Comput. Sci., vol. 14, no. 3, pp. 1552–1563, Jun. 2019, doi: 10.11591/ijeecs.v14.i3.pp1552-1563.
H. Shamsudin, U. K. Yusof, A. Jayalakshmi, and M. N. Akmal Khalid, “Combining oversampling and undersampling techniques for imbalanced classification: A comparative study using credit card fraudulent transaction dataset,” in IEEE International Conference on Control and Automation, ICCA, IEEE Computer Society, Oct. 2020, pp. 803–808. doi: 10.1109/ICCA51439.2020.9264517.
C. M. F. Andriani and D. Susilaningrum, “Klasifikasi Waiting Time for Pilot di Pelabuhan Tanjung Perak Menggunakan Metode Regresi Logistik – Synthetic Minority Oversampling Technique (SMOTE),” J. Sains dan Seni ITS, vol. 12, no. 1, pp. 111–118, 2023.
E. Erlin, Y. Desnelita, N. Nasution, L. Suryati, and F. Zoromi, “Dampak SMOTE terhadap Kinerja Random Forest Classifier berdasarkan Data Tidak seimbang,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 21, no. 3, pp. 677–690, Jul. 2022, doi: 10.30812/matrik.v21i3.1726.
L. Ganda, R. Putra, K. Marzuki, and H. Hairani, “Correlation-based feature selection and Smote-Tomek Link to improve the performance of machine learning methods on cancer disease prediction,” Eng. Appl. Sci. Res., vol. 50, no. 6, pp. 577–583, 2023, doi: 10.14456/easr.2023.59.
A. Baita and N. Cahyono, “Analisis Sentimen Mengenai Vaksin SINOVAC Menggunakan Algoritma Support Vector Machine (SVM) DAN K-Nearest Neighbor (KNN),” Inf. Syst. J., vol. 4, no. 2, pp. 42–46, 2021.
P. Shah, P. Swaminarayan, and M. Patel, “Sentiment analysis on film review in Gujarati language using machine learning,” International Journal of Electrical and Computer Engineering, vol. 12, no. 1. Institute of Advanced Engineering and Science, pp. 1030–1039, Feb. 01, 2022. doi: 10.11591/ijece.v12i1.pp1030-1039.
H. Hairani, A. Anggrawan, and D. Priyanto, “Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote,” Int. J. Informatics Vis., vol. 7, no. 1, pp. 258–264, 2023.
K. Alamat, W. Nugraha, D. Risdiansyah, D. Purwaningtias, T. Hidayatulloh, and S. Suhada, “Kombinasi Tomek Link dan SMOTE Untuk Mengatasi Ketidakseimbangan Kelas Pada Credit Card Fraud,” J. Larik, vol. 2, no. 2, pp. 32–40, 2022, [Online]. Available: http://jurnal.bsi.ac.id/index.php/larik
I. N. Switrayana, D. Ashadi, H. Hairani, and A. Aminuddin, “Sentiment Analysis and Topic Modeling of Kitabisa Applications using Support Vector Machine (SVM) and Smote-Tomek Links Methods,” Int. J. Eng. Comput. Sci. Appl., vol. 2, no. 2, pp. 81–91, Sep. 2023, doi: 10.30812/ijecsa.v2i2.3406.
E. Gusniawan Pradana, “Implementasi Web Crawler Untuk Mencari Harga Barang Termurah Dari Berbagai Situs E-Commerce Indonesia,” J. Teknol. Pint., vol. 2, no. 9, pp. 1–11, 2022.
J. Budiarto, “Identifikasi Kebutuhan Masyarakat Nusa Tenggara Barat pada Pandemi Covid-19 di Media Sosial dengan Metode Crawling (Requirements Identification for NTB People in pandemic covid-19 at Social Media Using Crawling Method),” JTIM J. Teknol. Inf. dan Multimed., vol. 2, no. 4, pp. 244–250, 2021.
P. Y. Saputra, “Implementasi Teknik Crawling Untuk Pengumpulan Data Dari Media Sosial Twitter,” J. Din., vol. 8, no. 2, pp. 160–168, 2017, [Online]. Available: www.quicksprout.com
M. Yusran, S. Rasyid, E. Sagita, R. N. D. Julia, and Siswanto, “Sentiment Analysis of Sustainable Development Goals on Twitter with Classifying Decision Tree C5.0 and Classification and Regression Tree,” Int. J. Acad. Appl. Res., vol. 6, no. 6, pp. 104–110, 2022, [Online]. Available: www.ijeais.org/ijaar
T. D. Dikiyanti, A. M. Rukmi, and M. I. Irawan, “Sentiment analysis and topic modeling of BPJS Kesehatan based on twitter crawling data using Indonesian Sentiment Lexicon and Latent Dirichlet Allocation algorithm,” in Journal of Physics: Conference Series, IOP Publishing Ltd, Mar. 2021. doi: 10.1088/1742-6596/1821/1/012054.
W. Astuti, D. Djoko, and A. Widodo, “Pemetaan Tindak Kejahatan Jalanan di Kota Semarang Menggunakan Algoritma K-Means Clustering,” J. Tek. Elektro, vol. 8, no. 1, pp. 5–7, 2016.
C. Sains Teknologi, S. Pakpahan, A. Manullang, and K. Kunci, “Analisis Sentimen Integritas KPK Tahun 2021 Pencegahan Korupsi pada Twitter KPK menggunakan Metode K-Nearest Neighbor dan Naive Bayes,” Citra Sains Teknol., vol. 2, no. 1, pp. 63–73, 2022.
M. S. Bahri, A. Hermawan, E. Pricilia Kondy, and R. Joyce Semida, “Performance Comparison of Supporting Vector Machine Method without or with Particle Swarm Optimization Based on Sentiment Analysis WhatsApp Review,” Int. J. Acad. Appl. Res., vol. 6, no. 6, pp. 94–101, 2022, [Online]. Available: www.ijeais.org/ijaar
W. E. Nurjanah, R. Setya Perdana, and M. A. Fauzi, “Analisis Sentimen Terhadap Tayangan Televisi Berdasarkan Opini Masyarakat pada Media Sosial Twitter menggunakan Metode K-Nearest Neighbor dan Pembobotan Jumlah Retweet,” J. Pengemb. Teknol. Inf. dan Ilmu Komput., vol. 1, no. 12, pp. 1750–1757, 2017.
R. Adinugroho, “Perbandingan Rasio Split Data Training dan Data Testing Menggunakan Metode LSTM Dalam Memprediksi Harga Indeks Saham Asia,” 2022. [Online]. Available: https://repository.uinjkt.ac.id/dspace/handle/123456789/67314
S. Rabbani, D. Safitri, N. Rahmadhani, A. A. F. Sani, and M. K. Anam, “Perbandingan Evaluasi Kernel SVM untuk Klasifikasi Sentimen dalam Analisis Kenaikan Harga BBM,” MALCOM Indones. J. Mach. Learn. Comput. Sci., vol. 3, no. 2, pp. 153–160, Oct. 2023, doi: 10.57152/malcom.v3i2.897.
D. Darwis, N. Siskawati, and Z. Abidin, “Penerapan Algoritma Naive Bayes untuk Analisis Sentimen Review Data Twitter BMKG Nasional,” J. Tekno Kompak, vol. 15, no. 1, pp. 131–145, 2021.
A. T. Putra, E. Kardinata, H. Junaedi, F. Chandra, and J. Santoso, “Ekstraksi Relasi Antar Entitas di Bahasa Indonesia Menggunakan Neural Network,” J. Inf. Syst. Hosp. Technol., vol. 3, no. 02, pp. 49–54, Oct. 2021, doi: 10.37823/insight.v3i02.156.
E. F. Swana, W. Doorsamy, and P. Bokoro, “Tomek Link and SMOTE Approaches for Machine Fault Classification with an Imbalanced Dataset,” Sensors, vol. 22, no. 9, May 2022, doi: 10.3390/s22093246.
R. M. Sari and A. Prasetyo, “Penerapan Synthetic Minority Oversampling Technique terhadap Data Perokok Anak di Nusa Tenggara Barat Tahun 2021,” Inferensi, vol. 6, no. 2, p. 133, Sep. 2023, doi: 10.12962/j27213862.v6i2.18472.
A. K. Duggal and M. Dave, “A Comparative Study of Load Balancing Algorithms in a Cloud Environment ..,” in Advances in Computing and Intelligent Systems Algorithms for Intelligent Systems Series, 2019, pp. 115–126. [Online]. Available: http://www.springer.com/series/16171
G. Douzas, F. Bacao, and F. Last, “Improving imbalanced learning through a heuristic oversampling method based on k-means and SMOTE,” Inf. Sci. (Ny)., vol. 465, pp. 1–20, Oct. 2018, doi: 10.1016/j.ins.2018.06.056.
D. C. R. Novitasari, M. F. Rozi, and R. Veriani, “Klasifikasi Kelainan Pada Jantung Melalui Citra Iris Mata Menggunakan Fuzzy C-Means Sebagai Pengambil Fitur Iris dan Klasifikasi Menggunakan Support Vector Machine,” INTEGER J. Inf. Technol., vol. 4, no. 1, pp. 1–10, 2019.
D. Devi, S. K. Biswas, and B. Purkayastha, “A Review on Solution to Class Imbalance Problem: Undersampling Approaches,” in 2020 International Conference on Computational Performance Evaluation (ComPE), 2020, pp. 626–631.
L. M. Sinaga, Sawaluddin, and S. Suwilo, “Analysis of classification and Naïve Bayes algorithm k-nearest neighbor in data mining,” in IOP Conference Series: Materials Science and Engineering, 2020. doi: 10.1088/1757-899X/725/1/012106.
R. Damarta, A. Hidayat, and A. S. Abdullah, “The Application of K-Nearest Neighbors Classifier For Sentiment Analysis of PT PLN (Persero) Twitter Account Service Quality,” in Journal of Physics: Conference Series, 2021. doi: 10.1088/1742-6596/1722/1/012002.
Anggi Priliani Yulianto and S. Darwis, “Penerapan Metode K-Nearest Neighbors (kNN) pada Bearing,” J. Ris. Stat., vol. 1, no. 1, pp. 10–18, Jul. 2021, doi: 10.29313/jrs.v1i1.16.
S. Dyah Fritama, Y. Raymond Ramadhan, and M. Andayani Komara, “Analisis Sentimen Review Produk Acne Spot Treatment di Female Daily Menggunakan Algoritma K-Nearest Neighbor,” KLIK Kaji. Ilm. Inform. dan Komput., vol. 4, no. 1, pp. 134–143, 2023, doi: 10.30865/klik.v4i1.1070.
A. Habibie and I. Rachmawati, “Analisis Preferensi Konsumen Dalam Memilih Smartphone di Indonesia Consumer Analysis of Preferences in Choosing Smartphone in Indonesia,” in e-Proceeding of Management, 2020, pp. 114–124.
Y. Dang, N. Jiang, H. Hu, Z. Ji, and W. Zhang, “Image classification based on quantum K-Nearest-Neighbor algorithm,” Quantum Inf. Process., vol. 17, no. 9, pp. 1–18, Sep. 2018, doi: 10.1007/s11128-018-2004-9.
Indrayanti, D. Sugianti, and M. Karomi, Al Adib, “Optimasi Parameter Pada Algoritma K-Nearest Neighbour Untuk Klasifikasi Penyakit Diabetes Mellitus,” in Prosiding SNATIF, 2017, pp. 823–829.
S. Ruuska, W. Hämäläinen, S. Kajava, M. Mughal, P. Matilainen, and J. Mononen, “Evaluation of the Confusion Matrix Method in the Validation of an Automated System For Measuring Feeding Behaviour of Cattle,” Behav. Processes, vol. 148, pp. 56–62, Mar. 2018, doi: 10.1016/j.beproc.2018.01.004.
A. . Ihsan, “Reduksi Atribut Pada Algoritma K-Nearest Neighbor (KNN) Dengan Menggunakan Algoritma Genetika,” 2018. [Online]. Available: https://repositori.usu.ac.id/handle/123456789/3878
G. Zeng, “On the Confusion Matrix In Credit Scoring and Its Analytical Properties,” Commun. Stat. - Theory Methods, vol. 49, no. 9, pp. 2080–2093, 2019, doi: 10.1080/03610926.2019.1568485.
B. Juba and H. S. Le, “Precision-Recall versus Accuracy and the Role of Large Data Sets,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2019, pp. 4039–4048. [Online]. Available: www.aaai.org
DOI: https://doi.org/10.32520/stmsi.v13i5.4013
Article Metrics
Abstract view : 54 timesPDF - 18 times
Refbacks
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.