Sentiment Analysis Of Ijen Crater Reviews Using Naïve Bayes Classification And Oversampling Optimization

Fadhel Akhmad Hizham, Hasyim Asy'ari, Maysas Yafi Urrochman

Abstract


Sentiment analysis is a method that applies text mining concepts to provide classifications that have polarity that is positive, negative, or neutral from each sentence or document. In this context, the purpose of this research is to analyse the sentiment of user reviews related to the Ijen crater tourist attractions found on the Google Maps platform. This research is conducted in three main stages: first, Data Collection and Preprocessing by taking data samples obtained from Ijen Crater reviews contained on Google Maps; second Optimisation and Classification by changing the minority class samples to be almost equal to the majority class by randomly duplicating the minority class samples, third, classification performance measurement using confusion matrix. The test is conducted by comparing the performance between NBC classification without optimisation and NBC classification with SMOTE and ADASYN optimisation. The performance results show that SMOTE-optimised NBC classification provides the best improvement in accuracy by 6.74% compared to the performance of ordinary NBC and NBC added with ADASYN.

Full Text:

PDF

References


C. A. Bahri and L. H. Suadaa, “Aspect-based sentiment analysis in bromo tengger semeru national park indonesia based on google maps user reviews,” IJCCS (Indonesian Journal of Computing and Cybernetics Systems), vol. 17, no. 1, pp. 79–90, 2023.

W. Wulandari et al., “Electronic Word Of Mouth On Visiting Decisions: Case Study On Google Review Lokawisata Baturraden,” Journal of Tourism, Hospitality and Travel Management, vol. 1, no. 1, pp. 17–22, 2023.

A. Leiras and C. Eusébio, “Perceived image of accessible tourism destinations: a data mining analysis of Google Maps reviews,” Current Issues in Tourism, pp. 1–19, 2023.

K. De Boeck, J. Verdonck, M. Willocx, J. Lapon, and V. Naessens, “Reviewing review platforms: a privacy perspective,” in Proceedings of the 17th International Conference on Availability, Reliability and Security, 2022, pp. 1–10.

M. I. Ghaly, “The influence of user-generated content and social media travel influencers credibility on the visit intention of Generation Z,” Journal of Association of Arab Universities for Tourism and Hospitality, vol. 24, no. 2, pp. 367–382, 2023.

A. W. Sari, T. I. Hermanto, and M. Defriani, “Sentiment Analysis Of Tourist Reviews Using K-Nearest Neighbors Algorithm And Support Vector Machine,” Sinkron: jurnal dan penelitian teknik informatika, vol. 7, no. 3, pp. 1366–1378, 2023.

N. A. K. M. Haris, S. Mutalib, A. M. Ab Malik, S. Abdul-Rahman, and S. N. K. Kamarudin, “Sentiment classification from reviews for tourism analytics,” International Journal of Advances in Intelligent Informatics, vol. 9, no. 1, pp. 108–120, 2023.

S. Saepudin, S. Widiastuti, and C. Irawan, “Sentiment Analysis of Social Media Platform Reviews Using the Na{"i}ve Bayes Classifier Algorithm,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 12, no. 2, pp. 236–243, 2023.

O. Somantri, R. H. Maharrani, and S. Purwaningrum, “Coastal Sentiment Review Using Na{"i}ve Bayes with Feature Selection Genetic Algorithm,” Scientific Journal of Informatics, vol. 10, no. 3, pp. 229–238, 2023.

T. S. Rambe, M. N. S. Hasibuan, and M. H. Dar, “Sentiment Analysis of Beauty Product Applications using the Na{"i}ve Bayes Method,” Sinkron: jurnal dan penelitian teknik informatika, vol. 7, no. 2, pp. 980–989, 2023.

M. M. Aziz, M. D. Purbalaksono, and A. Adiwijaya, “Method comparison of Na{"i}ve Bayes, logistic regression, and svm for analyzing movie reviews,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 4, pp. 1714–1720, 2023.

T. M. Aruna, K. Asha, G. N. Divyaraj, and P. K. Pareek, “Feature Selection Based Naive Bayes Algorithm for Twitter Sentiment Analysis,” in 2022 Fourth International Conference on Emerging Research in Electronics, Computer Science and Technology (ICERECT), 2022, pp. 1–7.

Ismet Abac, K. Yildiz, “SMOTE vs. KNNOR: An evaluation of oversampling techniques in machine learning,” Gümücshane Üniversitesi Fen Bilimleri Dergisi, vol. 13, no. 3, pp. 767–779, 2023.

F. G. Mahmud, T. I. Hermanto, and I. M. Nugroho, “Implementation of k-nearest neighbor algorithm with smote for hotel reviews sentiment analysis,” Sinkron: jurnal dan penelitian teknik informatika, vol. 7, no. 2, pp. 595–602, 2023.

Y. Zhang, L. Deng, H. Huang, and B. Wei, “An improved SMOTE based on center offset factor and synthesis strategy for imbalanced data classification,” J Supercomput, pp. 1–41, 2024.

Y. A. Singgalen, “Analisis Sentimen Pengunjung Pulau Komodo dan Pulau Rinca di Website Tripadvisor Berbasis CRISP-DM,” Journal of Information System Research (JOSH), vol. 4, no. 2, pp. 614–625, 2023.

Y. B. P. Pamukti and M. Rahardi, “Sentiment Analysis of Bandung Tourist Destination Using Support Vector Machine and Na{"i}ve Bayes Algorithm,” in 2022 6th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), 2022, pp. 391–395.

Y. A. Singgalen, “Analisis Sentimen Wisatawan Melalui Data Ulasan Candi Borobudur di Tripadvisor Menggunakan Algoritma Naive Bayes Classifier,” Building of Informatics, Technology and Science (BITS), vol. 4, no. 3, pp. 1343–1352, 2022.

Q. Aini, R. R. Fauzi, and E. Khudzaeva, “Economic Impact due Covid-19 Pandemic: Sentiment Analysis on Twitter Using Naive Bayes Classifier and Support Vector Machine,” JOIV: International Journal on Informatics Visualization, vol. 7, no. 3, pp. 733–741, 2023.

S. K. Wardani and Y. A. Sari, “Analisis Sentimen menggunakan Metode Naïve Bayes Classifier terhadap Review Produk Perawatan Kulit Wajah menggunakan Seleksi Fitur N-gram dan Document Frequency Thresholding,” Jurnal Pengembangan Teknologi Informasi dan Ilmu Komputer, vol. 5, no. 12, 2021.

P. P. Allorerung and R. Rismayani, “Sentiment Analysis on WeTV App Reviews on Google Play Store Using NBC and SVM Algorithms,” SISTEMASI, vol. 12, no. 2, 2023, doi: 10.32520/stmsi.v12i2.2518.

N. Hardi, Y. Alkahfi, P. Handayani, W. Gata, and M. R. Firdaus, “Analisis Sentimen Physical Distancing pada Twitter Menggunakan Text Mining dengan Algoritma Naive Bayes Classifier,” SISTEMASI, vol. 10, no. 1, 2021, doi: 10.32520/stmsi.v10i1.1118.

D. A. N. Wulandari, R. Annisa, and L. Yusuf, “an Educational Data Mining for Student Academic Prediction Using K-Means Clustering and Naïve Bayes Classifier,” Seminar Nasional Aplikasi Teknologi Informasi (SNATI), 2020.

M. Rahayu, A. Luthfiarta, L. Cahyaningrum, and A. Nurfaiza Azzahra, “Pengaruh Oversampling dan Cross Validation Pada Model Machine Learning Untuk Sentimen Analisis Kebijakan Luaran Kelulusan Mahasiswa,” Jurnal Media Informatika Budidarma, vol. 8, no. 1, 2024.

A. Nurhopipah and C. Magnolia, “Perbandingan Metode Resampling pada Imbalanced Dataset untuk Klasifikasi Komentar Program Mbkm,” Jurnal Publikasi Ilmu Komputer dan Multimedia, vol. 2, no. 1, pp. 9–22, Jan. 2023, doi: 10.55606/jupikom.v2i1.862.

J. Al Amien, Yoze Rizki, and Mukhlis Ali Rahman Nasution, “Implementasi Adasyn Untuk Imbalance Data Pada Dataset UNSW-NB15 Adasyn Implementation For Data Imbalance on UNSW-NB15 Dataset,” Jurnal CoSciTech (Computer Science and Information Technology), vol. 3, no. 3, pp. 242–248, Dec. 2022, doi: 10.37859/coscitech.v3i3.4339.

N. L. W. S. R. Ginantra, C. P. Yanti, G. D. Prasetya, I. B. G. Sarasvananda, and I. K. A. G. Wiguna, “Analisis Sentimen Ulasan Villa di Ubud Menggunakan Metode Naive Bayes, Decision Tree, dan K-NN,” Jurnal Nasional Pendidikan Teknik Informatika: JANAPATI, vol. 11, no. 3, pp. 205–215, 2022.

C. Manning, P. Raghavan, and H. Schutze, “Term weighting, and the vector space model,” Introduction to information retrieval, pp. 109–133, 2008.

H. He, W. Zhang, and S. Zhang, “A novel ensemble method for credit scoring: Adaption of different imbalance ratios,” Expert Syst Appl, vol. 98, pp. 105–117, 2018.

B. Krithiga, P. Sabari, I. Jayasri, and I. Anjali, “Early detection of coronary heart disease by using naive bayes algorithm,” in Journal of Physics: Conference Series, 2021, p. 12040.

F. A. Hizham, C. Kartika Murni, and M. Qori’atunnadyah, “Uji Klasifikasi Algoritma Naive Bayes Classification dalam Analisis Sentimen Ulasan Puncak B29 Lumajang,” Jurnal Ilmiah Komputer, vol. 20, no. 1, pp. 361–370, 2024.

M. K. Malik, S. Wahyuni, and J. Widodo, “Sistem bagi hasil petani penyakap di desa krai kecamatan yosowilangun kabupaten lumajang,” Jurnal Pendidikan Ekonomi: Jurnal Ilmiah Ilmu Pendidikan, Ilmu Ekonomi Dan Ilmu Sosial, vol. 12, no. 1, pp. 26–32, 2018.

Q. Aini, R. R. Fauzi, and E. Khudzaeva, “Economic Impact Due Covid-19 Pandemic: Sentiment Analysis on Twitter Using Naive Bayes Classifier and Support Vector Machine,” International Journal on Informatics Visualization, vol. 7, no. 3, 2023, doi: 10.30630/joiv.7.3.1474.

A. A. Ajhari, “The Comparison of Sentiment Analysis of Moon Knight Movie Reviews between Multinomial Naive Bayes and Support Vector Machine,” Applied Information System and Management (AISM), vol. 6, no. 1, 2023, doi: 10.15408/aism.v6i1.26045.




DOI: https://doi.org/10.32520/stmsi.v13i5.4490

Article Metrics

Abstract view : 29 times
PDF - 8 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.