PENERAPAN WORD N-GRAM UNTUK SENTIMENT ANALYSIS REVIEW MENGGUNAKAN METODE SUPPORT VECTOR MACHINE (STUDI KASUS: APLIKASI SAMBARA)

Fitriyani Fitriyani, Toni Arifin

Abstract


ABSTRACT

Sambara application is an innovation from Bapenda West Java for motor vehicle tax services. The Sambara application expected can be provide efficiency, effectiveness and service improvement. The success of the application can be determined by conducting a sentiment review analysis. Sentiment analysis aims to detect polarity in the text in the form of negative or positive opinions, using text mining. At the text processing stage, the Word N-Gram feature is added as a word identification approach and for classification it uses the Support Vector Machine (SVM) method. This study aims to determine the application of Word N-Gram, the results of the accuracy value using the SVM method, and find out how much influence the application of Word N-Gram on the accuracy value. The highest accuracy value in this research was 89.00% with AUC value of 0.944 (excellent classification) on the amount of data 900, but when uses Bi-gram and Tri-gram results in a decrease in accuracy. The accuracy value with the highest increase is in the application of tri-grams with the amount of 1,200 data. Increase in accuracy value by 0.92% compared to Uni-Gram to 88.59% with AUC value of 0.95.

Keywords: analysis sentiment, text mining, word n-gram, support vector machine (SVM).

ABSTRAK

Aplikasi Sambara merupakan inovasi dari Bapenda Jabar untuk pelayanan pajak kendaraan bermotor. Aplikasi Sambara diharapkan memberikan efesiensi, efektifitas, dan perbaikan pelayanan. Keberhasilan aplikasi dapat diketahui dengan melakukan analysis sentiment review. Analysis sentiment bertujuan untuk mendeteksi polaritas di dalam teks berupa opini negatif atau positif., dengan menggunakan text mining. Pada tahapan text processing ditambahkan fitur Word N-Gram sebagai pendekatan identifikasi kata dan untuk klasifikasinya menggunakan metode Support Vector Machine (SVM). Penelitian ini bertujuan untuk mengetahui penerapan Word N-Gram, hasil nilai akurasi dengan menggunakan metode SVM, dan mengetahui seberapa besar pengaruh penerapan Word N-Gram terhadap nilai akurasi. Hasil nilai akurasi tertinggi pada penelitian ini sebesar 89.00% dengan nilai AUC 0.944 (excellent classification) pada jumlah data 900, namun saat dilakukan penerapan Bi-gram dan Tri-gram menghasilkan penurunan akurasi. Nilai akurasi dengan kenaikan tertinggi yaitu pada penerapan Tri-gram dengan jumlah data 1.200. Kenaikan nilai akurasi sebesar 0.92% dibandingkan dengan Uni-Gram menjadi 88.59% dengan nilai AUC 0.954.

Kata Kunci: analysis sentiment, text mining, word n-gram, support vector machine (SVM)


Full Text:

PDF

References


B. Jabar, “Cek Pajak Kendaran Melalui Aplikasi Sambara.” https://bapenda.jabarprov.go.id/2018/08/14/cek-pajak-kendaraan-melalui-aplikasi-sambara/.

B. Liu, “Sentiment Analysis and Opinion Mining,” Synth. Lect. Hum. Lang. Technol., vol. 5, no. 1, pp. 1–167, May 2012, doi: 10.2200/S00416ED1V01Y201204HLT016.

S. Gupta, “Sentiment Analysis: Concept, Analysis and Applications,” Toward Data Science, 2018. https://towardsdatascience.com/sentiment-analysis-concept-analysis-and-applications-6c94d6f58c17.

B. Susanto, “Text dan Web Mining,” 2020. http://lecturer.ukdw.ac.id/budsus/pdf/textwebmining/TextMining_Kuliah.pdf (accessed Apr. 10, 2019).

E. Junianto and D. Riana, “Penerapan PSO Untuk Seleksi Fitur Pada Klasifikasi Dokumen Berita Menggunakan NBC,” J. Inform., vol. 4, no. 1, pp. 38–45, 2017, [Online]. Available: https://ejournal.bsi.ac.id/ejurnal/index.php/ji/article/view/1810.

H. Ahmed, I. Traore, and S. Saad, “Detection of Online Fake News Using N-Gram Analysis and Machine Learning Techniques,” vol. 10618, I. Traore, I. Woungang, and A. Awad, Eds. Cham: Springer International Publishing, 2017, pp. 127–138.

E. Indrayuni and M. Wahyudi, “PENERAPAN CHARACTER N-GRAM UNTUK SENTIMENT ANALYSIS REVIEW HOTEL MENGGUNAKAN ALGORITMA NAIVE BAYES,” Konfrensi Nas. Ilmu Pengetah. dan Teknol., 2015.

F. Pramono, Didi Rosiyadi, and Windu Gata, “Integrasi N-gram, Information Gain, Particle Swarm Optimation di Naïve Bayes untuk Optimasi Sentimen Google Classroom,” J. RESTI (Rekayasa Sist. dan Teknol. Informasi), vol. 3, no. 3, pp. 383–388, 2019, doi: 10.29207/resti.v3i3.1119.

E. Indrayuni, “Komparasi Algoritma Naive Bayes Dan Support Vector Machine Untuk Analisa Sentimen Review Film,” J. Pilar Nusa Mandiri, vol. 14, no. 2, p. 175, 2018, doi: 10.33480/pilar.v14i2.918.

L. A. Utami, “Analisis Sentimen Opini Publik Berita Kebakaran Hutan Melalui Komparasi Algoritma Support Vector Machine Dan K-Nearest Neighbor Berbasis Particle Swarm Optimization,” Pilar Nusa Mandiri, vol. 13, no. 1, pp. 103–112, 2017.

A. C. Najib, A. Irsyad, G. A. Qandi, and N. A. Rakhmawati, “Perbandingan Metode Lexicon-based dan SVM untuk Analisis Sentimen Berbasis Ontologi pada Kampanye Pilpres Indonesia Tahun 2019 di Twitter,” Fountain Informatics J., vol. 4, no. 2, p. 41, Nov. 2019, doi: 10.21111/fij.v4i2.3573.

F. V. Sari and A. Wibowo, “Analisis Sentimen Pelanggan Toko Online Jd. Id Menggunakan Metode Naïve Bayes Classifier Berbasis Konversi Ikon Emosi,” Simetris J. Tek. Mesin, Elektro dan Ilmu Komput., vol. 2, no. 2, pp. 681–686, 2019.

D. Gunawan, D. Ardiansyah, F. Akbar, and A. Salman, “Komparasi Algoritma Support Vector Machine Dan Naïve Bayes Dengan Algoritma Genetika Pada Analisis Sentimen Calon Gubernur Jabar 2018-2023,” Komparasi Algoritm. Support Vector Mach. Dan Naïve Bayes Dengan Algoritm. Genet. Pada Anal. Sentimen Calon Gubernur Jabar 2018-2023, vol. VI, 2020, doi: 10.31294/jtk.v4i2.

R. Mahendrajaya, G. A. Buntoro, and M. B. Setyawan, “ANALISIS SENTIMEN PENGGUNA GOPAY MENGGUNAKAN METODE LEXICON BASED DAN SUPPORT VECTOR MACHINE,” Komputek, pp. 52–63, 2019, [Online]. Available: http://studentjournal.umpo.ac.id/index.php/komputek%0AANALISIS.

A. Nugroho, “Analisis Sentimen Pada Media Sosial Twitter Menggunakan Naive Bayes Classifier Dengan Ekstrasi Fitur N-Gram,” J-SAKTI (Jurnal Sains Komput. dan Inform., vol. 2, no. 2, p. 200, 2018, doi: 10.30645/j-sakti.v2i2.83.

A. A. Prasanti, M. A. Fauzi, and M. T. Furqon, “Klasifikasi Teks Pengaduan Pada Sambat Online Menggunakan Metode N- Gram dan Neighbor Weighted K-Nearest Neighbor ( NW-KNN ),” vol. 2, no. 2, pp. 594–601, 2018.

A. Tripathy, A. Agrawal, and S. K. Rath, “Classification of sentiment reviews using n-gram machine learning approach,” Expert Syst. Appl., vol. 57, pp. 117–126, 2016, doi: 10.1016/j.eswa.2016.03.028.

C. W. Dawson, Project in Computing and Information Systems A Student’s Guide, 2nd ed. Inggris: ADDISON-WESLEY, 2009.

A. Nurfalah and A. A. Suryani, “Analisis Sentimen Berbahasa Indonesia dengan Pendekatan Lexicon-Based Pada Media Sosial,” J. Masy. Inform. Indones., 2017.

Z. Pratama, E. Utami, and M. R. Arief, “Analisa Perbandingan Jenis N-GRAM Dalam Penentuan Similarity Pada Deteksi Plagiat,” Creat. Inf. Technol. J., vol. 4, no. 4, p. 254, 2019, doi: 10.24076/citec.2017v4i4.118.

A. Guo and T. Yang, “Research and improvement of feature words weight based on TFIDF algorithm,” Proc. 2016 IEEE Inf. Technol. Networking, Electron. Autom. Control Conf. ITNEC 2016, pp. 415–419, 2016, doi: 10.1109/ITNEC.2016.7560393.

Tutorlalspoint, “Data Mining - Classification & Prediction,” Tutorlalspoint, 2020. https://www.tutorialspoint.com/data_mining/dm_classification_prediction.htm (accessed Jun. 07, 2020).

J. Sahertian and A. Sanjaya, “Deteksi Buah Pada Pohon Menggunakan Metode SVM dan Fitur Tekstur,” Semnas Teknomedia, pp. 19–24, 2017.

G. Shi, “Support Vector Machines,” in Data Mining and Knowledge Discovery for Geoscientists, Elsevier, 2014, pp. 87–110.

M. Sun, “Support Vector Machine Models for Classification,” Encycl. Bus. Anal. Optim., pp. 2395–2409, 2014, doi: 10.4018/978-1-4666-5202-6.ch215.

C. Neale, D. Workman, and A. Dommalapati, “Cross Validation: A Beginner’s Guide,” Toward Data Science, 2019. https://towardsdatascience.com/cross-validation-a-beginners-guide-5b8ca04962cd (accessed May 03, 2020).

J. Lei, “Cross-Validation With Confidence,” in Journal of the American Statistical Association, 2019, pp. 1–35.

H. Witten, Ian, E. Frank, and M. A. Hall, Data Mining. 2008.

M. Awad and R. Khanna, “Machine Learning,” in Efficient Learning Machines, vol. 6, no. 1, Berkeley, CA: Apress, 2015, pp. 1–18.




DOI: https://doi.org/10.32520/stmsi.v9i3.954

Article Metrics

Abstract view : 1565 times
PDF - 671 times

Refbacks



Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.