Fake News Detection using the Random Forest Algorithm

Rahmat Dipo Setyadin, Reza Handaru Winasis, Gandung Triyono

Abstract


Detecting fake news has become increasingly important in the digital era, where false information can spread rapidly and significantly influence public opinion. The dissemination of fake news can lead to public distrust in the media, economic losses, and even social conflict. This study aims to develop an effective fake news detection system using the Random Forest algorithm approach. The dataset used in this research was collected from the official Kominfo website and includes attributes such as title, description, author, date, category, page, news URL, and image URL. The text preprocessing process involves tokenization, stop word removal, text normalization, and feature extraction using Term Frequency and Inverse Document Frequency (TF-IDF) to generate numerical representations of the textual data. The Random Forest model was evaluated using accuracy, precision, recall, and F1-score metrics to assess its effectiveness in detecting fake news. The results show that the model performed exceptionally well, with k-fold cross-validation (k=5) yielding high average accuracy—Random Forest achieved an accuracy of 0.9890.

Keywords


Deteksi Berita Palsu; Random Forest; TF-IDF; K-Fold Cross Validation

Full Text:

PDF

References


S. Sadiq, N. Wagner, M. L. Shyu, and D. Feaster, “High Dimensional Latent Space Variational AutoEncoders for Fake News Detection,” Proc. - 2nd Int. Conf. Multimed. Inf. Process. Retrieval, MIPR 2019, pp. 437–442, 2019, doi: 10.1109/MIPR.2019.00088.

trends.google.com, “Fake News Trend,” 2024. https://trends.google.com/trends/explore?date=2023-04-16 2024-03-16&geo=ID&q=Berita aktual,Berita palsu&hl=id

R. K. Kaliyar, A. Goswami, and P. Narang, “FakeBERT: Fake News Detection in Social Media with a BERT-based Deep Learning Approach,” Multimed. Tools Appl., vol. 80, no. 8, pp. 11765–11788, 2021, doi: 10.1007/s11042-020-10183-2.

E. Qawasmeh, M. Tawalbeh, and M. Abdullah, “Automatic Identification of Fake News using Deep Learning,” 2019 6th Int. Conf. Soc. Networks Anal. Manag. Secur. SNAMS 2019, pp. 383–388, 2019, doi: 10.1109/SNAMS.2019.8931873.

S. Lyu and D. C.-T. Lo, “Fake News Detection by Decision Tree,” IEEE SoutheastCon 2020, pp. 430–435, 2020, doi: 10.1109/SoutheastCon44009.2020.9249688.

R. Jehad and S. A.Yousif, “Fake News Classification using Random Forest and Decision Tree (J48),” Al-Nahrain J. Sci., vol. 23, no. 4, pp. 49–55, 2020, doi: 10.22401/anjs.23.4.09.

H. J. Alshahrani et al., “Hunter Prey Optimization with Hybrid Deep Learning for Fake News Detection on Arabic Corpus,” Comput. Mater. Contin., vol. 75, no. 2, pp. 4255–4272, 2023, doi: 10.32604/cmc.2023.034821.

S. Bachhety, R. Singhal, and R. Jain, “Intelligent Data Analysis with Data Mining: Theory and Applications,” Intell. Data Anal. From Data Gather. to Data Compr., vol. 1, pp. 63–83, 2020, doi: 10.1002/9781119544487.ch4.

P. Mukherjee, S. Santra, S. Bhowmick, A. Paul, P. Chatterjee, and A. Deyasi, “Development of GUI for Text-to-Speech Recognition using Natural Language Processing,” 2018 2nd Int. Conf. Electron. Mater. Eng. Nano-Technology, IEMENTech 2018, pp. 1–4, 2018, doi: 10.1109/IEMENTECH.2018.8465238.

S. K. A. Fahad and A. E. Yahya, “Inflectional Review of Deep Learning on Natural Language Processing,” 2018 Int. Conf. Smart Comput. Electron. Enterp. ICSCEE 2018, no. Dl, pp. 2018–2021, 2018, doi: 10.1109/ICSCEE.2018.8538416.

E. A. Lisangan, “Natural Language Processing dalam memperoleh Informasi Akademik Mahasiswa Universitas Atma Jaya Makassar,” J. Temat., vol. 1, no. March 2013, pp. 1–9, 2015, doi: 2303-3878.

A. Yousaf et al., “Emotion Recognition by Textual Tweets Classification using Voting Classifier (LR-SGD),” IEEE Access, vol. 9, pp. 6286–6295, 2021, doi: 10.1109/ACCESS.2020.3047831.

M. Tajrian, A. Rahman, M. A. Kabir, and M. R. Islam, “A Review of Methodologies for Fake News Analysis,” IEEE Access, vol. 11, no. June, pp. 73879–73893, 2023, doi: 10.1109/ACCESS.2023.3294989.

S. Mohan, C. Thirumalai, and G. Srivastava, “Effective Heart Disease Prediction using Hybrid Machine Learning Techniques,” IEEE Access, vol. 7, pp. 81542–81554, 2019, doi: 10.1109/ACCESS.2019.2923707.

K. Rajesh, A. Kumar, and R. Kadu, “Fraudulent News Detection using Machine Learning Approaches,” 2019 Glob. Conf. Adv. Technol. GCAT 2019, pp. 1–5, 2019, doi: 10.1109/GCAT47503.2019.8978436.

M. Al Bataineh, D. I. A. Abdoun, H. Alnuaimi, Z. Al-Qudah, Z. Albataineh, and M. Al Ahmad, “Head Impact Detection using Machine Learning Algorithms,” IEEE Access, vol. 12, no. January, pp. 4938–4947, 2024, doi: 10.1109/ACCESS.2023.3349212.

D. Rohera et al., “A Taxonomy of Fake News Classification Techniques: Survey and Implementation Aspects,” IEEE Access, vol. 10, pp. 30367–30394, 2022, doi: 10.1109/ACCESS.2022.3159651.




DOI: https://doi.org/10.32520/stmsi.v14i3.4995

Article Metrics

Abstract view : 139 times
PDF - 31 times

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.