Comparative Analysis of T5 Model Performance for Indonesian Abstractive Text Summarization

Mohammad Wahyu Bagus Dwi Satya; Ardytha Luthfiarta; Mohammad Noval Althoff

doi:10.32520/stmsi.v14i3.4884

Comparative Analysis of T5 Model Performance for Indonesian Abstractive Text Summarization

Mohammad Wahyu Bagus Dwi Satya, Ardytha Luthfiarta, Mohammad Noval Althoff

Abstract

The rapid growth of digital content has created significant challenges in information processing, particularly in languages like Indonesian, where automatic summarization remains complex. This study evaluates the performance of different T5 (Text-to-Text Transfer Transformer) model variants in generating abstractive summaries for Indonesian texts. The research aims to identify the most effective model variant for Indonesian language summarization by comparing T5-Base, FLAN-T5 Base, and mT5-Base models. Using the INDOSUM dataset containing 19,000 Indonesian news article-summary pairs, we implemented a 5-Fold Cross-Validation approach and applied ROUGE metrics for evaluation. Results show that T5-Base achieves the highest ROUGE-1, ROUGE-2, and ROUGE-L scores of 73.52%, 64.50%, and 69.55%, respectively, followed by FLAN-T5, while mT5-Base performs the worst. However, qualitative analysis reveals various summarization errors: T5-Base exhibits redundancy and inconsistent formatting, FLAN-T5 suffers from truncation issues, and mT5 often generates factually incorrect summaries due to misinterpretation of context. Additionally, we assessed computational performance through training time, inference speed, and resource consumption. The results indicate that mT5-Base has the shortest training time and fastest inference speed but at the cost of lower summarization accuracy. Conversely, T5-Base, while achieving the highest accuracy, requires significantly longer training time and greater computational resources. These findings highlight the trade-offs between accuracy, error tendencies, and computational efficiency, providing valuable insights for developing more effective Indonesian language summarization systems and emphasizing the importance of model selection for specific language tasks.

Keywords

natural language processing; text summarization; transformers; T5; ROUGE

Full Text:

PDF

References

D. Qiu and B. Yang, “Text Summarization based on Multi-Head Self-Attention Mechanism and Pointer Network,” Complex Intell. Syst., vol. 8, no. 1, pp. 555–567, Feb. 2022, doi: 10.1007/s40747-021-00527-2.

D. Bawden and L. Robinson, “Information Overload: An Introduction,” in Oxford Research Encyclopedia of Politics, Oxford University Press, 2020. doi: 10.1093/acrefore/9780190228637.013.1360.

C. Setyawan, N. Benarkah, and V. R. Prasetyo, “Automatic Text Summarization berdasarkan Pendekatan Statistika pada Dokumen Berbahasa Indonesia,” Keluwih J. Sains Dan Teknol., vol. 2, no. 1, Art. no. 1, Feb. 2021, doi: 10.24123/saintek.v2i1.4045.

W. Widodo, M. Nugraheni, and I. P. Sari, “A Comparative Review of Extractive Text Summarization in Indonesian Language,” IOP Conf. Ser. Mater. Sci. Eng., vol. 1098, no. 3, p. 032041, Mar. 2021, doi: 10.1088/1757-899X/1098/3/032041.

W. S. El-Kassas, C. R. Salama, A. A. Rafea, and H. K. Mohamed, “Automatic Text Summarization: A Comprehensive Survey,” Expert Syst. Appl., vol. 165, p. 113679, Mar. 2021, doi: 10.1016/j.eswa.2020.113679.

N. Giarelis, C. Mastrokostas, and N. Karacapilidis, “Abstractive vs. Extractive Summarization: An Experimental Review,” Appl. SCI., vol. 13, no. 13, p. 7620, Jun. 2023, doi: 10.3390/app13137620.

A. P. Widyassari et al., “Review of Automatic Text Summarization Techniques & Methods,” J. King Saud Univ. - Comput. Inf. Sci., vol. 34, no. 4, pp. 1029–1046, Apr. 2022, doi: 10.1016/j.jksuci.2020.05.006.

C. Raffel et al., “Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer,” Sep. 19, 2023, arXiv: arXiv:1910.10683. Accessed: Oct. 31, 2024. [Online]. Available: http://arxiv.org/abs/1910.10683

G. S. Ramesh, V. Manyam, V. Mandula, P. Myana, S. Macha, and S. Reddy, “Abstractive Text Summarization using T5 Architecture,” in Proceedings of Second International Conference on Advances in Computer Engineering and Communication Systems, A. B. Reddy, B. V. Kiranmayee, R. R. Mukkamala, and K. Srujan Raju, Eds., in Algorithms for Intelligent Systems. , Singapore: Springer Nature Singapore, 2022, pp. 535–543. doi: 10.1007/978-981-16-7389-4_52.

A. Garg et al., “NEWS Article Summarization with Pretrained Transformer,” in Advanced Computing, vol. 1367, D. Garg, K. Wong, J. Sarangapani, and S. K. Gupta, Eds., in Communications in Computer and Information Science, vol. 1367. , Singapore: Springer Singapore, 2021, pp. 203–211. doi: 10.1007/978-981-16-0401-0_15.

A. Urlana, S. M. Bhatt, N. Surange, and M. Shrivastava, “Indian Language Summarization using Pretrained Sequence-to-Sequence Models,” Mar. 25, 2023, arXiv: arXiv:2303.14461. Accessed: Oct. 31, 2024. [Online]. Available: http://arxiv.org/abs/2303.14461

Y. Liu and M. Lapata, “Text Summarization with Pretrained Encoders,” Sep. 05, 2019, arXiv: arXiv:1908.08345. Accessed: Aug. 19, 2024. [Online]. Available: http://arxiv.org/abs/1908.08345

M. Ramina, N. Darnay, C. Ludbe, and A. Dhruv, “Topic Level Summary Generation using BERT Induced Abstractive Summarization Model,” in 2020 4th International Conference on Intelligent Computing and Control Systems (ICICCS), Madurai, India: IEEE, May 2020, pp. 747–752. doi: 10.1109/ICICCS48265.2020.9120997.

Q. A. Itsnaini, M. Hayaty, A. D. Putra, and N. A. M. Jabari, “Abstractive Text Summarization using Pre-Trained Language Model ‘Text-to-Text Transfer Transformer (T5),’” Ilk. J. Ilm., vol. 15, no. 1, pp. 124–131, Apr. 2023, doi: 10.33096/ilkom.v15i1.1532.124-131.

G. E. Abdul, I. A. Ali, and C. Megha, “Fine-Tuned T5 for Abstractive Summarization,” Int. J. Perform. Eng., vol. 17, no. 10, p. 900, 2021, doi: 10.23940/ijpe.21.10.p8.900906.

Y. Ding, Y. Qin, Q. Liu, and M.-Y. Kan, “CocoSciSum: A Scientific Summarization Toolkit with Compositional Controllability,” in Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Singapore: Association for Computational Linguistics, 2023, pp. 518–526. doi: 10.18653/v1/2023.emnlp-demo.47.

V. N. Mahmoodabadi and F. Ghasemian, “Persian Text Summarization via Fine Tuning mT5 Transformer,” 2023.

K. Kurniawan and S. Louvan, “Indosum: A New Benchmark Dataset for Indonesian Text Summarization,” Int. Conf. Asian Lang. Process. IALP, pp. 215–220, 2018, doi: 10.1109/IALP.2018.8629109.

S. Mehta, D. Shah, R. Kulkarni, and C. Caragea, “Semantic Tokenizer for Enhanced Natural Language Processing,” Apr. 24, 2023, arXiv: arXiv:2304.12404. Accessed: Oct. 31, 2024. [Online]. Available: http://arxiv.org/abs/2304.12404

X. Wang, W. Tian, and Z. Liao, “Framework for Hyperparameter Impact Analysis and Selection for Water Resources Feedforward Neural Network,” Water Resour. Manag., vol. 36, no. 11, pp. 4201–4217, Sep. 2022, doi: 10.1007/s11269-022-03248-4.

R. Yacouby and D. Axman, “Probabilistic Extension of Precision, Recall, and F1 Score for More Thorough Evaluation of Classification Models,” in Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems, Online: Association for Computational Linguistics, 2020, pp. 79–91. doi: 10.18653/v1/2020.eval4nlp-1.9.

A. L. Putra, Sanjaya, M. R. Fachruradzi, and A. Y. Zakiyyah, “Resource Efficient Abstractive Text Summarization in Indonesian with ALBERT,” in 2024 International Conference on Smart Computing, IoT and Machine Learning (SIML), Surakarta, Indonesia: IEEE, Jun. 2024, pp. 81–85. doi: 10.1109/SIML61815.2024.10578175.

M. Nasari, A. Maulina, and A. S. Girsang, “Abstractive Indonesian News Summarization using BERT2GPT,” in 2023 IEEE 7th International Conference on Information Technology, Information Systems and Electrical Engineering (ICITISEE), Purwokerto, Indonesia: IEEE, Nov. 2023, pp. 369–375. doi: 10.1109/ICITISEE58992.2023.10405359.

DOI: https://doi.org/10.32520/stmsi.v14i3.4884

Article Metrics

Abstract view : 269 times
PDF - 55 times

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Username
Password
Remember me