A Robust Gender Recognition System using Convolutional Neural Network on Indonesian Speaker

I Nyoman Switrayana; Sirojul Hadi; Neny Sulistianingsih

doi:10.32520/stmsi.v13i3.3698

A Robust Gender Recognition System using Convolutional Neural Network on Indonesian Speaker

I Nyoman Switrayana, Sirojul Hadi, Neny Sulistianingsih

Abstract

Voice is one of the biometrics that humans have. Humans can be recognized by the sounds produced by their vocal cords and vocal tracts. One of the uses of voice is to recognize gender. Despite extensive research, gender recognition using machine learning remains unsatisfactory due to the complexity of voice features and the limitations of conventional algorithms. In this research, voice-based gender recognition is performed by applying deep learning. The deep learning model used is the Convolutional Neural Network (CNN). The input of CNN is the result of feature extraction from the Mel-Frequency Cepstral Coefficients (MFCC) method. MFCC produces Mel-Spectograms which are important features of sound. The dataset used is Indonesian speech. In the research, there are imbalanced and balanced dataset scenarios to see the performance of the model. To produce a balanced dataset, random undersampling is performed on the majority class. In addition, the effect of dividing training and testing data with a composition of 70:30, 80:20, and 90:10 was observed. The results show that the model has 100% accuracy for all imbalanced dataset scenarios. Then the highest accuracy is 99.65% for the balanced dataset scenario with 70:30 splitting. In summary, it can be concluded that CNN performs very well in identifying gender from voice features overall, although its performance decreases when random undersampling is applied to the dataset.

Full Text:

PDF

References

S. M. S. I. Badhon, M. H. Rahaman, and F. R. Rupon, “A Machine Learning Approach to Automating Bengali Voice based Gender Classification,” Proc. 2019 8th Int. Conf. Syst. Model. Adv. Res. Trends, SMART 2019, pp. 55–61, 2020, doi: 10.1109/SMART46866.2019.9117385.

E. Tanuar, E. Abdurachman, F. L. Gaol, and Lukas, “Analysis of Gender Identification in Bahasa Indonesia using Supervised Machine Learning Algorithm,” 2020 3rd Int. Conf. Inf. Commun. Technol. ICOIACT 2020, pp. 421–424, 2020, doi: 10.1109/ICOIACT50329.2020.9332145.

M. A. Uddin, M. Biswas, and R. K. Pathan, “Gender Recognition from Human Voice using Multi-Layer Architecture,” 2020.

S. Katiyar, S. Kumar, and H. Walia, “A Novel Approach to Identify Age and Gender using Deep Learning,” 2021 9th Int. Conf. Reliab. Infocom Technol. Optim. (Trends Futur. Dir. ICRITO 2021, pp. 1–5, 2021, doi: 10.1109/ICRITO51393.2021.9596153.

A. Singhal and D. K. Sharma, “Analysis of Classifiers for Gender Identification using Voice Signals,” 2021 5th Int. Conf. Inf. Syst. Comput. Networks, ISCON 2021, pp. 2021–2024, 2021, doi: 10.1109/ISCON52037.2021.9702469.

M. La Mura and P. Lamberti, “Human-Machine Interaction Personalization : a Review on Gender and Emotion Recognition Through Speech Analysis,” pp. 319–323, 2020.

T. A. Topu, S. Siddique, A. K. M. Masum, S. A. Khushbu, S. M. S. I. Badhon, and S. Abujar, “Bengali Continuous Speech Voice-based Gender Classification,” 2021 12th Int. Conf. Comput. Commun. Netw. Technol. ICCCNT 2021, 2021, doi: 10.1109/ICCCNT51525.2021.9579838.

B. Fatima, A. Raheel, A. Arsalan, M. Majid, M. Ehatisham-ul-haq, and S. M. Anwar, “Gender Recognition using EEG during Mobile Game Play,” pp. 634–639, 2021.

A. I. Ahmed, D. L. Ndzi, J. Chiverton, and M. Al-faris, “Machine Learning based Speaker Gender Classification using Transformed Features,” pp. 13–18, 2021.

E. Priya, J. Priyadharshini .S, P. Satya Reshma, and S. .S, “Temporal and Spectral Features based Gender Recognition from Audio Signals,” 2022.

Y. Singh, A. Pillay, and E. Jembere, “Features of Speech Audio for Accent Recognition,” 2020 Int. Conf. Artif. Intell. Big Data, Comput. Data Commun. Syst. icABCD 2020 - Proc., 2020, doi: 10.1109/icABCD49160.2020.9183893.

S. Goyal, V. V. Patage, and S. Tiwari, “Gender and Age Group Predictions from Speech Features using Multi-Layer Perceptron Model,” 2020 IEEE 17th India Counc. Int. Conf. INDICON 2020, pp. 3–8, 2020, doi: 10.1109/INDICON49873.2020.9342434.

G. Sharma and S. Mala, “Framework for Gender Recognition using Voice,” Proc. Conflu. 2020 - 10th Int. Conf. Cloud Comput. Data Sci. Eng., pp. 32–37, 2020, doi: 10.1109/Confluence47617.2020.9058146.

S. Chaudhary and D. Kumar Sharma, “Gender Identification based on Voice Signal Characteristics,” pp. 869–874, 2018.

K. Nugroho, E. Noersasongko, and H. A. Santoso, “Javanese Gender Speech Recognition using Deep Learning and Singular Value Decomposition,” pp. 251–254, 2019.

L L. M. Liztio and C. A. Sari, “Gender Identification based on Speech Recognition using Backpropagation Neural Network,” pp. 88–92, 2020.

P. Sachin, N. Correa, A. H. Shenoy, A. C. Ballal, and P. Mittal, “Gender and Emotion Classification by Hierarchical Modelling using Convolutional Neural Network,” 2022 2nd Asian Conf. Innov. Technol. ASIANCON 2022, pp. 1–6, 2022, doi: 10.1109/ASIANCON55314.2022.9908796.

A. M. Jasim, S. R. Awad, F. L. Malallah, and J. M. Abdul-jabbar, “Efficient Gender Classifier for Arabic Speech using CNN with Dimensional Reshaping,” pp. 1–5, 2021.

R. D. Alamsyah and S. Suyanto, “Speech Gender Classification using Bidirectional Long Short Term Memory,” pp. 646–649, 2023.

K. V. Balaji and R. Sugumar, “A Comprehensive Review of Diabetes Mellitus Exposure and Prediction using Deep Learning Techniques,” 2022 Int. Conf. Data Sci. Agents Artif. Intell. ICDSAAI 2022, no. Ml, 2022, doi: 10.1109/ICDSAAI55433.2022.10028832.

R. Rehman, K. Bordoloi, K. Dutta, N. Borah, and P. Mahanta, “Feature Selection and Classification of Speech Dataset for Gender Identification: A machine Learning Approach,” J. Theor. Appl. Inf. Technol., vol. 98, no. 22, pp. 3449–3459, 2020.

B. Jena, A. Mohanty, and S. K. Mohanty, “Gender Recognition of Speech Signal using KNN and SVM,” SSRN Electron. J., no. Icicnis, pp. 548–557, 2021, doi: 10.2139/ssrn.3769786.

O. Mamyrbayev, A. Toleu, G. Tolegen, and N. Mekebayev, “Neural Architectures for Gender Detection and Speaker Identification,” Cogent Eng., vol. 7, no. 1, 2020, doi: 10.1080/23311916.2020.1727168.

A. Tursunov, Mustaqeem, J. Y. Choeh, and S. Kwon, “Age and Gender Recognition using a Convolutional Neural Network with a Specially Designed Multi-attention Module Through Speech Spectrograms,” Sensors, vol. 21, no. 17, 2021, doi: 10.3390/s21175892.

G. U. Shagi and S. Aji, “A machine Learning Approach for Gender Identification using Statistical Features of Pitch in Speeches,” Appl. Acoust., vol. 185, p. 108392, 2022, doi: 10.1016/j.apacoust.2021.108392.

H. Q. Jaleel, J. J. Stephan, and S. A. Naji, “Gender Identification from Speech Recognition using Machine Learning Techniques and Convolutional Neural Networks,” Webology, vol. 19, no. 1, pp. 1666–1688, 2022, doi: 10.14704/web/v19i1/web19112.

A. A. Alnuaim et al., “Speaker Gender Recognition based on Deep Neural Networks and ResNet50,” Wirel. Commun. Mob. Comput., vol. 2022, 2022, doi: 10.1155/2022/4444388.

A. A. Alashban and Y. A. Alotaibi, “Speaker Gender Classification in Mono-Language and Cross-Language using BLSTM Network,” pp. 66–71, 2021.

M. D. Prasetio, “Single Speaker Recognition using Deep Belief Network Gender Classification Voices,” pp. 253–258, 2019.

L. Jasuja, A. Rasool, and G. Hajela, “Voice Gender Recognizer Recognition of Gender from Voice using Deep Neural Networks,”, pp. 319–324, 2020.

G. R. Nitisara, S. Suyanto, and K. N. Ramadhani, “Speech Age-Gender Classification using Long Short-Term Memory,” pp. 8–11, 2023.

DOI: https://doi.org/10.32520/stmsi.v13i3.3698

Article Metrics

Abstract view : 387 times
PDF - 129 times

Refbacks

There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Username
Password
Remember me