Integrating GAN to MFCC-Assisted RNN-LSTM Algorithms for Identifying Types of Tribes Based on English Pronunciation

Agung Mulyo Widodo, Hojjat Baghban, Mosiur Rahaman, Erwan Baharudin, Muhamad Bahrul Ulum, Diah Aryani

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Many studies based on Natural Language Processing (NLP) that have been carried out focus on the results of speech recognition on devices, but do not discuss the speech recognition process that is applied until the device works according to voice commands from the user. One of the factors for success in the classification process is the amount of unbalanced data. Imbalances in the data can impact the performance of the Deep Learning architecture used to create classifier models. Therefore, it is proposed to integrate the Generative Adversarial Network (GAN) algorithm and the Mel Frequency Cepstral Coefficient (MFCC) algorithm assisted by RNN-LSTM which is used in Speech Recognition in English Pronunciation by Non-Native Speakers. The proposed model is able to produce ethnic recognition based on the pronunciation made by a speaker. The study was conducted using 2,722 voice samples from 4 indigenous tribes in Indonesia, namely, the Ambonese, Sundanese, Javanese and Betawi. By using the proposed method and after testing, a classifier model was obtained with an accuracy of 78.12%.
Original languageEnglish
Title of host publicationProceedings - ICE3IS 2024
Subtitle of host publication4th International Conference on Electronic and Electrical Engineering and Intelligent System: Leading-Edge Technologies for Sustainable Societies
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages385-390
Number of pages6
ISBN (Electronic)9798350378368
DOIs
StatePublished - 11 12 2024
Event4th International Conference on Electronic and Electrical Engineering and Intelligent System, ICE3IS 2024 - Hybrid, Yogyakarta, Indonesia
Duration: 07 08 202408 08 2024

Publication series

NameProceedings - ICE3IS 2024: 4th International Conference on Electronic and Electrical Engineering and Intelligent System: Leading-Edge Technologies for Sustainable Societies

Conference

Conference4th International Conference on Electronic and Electrical Engineering and Intelligent System, ICE3IS 2024
Country/TerritoryIndonesia
CityHybrid, Yogyakarta
Period07/08/2408/08/24

Bibliographical note

Publisher Copyright:
© 2024 IEEE.

Keywords

  • GAN
  • LSTM
  • MFCC
  • pronunciation
  • recognition
  • speech

Fingerprint

Dive into the research topics of 'Integrating GAN to MFCC-Assisted RNN-LSTM Algorithms for Identifying Types of Tribes Based on English Pronunciation'. Together they form a unique fingerprint.

Cite this