Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression

Tirus Muya Maina

Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression

Tirus Muya Maina¹

Section:Research Paper, Product Type: Journal-Paper
Vol.12 , Issue.1 , pp.16-25, Mar-2025

Online published on Mar 31, 2025

Copyright © Tirus Muya Maina . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Tirus Muya Maina, “Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression,” World Academics Journal of Engineering Sciences, Vol.12, Issue.1, pp.16-25, 2025.

MLA Style Citation: Tirus Muya Maina "Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression." World Academics Journal of Engineering Sciences 12.1 (2025): 16-25.

APA Style Citation: Tirus Muya Maina, (2025). Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression. World Academics Journal of Engineering Sciences, 12(1), 16-25.

BibTex Style Citation:
@article{Maina_2025,
author = {Tirus Muya Maina},
title = {Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression},
journal = {World Academics Journal of Engineering Sciences},
issue_date = {3 2025},
volume = {12},
Issue = {1},
month = {3},
year = {2025},
issn = {2347-2693},
pages = {16-25},
url = {https://www.isroset.org/journal/WAJES/full_paper_view.php?paper_id=3821},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/WAJES/full_paper_view.php?paper_id=3821
TI - Optimizing Phonetic Recognition and Computational Efficiency in Swahili Digraphs Using Feature Reduction Model with Multinomial Logistic Regression
T2 - World Academics Journal of Engineering Sciences
AU - Tirus Muya Maina
PY - 2025
DA - 2025/03/31
PB - IJCSE, Indore, INDIA
SP - 16-25
IS - 1
VL - 12
SN - 2347-2693
ER -

2 Views

3 Downloads

0 Downloads

Bar Line

Abstract :
Automatic Speech Recognition systems commonly rely on spectral acoustic features such as Linear Predictive Coding, Perceptual Linear Prediction, and Mel-Frequency Cepstral Coefficients. While these features capture essential spectral information, they often fall short in conveying detailed phonetic distinctions, especially for languages with complex phonological structures like Swahili. This paper introduces a novel approach to enhance Swahili digraph recognition by transforming high-dimensional MFCC feature vectors into a reduced set of probability-based features using Multinomial Logistic Regression (MLR), termed Feature reduction by Multinomial Logistic Regression (FRMLR). The FRMLR method reduces the feature dimensionality from 39 to 5, significantly decreasing computational complexity while preserving critical phonetic information. The proposed method improves recognition accuracy, achieving an accuracy of 92.5% and enhances computational efficiency, reducing training time from 45 minutes to 10 minutes and memory usage by 70%. The findings illustrate how effective FRMLR is at capturing the phonetic nuances of Swahili digraphs, leading to higher recognition accuracy and robustness against variability and noise. The FRMLR approach`s adaptability to other languages and potential applications in various ASR systems highlight its scalability and versatility. By addressing the limitations of traditional spectral features, FERMLR represents a significant advancement in ASR technology, particularly for languages with intricate phonological characteristics. This method holds promise for advancing ASR systems in multilingual environments, contributing to more inclusive and effective speech recognition technologies.

Key-Words / Index Term :
Automatic Speech Recognition (ASR), Feature Extraction, Multinomial Logistic Regression (MLR), Swahili Digraphs, Dimensionality Reduction, Computational Efficiency, Mel-Frequency Cepstral Coefficients (MFCC).

References :
[1] D. O`Shaughnessy, “Review of analysis methods for speech applications,” Speech Communication, Vol.151, pp.64–75, 2023.
[2] M. Malik, K. M. Muhammad, M. Khawar, and M. Imran, “Automatic speech recognition: A survey,” Multimedia Tools and Applications, pp.9411–9457, 2021.
[3] A. Suresh, A. Jain, K. Mathur, and P. Gambhir, “Comparison of modelling ASR system with different features extraction methods using sequential model,” in International Conference on Artificial Intelligence and Speech Technology, Cham, 2022.
[4] S. A. M. Yusof, A. F. Atanda, and H. Husni, “Improving the Performance of Multinomial Logistic Regression in Vowel Recognition by Determining Best Regression Coefficients,” in 2020 International Conference on Decision Aid Sciences and Application (DASA), Sakheer, Bahrain, 2020.
[5] I. Micheli, A. Flavia, T. Maddalena, and P. Amelia, Language and Identity. Theories and Experiences in Lexicography and Linguistic Policies in a Global World, Edizioni Università di Trieste, 2021.
[6] B. Dossou et al., “AfroLM: A self-active learning-based multilingual pretrained language model for 23 African languages,” arXiv preprint arXiv:2211.03263, 2022.
[7] IBM, Multinomial Logistic Regression, IBM Corporation, Mar. 3, 2023.
[8] S. Mahendra, Multinomial Logistic Regression, Aiplusinfo, Jun. 13, 2023.
[9] S. Fei, D. Xu, Z. Chen, Y. Xiao, and Y. Ma, “MLR-based feature splitting regression for estimating plant traits using high-dimensional hyperspectral reflectance data,” Field Crops Research, Vol.293, Issue.15, 2023.
[10] K. Gupta and D. Gupta, “An analysis on LPC, RASTA and MFCC techniques in automatic speech recognition system,” in 2016 6th International Conference - Cloud System and Big Data Engineering (Confluence), Noida, India, 2016.
[11] E. Djamal, N. Nurhamidah, and R. Ilyas, “Spoken word recognition using MFCC and learning vector quantization,” in 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Yogyakarta, Indonesia, 2017.
[12] P. Prithvi and K. Kumar, “Comparative analysis of MFCC, LFCC, RASTA-PLP,” International Journal of Scientific Engineering and Research (IJSER), Vol.4, Issue.5, 2016.
[13] E. C. Djamal, N. Nurhamidah, and R. Ilyas, “Spoken word recognition using MFCC and learning vector quantization,” in 2017 4th International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Yogyakarta, Indonesia, 2017.
[14] Z. Abdul, “Kurdish speaker identification based on one-dimensional convolutional neural network,” Comput. Methods Differ. Equ., Vol.7, issue.4, pp.566–572, 2019.
[15] X. Zhao and D. Wang, “Analysing noise robustness of MFCC and GFCC features in speaker identification,” in IEEE Int. Conf. Acoust. Speech Signal Process., 2013.
[16] D. Prabakaran and S. Sriuppili, “Speech processing: MFCC-based feature extraction techniques – An investigation,” Journal of Physics: Conference Series, Vol.1717, Issue.1, 2009.
[17] H. Yin, V. Hohmann, and C. Nadeu, “Acoustic features for speech recognition based on gammatone filterbank and instantaneous frequency,” Speech Commun., Vol.53, Issue.5, pp.707–715, 2011.
[18] Z. K. Abdul and A. K. Al-Talabani, “Mel frequency cepstral coefficient and its applications: A review,” IEEE Access, Vol.10, pp.122136–122158, 2022.
[19] M. K. Singh, “Multimedia application for forensic automatic speaker recognition from disguised voices using MFCC feature extraction and classification techniques,” Multimedia Tools and Applications, Vol.83, pp.77327–77345, 2024.
[20] S. Sarma and N. Pathak, “Design and implementation of an Assamese language chatbot using,” International Journal of Scientific Research in Computer Science and Engineering, Vol.11, Issue.6, pp.13–18, 2023.
[21] Deepanshu et al., “Convolutional neural network-based automated acute lymphoblastic leukemia detection and stage classification from peripheral blood smear images,” International Journal of Scientific Research in Computer Science and Engineering, Vol.12, Issue.3, pp.21–28, 2024.
[22] T. M. Maina, The Swahili Digraph Corpus, Mendeley Data, Vol.2, 2024.
[23] T. M. Maina, A.M. Oirere, and S. Kahara “A CNN-Based Digraph Extraction Model for Enhanced Swahili Natural Language Processing,” International Journal of Scientific Research in Computer Science and Engineering, Vol.12, Issue.6, pp.43-55, 2024.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link