Comparison Studies of Speaker Modeling Techniques in Speaker Verification System

K. Sarmah

Comparison Studies of Speaker Modeling Techniques in Speaker Verification System

K. Sarmah¹

Department of Computer Science, Pandit Deendayal Upadhyaya Adarsha Mahavidyalaya, Goalpara, India.

Correspondence should be addressed to: kshirodsarmah@gmail.com.

Section:Research Paper, Product Type: Isroset-Journal
Vol.5 , Issue.5 , pp.75-82, Oct-2017

CrossRef-DOI: https://doi.org/10.26438/ijsrcse/v5i5.7582

Online published on Oct 30, 2017

Copyright © K. Sarmah . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: K. Sarmah, “Comparison Studies of Speaker Modeling Techniques in Speaker Verification System,” International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.5, pp.75-82, 2017.

MLA Style Citation: K. Sarmah "Comparison Studies of Speaker Modeling Techniques in Speaker Verification System." International Journal of Scientific Research in Computer Science and Engineering 5.5 (2017): 75-82.

APA Style Citation: K. Sarmah, (2017). Comparison Studies of Speaker Modeling Techniques in Speaker Verification System. International Journal of Scientific Research in Computer Science and Engineering, 5(5), 75-82.

BibTex Style Citation:
@article{Sarmah_2017,
author = {K. Sarmah},
title = {Comparison Studies of Speaker Modeling Techniques in Speaker Verification System},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {10 2017},
volume = {5},
Issue = {5},
month = {10},
year = {2017},
issn = {2347-2693},
pages = {75-82},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=488},
doi = {https://doi.org/10.26438/ijcse/v5i5.7582}
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
DO = {https://doi.org/10.26438/ijcse/v5i5.7582}
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=488
TI - Comparison Studies of Speaker Modeling Techniques in Speaker Verification System
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - K. Sarmah
PY - 2017
DA - 2017/10/30
PB - IJCSE, Indore, INDIA
SP - 75-82
IS - 5
VL - 5
SN - 2347-2693
ER -

730 Views

282 Downloads

106 Downloads

Bar Line

Abstract :
In this paper a brief comparison studies on the performance of different speaker modeling techniques in robust and reliable speaker verification (SV) system has been discussed. In text-independent speaker verification, lots of states of art speaker modeling techniques have been developed in different scenarios to upgrade its performance. The performance of SV system is not only depended on the fusion of different feature vectors but also it is highly depended upon the fusion of various speaker modeling techniques. In this work, an automatic SV system has been developed using the Mel-Frequency Cepstral Coefficients (MFCC) combined with the Prosodic feature vectors. The baseline of the SV system has been trained with speaker modeling techniques separately and fusions namely Vector Quantization (VQ), Gaussian Mixture Model (GMM), GMM-Universal Background Model (GMM-UBM), Support Vector Machine (SVM) and Joint Factor Analysis (JFA) to analyze its performances. The results reported here, have been evaluated using the multilingual speech database, namely Arunachali Language Speech Database (ALS-DB). From the experimental point of view we observe that the best performance of SV system shows by JFA with GMM-UBM modeling technique with its EER value of 4.76% and MinDCF value of 0.0872. Comparing with other modeling techniques VQ shows its poor performance with its EER value of 11.08% and MinDCF value of 0.2010. SVM shows of approximately 2.8% improvement of verification rate with comparison to that of GMM-UBM. Here, finally, we conclude that the fusions of both generative and discriminative models highly improve the performance of SV system.

Key-Words / Index Term :
Speaker Verification,MFCC,Prosodic,GMM-UBM,SVM,JFA

References :
[1] F. Bimbot, et. al., “A tutorial on text-independent speaker verification,” EURASIP Journ. on Applied Signal Processing, pp. 430-451, 2004.
[2] D.A.Reynolds, “An overview of automatic speaker recognition technology”,. In: ICASSP, IEEE international conference on acoustics, speech and signal processing, vol 4, pp 4072–4075, 2002.
[3] L. Mary and B.Yegnanarayana, “Extraction and representation of prosodic features for language and speaker recognition”,Speech communication,pp.782-796, 2008.
[4] D.A.Reynolds, T.F.Quateri and R.B. Dunn, “Speaker verification using adapted Gaussian mixture models”, In Digital Signal Processing, Vol.10, pp.19-41, 2000.
[5] T.Kinnunen and H. Li, “ An overview of Text-independent Speaker Recognition: from Features to Supervectors”,Speech Communication,pp. 12-40, 2010
[6] K. Sarmah and U. Bhattacharjee, “Speaker Modeling Distance Normalization Technique in Multilingual Speaker Verification”, International Journal of Electrical and Electronics Engineering Research, Vol.3, Issue-2, pp.319-326, 2013.
[7] K. Sarmah and U. Bhattacharjee, “Improvement of Speaker Verification System with Feature Level and Score Level Normalization Techniques”, International Journal of Innovative Research in Computer and Communication Engineering (IJIRCCE), Vol.2, Issue 2, pp. 3119-3126, 2014.
[8] K. Sarmah and U. Bhattacharjee, “Text-independent multi-sensor speaker verification system”, International Journal of Computer Science and Engineering, Vol. 4, Issue 5,pp.7-16, 2015.
[9] D.A. Reynolds, “Speaker identification and verification using Gaussian mixture speaker models,” Speech Communication, vol.17, pp. 91-108,1995.
[10] N.Malayath, , H. Hermansky, S.Kajarekar, and B.Yegananarayan, “Data –driven temporal filters and alternatives to GMM in speaker verification”, In Digital Signal Processing, pp.55-74, 2000.
[11] D.A.Reynolds, “Gaussian Mixture Models”. In Encyclopedia of Biometric Recognition, Springer, Journal Article, 2008.
[12] A. Fazel, and S.Chakrabartty, “An overview of Statistical Pattern Recognition Techniques for Speaker Verification”,. In IEEE CIRCUITS AND SYSTEMS MAGAZINE. 2011.
[13] J. Pelecanos, R.Vogt and S.Sridharan, “A study on standard and iterative MAP adaptation for speaker recognition”,. In Proceeding on the 9th Australian International Conference on Speech Science & Technology Melbourne, December 2 to 5, 2002.
[14] W. Campbell, J. Campbell, D.A. Reynolds, E. Singer, and P.Torres-Carrasquillo, “Support vector machines for speaker and language recognition”,. Computer Speech and Language 20, pp.210–229, 2006.
[15] P.Kenny, “Joint factor analysis of speaker and session variability: Theory and algorithms, Tech. Report CRIM-06/08-13, 2005.
[16] P. Kenny and P.Dumouchel, “Experiments in speaker verification using factor analysis likelihood ratios,” in Proc. Odyssey04, pp. 219-226, 2004.
[17] U. Bhattacharjee and K. Sarmah, “A Multilingual Speech Database for Speaker Recognition”, Proc. IEEE, ISPCC, 2012.
[18] U. Bhattacharjee and K. Sarmah, “GMM-UBM Based Speaker Verification in Multilingual Environments”,International Journal of Computer Science Issues.Vol. 9,Issue 6,No.2, pp.373-380,2012.
[19] U. Bhattacharjee and K. Sarmah, “Development of a Speech Corpus for Speaker Verification Research in Multilingual Environment”,International Journal of Soft Computing and Engineering. Vol.2, Issue-6, pp. 443-446, 2013.
[20] D.A.Reynolds, et..al, “The SuperSID project: exploiting high-level information for high-accuracy speaker recognition”. In Proc. Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2003 (Hong Kong, China, pp. 784–787, April 2003.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link