Examining the Best Speech-to-Text Method for Audio Files in Podcasting

Siya Naik, Avina Almeida, Shubham Lotliker

Full Paper View Go Back

Examining the Best Speech-to-Text Method for Audio Files in Podcasting

Siya Naik¹ , Avina Almeida² , Shubham Lotliker³

Section:Research Paper, Product Type: Journal-Paper
Vol.9 , Issue.5 , pp.25-29, Oct-2021

Online published on Oct 31, 2021

Copyright © Siya Naik, Avina Almeida, Shubham Lotliker . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

View this paper at Google Scholar | DPI Digital Library

XML View PDF Download

How to Cite this Paper

IEEE Citation
MLA Citation
APA Citation
BibTex Citation
RIS Citation

IEEE Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker, “Examining the Best Speech-to-Text Method for Audio Files in Podcasting,” International Journal of Scientific Research in Computer Science and Engineering, Vol.9, Issue.5, pp.25-29, 2021.

MLA Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker "Examining the Best Speech-to-Text Method for Audio Files in Podcasting." International Journal of Scientific Research in Computer Science and Engineering 9.5 (2021): 25-29.

APA Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker, (2021). Examining the Best Speech-to-Text Method for Audio Files in Podcasting. International Journal of Scientific Research in Computer Science and Engineering, 9(5), 25-29.

BibTex Style Citation:
@article{Naik_2021,
author = {Siya Naik, Avina Almeida, Shubham Lotliker},
title = {Examining the Best Speech-to-Text Method for Audio Files in Podcasting},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {10 2021},
volume = {9},
Issue = {5},
month = {10},
year = {2021},
issn = {2347-2693},
pages = {25-29},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2553},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2553
TI - Examining the Best Speech-to-Text Method for Audio Files in Podcasting
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Siya Naik, Avina Almeida, Shubham Lotliker
PY - 2021
DA - 2021/10/31
PB - IJCSE, Indore, INDIA
SP - 25-29
IS - 5
VL - 9
SN - 2347-2693
ER -

248 Views

354 Downloads

87 Downloads

Bar Line

Abstract :
Podcasting is a great way to give insights or opinions on any topic to the audience. Podcasting requires both parties to be present physically at the location. But due to the pandemic crisis, this has caused a big problem. So, it is now carried out on an online platform. But the cons are the presence of noise in the audio files as well as miscommunication. The Spectral Gating method is used to remove the noise. This paper compares the various algorithms for converting audio to text by using various speech-to-text pretrained models. We performed an experiment on various audio files and the best accuracy rate was obtained for SpeechRecognition pretrained model.

Key-Words / Index Term :
Podcast; Subtitles; Spectral Gating; Speech-to-Text; Silero; Vosk; Mozilla DeepSpeech; SpeechRecognition; Word Error Rate; Accuracy.

References :
[1] Akhil Kanade, Sourabh Gune, Shubham Dharamkar, Rohan Gokhale, “Automatic Subtile Generation for Videos,” Interntional Journal of Enginneering Research and General Science, Vol.3, Issue.6, p.744,2015.
[2] Siya Sadashiv Naik, Gouri Bhatikar and Ugam Gaude, “Analysis of Best Algorithm for Noise Reduction in Podcasting,” Internatioonal Journal of Scientific Research in Science and Technology, Vol.8, Issue.3, pp24-249,2021.
[3] N Usha Rani, P N Girija, “Error Analysis to Improve the Speech Recogntion Accuracy on Telegu Language,” Indian Academic of Sciences, Vol.37.Part.6, p.747,2012.
[4] Aashish Agarwal, Torsten Zesch, “German End-to-end Speech Recognition based on DeepSpeech,” ResearchGate, Germany, Germany, pp.2-3, 2019.
[5] N. SelvaKumar, M. Rohini, C. Narmada, M. Yogeshprabhu, “Network Traffic Control Using AI,” International Journal of Scientific Research in Network Security and Communication, Vol.8, Issue.2, pp.13-21,2020.
[6] Muhammad Hafida Firmansyah, Anand Paul, Deblina Bhattachrya, Gul Malik Urfa, “A.I. based Emedded Speech to Text using DeepSpeech,” ResearchGate, South Korea, pp.1-5,2020.
[7] Dhara Bhatt, Bhargavi Khrishna, “Computer Assisted Pronounciation Learning System Using Speech Recognition Systems “PROnunciation Application”,” International Journal of Scientific Research in Computer Science and Engineering, Vol.7, Issue.6,pp.36-39,2019.

Authorization Required

Close(X)

You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at support@isroset.org or view contact page for more details.

Full Paper View Go Back

Main Menu

Journals Contents

Information

Download

Publication Certificate

Contact Us

Use full Link