Full Paper View Go Back

Examining the Best Speech-to-Text Method for Audio Files in Podcasting

Siya Naik1 , Avina Almeida2 , Shubham Lotliker3

Section:Research Paper, Product Type: Journal-Paper
Vol.9 , Issue.5 , pp.25-29, Oct-2021


Online published on Oct 31, 2021


Copyright © Siya Naik, Avina Almeida, Shubham Lotliker . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
 

View this paper at   Google Scholar | DPI Digital Library


XML View     PDF Download

How to Cite this Paper

  • IEEE Citation
  • MLA Citation
  • APA Citation
  • BibTex Citation
  • RIS Citation

IEEE Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker, “Examining the Best Speech-to-Text Method for Audio Files in Podcasting,” International Journal of Scientific Research in Computer Science and Engineering, Vol.9, Issue.5, pp.25-29, 2021.

MLA Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker "Examining the Best Speech-to-Text Method for Audio Files in Podcasting." International Journal of Scientific Research in Computer Science and Engineering 9.5 (2021): 25-29.

APA Style Citation: Siya Naik, Avina Almeida, Shubham Lotliker, (2021). Examining the Best Speech-to-Text Method for Audio Files in Podcasting. International Journal of Scientific Research in Computer Science and Engineering, 9(5), 25-29.

BibTex Style Citation:
@article{Naik_2021,
author = {Siya Naik, Avina Almeida, Shubham Lotliker},
title = {Examining the Best Speech-to-Text Method for Audio Files in Podcasting},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {10 2021},
volume = {9},
Issue = {5},
month = {10},
year = {2021},
issn = {2347-2693},
pages = {25-29},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2553},
publisher = {IJCSE, Indore, INDIA},
}

RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=2553
TI - Examining the Best Speech-to-Text Method for Audio Files in Podcasting
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Siya Naik, Avina Almeida, Shubham Lotliker
PY - 2021
DA - 2021/10/31
PB - IJCSE, Indore, INDIA
SP - 25-29
IS - 5
VL - 9
SN - 2347-2693
ER -

192 Views    322 Downloads    80 Downloads
  
  

Abstract :
Podcasting is a great way to give insights or opinions on any topic to the audience. Podcasting requires both parties to be present physically at the location. But due to the pandemic crisis, this has caused a big problem. So, it is now carried out on an online platform. But the cons are the presence of noise in the audio files as well as miscommunication. The Spectral Gating method is used to remove the noise. This paper compares the various algorithms for converting audio to text by using various speech-to-text pretrained models. We performed an experiment on various audio files and the best accuracy rate was obtained for SpeechRecognition pretrained model.

Key-Words / Index Term :
Podcast; Subtitles; Spectral Gating; Speech-to-Text; Silero; Vosk; Mozilla DeepSpeech; SpeechRecognition; Word Error Rate; Accuracy.

References :
[1] Akhil Kanade, Sourabh Gune, Shubham Dharamkar, Rohan Gokhale, “Automatic Subtile Generation for Videos,” Interntional Journal of Enginneering Research and General Science, Vol.3, Issue.6, p.744,2015.
[2] Siya Sadashiv Naik, Gouri Bhatikar and Ugam Gaude, “Analysis of Best Algorithm for Noise Reduction in Podcasting,” Internatioonal Journal of Scientific Research in Science and Technology, Vol.8, Issue.3, pp24-249,2021.
[3] N Usha Rani, P N Girija, “Error Analysis to Improve the Speech Recogntion Accuracy on Telegu Language,” Indian Academic of Sciences, Vol.37.Part.6, p.747,2012.
[4] Aashish Agarwal, Torsten Zesch, “German End-to-end Speech Recognition based on DeepSpeech,” ResearchGate, Germany, Germany, pp.2-3, 2019.
[5] N. SelvaKumar, M. Rohini, C. Narmada, M. Yogeshprabhu, “Network Traffic Control Using AI,” International Journal of Scientific Research in Network Security and Communication, Vol.8, Issue.2, pp.13-21,2020.
[6] Muhammad Hafida Firmansyah, Anand Paul, Deblina Bhattachrya, Gul Malik Urfa, “A.I. based Emedded Speech to Text using DeepSpeech,” ResearchGate, South Korea, pp.1-5,2020.
[7] Dhara Bhatt, Bhargavi Khrishna, “Computer Assisted Pronounciation Learning System Using Speech Recognition Systems “PROnunciation Application”,” International Journal of Scientific Research in Computer Science and Engineering, Vol.7, Issue.6,pp.36-39,2019.

Authorization Required

 

You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at  support@isroset.org or view contact page for more details.

Go to Navigation