Skip to Main content Skip to Navigation
Conference papers

A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect

Abstract : This study presents a large scale benchmarking on cloudbased Speech-To-Text systems : Google Cloud Speech-To-Text, Microsoft Azure Cognitive Services, Amazon Transcribe, IBM Watson Speech to Text. For each systems, 40 158 clean and noisy speech files about 101 hours are tested. Effect of background noise on STT quality is also evaluated with 5 different Signal-to-noise ratios from 40 dB to 0 dB. Results showed that Microsoft Azure provided lowest transcription error rate 9.09% on clean speech, with high robustness to noisy environment. Google Cloud and Amazon Transcribe gave similar performance, but the latter is very limited for time-constraint usage. Though IBM Watson could work correctly in quiet conditions, it is highly sensible to noisy speech which could strongly limit its application in real life situations.
Document type :
Conference papers
Complete list of metadata

https://hal.mines-ales.fr/hal-03277773
Contributor : Administrateur Imt - Mines Alès Connect in order to contact the contributor
Submitted on : Monday, July 5, 2021 - 9:06:47 AM
Last modification on : Friday, October 22, 2021 - 2:38:02 PM

Links full text

Identifiers

  • HAL Id : hal-03277773, version 1
  • ARXIV : 2105.03409

Citation

Binbin Xu, Chongyang Tao, Zidu Feng, Youssef Raqui, Sylvie Ranwez. A Benchmarking on Cloud based Speech-To-Text Services for French Speech and Background Noise Effect. APIA 2021 - Conférence Nationale sur les Applications Pratiques de l’Intelligence Artificielle (événement affilié à PFIA 2021), Jun 2021, Bordeaux, France. p. 102-107. ⟨hal-03277773⟩

Share

Metrics

Record views

51