Deep Speech: Scaling up end-to-end speech recognition. We present a state-of-the-art speech recognition system developed using end-to-end deep learning. Our architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, our system does not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learns a function that is robust to such effects. We do not need a phoneme dictionary, nor even the concept of a ”phoneme.” Key to our approach is a well-optimized RNN training system that uses multiple GPUs, as well as a set of novel data synthesis techniques that allow us to efficiently obtain a large amount of varied data for training. Our system, called Deep Speech, outperforms previously published results on the widely studied Switchboard Hub5’00, achieving 16.0% error on the full test set. Deep Speech also handles challenging noisy environments better than widely used, state-of-the-art commercial speech systems.
Keywords for this software
References in zbMATH (referenced in 6 articles )
Showing results 1 to 6 of 6.
- Ramezani-Kebrya, Ali; Faghri, Fartash; Markov, Ilya; Aksenov, Vitalii; Alistarh, Dan; Roy, Daniel M.: NUQSGD: provably communication-efficient data-parallel SGD via nonuniform quantization (2021)
- Yang, Y.-Y., Hira, M., Ni, Z., Chourdia, A., Astafurov, A., Chen, C., Yeh, C.-F., Puhrsch, C., Pollack, D., Genzel, D., Greenberg, D., Yang, E. Z., Lian, J., Mahadeokar, J., Hwang, J., Chen, J., Goldsborough, P., Roy, P., Narenthiran, S., Watanabe, S., Chintala, S., Quenneville-Bélair, V, Shi, Y.: TorchAudio: Building Blocks for Audio and Speech Processing (2021) arXiv
- Dukov, Nikolay; Ganchev, Todor: Comparative evaluation of various activation functions in the recurrent neurons of the LRPNN (2020)
- Liu, Minliang; Liang, Liang; Sun, Wei: A generic physics-informed neural network-based constitutive model for soft biological tissues (2020)
- Sun, Luning; Gao, Han; Pan, Shaowu; Wang, Jian-Xun: Surrogate modeling for fluid flows based on physics-constrained deep learning without simulation data (2020)
- Liu, Minliang; Liang, Liang; Sun, Wei: Estimation of in vivo constitutive parameters of the aortic wall using a machine learning approach (2019)