Speech Recognition Over Digital Channels: Robustness and Standards
Antonio Peinado, Jose Segura
Automatic speech recognition (ASR) is a very attractive means for human-machine interaction. The degree of maturity reached by speech recognition technologies during recent years allows the development of applications that use them. In particular, ASR shows an enormous potential in mobile environments, where devices such as mobile phones or PDAs are used, and for Internet Protocol (IP) applications.
Speech Recognition Over Digital Channels is the first book of its kind to offer a complete system comprehension, addressing the topics of distributed and network-based speech recognition issues and standards, the concepts of speech processing and transmission, and system architectures and robustness.
Describes the different client/server architectures for remote speech recognition systems, by means of which the client transmits speech parameters through a digital channel to a remote recognition server
- Focuses on robustness against both adverse acoustic environments (in the front-end) and bit errors/packet loss
- Discusses four ETSI standards for distributed speech recognition; the understanding of the standards and the technologies behind them
- Provides the necessary background for the comprehension of remote speech recognition technologies
This book will appeal to a wide-ranging audience: engineers using speech recognition systems, researchers involved in ASR systems and those interested in processing and transmitting speech such as signal processing and communications communities. It will also be of interest to technical experts requiring an understanding of recognition over mobile and IP networks, and postgraduate students working on robust speech processing.
Table of Contents
1.2 RSR over Digital Channels.
1.3 Organization of the Book.
2 Speech Recognition with HMMs.
2.2 Some General Issues.
2.3 Analysis of Speech Signals.
2.4 Vector Quantization.
2.5 Approaches to ASR.
2.6 Hidden Markov Models.
2.7 Application of HMMs to Speech Recognition.
2.8 Model Adaptation.
2.9 Dealing with Uncertainty.
3 Networks and Degradation.
3.2 Mobile and Wireless Networks.
3.3 IP Networks.
3.4 The Acoustic Environment.
4 Speech Compression and Architectures for RSR.
4.2 Speech Coding.
4.3 Recognition from Decoded Speech.
4.4 Recognition from Codec Parameters.
4.5 Distributed Speech Recognition.
4.6 Comparison between NSR and DSR.
5 Robustness Against Transmission Channel Errors.
5.2 Channel Coding Techniques.
5.3 Error Concealment (EC).
6 Front-end Processing for Robust Feature Extraction.
6.2 Noise Reduction Techniques.
6.3 Voice Activity Detection.
6.4 Feature Normalization.
7 Standards for Distributed Speech Recognition.
7.2 Signal Preprocessing.
7.3 Feature Extraction.
7.4 Feature Compression and Encoding.
7.5 Feature Decoding and Postprocessing.
A Alternative Representations of the LPC Coefficients.
B Basic Digital Modulation Concepts.
C Review of Channel Coding Techniques.
C.1 Media-independent FEC.
List of Acronyms.