Digital Speech Transmission: Enhancement, Coding and Error Concealment (Hardcover)

Peter Vary, Rainer Martin

  • 出版商: Wiley
  • 出版日期: 2006-03-03
  • 售價: $1,380
  • 貴賓價: 9.5$1,311
  • 語言: 英文
  • 頁數: 644
  • 裝訂: Hardcover
  • ISBN: 0471560189
  • ISBN-13: 9780471560180
  • 下單後立即進貨 (約5~7天)





The enormous advances in digital signal processing (DSP) technology have contributed to the wide dissemination and success of speech communication devices – be it GSM and UMTS mobile telephones, digital hearing aids, or human-machine interfaces. Digital speech transmission techniques play an important role in these applications, all the more because high quality speech transmission remains essential in all current and next generation communication networks.

Enhancement, coding and error concealment techniques improve the transmitted speech signal at all stages of the transmission chain, from the acoustic front-end to the sound reproduction at the receiver. Advanced speech processing algorithms help to mitigate a number of physical and technological limitations such as background noise, bandwidth restrictions, shortage of radio frequencies, and transmission errors.

Digital Speech Transmission provides a single-source, comprehensive guide to the fundamental issues, algorithms, standards, and trends in speech signal processing and speech communication technology. The authors give a solid, accessible overview of

  • fundamentals of speech signal processing
  • speech coding, including new speech coders for GSM and UMTS
  • error concealment by soft decoding
  • artificial bandwidth extension of speech signals
  • single and multi-channel noise reduction
  • acoustic echo cancellation

This text is an invaluable resource for engineers, researchers, academics, and graduate students in the areas of communications, electrical engineering, and information technology.

Table of Contents

1 Introduction.

2 Models of Speech Production and Hearing.

2.1 Organs of Speech Production.

2.2 Characteristics of Speech Signals.

2.3 Model of Speech Production.

2.4 Anatomy of Hearing.

2.5 Performance of the Auditory Organs.


3 Spectral Transformations.

3.1 Fourier Transform of Continuous Signals.

3.2 Fourier Transform of Discrete Signals.

3.3 Linear Shift Invariant Systems.

3.4 The z-Transform.

3.5 The Discrete Fourier Transform.

3.6 Fast Convolution.

3.7 Cepstral Analysis.


4 Filter Banks for Spectral Analysis and Synthesis.

4.1 Spectral Analysis Using Narrow-Band Filters.

4.2 Polyphase Network Filter Banks.

4.3 QuadratureMirror Filter Banks.


5 Stochastic Signals and Estimation.

5.1 Basic Concepts.

5.2 Expectations andMoments.

5.3 Bivariate Statistics.

5.4 Probability and Information.

5.5 Multivariate Statistics.

5.6 Stochastic Processes.

5.7 Estimation of Statistical Quantities by Time Averages.

5.8 Power Spectral Densities.

5.9 Estimation of the Power Spectral Density.

5.10 Statistical Properties of Speech Signals.

5.11 Statistical Properties of DFT Coe.cients.

5.12 Optimal Estimation.


6 Linear Prediction.

6.1 Vocal TractModels and Short-TermPrediction.

6.2 Optimal Prediction Coe.cients for Stationary Signals.

6.3 Predictor Adaptation.

6.4 Long-TermPrediction.


7 Quantization.

7.1 Analog Samples and Digital Presentation.

7.2 Uniform Quantization.

7.3 Non-uniformQuantization.

7.4 OptimalQuantization.

7.5 Adaptive Quantization.

7.6 Vector Quantization.

7.6.1 Principle.


8 Speech Coding.

8.1 Classi.cation of Speech Coding Algorithms.

8.2 Model-Based Predictive Coding.

8.3 Di.erentialWaveform Coding.

8.4 Parametric Coding.

8.5 Hybrid Coding.

8.6 Adaptive Post.ltering.


9 Error Concealment and Softbit Decoding.

9.1 Hardbit Source Decoding.

9.2 Conventional Error Concealment.

9.3 Softbits and L-Values.

9.4 Softbit Source Decoding (SD).

9.5 Application toModel Parameters.

9.6 Further Improvements.


10 Bandwidth Extension of Speech Signals (BWE).

10.1 Narrowband versusWideband Telephony.

10.2 Speech Coding with Integrated BWE.

10.3 BWE without Auxiliary Transmission.


11 Single and Dual Channel Noise Reduction.

11.1 Introduction.

11.2 LinearMMSE Estimators.

11.3 Speech Enhancement in the DFT Domain.

11.4 Optimal Non-Linear Estimators.

11.5 Joint Optimum Detection and Estimation of Speech.

11.6 Computation of Likelihood Ratios.

11.7 Estimation of the A Priory Probability of Speech Presence.

11.8 VAD and Noise Estimation Techniques.

11.9 Dual-Channel Noise Reduction.


12 Multi-Channel Noise Reduction.

12.1 Introduction.

12.2 Spatial Sampling of Sound Fields.

12.3 Beamforming.

12.4 PerformanceMeasures and Spatial Aliasing.

12.5 Design of Fixed Beamformers.

12.6 Adaptive Beamformers.


13 Acoustic Echo Control.

13.1 The Echo Control Problem.

13.2 Evaluation Criteria.

13.3 TheWiener Solution.

13.4 The LMS and NLMS Algorithm.

13.5 Convergence Analysis and Control of the LMS Algorithm.

13.6 Geometric Projection Interpretation of the NLMS Algorithm.

13.7 The Projection Algorithm.

13.8 Least-Squares and Recursive Least-Squares Algorithms.

13.9 Block Processing and Frequency-Domain Adaptive Filters.

13.9.1 Block LMS Algorithm.

13.10 Additional Measures for Echo Control.

13.11 Stereophonic Acoustic Echo Control.


A Codec Standards.

A.1 Evaluation Criteria.

A.2 ITU-T/G.726: Adaptive Di.erential Pulse-Code Modulation.

A.3 ITU-T/G.728: Low-Delay CELP Speech Coder.

A.4 ITU-T/G.729: Conjugate-Structure Algebraic CELP-Codec.

A.5 ITU-T/G.722: 7 kHz Audio Coding within 64 kbit/s.

A.6 ETSI-GSM06.10: Full Rate Speech Transcoding.

A.7 ETSI-GSM06.20: Half Rate Speech Transcoding.

A.8 ETSI-GSM 06.60: Enhanced Full Rate Speech Transcoding.

A.9 ETSI-GSM06.90: AdaptiveMulti-Rate (AMR) Codec.

A.10 ETSI/3GPP AMRWideband Codec, AMR-WB.

A.11 ETSI/3GPP Extended AMR Wideband Codec, AMR-WB+.

A.12 TIA IS-96: Speech Service Option Standard for Wideband Spread-SpectrumSystems.

A.13 INMARSAT: Improved Multi-Band Excitation Codec (IMBE).

B Speech Quality Assessment.

B.1 Auditive Speech QualityMeasures.

B.2 Instrumental Speech QualityMeasures.