A Perspective on Single-Channel Frequency-Domain Speech Enhancement (Synthesis Lectures on Speech and Audio Processing)
暫譯: 單通道頻域語音增強的觀點（語音與音頻處理合成講座）

Name: A Perspective on Single-Channel Frequency-Domain Speech Enhancement (Synthesis Lectures on Speech and Audio Processing)
Price: 1491 TWD
Availability: OnlineOnly
Author: Jacob Benesty, Yiteng Huang
ISBN: 1608456986

Jacob Benesty, Yiteng Huang

出版商: Morgan & Claypool
出版日期: 2011-03-01
售價: $1,570
貴賓價: 9.5 折 $1,491
語言: 英文
頁數: 110
裝訂: Paperback
ISBN: 1608456986
ISBN-13: 9781608456987
相關分類: 語音辨識 Speech-recognition

海外代購書籍(需單獨結帳)

商品描述

This book focuses on a class of single-channel noise reduction methods that are performed in the frequency domain via the short-time Fourier transform (STFT). The simplicity and relative effectiveness of this class of approaches make them the dominant choice in practical systems. Even though many popular algorithms have been proposed through more than four decades of continuous research, there are a number of critical areas where our understanding and capabilities still remain quite rudimentary, especially with respect to the relationship between noise reduction and speech distortion. All existing frequency-domain algorithms, no matter how they are developed, have one feature in common: the solution is eventually expressed as a gain function applied to the STFT of the noisy signal only in the current frame. As a result, the narrowband signal-to-noise ratio (SNR) cannot be improved, and any gains achieved in noise reduction on the fullband basis come with a price to pay, which is speech distortion. In this book, we present a new perspective on the problem by exploiting the difference between speech and typical noise in circularity and interframe self-correlation, which were ignored in the past. By gathering the STFT of the microphone signal of the current frame, its complex conjugate, and the STFTs in the previous frames, we construct several new, multiple-observation signal models similar to a microphone array system: there are multiple noisy speech observations, and their speech components are correlated but not completely coherent while their noise components are presumably uncorrelated. Therefore, the multichannel Wiener filter and the minimum variance distortionless response (MVDR) filter that were usually associated with microphone arrays will be developed for single-channel noise reduction in this book. This might instigate a paradigm shift geared toward speech distortionless noise reduction techniques. Table of Contents: Introduction / Problem Formulation / Performance Measures / Linear and Widely Linear Models / Optimal Filters with Model 1 / Optimal Filters with Model 2 / Optimal Filters with Model 3 / Optimal Filters with Model 4 / Experimental Study

商品描述(中文翻譯)

本書專注於一類在頻域中透過短時傅立葉變換（STFT）進行的單通道噪聲減少方法。這類方法的簡單性和相對有效性使其成為實際系統中的主要選擇。儘管在過去四十多年中提出了許多流行的演算法，但在噪聲減少與語音失真之間的關係等幾個關鍵領域，我們的理解和能力仍然相當初步。所有現有的頻域演算法，不論其開發方式如何，都有一個共同特徵：解決方案最終表達為應用於當前幀的噪聲信號的STFT的增益函數。因此，窄帶信號對噪聲比（SNR）無法改善，而在全頻帶基礎上實現的任何噪聲減少增益都會以語音失真為代價。在本書中，我們通過利用語音與典型噪聲在圓周性和幀間自相關性上的差異，提出了對該問題的新視角，這些差異在過去被忽視。通過收集當前幀的麥克風信號的STFT、其共軛複數以及前幀的STFT，我們構建了幾個新的多重觀察信號模型，類似於麥克風陣列系統：存在多個帶噪聲的語音觀察，其語音成分相關但不完全相干，而其噪聲成分則假設為不相關。因此，本書將針對單通道噪聲減少開發通常與麥克風陣列相關的多通道維納濾波器和最小方差無失真響應（MVDR）濾波器。這可能會引發一場針對無失真語音噪聲減少技術的範式轉變。

目錄：引言 / 問題公式化 / 性能測量 / 線性與廣義線性模型 / 模型1的最佳濾波器 / 模型2的最佳濾波器 / 模型3的最佳濾波器 / 模型4的最佳濾波器 / 實驗研究