Cognitively Inspired Audiovisual Speech Filtering: Towards an Intelligent, Fuzzy Based, Multimodal, Two-Stage Speech Enhancement System (SpringerBriefs in Cognitive Computation)
暫譯: 認知啟發的視聽語音過濾:朝向智能模糊基礎的多模態雙階段語音增強系統(SpringerBriefs in Cognitive Computation)

Andrew Abel

  • 出版商: Springer
  • 出版日期: 2015-08-19
  • 售價: $2,500
  • 貴賓價: 9.5$2,375
  • 語言: 英文
  • 頁數: 140
  • 裝訂: Paperback
  • ISBN: 3319135082
  • ISBN-13: 9783319135083
  • 相關分類: 語音辨識 Speech-recognition
  • 海外代購書籍(需單獨結帳)

商品描述

This book presents a summary of the cognitively inspired basis behind multimodal speech enhancement, covering the relationship between audio and visual modalities in speech, as well as recent research into audiovisual speech correlation. A number of audiovisual speech filtering approaches that make use of this relationship are also discussed. A novel multimodal speech enhancement system, making use of both visual and audio information to filter speech, is presented, and this book explores the extension of this system with the use of fuzzy logic to demonstrate an initial implementation of an autonomous, adaptive, and context aware multimodal system. This work also discusses the challenges presented with regard to testing such a system, the limitations with many current audiovisual speech corpora, and discusses a suitable approach towards development of a corpus designed to test this novel, cognitively inspired, speech filtering system.                                                                                

商品描述(中文翻譯)

本書總結了多模態語音增強背後的認知啟發基礎,涵蓋了語音中音頻與視覺模態之間的關係,以及最近對視聽語音相關性的研究。還討論了幾種利用這種關係的視聽語音過濾方法。本書介紹了一種新穎的多模態語音增強系統,該系統利用視覺和音頻信息來過濾語音,並探討了使用模糊邏輯擴展此系統的可能性,以展示一個自主、自適應和上下文感知的多模態系統的初步實現。本研究還討論了測試此類系統所面臨的挑戰、當前許多視聽語音語料庫的局限性,並探討了一種適合開發旨在測試這種新穎的、認知啟發的語音過濾系統的語料庫的方法。