Text to Speech Synthesis : New Paradigms and Advances
Shrikanth Narayanan, Abeer Alwan
- 出版商: Prentice Hall PTR
- 出版日期: 2004-08-03
- 售價: $3,172
- 貴賓價: 9.5 折 $3,013
- 語言: 英文
- 頁數: 288
- 裝訂: Hardcover
- ISBN: 013145661X
- ISBN-13: 9780131456617
Recent advances in speech synthesis will enable the development of high-quality natural voice systems with broad application in education, business, entertainment, and medicine. Text to Speech Synthesis is the first book to comprehensively document these new research trends and paradigms, balancing coverage of research and applications. It brings together seminal research by leaders in the field, drawn from both academic and industrial laboratories worldwide.
The authors and editors offer broad coverage of several key areas, including new unit selection approaches, speech representations and modeling, data-driven synthesis schemes, and expressive speech synthesis.
- Unit Selection Methods: Reducing discontinuities at synthesis time in corpus-based speech processing, voice quality variation, and join costs
- Hidden Markov Model (HMM)-Based Synthesis: Advanced uses of speech recognition technology, HMM-based multilingual speech synthesis, and new prosody control techniques
- Expressive Speech Synthesis: Challenges, questions, and avenues of research, including diphone transplantation and minimization of pitch modification
- Speech Representation and Models: A new articulatory modeling paradigm for controlling synthesis quality
This is an essential resource for all researchers working in speech synthesis and related areas such as multimedia signal processing, linguistics, and spoken user interfaces. It will also be valuable to any engineer, developer, or manager who must evaluate the latest speech technologies or integrate them into practical applications.
Table of Contents:
1. Reducing Discontinuities at Synthesis Time for Corpus-Based Speech Synthesis.
Baris Bozkurt, Thierry Dutoit, Romain Prudon, Christophe D'Alessandro and Vincent Pagel.
Shift-Only F0 Smoothing.
Improving Quality of MBROLA Synthesis.
Discussions and Conclusion.
2. Voice Quality Variation in a Long-Term Recording of a Single Speaker Speech Corpus.
Hisashi Kawai and Minoru Tsuzaki.
Factors of Voice Quality Variation.
Candidates of Acoustic Correlates.
Prediction of Voice Quality Difference Scores.
3. Join Cost for Unit Selection Speech Synthesis.
Jithendra Vepa and Simon King.
Perceptual Listening Tests.
Results and Discussion.
4. Articulatory Modeling: A Role in Concatenative Text to Speech Synthesis.
M. Mohan Sondhi and Daniel J. Sinder.
Rule-Based Control of the Parameters.
Concatenative Articulatory Synthesis.
5. Minimizing The Amount of Pitch Modification in Speech Synthesis.
Esther Klabbers, Jan van Santen and Johan Wouters.
Speech Corpus Analysis.
Text Corpus Analysis.
6. The Use of Speech Recognition Technology in Speech Synthesis.
Mari Ostendorf and Ivan Bulyko.
ASR in Synthesis.
7. An HMM-Based Approach to Multilingual Speech Synthesis.
Keiichi Tokuda, Heiga Zen and Alan W. Black.
HMM-Based Speech Synthesis System.
F0 Pattern Modeling by HMM.
Speech-Parameter Generation from an HMM.
Implementation on Festival Architecture.
8. Prosody Control For HMM-Based Japanese TTS.
Koji Iwano, Masahiro Yamada, Taro Togawa and Sadaoki Furui.
Outline of HMM-Based TTS System.
Prosody Generation Using the Quantification Theory (Type 1).
Speech-Rate-Variable Synthesis Method.
9. Synthesizing Expressive Speech Overview: Challenges, and Open Questions.
Murtaza Bulut, Shrikanth Narayanan and Lewis Johnson.
Theories of Emotion.
Dimensions of Emotional Space.
Speech Synthesis Methods.
Emotional Speech Data Collection.
Experimental Evaluation of Expressive Speech.
Presentation of Results From Case Studies.
Open Questions and Future Directions.
10. Unit Selection Synthesis of Prosody: Evaluation Using Diphone Transplantation.
Romain Prudon, Christophe D'Alessandro and Philippe Boula de Mareüil.
Computing Prosody by Selection.
11. Toward Expressive Synthetic Speech.
Ellen Eide, Raimo Bakis, Wael Hamza and John F. Pitrelli.
A Pilot Study For Generating Expressive Speech.
Generating Expressive Speech with Limited Resources.
Rule-Based Methods for Generating Expressive Speech.
Use of an Expressive TTS System.