Learning Machine Translation

Cyril Goutte, Nicola Cancedda, Marc Dymetman, George Foster

買這商品的人也買了...

商品描述

The Internet gives us access to a wealth of information in languages we don't understand. The investigation of automated or semi-automated approaches to translation has become a thriving research field with enormous commercial potential. This volume investigates how machine learning techniques can improve statistical machine translation, currently at the forefront of research in the field.

The book looks first at enabling technologies—technologies that solve problems that are not machine translation proper but are linked closely to the development of a machine translation system. These include the acquisition of bilingual sentence-aligned data from comparable corpora, automatic construction of multilingual name dictionaries, and word alignment. The book then presents new or improved statistical machine translation techniques, including a discriminative training framework for leveraging syntactic information, the use of semi-supervised and kernel-based learning methods, and the combination of multiple machine translation outputs in order to improve overall translation quality.

Contributors: Srinivas Bangalore, Nicola Cancedda, Josep M. Crego, Marc Dymetman, Jakob Elming, George Foster, Jesús Giménez, Cyril Goutte, Nizar Habash, Gholamreza Haffari, Patrick Haffner, Hitoshi Isahara, Stephan Kanthak, Alexandre Klementiev, Gregor Leusch, Pierre Mahé, Lluís Màrquez, Evgeny Matusov, I. Dan Melamed, Ion Muslea, Hermann Ney, Bruno Pouliquen, Dan Roth, Anoop Sarkar, John Shawe-Taylor, Ralf Steinberger, Joseph Turian, Nicola Ueffing, Masao Utiyama, Zhuoran Wang, Benjamin Wellington, Kenji Yamada

Neural Information Processing series

商品描述(中文翻譯)

互聯網讓我們能夠接觸到大量我們不懂的語言的信息。自動或半自動翻譯方法的研究已成為一個具有巨大商業潛力的繁榮研究領域。本書探討了機器學習技術如何改進統計機器翻譯,該技術目前在該領域的研究中處於領先地位。

本書首先介紹了一些關鍵技術,這些技術解決的問題與機器翻譯密切相關,但不屬於傳統的機器翻譯範疇。這些技術包括從可比語料庫中獲取雙語句子對齊數據、自動構建多語言姓名詞典以及詞語對齊。接著,本書介紹了新的或改進的統計機器翻譯技術,包括一種利用句法信息的判別式訓練框架、半監督和基於核的學習方法以及結合多個機器翻譯輸出以提高整體翻譯質量。

本書的貢獻者包括:Srinivas Bangalore、Nicola Cancedda、Josep M. Crego、Marc Dymetman、Jakob Elming、George Foster、Jesús Giménez、Cyril Goutte、Nizar Habash、Gholamreza Haffari、Patrick Haffner、Hitoshi Isahara、Stephan Kanthak、Alexandre Klementiev、Gregor Leusch、Pierre Mahé、Lluís Màrquez、Evgeny Matusov、I. Dan Melamed、Ion Muslea、Hermann Ney、Bruno Pouliquen、Dan Roth、Anoop Sarkar、John Shawe-Taylor、Ralf Steinberger、Joseph Turian、Nicola Ueffing、Masao Utiyama、Zhuoran Wang、Benjamin Wellington、Kenji Yamada。

本書屬於《神經信息處理系列》。