Cross-Lingual Word Embeddings
暫譯: 跨語言詞嵌入

Sogaard, Anders, Vulic, Ivan, Ruder, Sebastian

  • 出版商: Morgan & Claypool
  • 出版日期: 2019-06-04
  • 售價: $2,090
  • 貴賓價: 9.5$1,986
  • 語言: 英文
  • 頁數: 132
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1681730634
  • ISBN-13: 9781681730639
  • 相關分類: Natural Language Processing
  • 海外代購書籍(需單獨結帳)

商品描述

The majority of natural language processing (NLP) is English language processing, and while there is good language technology support for (standard varieties of) English, support for Albanian, Burmese, or Cebuano--and most other languages--remains limited.

Being able to bridge this digital divide is important for scientific and democratic reasons but also represents an enormous growth potential. A key challenge for this to happen is learning to align basic meaning-bearing units of different languages.

In this book, the authors survey and discuss recent and historical work on supervised and unsupervised learning of such alignments. Specifically, the book focuses on so-called cross-lingual word embeddings. The survey is intended to be systematic, using consistent notation and putting the available methods on comparable form, making it easy to compare wildly different approaches. In so doing, the authors establish previously unreported relations between these methods and are able to present a fast-growing literature in a very compact way. Furthermore, the authors discuss how best to evaluate cross-lingual word embedding methods and survey the resources available for students and researchers interested in this topic.

商品描述(中文翻譯)

大多數的自然語言處理(NLP)是針對英語的處理,雖然對於(標準變體的)英語有良好的語言技術支持,但對於阿爾巴尼亞語、緬甸語或宿霧語——以及大多數其他語言——的支持仍然有限。

能夠彌合這一數位鴻溝對於科學和民主的理由都非常重要,同時也代表著巨大的增長潛力。實現這一目標的一個關鍵挑戰是學會對齊不同語言的基本意義單位。

在本書中,作者調查並討論了有關這種對齊的監督學習和非監督學習的近期和歷史工作。具體而言,本書專注於所謂的跨語言詞嵌入(cross-lingual word embeddings)。這項調查旨在系統化,使用一致的符號並將可用的方法以可比較的形式呈現,使得比較截然不同的方法變得容易。在此過程中,作者建立了這些方法之間先前未報告的關係,並能夠以非常簡潔的方式呈現快速增長的文獻。此外,作者還討論了如何最佳評估跨語言詞嵌入方法,並調查了對於對此主題感興趣的學生和研究人員可用的資源。