Advancing Information Access

Agichtein, Eugene

  • 出版商: Morgan & Claypool
  • 出版日期: 2013-01-01
  • 售價: $1,430
  • 貴賓價: 9.5$1,359
  • 語言: 英文
  • 頁數: 100
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1608454851
  • ISBN-13: 9781608454853
  • 海外代購書籍(需單獨結帳)

商品描述

Proliferation of the Internet and ubiquitous access to the Web enable millions of Web users to collaborate online on a variety of activities. Many of these activities result in the construction of large repositories of knowledge, either as their primary aim (e.g., Wikipedia) or as a by-product (e.g., Yahoo Answers). The unprecedented amounts of information in collaboratively generated content (CGC) enable new, knowledge-rich approaches to information access, which are significantly more powerful than the conventional word-based methods. Considerable progress has been made in this direction over the last few years. This lecture reviews influential examples of this line of research, including explicit manipulation of human-defined concepts and their use to augment the bag of words, using large-scale taxonomies of topics from Wikipedia or the Open Directory Project to construct additional class-based features, andidentifying newly available word senses and examples of their usage for better word disambiguation. However, the quality and comprehensiveness of collaboratively created content varies drastically, and a significant amount of preprocessing, filtering, and organization is often necessary. Thus, not only the content repositories can be used to improve IR methods, but the reverse pollination is also possible, as better information extraction methods can be used for automatically collecting more knowledge, or verifying the contributed content. This natural connection between modeling the generation process of CGC and effectively using the accumulated knowledge is further explored in this lecture.