您需要先登入或註冊後才能繼續。

Natural Language Processing for Social Media (Synthesis Lectures on Human Language Technologies)
暫譯: 社交媒體的自然語言處理(人類語言技術綜合講座)

Atefeh Farzindar, Diana Inkpen

商品描述

In recent years, online social networking has revolutionized interpersonal communication. The newer research on language analysis in social media has been increasingly focusing on the latter's impact on our daily lives, both on a personal and a professional level. Natural language processing (NLP) is one of the most promising avenues for social media data processing. It is a scientific challenge to develop powerful methods and algorithms which extract relevant information from a large volume of data coming from multiple sources and languages in various formats or in free form. We discuss the challenges in analyzing social media texts in contrast with traditional documents. Research methods in information extraction, automatic categorization and clustering, automatic summarization and indexing, and statistical machine translation need to be adapted to a new kind of data. This book reviews the current research on NLP tools and methods for processing the non-traditional information from social media data that is available in large amounts (big data), and shows how innovative NLP approaches can integrate appropriate linguistic information in various fields such as social media monitoring, healthcare, business intelligence, industry, marketing, and security and defence. We review the existing evaluation metrics for NLP and social media applications, and the new efforts in evaluation campaigns or shared tasks on new datasets collected from social media. Such tasks are organized by the Association for Computational Linguistics (such as SemEval tasks) or by the National Institute of Standards and Technology via the Text REtrieval Conference (TREC) and the Text Analysis Conference (TAC). In the concluding chapter, we discuss the importance of this dynamic discipline and its great potential for NLP in the coming decade, in the context of changes in mobile technology, cloud computing, virtual reality, and social networking. In this second edition, we have added information about recent progress in the tasks and applications presented in the first edition. We discuss new methods and their results. The number of research projects and publications that use social media data is constantly increasing due to continuously growing amounts of social media data and the need to automatically process them. We have added 85 new references to the more than 300 references from the first edition. Besides updating each section, we have added a new application (digital marketing) to the section on media monitoring and we have augmented the section on healthcare applications with an extended discussion of recent research on detecting signs of mental illness from social media.

商品描述(中文翻譯)

近年來,線上社交網絡徹底改變了人際溝通。對社交媒體中語言分析的新研究越來越多地關注其對我們日常生活的影響,無論是在個人層面還是專業層面。自然語言處理(Natural Language Processing, NLP)是社交媒體數據處理中最具前景的途徑之一。開發強大的方法和算法以從來自多個來源和語言的海量數據中提取相關信息是一項科學挑戰,這些數據可能以各種格式或自由形式存在。我們討論了分析社交媒體文本與傳統文檔之間的挑戰。信息提取、自動分類和聚類、自動摘要和索引以及統計機器翻譯的研究方法需要適應這種新型數據。本書回顧了當前針對社交媒體數據(大數據)中非傳統信息的NLP工具和方法的研究,並展示了創新的NLP方法如何在社交媒體監控、醫療保健、商業智能、工業、市場營銷以及安全和防衛等各個領域整合適當的語言信息。我們回顧了現有的NLP和社交媒體應用的評估指標,以及在社交媒體上收集的新數據集上進行的評估活動或共享任務的新努力。這些任務由計算語言學協會(如SemEval任務)或由國家標準與技術研究所通過文本檢索會議(Text REtrieval Conference, TREC)和文本分析會議(Text Analysis Conference, TAC)組織。在結論章中,我們討論了這一動態學科的重要性及其在未來十年內對NLP的巨大潛力,特別是在移動技術、雲計算、虛擬現實和社交網絡變化的背景下。在第二版中,我們增加了有關第一版中呈現的任務和應用的最新進展的信息。我們討論了新方法及其結果。由於社交媒體數據的持續增長以及自動處理這些數據的需求,使用社交媒體數據的研究項目和出版物的數量不斷增加。我們在第一版的300多個參考文獻中新增了85個參考文獻。除了更新每個部分外,我們還在媒體監控部分新增了一個應用(數位行銷),並在醫療保健應用部分擴展了對於從社交媒體檢測心理疾病跡象的最新研究的討論。