Mining the Web: Discovering Knowledge for Hypertext Data

Soumen Chakrabarti

買這商品的人也買了...

商品描述

Mining the Web: Discovering Knowledge from Hypertext Data is the first book devoted entirely to techniques for producing knowledge from the vast body of unstructured Web data. Building on an initial survey of infrastructural issues--including Web crawling and indexing--Chakrabarti examines low-level machine learning techniques as they relate specifically to the challenges of Web mining. He then devotes the final part of the book to applications that unite infrastructure and analysis to bring machine learning to bear on systematically acquired and stored data. Here the focus is on results: the strengths and weaknesses of these applications, along with their potential as foundations for further progress. From Chakrabarti's work--painstaking, critical, and forward-looking--readers will gain the theoretical and practical understanding they need to contribute to the Web mining effort.

Contents


Preface. Introduction. I Infrastructure: Crawling the Web. Web search. II Learning: Similarity and clustering. Supervised learning for text. Semi-supervised learning. III Applications: Social network analysis. Resource discovery. The future of Web mining.

商品描述(中文翻譯)

「挖掘網絡:從超文本數據中發現知識」是第一本完全專注於從龐大的非結構化網絡數據中提取知識技術的書籍。在對基礎設施問題(包括網絡爬蟲和索引)進行初步調查的基礎上,Chakrabarti探討了與網絡挖掘挑戰密切相關的低層機器學習技術。然後,他將書的最後一部分專注於將基礎設施和分析結合起來,將機器學習應用於系統性獲取和存儲的數據。這裡的重點是結果:這些應用的優點和缺點,以及它們作為進一步進展基礎的潛力。從Chakrabarti的工作中,讀者將獲得他們需要為網絡挖掘工作做出貢獻的理論和實踐理解。