Practical Text Mining with Perl (Hardcover)

Roger Bilisoly

  • 出版商: Wiley
  • 出版日期: 2008-08-01
  • 定價: $3,670
  • 售價: 9.5$3,487
  • 語言: 英文
  • 頁數: 320
  • 裝訂: Hardcover
  • ISBN: 0470176431
  • ISBN-13: 9780470176436
  • 相關分類: Perl 程式語言Text-mining
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Provides readers with the methods, algorithms, and means to perform text mining tasks

This book is devoted to the fundamentals of text mining using Perl, an open-source programming tool that is freely available via the Internet (www.perl.org). It covers mining ideas from several perspectives--statistics, data mining, linguistics, and information retrieval--and provides readers with the means to successfully complete text mining tasks on their own.

The book begins with an introduction to regular expressions, a text pattern methodology, and quantitative text summaries, all of which are fundamental tools of analyzing text. Then, it builds upon this foundation to explore:

  • Probability and texts, including the bag-of-words model
  • Information retrieval techniques such as the TF-IDF similarity measure
  • Concordance lines and corpus linguistics
  • Multivariate techniques such as correlation, principal components analysis, and clustering
  • Perl modules, German, and permutation tests

Each chapter is devoted to a single key topic, and the author carefully and thoughtfully introduces mathematical concepts as they arise, allowing readers to learn as they go without having to refer to additional books. The inclusion of numerous exercises and worked-out examples further complements the book's student-friendly format.

Practical Text Mining with Perl is ideal as a textbook for undergraduate and graduate courses in text mining and as a reference for a variety of professionals who are interested in extracting information from text documents.

商品描述(中文翻譯)

本書提供讀者進行文字探勘任務的方法、演算法和工具。這本書專注於使用Perl進行文字探勘的基礎知識,Perl是一個開源的編程工具,可以通過互聯網免費獲得(www.perl.org)。它從統計學、數據探勘、語言學和信息檢索等多個角度介紹了文字探勘的思想,並為讀者提供了成功完成文字探勘任務的方法。

本書首先介紹了正則表達式、文本模式方法和定量文本摘要等基礎工具,這些工具是分析文本的基礎。然後,它在此基礎上探討了以下內容:
- 概率和文本,包括詞袋模型
- 信息檢索技術,如TF-IDF相似度度量
- 語料庫語言學和一致性行
- 相關性、主成分分析和聚類等多變量技術
- Perl模塊、德語和排列測試

每一章都專注於一個關鍵主題,作者在適當的時候仔細介紹數學概念,讓讀者能夠在學習的過程中不必參考其他書籍。大量的練習和實例進一步完善了本書的學生友好格式。

《Practical Text Mining with Perl》非常適合作為本科和研究生課程的教材,也是對從文本文檔中提取信息感興趣的各種專業人士的參考書。