Perl & LWP

Sean M. Burke

  • 出版商: O'Reilly
  • 出版日期: 2002-07-30
  • 售價: $1,540
  • 貴賓價: 9.5$1,463
  • 語言: 英文
  • 頁數: 262
  • 裝訂: Paperback
  • ISBN: 0596001789
  • ISBN-13: 9780596001780
  • 相關分類: Perl 程式語言
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

商品描述

Perl soared to popularity as a language for creating and managing web content, but with LWP (Library for WWW in Perl), Perl is equally adept at consuming information on the Web. LWP is a suite of modules for fetching and processing web pages.

The Web is a vast data source that contains everything from stock prices to movie credits, and with LWP all that data is just a few lines of code away. Anything you do on the Web, whether it's buying or selling, reading or writing, uploading or downloading, news to e-commerce, can be controlled with Perl and LWP. You can automate Web-based purchase orders as easily as you can set up a program to download MP3 files from a web site.

Perl & LWP covers:

  • Understanding LWP and its design
  • Fetching and analyzing URLs
  • Extracting information from HTML using regular expressions and tokens
  • Working with the structure of HTML documents using trees
  • Setting and inspecting HTTP headers and response codes
  • Managing cookies
  • Accessing information that requires authentication
  • Extracting links
  • Cooperating with proxy caches
  • Writing web spiders (also known as robots) in a safe fashion


Perl & LWP includes many step-by-step examples that show how to apply the various techniques. Programs to extract information from the web sites of BBC News, Altavista, ABEBooks.com, and the Weather Underground, to name just a few, are explained in detail, so that you understand how and why they work.
Perl programmers who want to automate and mine the web can pick up this book and be immediately productive. Written by a contributor to LWP, and with a foreword by one of LWP's creators, Perl & LWP is the authoritative guide to this powerful and popular toolkit

Table of Contents

Foreword

Preface

1. Introduction to Web Automation

2. Web Basics

3. The LWP Class Model

4. URLs

5. Forms

6. Simple HTML Processing with Regular Expressions

7. HTML Processing with Tokens

8. Tokenizing Walkthrough

9. HTML Processing with Trees

10. Modifying HTML with Trees

11. Cookies, Authentication, and Advanced Requests

12. Spiders

A. LWP Modules

B. HTTP Status Codes

C. Common MIME Types

D. Language Tags

E. Common Content Encodings

F. ASCII Table

G. User's View of Object-Oriented Modules

Index

商品描述(中文翻譯)

Perl在創建和管理網頁內容方面迅速流行起來,但是有了Perl的LWP(Perl中的WWW庫),Perl同樣擅長於從網絡上獲取信息。LWP是一套用於獲取和處理網頁的模塊集合。網絡是一個包含從股票價格到電影演職員表的龐大數據源,而有了LWP,所有這些數據只需要幾行代碼就可以獲取。無論您在網絡上做什麼,無論是購買還是銷售、閱讀還是寫作、上傳還是下載、新聞還是電子商務,都可以通過Perl和LWP來控制。您可以輕鬆自動化基於網絡的採購訂單,就像設置一個程序從網站下載MP3文件一樣簡單。《Perl & LWP》包括以下內容:
- 理解LWP及其設計
- 獲取和分析URL
- 使用正則表達式和標記從HTML中提取信息
- 使用樹狀結構處理HTML文檔
- 設置和檢查HTTP標頭和響應代碼
- 管理cookie
- 訪問需要身份驗證的信息
- 提取鏈接
- 與代理緩存合作
- 安全地編寫網絡爬蟲(也稱為機器人)
《Perl & LWP》中包含許多逐步示例,展示了如何應用各種技術。詳細解釋了從BBC News、Altavista、ABEBooks.com和Weather Underground等網站提取信息的程序,以便您了解它們的工作原理和原因。想要自動化和挖掘網絡的Perl程序員可以閱讀本書,並立即提高生產力。本書由LWP的貢獻者撰寫,並由LWP的創建者之一撰寫前言,是這個強大且受歡迎的工具包的權威指南。

目錄:
- 前言
- 前言
- 1. 網絡自動化簡介
- 2. 網絡基礎知識
- 3. LWP類模型
- 4. URL
- 5. 表單
- 6. 使用正則表達式進行簡單的HTML處理
- 7. 使用標記進行HTML處理
- 8. 標記化演示
- 9. 使用樹狀結構進行HTML處理
- 10. 使用樹狀結構修改HTML
- 11. Cookie、身份驗證和高級請求
- 12. 網絡爬蟲
- A. LWP模塊
- B. HTTP狀態碼
- C. 常見MIME類型
- D. 語言標籤
- E. 常見內容編碼
- F. ASCII表
- G. 面向對象模塊的用戶視圖
- 索引