Webbots, Spiders, and Screen Scrapers (Paperback)

Michael Schrenk

  • 出版商: No Starch Press
  • 出版日期: 2007-03-30
  • 定價: $1,320
  • 售價: 2.3$299
  • 語言: 英文
  • 頁數: 306
  • 裝訂: Paperback
  • ISBN: 1593271204
  • ISBN-13: 9781593271206

立即出貨(限量) (庫存=1)

買這商品的人也買了...

相關主題

商品描述

Description

The Internet is bigger and better than what a mere browser allows. Webbots, Spiders, and Screen Scrapers is for programmers and businesspeople who want to take full advantage of the vast resources available on the Web. There's no reason to let browsers limit your online experienceespecially when you can easily automate online tasks to suit your individual needs.

Learn how to write webbots and spiders that do all this and more:

  • Programmatically download entire websites
  • Effectively parse data from web pages
  • Manage cookies
  • Decode encrypted files
  • Automate form submissions
  • Send and receive email
  • Send SMS alerts to your cell phone
  • Unlock password-protected websites
  • Automatically bid in online auctions
  • Exchange data with FTP and NNTP servers

Sample projects using standard code libraries reinforce these new skills. You'll learn how to create your own webbots and spiders that track online prices, aggregate different data sources into a single web page, and archive the online data you just can't live without. You'll learn inside information from an experienced webbot developer on how and when to write stealthy webbots that mimic human behavior, tips for developing fault-tolerant designs, and various methods for launching and scheduling webbots. You'll also get advice on how to write webbots and spiders that respect website owner property rights, plus techniques for shielding websites from unwanted robots. Some tasks are just too tediousor too important!?to leave to humans. Once you've automated your online life, you'll never let a browser limit the way you use the Internet again.

 

Table of Contents

Introduction

PART I: FUNDAMENTAL CONCEPTS AND TECHNIQUES
Chapter 1: What's in It for You?
Chapter 2: Ideas for Webbots
Chapter 3: Downloading Web Pages
Chapter 4: Parsing Techniques
Chapter 5: Automating Form Submission
Chapter 6: Managing Large Amounts of Data

PART II: PROJECTS
Chapter 7: Price-Monitoring Webbots
Chapter 8: Image-Capturing Webbots
Chapter 9: Link-Verification Webbots
Chapter 10: Anonymous Browsing Webbots
Chapter 11: Search-Ranking Webbots
Chapter 12: Aggregation Webbots
Chapter 13: FTP Webbots
Chapter 14: NNTP News Webbots
Chapter 15: Webbots That Read Email
Chapter 16: Webbots That Send Email
Chapter 17: Converting a Website into a Function

PART III: ADVANCED TECHNICAL CONSIDERATIONS
Chapter 18: Spiders
Chapter 19: Procurement Webbots and Snipers
Chapter 20: Webbots and Cryptography
Chapter 21: Authentication
Chapter 22: Advanced Cookie Management
Chapter 23: Scheduling Webbots and Spiders

PART IV: LARGER CONSIDERATIONS
Chapter 24: Designing Stealthy Webbots and Spiders
Chapter 25: Writing Fault-Tolerant Webbots
Chapter 26: Designing Webbot-Friendly Websites
Chapter 27: Killing Spiders
Chapter 28: Keeping Webbots out of Trouble
Appendix A: PHP/CURL Reference
Appendix B: Status Codes
Appendix C: SMS Email Addresses
Index