Automated Data Collection with R: A Practical Guide to Web Scraping and Text Mining (Hardcover)

Simon Munzert, Christian Rubba, Peter Meißner, Dominic Nyhuis



A hands on guide to web scraping and text mining for both beginners and experienced users of R

  • Introduces fundamental concepts of the main architecture of the web and databases and covers HTTP, HTML, XML, JSON, SQL.
  • Provides basic techniques to query web documents and data sets (XPath and regular expressions).
  • An extensive set of exercises are presented to guide the reader through each technique.
  • Explores both supervised and unsupervised techniques as well as advanced techniques such as data scraping and text management.
  • Case studies are featured throughout along with examples for each technique presented.
  • R code and solutions to exercises featured in the book are provided on a supporting website.