Hadoop: The Definitive Guide, 2/e (Paperback)

Tom White

  • 出版商: O'Reilly
  • 出版日期: 2010-10-15
  • 定價: $1,750
  • 售價: 1.7$299
  • 語言: 英文
  • 頁數: 628
  • 裝訂: Paperback
  • ISBN: 1449389732
  • ISBN-13: 9781449389734
  • 相關分類: Hadoop
  • 立即出貨(限量) (庫存=1)



Discover how Apache Hadoop can unleash the power of your data. This comprehensive resource shows you how to build and maintain reliable, scalable, distributed systems with the Hadoop framework -- an open source implementation of MapReduce, the algorithm on which Google built its empire. Programmers will find details for analyzing datasets of any size, and administrators will learn how to set up and run Hadoop clusters.

This revised edition covers recent changes to Hadoop, including new features such as Hive, Sqoop, and Avro. It also provides illuminating case studies that illustrate how Hadoop is used to solve specific problems. Looking to get the most out of your data? This is your book.

  • Use the Hadoop Distributed File System (HDFS) for storing large datasets, then run distributed computations over those datasets with MapReduce
  • Become familiar with Hadoop’s data and I/O building blocks for compression, data integrity, serialization, and persistence
  • Discover common pitfalls and advanced features for writing real-world MapReduce programs
  • Design, build, and administer a dedicated Hadoop cluster, or run Hadoop in the cloud
  • Use Pig, a high-level query language for large-scale data processing
  • Analyze datasets with Hive, Hadoop’s data warehousing system
  • Take advantage of HBase, Hadoop’s database for structured and semi-structured data
  • Learn ZooKeeper, a toolkit of coordination primitives for building distributed systems

"Now you have the opportunity to learn about Hadoop from a master -- not only of the technology, but also of common sense and plain talk."

--Doug Cutting, Cloudera


發現 Apache Hadoop 如何釋放您的數據力量。這本全面的資源向您展示如何使用 Hadoop 框架建立和維護可靠、可擴展、分散式系統 - 這是 Google 建立其帝國的 MapReduce 演算法的開源實現。程式設計師將找到分析任何大小數據集的詳細資訊,管理員將學習如何設置和運行 Hadoop 集群。

這本修訂版涵蓋了 Hadoop 的最新變化,包括 Hive、Sqoop 和 Avro 等新功能。它還提供了有啟發性的案例研究,說明了如何使用 Hadoop 解決特定問題。想要充分利用您的數據嗎?這本書是您的選擇。

- 使用 Hadoop 分散式檔案系統 (HDFS) 存儲大型數據集,然後使用 MapReduce 在這些數據集上運行分散式計算。
- 熟悉 Hadoop 的數據和 I/O 構建塊,用於壓縮、數據完整性、序列化和持久性。
- 發現撰寫實際 MapReduce 程序的常見問題和高級功能。
- 設計、構建和管理專用的 Hadoop 集群,或在雲端上運行 Hadoop。
- 使用 Pig,一種用於大規模數據處理的高級查詢語言。
- 使用 Hive,Hadoop 的數據倉儲系統,分析數據集。
- 利用 HBase,Hadoop 的結構化和半結構化數據庫。
- 學習 ZooKeeper,一套用於構建分散式系統的協調原語工具包。

「現在,您有機會從一位大師那裡學習有關 Hadoop 的知識 - 不僅是技術方面,還有常識和平實的談話。」
- Doug Cutting, Cloudera