Hadoop 2 Quick-Start Guide: Learn the Essentials of Big Data Computing in the Apache Hadoop 2 Ecosystem (Paperback)

Douglas Eadline

  • 出版商: Addison Wesley
  • 出版日期: 2015-11-05
  • 售價: $1,280
  • 貴賓價: 9.5$1,216
  • 語言: 英文
  • 頁數: 304
  • 裝訂: Paperback
  • ISBN: 0134049942
  • ISBN-13: 9780134049946
  • 相關分類: Hadoop大數據 Big-data
  • 立即出貨 (庫存=1)




Get Started Fast with Apache Hadoop® 2, YARN, and Today’s Hadoop Ecosystem


With Hadoop 2.x and YARN, Hadoop moves beyond MapReduce to become practical for virtually any type of data processing. Hadoop 2.x and the Data Lake concept represent a radical shift away from conventional approaches to data usage and storage. Hadoop 2.x installations offer unmatched scalability and breakthrough extensibility that supports new and existing Big Data analytics processing methods and models.


Hadoop® 2 Quick-Start Guide is the first easy, accessible guide to Apache Hadoop 2.x, YARN, and the modern Hadoop ecosystem. Building on his unsurpassed experience teaching Hadoop and Big Data, author Douglas Eadline covers all the basics you need to know to install and use Hadoop 2 on personal computers or servers, and to navigate the powerful technologies that complement it.


Eadline concisely introduces and explains every key Hadoop 2 concept, tool, and service, illustrating each with a simple “beginning-to-end” example and identifying trustworthy, up-to-date resources for learning more.


This guide is ideal if you want to learn about Hadoop 2 without getting mired in technical details. Douglas Eadline will bring you up to speed quickly, whether you’re a user, admin, devops specialist, programmer, architect, analyst, or data scientist.


Coverage Includes

  • Understanding what Hadoop 2 and YARN do, and how they improve on Hadoop 1 with MapReduce
  • Understanding Hadoop-based Data Lakes versus RDBMS Data Warehouses
  • Installing Hadoop 2 and core services on Linux machines, virtualized sandboxes, or clusters
  • Exploring the Hadoop Distributed File System (HDFS)
  • Understanding the essentials of MapReduce and YARN application programming
  • Simplifying programming and data movement with Apache Pig, Hive, Sqoop, Flume, Oozie, and HBase
  • Observing application progress, controlling jobs, and managing workflows
  • Managing Hadoop efficiently with Apache Ambari–including recipes for HDFS to NFSv3 gateway, HDFS snapshots, and YARN configuration
  • Learning basic Hadoop 2 troubleshooting, and installing Apache Hue and Apache Spark



快速入門Apache Hadoop® 2、YARN和現代Hadoop生態系統

通過Hadoop 2.x和YARN,Hadoop超越了MapReduce,成為幾乎任何類型的數據處理都實用的工具。Hadoop 2.x和數據湖(Data Lake)概念代表了遠離傳統數據使用和存儲方法的激進轉變。Hadoop 2.x安裝提供了無與倫比的可擴展性和突破性的可擴展性,支持新舊大數據分析處理方法和模型。

《Hadoop® 2快速入門指南》是第一本易於理解和使用的Apache Hadoop 2.x、YARN和現代Hadoop生態系統指南。作者Douglas Eadline基於他在教授Hadoop和大數據方面的豐富經驗,涵蓋了您在個人計算機或服務器上安裝和使用Hadoop 2所需的所有基礎知識,以及如何使用與之相輔相成的強大技術。

Eadline簡明扼要地介紹和解釋了每個關鍵的Hadoop 2概念、工具和服務,並通過簡單的“從頭到尾”示例來說明每個概念,同時提供可靠、最新的學習資源。

如果您想了解Hadoop 2而不陷入技術細節,這本指南非常適合您。無論您是用戶、管理員、DevOps專家、程序員、架構師、分析師還是數據科學家,Douglas Eadline都能快速帶您入門。


- 理解Hadoop 2和YARN的功能,以及它們如何改進MapReduce
- 理解基於Hadoop的數據湖和關係型數據倉庫的區別
- 在Linux機器、虛擬沙盒或集群上安裝Hadoop 2和核心服務
- 探索Hadoop分佈式文件系統(HDFS)
- 理解MapReduce和YARN應用程序編程的基礎知識
- 使用Apache Pig、Hive、Sqoop、Flume、Oozie和HBase簡化編程和數據移動
- 監控應用程序進度、控制作業和管理工作流程
- 使用Apache Ambari高效管理Hadoop,包括HDFS到NFSv3網關、HDFS快照和YARN配置的技巧
- 學習基本的Hadoop 2故障排除,並安裝Apache Hue和Apache Spark