Hadoop Essentials - Tackling the Challenges of Big Data with Hadoop

Shiva Achari

  • 出版商: Packt Publishing
  • 出版日期: 2015-04-30
  • 售價: $1,400
  • 貴賓價: 9.5$1,330
  • 語言: 英文
  • 頁數: 172
  • 裝訂: Paperback
  • ISBN: 1784396680
  • ISBN-13: 9781784396688
  • 相關分類: Hadoop大數據 Big-data
  • 海外代購書籍(需單獨結帳)
    無現貨庫存(No stock available)


Key Features

  • Get to grips with the most powerful tools in the Hadoop ecosystem, including Storm and Spark
  • Learn everything you need to take control of Big Data
  • A fast-paced journey through the key features of Hadoop

Book Description

This book jumps into the world of Hadoop and its tools, to help you learn how to use them effectively to optimize and improve the way you handle Big Data.

Starting with the fundamentals Hadoop YARN, MapReduce, HDFS, and other vital elements in the Hadoop ecosystem, you will soon learn many exciting topics such as MapReduce patterns, data management, and real-time data analysis using Hadoop. You will also explore a number of the leading data processing tools including Hive and Pig, and learn how to use Sqoop and Flume, two of the most powerful technologies used for data ingestion. With further guidance on data streaming and real-time analytics with Storm and Spark, Hadoop Essentials is a reliable and relevant resource for anyone who understands the difficulties - and opportunities - presented by Big Data today.

With this guide, you'll develop your confidence with Hadoop, and be able to use the knowledge and skills you learn to successfully harness its unparalleled capabilities.

What you will learn

  • Get to grips with the fundamentals of Hadoop, and tools such as HDFS, MapReduce, and YARN
  • Learn how to use Hadoop for real-world Big Data projects
  • Improve the performance of your Big Data architecture
  • Find out how to get the most from data processing tools such as Hive and Pig
  • Learn how to unlock real-time Big Data analytics with Apache Spark

About the Author

Shiva Achari has more than 8 years of extensive industry experience and is currently working as a Big Data Architect consultant with companies such as Oracle and Teradata. Over the years, he has architected, designed, and developed multiple innovative and high-performance large-scale solutions, such as distributed systems, data centers, big data management tools, SaaS cloud applications, Internet applications, and Data Analytics solutions.

Table of Contents

  1. Introduction to Big Data and Hadoop
  2. Hadoop Ecosystem
  3. Pillars of Hadoop HDFS, MapReduce, and YARN
  4. Data Access Components Hive and Pig
  5. Storage Component HBase
  6. Data Ingestion in Hadoop Sqoop and Flume
  7. Streaming and Real-time Analysis Storm and Spark



  • 熟悉Hadoop生態系統中最強大的工具,包括Storm和Spark

  • 學習掌握處理大數據所需的一切

  • 快速了解Hadoop的主要特點



從基礎知識Hadoop YARN、MapReduce、HDFS和Hadoop生態系統中的其他重要元素開始,您很快就會學習到許多令人興奮的主題,例如MapReduce模式、數據管理以及使用Hadoop進行實時數據分析。您還將探索一些領先的數據處理工具,包括Hive和Pig,並學習如何使用Sqoop和Flume這兩個用於數據輸入的強大技術。通過有關數據流和使用Storm和Spark進行實時分析的進一步指導,《Hadoop基礎知識》是一個可靠且相關的資源,適用於任何了解當今大數據所帶來困難和機遇的人。



  • 熟悉Hadoop的基礎知識,以及HDFS、MapReduce和YARN等工具

  • 學習如何在實際的大數據項目中使用Hadoop

  • 提高大數據架構的性能

  • 了解如何充分利用Hive和Pig等數據處理工具

  • 學習如何使用Apache Spark進行實時大數據分析


Shiva Achari擁有超過8年的廣泛行業經驗,目前在Oracle和Teradata等公司擔任大數據架構師顧問。多年來,他設計和開發了多個創新且高性能的大規模解決方案,例如分佈式系統、數據中心、大數據管理工具、SaaS雲應用程序、互聯網應用程序和數據分析解決方案。


  1. 大數據和Hadoop簡介

  2. Hadoop生態系統

  3. Hadoop的支柱:HDFS、MapReduce和YARN

  4. 數據訪問組件:Hive和Pig

  5. 存儲組件:HBase

  6. Hadoop中的數據輸入:Sqoop和Flume

  7. 流式數據和實時分析:Storm和Spark