Apache Hadoop 3 Quick Start Guide: Learn about big data processing and analytics
暫譯: Apache Hadoop 3 快速入門指南:了解大數據處理與分析
Hrishikesh Vijay Karambelkar
- 出版商: Packt Publishing
- 出版日期: 2018-10-31
- 售價: $1,290
- 貴賓價: 9.5 折 $1,226
- 語言: 英文
- 頁數: 220
- 裝訂: Paperback
- ISBN: 1788999835
- ISBN-13: 9781788999830
-
相關分類:
Hadoop
海外代購書籍(需單獨結帳)
買這商品的人也買了...
-
$414$393 -
$474機器學習
-
$580$493 -
$356Spring Boot 實戰 (Spring Boot in Action)
-
$360$281 -
$880$686 -
$714Effective Java, 3/e (簡體中文版)
-
$607Spring Cloud 微服務和分佈式系統實踐
-
$580$458 -
$1,280$1,011 -
$780$616
商品描述
A fast paced guide that will help you learn about Apache Hadoop 3 and its ecosystem
Key Features
- Set up, configure and get started with Hadoop to get useful insights from large data sets
- Work with the different components of Hadoop such as MapReduce, HDFS and YARN
- Learn about the new features introduced in Hadoop 3
Book Description
Apache Hadoop is a widely used distributed data platform. It enables large datasets to be efficiently processed instead of using one large computer to store and process the data. This book will get you started with the Hadoop ecosystem, and introduce you to the main technical topics, including MapReduce, YARN, and HDFS.
The book begins with an overview of big data and Apache Hadoop. Then, you will set up a pseudo Hadoop development environment and a multi-node enterprise Hadoop cluster. You will see how the parallel programming paradigm, such as MapReduce, can solve many complex data processing problems.
The book also covers the important aspects of the big data software development lifecycle, including quality assurance and control, performance, administration, and monitoring.
You will then learn about the Hadoop ecosystem, and tools such as Kafka, Sqoop, Flume, Pig, Hive, and HBase. Finally, you will look at advanced topics, including real time streaming using Apache Storm, and data analytics using Apache Spark.
By the end of the book, you will be well versed with different configurations of the Hadoop 3 cluster.
What you will learn
- Store and analyze data at scale using HDFS, MapReduce and YARN
- Install and configure Hadoop 3 in different modes
- Use Yarn effectively to run different applications on Hadoop based platform
- Understand and monitor how Hadoop cluster is managed
- Consume streaming data using Storm, and then analyze it using Spark
- Explore Apache Hadoop ecosystem components, such as Flume, Sqoop, HBase, Hive, and Kafka
Who this book is for
Aspiring Big Data professionals who want to learn the essentials of Hadoop 3 will find this book to be useful. Existing Hadoop users who want to get up to speed with the new features introduced in Hadoop 3 will also benefit from this book. Having knowledge of Java programming will be an added advantage.
Table of Contents
- Hadoop 3.0 - Background and Introduction
- Planning and Setting Up Hadoop Clusters
- Deep Dive into the Hadoop Distributed File System
- Developing MapReduce Applications
- Building Rich YARN Applications
- Monitoring and Administration of a Hadoop Cluster
- Demystifying Hadoop Ecosystem Components
- Advanced Topics in Apache Hadoop
商品描述(中文翻譯)
一本快速入門的指南,幫助您了解 Apache Hadoop 3 及其生態系統
主要特點
- 設置、配置並開始使用 Hadoop,從大型數據集中獲取有用的見解
- 使用 Hadoop 的不同組件,如 MapReduce、HDFS 和 YARN
- 了解 Hadoop 3 中引入的新功能
書籍描述
Apache Hadoop 是一個廣泛使用的分散式數據平台。它使得大型數據集能夠高效處理,而不是使用一台大型計算機來存儲和處理數據。本書將幫助您入門 Hadoop 生態系統,並介紹主要的技術主題,包括 MapReduce、YARN 和 HDFS。
本書首先概述大數據和 Apache Hadoop。接著,您將設置一個伪 Hadoop 開發環境和一個多節點的企業 Hadoop 集群。您將看到並行編程範式(如 MapReduce)如何解決許多複雜的數據處理問題。
本書還涵蓋了大數據軟體開發生命周期中的重要方面,包括質量保證和控制、性能、管理和監控。
然後,您將了解 Hadoop 生態系統,以及 Kafka、Sqoop、Flume、Pig、Hive 和 HBase 等工具。最後,您將探討高級主題,包括使用 Apache Storm 的實時流處理和使用 Apache Spark 的數據分析。
到本書結束時,您將熟悉 Hadoop 3 集群的不同配置。
您將學到什麼
- 使用 HDFS、MapReduce 和 YARN 進行大規模數據存儲和分析
- 在不同模式下安裝和配置 Hadoop 3
- 有效使用 YARN 在基於 Hadoop 的平台上運行不同的應用程序
- 理解和監控 Hadoop 集群的管理方式
- 使用 Storm 消費流數據,然後使用 Spark 進行分析
- 探索 Apache Hadoop 生態系統組件,如 Flume、Sqoop、HBase、Hive 和 Kafka
本書適合誰
有志於成為大數據專業人士的人士,想要學習 Hadoop 3 的基本知識,將會發現本書非常有用。現有的 Hadoop 用戶希望跟上 Hadoop 3 中引入的新功能,也將從本書中受益。具備 Java 編程知識將是額外的優勢。
目錄
- Hadoop 3.0 - 背景與介紹
- 規劃與設置 Hadoop 集群
- 深入了解 Hadoop 分散式檔案系統
- 開發 MapReduce 應用程序
- 構建豐富的 YARN 應用程序
- Hadoop 集群的監控與管理
- 揭開 Hadoop 生態系統組件的神秘面紗
- Apache Hadoop 的高級主題