Mastering Hadoop 3: Big Data processing at scale to unlock unique business insights
暫譯: 精通 Hadoop 3:大數據處理與規模,解鎖獨特商業洞察

Chanchal Singh;Manish Kumar;Dr. Timothy Wong

  • 出版商: Packt Publishing
  • 出版日期: 2019-02-28
  • 售價: $2,120
  • 貴賓價: 9.5$2,014
  • 語言: 英文
  • 頁數: 797
  • 裝訂: Paperback
  • ISBN: 1788620445
  • ISBN-13: 9781788620444
  • 相關分類: Hadoop
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

商品描述

Your guide to master the most advanced concepts of Hadoop 3

Key Features

  • Master the newly introduced features and capabilities of Hadoop 3 - the world's most popular Big Data ecosystem
  • Crunch and process your data with ease using MapReduce, YARN and a whole host of other tools within the Hadoop ecosystem
  • A highly practical book with real-world case studies and easy to understand code to help you master Hadoop

Book Description

Apache Hadoop is one of the most popular Big Data solutions for distributed storage and processing of large chunks of data. With Hadoop 3, Apache promises to bringing a high-performance, more fault-tolerant and more efficient Big Data processing platform, with focus on better scalability and efficiency.

This is a comprehensive guide to understand advanced concepts of Hadoop ecosystem tool. You will learn how Hadoop works internally, advance concepts of different ecosystem tools, solution to some of real world use case and how to secure your cluster. It will then walk you through some of advance concepts of HDFS, YARN, MapReduce and Hadoop3. We will address some of the common challenges like, how to use Kafka efficiently, design low latency reliable message delivery Kafka systems, handle high data volumes, how to address some of the top-level concerns of building an enterprise grade messaging system and how to use different stream processing systems along with Kafka to fulfill their enterprise goals.

By the end of this book you will have an understanding of how components in the Hadoop ecosystem are effectively integrated to implement, a Fast & Reliable data pipeline. Also how to tackle different real-world problem when they occur in data pipeline.

What you will learn

  • Get an in-depth understanding of distributed computing using Hadoop 3 
  • Develop enterprise-grade applications using Apache Spark, Flink, and more. 
  • Build scalable and high performant Hadoop Data pipelines with security, monitoring and data governance at place
  • Build distributed, scalable, reliable and high performant Hadoop Data pipelines with security, monitoring and data governance at place.
  • Best Practices for Enterprises using or planning to use Hadoop 3 as data platform

Who This Book Is For

If you want to become a Big Data professional by mastering the advanced concepts in Hadoop, this book is for you. If you're a Hadoop professional looking to strengthen your knowledge of the Hadoop ecosystem, this book will also help you. A fundamental knowledge of the Java programming language and some basics of Hadoop is required to get started with this book.

商品描述(中文翻譯)

掌握 Hadoop 3 最先進概念的指南

主要特點


  • 掌握 Hadoop 3 的新功能和能力 - 全球最受歡迎的大數據生態系統

  • 輕鬆使用 MapReduce、YARN 及 Hadoop 生態系統中的其他工具來處理和分析數據

  • 一本高度實用的書籍,包含真實案例研究和易於理解的代碼,幫助您掌握 Hadoop

書籍描述

Apache Hadoop 是最受歡迎的大數據解決方案之一,用於分散式存儲和處理大量數據。隨著 Hadoop 3 的推出,Apache 承諾提供一個高性能、更具容錯性和更高效的大數據處理平台,重點在於更好的可擴展性和效率。

這是一本全面的指南,幫助您理解 Hadoop 生態系統工具的進階概念。您將學習 Hadoop 的內部運作、不同生態系統工具的進階概念、一些真實案例的解決方案以及如何保護您的叢集。接著將帶您了解 HDFS、YARN、MapReduce 和 Hadoop 3 的一些進階概念。我們將解決一些常見挑戰,例如如何有效使用 Kafka、設計低延遲可靠的消息傳遞 Kafka 系統、處理高數據量、如何解決構建企業級消息系統的主要關注點,以及如何與 Kafka 一起使用不同的流處理系統以實現企業目標。

在本書結束時,您將了解 Hadoop 生態系統中的組件如何有效整合以實現快速且可靠的數據管道,以及如何在數據管道中出現問題時應對不同的現實世界挑戰。

您將學到什麼


  • 深入了解使用 Hadoop 3 的分散式計算

  • 使用 Apache Spark、Flink 等開發企業級應用程序

  • 構建可擴展且高效能的 Hadoop 數據管道,並具備安全性、監控和數據治理

  • 構建分散式、可擴展、可靠且高效能的 Hadoop 數據管道,並具備安全性、監控和數據治理

  • 使用或計劃使用 Hadoop 3 作為數據平台的企業最佳實踐

本書適合誰

如果您想通過掌握 Hadoop 的進階概念成為大數據專業人士,這本書適合您。如果您是希望加強對 Hadoop 生態系統知識的 Hadoop 專業人士,這本書也將對您有所幫助。開始閱讀本書需要具備 Java 程式語言的基本知識以及一些 Hadoop 的基礎知識。