Hadoop RealWorld Solutions Cookbook Second Edition

Tanmay Deshpande

  • 出版商: Packt Publishing
  • 出版日期: 2016-03-29
  • 售價: $2,250
  • 貴賓價: 9.5$2,138
  • 語言: 英文
  • 頁數: 290
  • 裝訂: Paperback
  • ISBN: 1784395501
  • ISBN-13: 9781784395506
  • 相關分類: Hadoop
  • 下單後立即進貨 (約3~4週)

商品描述

Over 100+ hands-on recipes to help you learn and master the intricacies of Apache Hadoop 2.X, YARN, Hive, Pig, Oozie, Flume, Sqoop, Apache Spark, and Mahout

About This Book

  • Implement outstanding Machine Learning use cases on your own analytics models and processes.
  • Solutions to common problems when working with the Hadoop ecosystem.
  • Step-by-step implementation of end-to-end big data use cases.

Who This Book Is For

Readers who have a basic knowledge of big data systems and want to advance their knowledge with hands-on recipes.

What You Will Learn

  • Installing and maintaining Hadoop 2.X cluster and its ecosystem.
  • Write advanced Map Reduce programs and understand design patterns.
  • Advanced Data Analysis using the Hive, Pig, and Map Reduce programs.
  • Import and export data from various sources using Sqoop and Flume.
  • Data storage in various file formats such as Text, Sequential, Parquet, ORC, and RC Files.
  • Machine learning principles with libraries such as Mahout
  • Batch and Stream data processing using Apache Spark

In Detail

Big data is the current requirement. Most organizations produce huge amount of data every day. With the arrival of Hadoop-like tools, it has become easier for everyone to solve big data problems with great efficiency and at minimal cost. Grasping Machine Learning techniques will help you greatly in building predictive models and using this data to make the right decisions for your organization.

Hadoop Real World Solutions Cookbook gives readers insights into learning and mastering big data via recipes. The book not only clarifies most big data tools in the market but also provides best practices for using them. The book provides recipes that are based on the latest versions of Apache Hadoop 2.X, YARN, Hive, Pig, Sqoop, Flume, Apache Spark, Mahout and many more such ecosystem tools. This real-world-solution cookbook is packed with handy recipes you can apply to your own everyday issues. Each chapter provides in-depth recipes that can be referenced easily. This book provides detailed practices on the latest technologies such as YARN and Apache Spark. Readers will be able to consider themselves as big data experts on completion of this book.

This guide is an invaluable tutorial if you are planning to implement a big data warehouse for your business.

商品描述(中文翻譯)

超過100個實踐的食譜,幫助您學習和掌握Apache Hadoop 2.X、YARN、Hive、Pig、Oozie、Flume、Sqoop、Apache Spark和Mahout的細節。

關於本書
- 在您自己的分析模型和流程上實現優秀的機器學習用例。
- 解決使用Hadoop生態系統時的常見問題。
- 逐步實施端到端的大數據用例。

本書適合對大數據系統有基本知識並希望通過實踐食譜來提升知識的讀者。

您將學到什麼
- 安裝和維護Hadoop 2.X集群及其生態系統。
- 編寫高級Map Reduce程序並了解設計模式。
- 使用Hive、Pig和Map Reduce程序進行高級數據分析。
- 使用Sqoop和Flume從各種來源導入和導出數據。
- 在各種文件格式(如文本、順序、Parquet、ORC和RC文件)中存儲數據。
- 使用Mahout等庫進行機器學習。
- 使用Apache Spark進行批處理和流數據處理。

詳細內容
大數據是當前的需求。大多數組織每天都會產生大量數據。隨著類似Hadoop的工具的出現,每個人都可以以極高的效率和最低的成本解決大數據問題。掌握機器學習技術將大大幫助您建立預測模型,並使用這些數據為組織做出正確的決策。

《Hadoop實戰解決方案食譜》通過食譜為讀者提供學習和掌握大數據的見解。本書不僅澄清了市場上大多數大數據工具,還提供了使用這些工具的最佳實踐。本書提供的食譜基於最新版本的Apache Hadoop 2.X、YARN、Hive、Pig、Sqoop、Flume、Apache Spark、Mahout等生態系統工具。這本實用的食譜書充滿了您可以應用於自己日常問題的便利食譜。每個章節都提供了深入的食譜,可以輕鬆參考。本書詳細介紹了YARN和Apache Spark等最新技術的實踐。讀者在閱讀完本書後將能夠自認為是大數據專家。

如果您計劃為您的業務實施大數據數據倉庫,本指南將是一個寶貴的教程。