Moving Hadoop to the Cloud: Harnessing Cloud Features and Flexibility for Hadoop Clusters

Bill Havanki

  • 出版商: O'Reilly
  • 出版日期: 2017-08-22
  • 售價: $1,550
  • 貴賓價: 9.5$1,473
  • 語言: 英文
  • 頁數: 338
  • 裝訂: Paperback
  • ISBN: 1491959630
  • ISBN-13: 9781491959633
  • 相關分類: Hadoop
  • 海外代購書籍(需單獨結帳)

商品描述

Until recently, Hadoop deployments existed on hardware owned and run by organizations. Now, of course, you can acquire the computing resources and network connectivity to run Hadoop clusters in the cloud. But there’s a lot more to deploying Hadoop to the public cloud than simply renting machines.

This hands-on guide shows developers and systems administrators familiar with Hadoop how to install, use, and manage cloud-born clusters efficiently. You’ll learn how to architect clusters that work with cloud-provider features—not just to avoid pitfalls, but also to take full advantage of these services. You’ll also compare the Amazon, Google, and Microsoft clouds, and learn how to set up clusters in each of them.

  • Learn how Hadoop clusters run in the cloud, the problems they can help you solve, and their potential drawbacks
  • Examine the common concepts of cloud providers, including compute capabilities, networking and security, and storage
  • Build a functional Hadoop cluster on cloud infrastructure, and learn what the major providers require
  • Explore use cases for high availability, relational data with Hive, and complex analytics with Spark
  • Get patterns and practices for running cloud clusters, from designing for price and security to dealing with maintenance

商品描述(中文翻譯)

直到最近,Hadoop 的部署都是在組織擁有和運營的硬體上進行的。現在,當然你可以獲得計算資源和網絡連接來在雲端上運行 Hadoop 集群。但是,在將 Hadoop 部署到公共雲端上,不僅僅是租用機器這麼簡單。

這本實用指南向熟悉 Hadoop 的開發人員和系統管理員展示如何高效地安裝、使用和管理雲端上的集群。你將學習如何設計與雲端提供商功能相匹配的集群,不僅僅是為了避免問題,還要充分利用這些服務。你還將比較亞馬遜、谷歌和微軟的雲端,並學習如何在每個雲端上設置集群。

本書內容包括:
- 了解 Hadoop 集群在雲端上運行的方式,它們可以幫助你解決哪些問題,以及它們的潛在缺點
- 深入研究雲端提供商的常見概念,包括計算能力、網絡和安全性,以及存儲
- 在雲端基礎設施上構建一個功能完整的 Hadoop 集群,並了解主要提供商的要求
- 探索高可用性、使用 Hive 進行關聯數據處理以及使用 Spark 進行複雜分析的使用案例
- 從價格和安全性設計到維護處理,獲得在雲端集群上運行的模式和實踐