Spark: Big Data Cluster Computing in Production (Paperback)

Ilya Ganelin

  • 出版商: Wiley
  • 出版日期: 2016-03-21
  • 定價: $1,650
  • 售價: 9.5$1,568
  • 語言: 英文
  • 頁數: 216
  • 裝訂: Paperback
  • ISBN: 1119254019
  • ISBN-13: 9781119254010
  • 相關分類: Spark大數據 Big-data
  • 立即出貨 (庫存=1)

買這商品的人也買了...

商品描述

Production-targeted Spark guidance with real-world use cases Spark: Big Data Cluster Computing in Production goes beyond general Spark overviews to provide targeted guidance toward using lightning-fast big-data clustering in production. Written by an expert team well-known in the big data community, this book walks you through the challenges in moving from proof-of-concept or demo Spark applications to live Spark in production. Real use cases provide deep insight into common problems, limitations, challenges, and opportunities, while expert tips and tricks help you get the most out of Spark performance. Coverage includes Spark SQL, Tachyon, Kerberos, ML Lib, YARN, and Mesos, with clear, actionable guidance on resource scheduling, db connectors, streaming, security, and much more. Spark has become the tool of choice for many Big Data problems, with more active contributors than any other Apache Software project. General introductory books abound, but this book is the first to provide deep insight and real-world advice on using Spark in production. Specific guidance, expert tips, and invaluable foresight make this guide an incredibly useful resource for real production settings. * Review Spark hardware requirements and estimate cluster size * Gain insight from real-world production use cases * Tighten security, schedule resources, and fine-tune performance * Overcome common problems encountered using Spark in production Spark works with other big data tools including MapReduce and Hadoop, and uses languages you already know like Java, Scala, Python, and R. Lightning speed makes Spark too good to pass up, but understanding limitations and challenges in advance goes a long way toward easing actual production implementation. Spark: Big Data Cluster Computing in Production tells you everything you need to know, with real-world production insight and expert guidance, tips, and tricks.

商品描述(中文翻譯)

《Spark:大數據集群計算在生產環境中的應用指南》是一本針對生產環境中使用快速大數據集群計算工具Spark的指南書籍。這本書超越了一般的Spark概述,提供了針對生產環境中使用Spark的具體指導。由大數據社區中知名的專家團隊撰寫,本書將引導讀者從概念驗證或演示Spark應用程式轉向實際生產環境中使用Spark。真實的應用案例提供了對常見問題、限制、挑戰和機遇的深入洞察,而專家的技巧和訣竅則幫助讀者充分發揮Spark的性能。本書涵蓋了Spark SQL、Tachyon、Kerberos、ML Lib、YARN和Mesos等內容,並提供了明確可行的指導,包括資源排程、資料庫連接器、流式處理、安全性等等。Spark已成為許多大數據問題的首選工具,其活躍的貢獻者比任何其他Apache軟體項目都多。雖然有許多一般介紹性的書籍,但本書是第一本提供深入洞察和實際應用建議的Spark指南。具體的指導、專家的技巧和寶貴的預見性使本書成為實際生產環境中非常有用的資源。本書涵蓋了Spark的硬體需求和集群大小的估算,並從真實的生產應用案例中獲得洞察。同時,本書還提供了加強安全性、資源排程和性能微調等方面的指導,並解決了在生產環境中使用Spark時遇到的常見問題。Spark可以與MapReduce和Hadoop等其他大數據工具一起使用,並且支援Java、Scala、Python和R等已經熟悉的程式語言。Spark的高速運算使其成為不可錯過的工具,但事先了解其限制和挑戰對於實際生產環境的實施非常有幫助。《Spark:大數據集群計算在生產環境中的應用指南》將告訴您一切所需,並提供真實的生產洞察和專家指導、技巧和訣竅。