Learning Apache Spark 2

Muhammad Asif Abbasi

  • 出版商: Packt Publishing
  • 出版日期: 2017-03-24
  • 售價: $1,830
  • 貴賓價: 9.5$1,739
  • 語言: 英文
  • 頁數: 356
  • 裝訂: Paperback
  • ISBN: 1785885138
  • ISBN-13: 9781785885136
  • 相關分類: Spark
  • 下單後立即進貨 (約3~4週)

商品描述

Key Features

  • Exclusive guide that covers how to get up and running with fast data processing using Apache Spark
  • Explore and exploit various possibilities with Apache Spark using real-world use cases in this book
  • Want to perform efficient data processing at real time? This book will be your one-stop solution.

Book Description

Spark juggernaut keeps on rolling and getting more and more momentum each day. Spark provides key capabilities in the form of Spark SQL, Spark Streaming, Spark ML and Graph X all accessible via Java, Scala, Python and R. Deploying the key capabilities is crucial whether it is on a Standalone framework or as a part of existing Hadoop installation and configuring with Yarn and Mesos.

The next part of the journey after installation is using key components, APIs, Clustering, machine learning APIs, data pipelines, parallel programming. It is important to understand why each framework component is key, how widely it is being used, its stability and pertinent use cases.

Once we understand the individual components, we will take a couple of real life advanced analytics examples such as ‘Building a Recommendation system', ‘Predicting customer churn' and so on.

The objective of these real life examples is to give the reader confidence of using Spark for real-world problems.

What you will learn

  • Get an overview of big data analytics and its importance for organizations and data professionals
  • Delve into Spark to see how it is different from existing processing platforms
  • Understand the intricacies of various file formats, and how to process them with Apache Spark.
  • Realize how to deploy Spark with YARN, MESOS

商品描述(中文翻譯)

《主要特點》
- 獨家指南,介紹如何使用Apache Spark進行快速數據處理
- 通過本書中的實際案例,探索和利用Apache Spark的各種可能性
- 想要實時進行高效的數據處理?本書將是您的一站式解決方案。

《書籍描述》
Spark巨人每天都在不斷發展壯大,並且越來越受到關注。Spark通過Java、Scala、Python和R等方式提供了Spark SQL、Spark Streaming、Spark ML和Graph X等關鍵功能。部署這些關鍵功能非常重要,無論是在獨立框架上還是作為現有Hadoop安裝的一部分,並與Yarn和Mesos進行配置。

安裝完成後的下一步是使用關鍵組件、API、集群、機器學習API、數據管道和並行編程。了解每個框架組件的重要性、廣泛使用程度、穩定性和相關用例非常重要。

一旦我們了解了各個組件,我們將介紹一些實際的高級分析示例,例如“構建推薦系統”、“預測客戶流失”等等。

這些實際案例的目的是讓讀者對於使用Spark解決現實世界問題有信心。

《你將學到什麼》
- 瞭解大數據分析對於組織和數據專業人員的重要性
- 深入研究Spark,看看它與現有處理平台的不同之處
- 理解各種文件格式的細節,以及如何使用Apache Spark處理它們
- 瞭解如何使用YARN、MESOS部署Spark