Big Data Analytics with Spark and Hadoop

Name: Big Data Analytics with Spark and Hadoop
Price: 960 TWD
Availability: InStock
Author: Venkat Ankam
ISBN: 1785884697

Venkat Ankam

出版商: Packt Publishing
出版日期: 2016-09-29
定價: $1,600
售價: 6.0 折 $960
語言: 英文
頁數: 326
裝訂: Paperback
ISBN: 1785884697
ISBN-13: 9781785884696
相關分類: Hadoop、Spark、大數據 Big-data、Data Science
相關翻譯: Spark與Hadoop大數據分析 (Big Data Analytics) (簡中版)

立即出貨 (庫存=1)

買這商品的人也買了...

~~$550~~ $468

SQL 語法範例辭典
~~$780~~ $663

Embedded Linux 嵌入式系統開發實務, 2/e (Embedded Linux Primer: A Practical Real-World Approach, 2/e)
~~$590~~ $460

JavaScript 設計與開發：透視新技術關鍵 + 完全實力養成 (Modern JavaScript: Develop and Design)
~~$450~~ $351

SQL Server 效能調校
~~$780~~ $585

ASP.NET MVC 5 網站開發美學
~~$690~~ $518

實戰雲端作業系統建置與維護－VMware vSphere 5.5 虛擬化全面啟動
~~$780~~ $616

精通 Python｜運用簡單的套件進行現代運算 (Introducing Python: Modern Computing in Simple Packages)
~~$360~~ $270

完整學會 Git, GitHub, Git Server 的24堂課
~~$450~~ $383

AngularJS 快速上手-- 實務範例教學
~~$560~~ $437

Python 程式設計實務－從初學到活用 Python 開發技巧的16堂課
~~$420~~ $315

讓響應式(RWD)網頁設計變簡單：Bootstrap開發速成 (附135分鐘專題影音教學)
~~$490~~ $382

Linux Shell 程式設計實力養成：225個實務關鍵技巧徹底詳解, 2/e
~~$520~~ $390

Python 程式設計入門指南
~~$594~~ $564

自製編譯器
~~$580~~ $493

新世代全端介面開發：React.js 快速上手
~~$580~~ $452

Python 機器學習 (Python Machine Learning)
~~$580~~ $458

Data Science from Scratch｜用 Python 學資料科學 (中文版)(Data Science from Scratch: First Principles with Python)
~~$540~~ $405

iOS 10 程式設計實戰 - Swift 3 + Apple Watch 快速上手的開發技巧200+
~~$490~~ $417

Google 就是這樣猜中你的心：用機器學習及演算法分析文字語意
~~$580~~ $493

新觀念 Microsoft Visual C# 程式設計範例教本, 4/e
~~$420~~ $357

Python 程式設計「超入門」
~~$420~~ $315

Google Hacking 精實技法｜進階搜尋x駭客工具x滲透測試
~~$450~~ $338

學會 Swift 3 程式設計的 21堂課
~~$280~~ $210

區塊鏈商業應用｜次世代網路技術的前景、實踐與應用
~~$550~~ $435

程式菜鳥也能拿來即用的 Excel VBA 巨集活用 200例（2016/2013適用）

商品描述

Key Features

This book is based on the latest 2.0 version of Apache Spark and 2.7 version of Hadoop integrated with most commonly used tools.
Learn all Spark stack components including latest topics such as DataFrames, DataSets, GraphFrames, Structured Streaming, DataFrame based ML Pipelines and SparkR.
Integrations with frameworks such as HDFS, YARN and tools such as Jupyter, Zeppelin, NiFi, Mahout, HBase Spark Connector, GraphFrames, H2O and Hivemall.

Book Description

Big Data Analytics book aims at providing the fundamentals of Apache Spark and Hadoop. All Spark components Spark Core, Spark SQL, DataFrames, Data sets, Conventional Streaming, Structured Streaming, MLlib, Graphx and Hadoop core components HDFS, MapReduce and Yarn are explored in greater depth with implementation examples on Spark + Hadoop clusters.

It is moving away from MapReduce to Spark. So, advantages of Spark over MapReduce are explained at great depth to reap benefits of in-memory speeds. DataFrames API, Data Sources API and new Data set API are explained for building Big Data analytical applications. Real-time data analytics using Spark Streaming with Apache Kafka and HBase is covered to help building streaming applications. New Structured streaming concept is explained with an IOT (Internet of Things) use case. Machine learning techniques are covered using MLLib, ML Pipelines and SparkR and Graph Analytics are covered with GraphX and GraphFrames components of Spark.

Readers will also get an opportunity to get started with web based notebooks such as Jupyter, Apache Zeppelin and data flow tool Apache NiFi to analyze and visualize data.

What you will learn

Find out and implement the tools and techniques of big data analytics using Spark on Hadoop clusters with wide variety of tools used with Spark and Hadoop
Understand all the Hadoop and Spark ecosystem components
Get to know all the Spark components: Spark Core, Spark SQL, DataFrames, DataSets, Conventional and Structured Streaming, MLLib, ML Pipelines and Graphx
See batch and real-time data analytics using Spark Core, Spark SQL, and Conventional and Structured Streaming
Get to grips with data science and machine learning using MLLib, ML Pipelines, H2O, Hivemall, Graphx, SparkR and Hivemall.

About the Author

Venkat Ankam has over 18 years of IT experience and over 5 years in big data technologies, working with customers to design and develop scalable big data applications. Having worked with multiple clients globally, he has tremendous experience in big data analytics using Hadoop and Spark.

He is a Cloudera Certified Hadoop Developer and Administrator and also a Databricks Certified Spark Developer. He is the founder and presenter of a few Hadoop and Spark meetup groups globally and loves to share knowledge with the community.

Venkat has delivered hundreds of trainings, presentations, and white papers in the big data sphere. While this is his first attempt at writing a book, many more books are in the pipeline.

Big Data Analytics at 10,000 foot view
Getting Started with Apache Hadoop and Apache Spark
Deep Dive into Apache Spark
Big Data Analytics with Spark SQL, DataFrames, and Datasets
Real-Time Analytics with Spark Streaming and Structured Streaming
Notebooks and Dataflows with Spark and Hadoop
Machine Learning with Spark and Hadoop
Building Recommendation Systems with Spark and Mahout
Graph Analytics with GraphX
Interactive Analytics with SparkR

商品描述(中文翻譯)

主要特點

本書基於最新的Apache Spark 2.0版本和Hadoop 2.7版本，並整合了最常用的工具。

學習所有Spark堆疊組件，包括最新的主題，如DataFrames、DataSets、GraphFrames、Structured Streaming、基於DataFrame的ML Pipelines和SparkR。

與HDFS、YARN等框架以及Jupyter、Zeppelin、NiFi、Mahout、HBase Spark Connector、GraphFrames、H2O和Hivemall等工具的整合。

書籍描述

《大數據分析》一書旨在提供Apache Spark和Hadoop的基礎知識。深入探討了所有Spark組件，包括Spark Core、Spark SQL、DataFrames、DataSets、傳統流式處理、結構化流式處理、MLlib、Graphx以及Hadoop核心組件HDFS、MapReduce和Yarn，並提供了在Spark + Hadoop集群上實施的示例。

本書將從MapReduce轉向Spark。因此，詳細解釋了Spark相對於MapReduce的優勢，以實現內存速度的好處。解釋了DataFrames API、Data Sources API和新的Data set API，用於構建大數據分析應用程序。介紹了使用Spark Streaming和Apache Kafka、HBase進行實時數據分析，以幫助構建流式應用程序。使用IOT（物聯網）用例解釋了新的結構化流式處理概念。使用MLLib、ML Pipelines和SparkR進行機器學習技術，使用GraphX和Spark的GraphFrames組件進行圖形分析。

讀者還將有機會開始使用基於Web的筆記本，如Jupyter、Apache Zeppelin和數據流工具Apache NiFi進行數據分析和可視化。