Learning Spark: Lightning-Fast Big Data Analysis (Paperback)
Holden Karau, Andy Konwinski, Patrick Wendell, Matei Zaharia
- 出版商: O'Reilly
- 出版日期: 2015-02-27
- 售價: $1,450
- 貴賓價: 9.5 折 $1,378
- 語言: 英文
- 頁數: 276
- 裝訂: Paperback
- ISBN: 1449358624
- ISBN-13: 9781449358624
-
相關分類:
Spark、大數據 Big-data、資料科學
-
相關翻譯:
Spark 學習手冊 (Learning Spark: Lightning-Fast Big Data Analysis) (繁中版)
-
其他版本:
Learning Spark: Lightning-Fast Data Analytics
銷售排行:
🥈 2016/6 英文書 銷售排行 第 2 名
買這商品的人也買了...
-
$480$379 -
$480$470 -
$720$684 -
$680$578 -
$590$466 -
$840Python for Data Analysis (Paperback)
-
$880$695 -
$450$356 -
$480$374 -
$580$458 -
$680$578 -
$720$612 -
$360$284 -
$780$663 -
$350$277 -
$480$408 -
$560$476 -
$520$364 -
$990Advanced Analytics with Spark: Patterns for Learning from Data at Scale (Paperback)
-
$1,796$1,701 -
$580$458 -
$450$383 -
$780$616 -
$360$306 -
$680$537
商品描述
The Web is getting faster, and the data it delivers is getting bigger. How can you handle everything efficiently? This book introduces Spark, an open source cluster computing system that makes data analytics fast to run and fast to write. You’ll learn how to run programs faster, using primitives for in-memory cluster computing. With Spark, your job can load data into memory and query it repeatedly much quicker than with disk-based systems like Hadoop MapReduce.
Written by the developers of Spark, this book will have you up and running in no time. You’ll learn how to express MapReduce jobs with just a few simple lines of Spark code, instead of spending extra time and effort working with Hadoop’s raw Java API.
- Quickly dive into Spark capabilities such as collect, count, reduce, and save
- Use one programming paradigm instead of mixing and matching tools such as Hive, Hadoop, Mahout, and S4/Storm
- Learn how to run interactive, iterative, and incremental analyses
- Integrate with Scala to manipulate distributed datasets like local collections
- Tackle partitioning issues, data locality, default hash partitioning, user-defined partitioners, and custom serialization
- Use other languages by means of pipe() to achieve the equivalent of Hadoop streaming