Apache Spark 2.x Machine Learning Cookbook
暫譯: Apache Spark 2.x 機器學習食譜

Name: Apache Spark 2.x Machine Learning Cookbook
Price: 2023 TWD
Availability: OnlineOnly
Author: Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei
ISBN: 1783551607

Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, Shuen Mei

出版商: Packt Publishing
出版日期: 2017-09-22
售價: $2,130
貴賓價: 9.5 折 $2,023
語言: 英文
頁數: 666
裝訂: Paperback
ISBN: 1783551607
ISBN-13: 9781783551606
相關分類: Spark、Machine Learning
相關翻譯: Spark機器學習實戰 (簡中版)

海外代購書籍(需單獨結帳)

前往其他有現貨版本↗️

商品描述

Simplify machine learning model implementations with Spark

About This Book

Solve the day-to-day problems of data science with Spark
This unique cookbook consists of exciting and intuitive numerical recipes
Optimize your work by acquiring, cleaning, analyzing, predicting, and visualizing your data

Who This Book Is For

This book is for Scala developers with a fairly good exposure to and understanding of machine learning techniques, but lack practical implementations with Spark. A solid knowledge of machine learning algorithms is assumed, as well as hands-on experience of implementing ML algorithms with Scala. However, you do not need to be acquainted with the Spark ML libraries and ecosystem.

What You Will Learn

Get to know how Scala and Spark go hand-in-hand for developers when developing ML systems with Spark
Build a recommendation engine that scales with Spark
Find out how to build unsupervised clustering systems to classify data in Spark
Build machine learning systems with the Decision Tree and Ensemble models in Spark
Deal with the curse of high-dimensionality in big data using Spark
Implement Text analytics for Search Engines in Spark
Streaming Machine Learning System implementation using Spark

In Detail

Machine learning aims to extract knowledge from data, relying on fundamental concepts in computer science, statistics, probability, and optimization. Learning about algorithms enables a wide range of applications, from everyday tasks such as product recommendations and spam filtering to cutting edge applications such as self-driving cars and personalized medicine. You will gain hands-on experience of applying these principles using Apache Spark, a resilient cluster computing system well suited for large-scale machine learning tasks.

This book begins with a quick overview of setting up the necessary IDEs to facilitate the execution of code examples that will be covered in various chapters. It also highlights some key issues developers face while working with machine learning algorithms on the Spark platform. We progress by uncovering the various Spark APIs and the implementation of ML algorithms with developing classification systems, recommendation engines, text analytics, clustering, and learning systems. Toward the final chapters, we’ll focus on building high-end applications and explain various unsupervised methodologies and challenges to tackle when implementing with big data ML systems.

Style and approach

This book is packed with intuitive recipes supported with line-by-line explanations to help you understand how to optimize your work flow and resolve problems when working with complex data modeling tasks and predictive algorithms. This is a valuable resource for data scientists and those working on large scale data projects.

商品描述(中文翻譯)

簡化機器學習模型實作與 Spark

本書介紹

使用 Spark 解決日常數據科學問題

這本獨特的食譜書包含令人興奮且直觀的數值食譜

透過獲取、清理、分析、預測和視覺化數據來優化您的工作

本書適合誰

本書適合對機器學習技術有相當了解的 Scala 開發者，但缺乏使用 Spark 的實際實作經驗。假設讀者對機器學習算法有扎實的知識，並具備使用 Scala 實作 ML 算法的實務經驗。然而，您不需要熟悉 Spark ML 庫和生態系統。

您將學到什麼

了解 Scala 和 Spark 如何攜手合作，幫助開發者在使用 Spark 開發 ML 系統時

建立一個可擴展的推薦引擎，使用 Spark

了解如何在 Spark 中構建無監督的聚類系統以分類數據

使用決策樹和集成模型在 Spark 中構建機器學習系統

使用 Spark 處理大數據中的高維度詛咒

在 Spark 中實作搜尋引擎的文本分析

使用 Spark 實作串流機器學習系統

詳細內容

機器學習旨在從數據中提取知識，依賴於計算機科學、統計學、概率論和優化的基本概念。學習算法使得從日常任務（如產品推薦和垃圾郵件過濾）到尖端應用（如自駕車和個性化醫療）等各種應用成為可能。您將獲得使用 Apache Spark 的實務經驗，這是一個適合大規模機器學習任務的彈性集群計算系統。

本書首先快速概述設置必要的 IDE，以便執行各章節中將涵蓋的代碼示例。它還強調了開發者在 Spark 平台上使用機器學習算法時面臨的一些關鍵問題。我們將逐步揭示各種 Spark API 及其在開發分類系統、推薦引擎、文本分析、聚類和學習系統中的 ML 算法實作。到最後幾章，我們將專注於構建高端應用，並解釋在實作大數據 ML 系統時需要解決的各種無監督方法和挑戰。