Mastering Scala Machine Learning (Paperback)

Alex Kozlov

買這商品的人也買了...

商品描述

Advance your skills in efficient data analysis and data processing using the powerful tools of Scala, Spark, and Hadoop

About This Book

  • This is a primer on functional-programming-style techniques to help you efficiently process and analyze all of your data
  • Get acquainted with the best and newest tools available such as Scala, Spark, Parquet and MLlib for machine learning
  • Learn the best practices to incorporate new Big Data machine learning in your data-driven enterprise to gain future scalability and maintainability

Who This Book Is For

Mastering Scala Machine Learning is intended for enthusiasts who want to plunge into the new pool of emerging techniques for machine learning. Some familiarity with standard statistical techniques is required.

What You Will Learn

  • Sharpen your functional programming skills in Scala using REPL
  • Apply standard and advanced machine learning techniques using Scala
  • Get acquainted with Big Data technologies and grasp why we need a functional approach to Big Data
  • Discover new data structures, algorithms, approaches, and habits that will allow you to work effectively with large amounts of data
  • Understand the principles of supervised and unsupervised learning in machine learning
  • Work with unstructured data and serialize it using Kryo, Protobuf, Avro, and AvroParquet
  • Construct reliable and robust data pipelines and manage data in a data-driven enterprise
  • Implement scalable model monitoring and alerts with Scala

In Detail

Since the advent of object-oriented programming, new technologies related to Big Data are constantly popping up on the market. One such technology is Scala, which is considered to be a successor to Java in the area of Big Data by many, like Java was to C/C++ in the area of distributed programing.

This book aims to take your knowledge to next level and help you impart that knowledge to build advanced applications such as social media mining, intelligent news portals, and more. After a quick refresher on functional programming concepts using REPL, you will see some practical examples of setting up the development environment and tinkering with data. We will then explore working with Spark and MLlib using k-means and decision trees.

Most of the data that we produce today is unstructured and raw, and you will learn to tackle this type of data with advanced topics such as regression, classification, integration, and working with graph algorithms. Finally, you will discover at how to use Scala to perform complex concept analysis, to monitor model performance, and to build a model repository. By the end of this book, you will have gained expertise in performing Scala machine learning and will be able to build complex machine learning projects using Scala.

Style and approach

This hands-on guide dives straight into implementing Scala for machine learning without delving much into mathematical proofs or validations. There are ample code examples and tricks that will help you sail through using the standard techniques and libraries. This book provides practical examples from the field on how to correctly tackle data analysis problems, particularly for modern Big Data datasets.

商品描述(中文翻譯)

提升您在使用Scala、Spark和Hadoop等強大工具進行高效數據分析和數據處理方面的技能。

關於本書
- 本書是一本介紹函數式編程風格技術的入門書,幫助您高效處理和分析所有數據。
- 熟悉最佳和最新的工具,如Scala、Spark、Parquet和MLlib,用於機器學習。
- 學習將新的大數據機器學習納入您的數據驅動企業的最佳實踐,以獲得未來的可擴展性和可維護性。

適合閱讀對象
- 本書適合想要深入了解機器學習新技術的愛好者。需要對標準統計技術有一定的了解。

學到什麼
- 使用REPL鍛煉您的Scala函數式編程技能。
- 使用Scala應用標準和高級機器學習技術。
- 熟悉大數據技術,並了解為什麼我們需要一種函數式的大數據方法。
- 發現新的數據結構、算法、方法和習慣,使您能夠有效處理大量數據。
- 了解監督學習和非監督學習在機器學習中的原則。
- 使用Kryo、Protobuf、Avro和AvroParquet處理非結構化數據並對其進行序列化。
- 構建可靠且強大的數據管道,並在數據驅動的企業中管理數據。
- 使用Scala實現可擴展的模型監控和警報。

詳細內容
- 自從面向對象編程出現以來,與大數據相關的新技術不斷湧現。Scala就是其中之一,被許多人認為是Java在大數據領域的繼任者,就像Java在分布式編程領域是C/C++的繼任者一樣。

- 本書旨在將您的知識提升到更高的水平,並幫助您構建高級應用程序,如社交媒體挖掘、智能新聞門戶等。在使用REPL快速複習函數式編程概念後,您將看到一些實際的示例,以設置開發環境並嘗試處理數據。然後,我們將探索使用k-means和決策樹在Spark和MLlib上工作。

- 我們今天產生的大部分數據都是非結構化和原始的,您將學習使用回歸、分類、集成和圖算法等高級主題來處理此類數據。最後,您將了解如何使用Scala執行複雜的概念分析,監控模型性能並構建模型庫。通過閱讀本書,您將獲得在Scala機器學習方面的專業知識,並能夠使用Scala構建複雜的機器學習項目。

風格和方法
- 本實用指南直接介紹了如何使用Scala進行機器學習,而不深入探討數學證明或驗證。書中提供了豐富的代碼示例和技巧,將幫助您使用標準技術和庫輕鬆應對。本書提供了來自實際場景的實用示例,特別是針對現代大數據數據集的正確數據分析問題的解決方法。