Data Algorithms: Recipes for Scaling Up with Hadoop and Spark (Paperback)

Mahmoud Parsian

買這商品的人也買了...

商品描述

Learn the algorithms and tools you need to build MapReduce applications with Hadoop and Spark for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-stepthrough the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns.

  • Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq
  • Use the most relevant regression/analytical algorithms used for different biological data types
  • Apply t-test, joins, top-10, and correlation algorithms using MapReduce/Hadoop and Spark

商品描述(中文翻譯)

學習使用Hadoop和Spark建立MapReduce應用程式所需的演算法和工具,以處理吉比、太比或拍比級別的資料集,並在廉價硬體集群上進行處理。在這本實用書中,作者Mahmoud Parsian(Illumina的大數據團隊負責人)逐步介紹機器學習演算法的設計,例如Naive Bayes和Markov Chain,並展示如何應用這些演算法於臨床和生物資料集,使用MapReduce設計模式。

本書內容包括:
- 將MapReduce演算法應用於臨床和生物資料,例如DNA-Seq和RNA-Seq
- 使用最相關的迴歸/分析演算法處理不同類型的生物資料
- 使用MapReduce/Hadoop和Spark應用t-test、連接、前10名和相關性演算法