Big Data Analytics with R (Paperback)

Simon Walkowiak

買這商品的人也買了...

商品描述

Key Features

  • Perform computational analyses on Big Data to generate meaningful results
  • Get a practical knowledge of R programming language while working on Big Data platforms like Hadoop, Spark, H2O and SQL/NoSQL databases,
  • Explore fast, streaming, and scalable data analysis with the most cutting-edge technologies in the market

Book Description

Big Data analytics is the process of examining large and complex data sets that often exceed the computational capabilities. R is a leading programming language of data science, consisting of powerful functions to tackle all problems related to Big Data processing.

The book will begin with a brief introduction to the Big Data world and its current industry standards. With introduction to the R language and presenting its development, structure, applications in real world, and its shortcomings. Book will progress towards revision of major R functions for data management and transformations. Readers will be introduce to Cloud based Big Data solutions (e.g. Amazon EC2 instances and Amazon RDS, Microsoft Azure and its HDInsight clusters) and also provide guidance on R connectivity with relational and non-relational databases such as MongoDB and HBase etc. It will further expand to include Big Data tools such as Apache Hadoop ecosystem, HDFS and MapReduce frameworks. Also other R compatible tools such as Apache Spark, its machine learning library Spark MLlib, as well as H2O.

What you will learn

  • Learn about current state of Big Data processing using R programming language and its powerful statistical capabilities
  • Deploy Big Data analytics platforms with selected Big Data tools supported by R in a cost-effective and time-saving manner
  • Apply the R language to real-world Big Data problems on a multi-node Hadoop cluster, e.g. electricity consumption across various socio-demographic indicators and bike share scheme usage
  • Explore the compatibility of R with Hadoop, Spark, SQL and NoSQL databases, and H2O platform

商品描述(中文翻譯)

《主要特點》
- 進行大數據的計算分析,產生有意義的結果
- 在Hadoop、Spark、H2O和SQL/NoSQL數據庫等大數據平台上實際應用R編程語言
- 使用市場上最先進的技術,探索快速、流式和可擴展的數據分析

《書籍描述》
大數據分析是對常常超出計算能力的大型和複雜數據集進行檢查的過程。R是數據科學領域的領先編程語言,具有強大的功能,可以應對與大數據處理相關的所有問題。

本書將首先簡要介紹大數據領域及其目前的行業標準。介紹R語言及其發展、結構、在實際世界中的應用以及其不足之處。書籍將進一步回顧主要的R函數,用於數據管理和轉換。讀者將了解基於雲的大數據解決方案(例如Amazon EC2實例和Amazon RDS,Microsoft Azure及其HDInsight集群),並提供有關R與關聯和非關聯數據庫(如MongoDB和HBase等)的連接的指導。它還將擴展到包括Apache Hadoop生態系統、HDFS和MapReduce框架等大數據工具。還有其他與R兼容的工具,如Apache Spark、其機器學習庫Spark MLlib,以及H2O。

《你將學到什麼》
- 了解使用R編程語言進行大數據處理的當前狀態以及其強大的統計能力
- 以節省成本和時間的方式,使用R支持的選定大數據工具部署大數據分析平台
- 在多節點Hadoop集群上應用R語言解決實際的大數據問題,例如各種社會人口指標下的電力消耗和自行車共享計劃的使用情況
- 探索R與Hadoop、Spark、SQL和NoSQL數據庫以及H2O平台的兼容性