Data Analysis with R Paperback – December 22, 2015

Tony Fischetti

  • 出版商: Packt Publishing
  • 出版日期: 2015-12-26
  • 定價: $1,800
  • 售價: 5.0$900
  • 語言: 英文
  • 頁數: 388
  • 裝訂: Paperback
  • ISBN: 1785288148
  • ISBN-13: 9781785288142
  • 相關分類: R 語言Data Science
  • 立即出貨(限量) (庫存=1)

買這商品的人也買了...

商品描述

Key Features

  • Load, manipulate and analyze data from different sources
  • Gain a deeper understanding of fundamentals of applied statistics
  • A practical guide to performing data analysis in practice

Book Description

Frequently the tool of choice for academics, R has spread deep into the private sector and can be found in the production pipelines at some of the most advanced and successful enterprises. The power and domain-specificity of R allows the user to express complex analytics easily, quickly, and succinctly. With over 7,000 user contributed packages, it's easy to find support for the latest and greatest algorithms and techniques.

Starting with the basics of R and statistical reasoning, Data Analysis with R dives into advanced predictive analytics, showing how to apply those techniques to real-world data though with real-world examples.

Packed with engaging problems and exercises, this book begins with a review of R and its syntax. From there, get to grips with the fundamentals of applied statistics and build on this knowledge to perform sophisticated and powerful analytics. Solve the difficulties relating to performing data analysis in practice and find solutions to working with “messy data”, large data, communicating results, and facilitating reproducibility.

This book is engineered to be an invaluable resource through many stages of anyone's career as a data analyst.

What you will learn

  • Navigate the R environment
  • Describe and visualize the behavior of data and relationships between data
  • Gain a thorough understanding of statistical reasoning and sampling
  • Employ hypothesis tests to draw inferences from your data
  • Learn Bayesian methods for estimating parameters
  • Perform regression to predict continuous variables
  • Apply powerful classification methods to predict categorical data
  • Handle missing data gracefully using multiple imputation
  • Identify and manage problematic data points
  • Employ parallelization and Rcpp to scale your analyses to larger data
  • Put best practices into effect to make your job easier and facilitate reproducibility

About the Author

Tony Fischetti is a data scientist at College Factual, where he gets to use R everyday to build personalized rankings and recommender systems. He graduated in cognitive science from Rensselaer Polytechnic Institute, and his thesis was strongly focused on using statistics to study visual short-term memory.

Tony enjoys writing and and contributing to open source software, blogging at http://www.onthelambda.com, writing about himself in third person, and sharing his knowledge using simple, approachable language and engaging examples.

The more traditionally exciting of his daily activities include listening to records, playing the guitar and bass (poorly), weight training, and helping others.

Table of Contents

  1. RefresheR
  2. The Shape of Data
  3. Describing Relationships
  4. Probability
  5. Using Data to Reason About the World
  6. Testing Hypotheses
  7. Bayesian Methods
  8. Predicting Continuous Variables
  9. Predicting Categorical Variables
  10. Sources of Data
  11. Dealing with Messy Data
  12. Dealing with Large Data
  13. Reproducibility and Best Practices

商品描述(中文翻譯)

主要特點


  • 從不同來源載入、操作和分析數據

  • 深入理解應用統計學的基礎知識

  • 實踐數據分析的實用指南

書籍描述

R常常是學術界的首選工具,現在已經深入到私營部門,並且在一些最先進和成功的企業的生產流程中得到應用。R的強大和特定領域的特性使得用戶能夠輕鬆、快速、簡潔地表達複雜的分析。擁有超過7,000個用戶貢獻的套件,可以輕鬆找到最新和最優秀的算法和技術的支持。

《使用R進行數據分析》從R和統計推理的基礎知識開始,深入探討了高級預測分析,並展示如何通過真實世界的例子應用這些技術。

這本書充滿了引人入勝的問題和練習,從R和其語法的回顧開始。從那裡,深入了解應用統計學的基礎知識,並在此基礎上進行複雜而強大的分析。解決實際數據分析中的困難,找到處理“雜亂數據”、大數據、傳達結果和促進可重現性的解決方案。

這本書旨在成為數據分析師職業生涯中的寶貴資源。

你將學到什麼


  • 瀏覽R環境

  • 描述和可視化數據的行為和關係

  • 深入理解統計推理和抽樣

  • 使用假設檢驗從數據中推斷

  • 學習貝葉斯方法估計參數

  • 進行回歸以預測連續變量

  • 應用強大的分類方法預測分類數據

  • 優雅地處理缺失數據,使用多重插補

  • 識別和處理問題數據點

  • 使用並行處理和Rcpp擴展你的分析到更大的數據

  • 實施最佳實踐,使工作更輕鬆並促進可重現性

關於作者

Tony Fischetti 是College Factual的數據科學家,他每天都使用R建立個性化排名和推薦系統。他畢業於Rensselaer Polytechnic Institute的認知科學專業,他的論文主要集中在使用統計學研究視覺短期記憶。

Tony喜歡寫作和貢獻開源軟件,他在http://www.onthelambda.com上寫博客,用簡單、易懂的語言和引人入勝的例子分享他的知識。

他日常活動中更令人興奮的包括聽唱片、彈吉他和低音吉他(技巧不佳)、體重訓練和幫助他人。

目錄


  1. RefresheR

  2. 數據的形狀

  3. 描述關係

  4. 概率

  5. 使用數據推理世界

  6. 檢驗假設

  7. 貝葉斯方法

  8. 預測連續變量

  9. 預測分類變量

  10. 數據來源

  11. 處理雜亂數據

  12. 處理大數據

  13. 可重現性和最佳實踐