Primer to Analysis of Genomic Data Using R (Use R!)

Cedric Gondro

  • 出版商: Springer
  • 出版日期: 2015-06-09
  • 售價: $2,300
  • 貴賓價: 9.5$2,185
  • 語言: 英文
  • 頁數: 288
  • 裝訂: Paperback
  • ISBN: 331914474X
  • ISBN-13: 9783319144740
  • 相關分類: R 語言
  • 立即出貨 (庫存 < 3)

買這商品的人也買了...

商品描述

Through this book, researchers and students will learn to use R for analysis of large-scale genomic data and how to create routines to automate analytical steps. The philosophy behind the book is to start with real world raw datasets and perform all the analytical steps needed to reach final results. Though theory plays an important role, this is a practical book for graduate and undergraduate courses in bioinformatics and genomic analysis or for use in lab sessions. How to handle and manage high-throughput genomic data, create automated workflows and speed up analyses in R is also taught. A wide range of R packages useful for working with genomic data are illustrated with practical examples.

 

The key topics covered are association studies, genomic prediction, estimation of population genetic parameters and diversity, gene expression analysis, functional annotation of results using publically available databases and how to work efficiently in R with large genomic datasets. Important principles are demonstrated and illustrated through engaging examples which invite the reader to work with the provided datasets. Some methods that are discussed in this volume include: signatures of selection, population parameters (LD, FST, FIS, etc); use of a genomic relationship matrix for population diversity studies; use of SNP data for parentage testing; snpBLUP and gBLUP for genomic prediction. Step-by-step, all the R code required for a genome-wide association study is shown: starting from raw SNP data, how to build databases to handle and manage the data, quality control and filtering measures, association testing and evaluation of results, through to identification and functional annotation of candidate genes. Similarly, gene expression analyses are shown using microarray and RNAseq data.

 

At a time when genomic data is decidedly big, the skills from this book are critical. In recent years R has become the de facto< tool for analysis of gene expression data, in addition to its prominent role in analysis of genomic data. Benefits to using R include the integrated development environment for analysis, flexibility and control of the analytic workflow. Included topics are core components of advanced undergraduate and graduate classes in bioinformatics, genomics and statistical genetics. This book is also designed to be used by students in computer science and statistics who want to learn the practical aspects of genomic analysis without delving into algorithmic details. The datasets used throughout the book may be downloaded from the publisher’s website.

 

商品描述(中文翻譯)

這本書將教導研究人員和學生如何使用R進行大規模基因組數據分析,以及如何創建自動化分析步驟的例程。本書的理念是從真實世界的原始數據開始,並執行所有需要達到最終結果的分析步驟。儘管理論起著重要作用,但這是一本針對生物信息學和基因組分析的研究生和本科課程,或用於實驗室課程的實用書籍。本書還教授如何處理和管理高通量基因組數據,創建自動化工作流程並加快R中的分析速度。書中還使用實際示例演示了一系列對於處理基因組數據有用的R包。

主要涵蓋的主題包括關聯研究、基因組預測、估計群體遺傳參數和多樣性、基因表達分析、使用公開可用數據庫對結果進行功能注釋以及在R中高效處理大型基因組數據。重要原則通過引人入勝的示例進行演示和說明,鼓勵讀者使用提供的數據集進行實際操作。本卷討論的一些方法包括:選擇標誌、群體參數(LD、FST、FIS等);在群體多樣性研究中使用基因組關係矩陣;使用SNP數據進行親子鑒定;用於基因組預測的snpBLUP和gBLUP。逐步展示了進行全基因組關聯研究所需的所有R代碼:從原始SNP數據開始,如何建立處理和管理數據的數據庫,質量控制和過濾措施,關聯測試和結果評估,直到候選基因的識別和功能注釋。同樣,還展示了使用微陣列和RNAseq數據進行基因表達分析。

在基因組數據明顯變得龐大的時代,本書所教授的技能至關重要。近年來,R已成為基因表達數據分析的事實上的工具,除了在基因組數據分析中扮演重要角色外。使用R的好處包括集成的開發環境進行分析、靈活性和對分析工作流程的控制。所包含的主題是生物信息學、基因組學和統計遺傳學高級本科和研究生課程的核心組成部分。本書還適用於計算機科學和統計學專業的學生,他們希望學習基因組分析的實際方面,而不深入研究算法細節。本書中使用的數據集可從出版商的網站上下載。