Statistical Inference and Machine Learning for Big Data

Alvo, Mayer

  • 出版商: Springer
  • 出版日期: 2022-12-01
  • 售價: $6,170
  • 貴賓價: 9.5$5,862
  • 語言: 英文
  • 頁數: 463
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3031067835
  • ISBN-13: 9783031067839
  • 相關分類: 大數據 Big-dataMachine Learning
  • 海外代購書籍(需單獨結帳)

商品描述

This book initially presents a variety of advanced statistical methods at a level suitable for advanced undergraduate and graduate students as well as others interested in familiarizing themselves with this important subject. Later, it proceeds to illustrate these methods in the context of real life applications. The non specialist seldom gets to see the main focus of modern statistics. Through the presentation of several real life applications in a variety of areas such as genetics and environmental problems one begins to gain an appreciation of the challenges and the utility of statistics.
The book begins in Part I by outlining various data types and by indicating how these are normally represented graphically and subsequently analyzed. In Part II, Chapters 2 and 3 we introduce the basic tools in probability and statistics. Here, we have retained the most useful and relevant results pertinent to this book. In Chapter 4, we proceed with an introduction to multivariate methods and to copula methods. We illustrate a number of applications by presenting real life examples. In Chapter 5 we introduce nonparametric methods which are particularly useful in the analysis of BIG DATA when the underlying distributions are often unknown. Some emphasis is placed on the use of ranking methods. We continue with a discussion of exponential tilting and its applications in Chapter 6. There we discuss the subject of empirical Bayes and its application to micro-array data. In Chapter 7, we touch on counting data analysis and survival analysis. In Chapter 8, time series methods are briefly described both from the usual classical as well as from the state space modeling approaches. Estimating equations and empirical likelihood are discussed in Chapter 9. We present their application in nonparametric testing. Symbolic data analysis is a relatively new field which aims to reduce the dimension of the data through a process of aggregation. It forms the subject of Chapter 10 wherein traditional statistical methods are applied to aggregated medical data. In Part III we focus first on the subject of regression through the lens of machine learning. In Chapter 11 we describe regression methods from the machine learning point of view along with support vector machines often used to study interactions and classification. We then continue in Chapter 12 with the important topics of neural networks and text analytics. We conclude with Part IV by presenting the computational aspects of BIG DATA with special attention devoted to Markov Chain Monte Carlo methods and to Bayesian nonparametric statistics.

This book was written for two key audiences. It would serve as a handy desk reference for statistical methods at the undergraduate and graduate level. It would also be useful in courses which aim to provide an overview of modern statistics and its applications.

商品描述(中文翻譯)

這本書首先以適合高年級本科生、研究生以及其他對這個重要主題感興趣的人的水平介紹了各種高級統計方法。隨後,它通過實際應用的情境來說明這些方法。非專業人士很少有機會看到現代統計學的主要焦點。通過在遺傳學和環境問題等各個領域中呈現幾個實際應用,人們開始對統計學的挑戰和實用性有所體會。

本書第一部分首先概述了各種數據類型,並指出這些數據通常如何以圖形方式表示和分析。在第二部分的第2章和第3章中,我們介紹了概率和統計的基本工具。在這裡,我們保留了與本書相關的最有用和相關的結果。在第4章中,我們介紹了多變量方法和copula方法。我們通過呈現實際案例來說明一些應用。在第5章中,我們介紹了非參數方法,這些方法在分析大數據時特別有用,因為底層分佈通常是未知的。一些重點放在使用排名方法上。我們在第6章中繼續討論指數傾斜及其在實際應用中的應用。在那裡,我們討論了經驗貝葉斯的主題及其在微陣列數據中的應用。在第7章中,我們觸及了計數數據分析和生存分析。在第8章中,我們簡要描述了時間序列方法,包括傳統的經典方法和狀態空間建模方法。在第9章中討論了估計方程和經驗似然。我們介紹了它們在非參數檢驗中的應用。符號數據分析是一個相對新的領域,旨在通過聚合過程減少數據的維度。這是第10章的主題,其中將傳統統計方法應用於聚合的醫學數據。在第三部分中,我們首先從機器學習的角度關注回歸問題。在第11章中,我們從機器學習的角度描述了回歸方法,以及常用於研究交互作用和分類的支持向量機。然後在第12章中繼續討論神經網絡和文本分析等重要主題。最後,在第四部分中,我們介紹了處理大數據的計算方面,特別關注馬爾可夫鏈蒙特卡洛方法和貝葉斯非參數統計。

這本書的目標讀者有兩個主要群體。首先,它可以作為本科和研究生水平統計方法的方便參考書。其次,它也適用於旨在提供現代統計學及其應用概述的課程。