An Introduction to Data Analysis in R: Hands-On Coding, Data Mining, Visualization and Statistics from Scratch

Zamora Saiz, Alfonso, Quesada González, Carlos, Hurtado Gil, Lluís



This textbook offers an easy-to-follow, practical guide to modern data analysis using the programming language R. The chapters cover topics such as the fundamentals of programming in R, data collection and preprocessing, including web scraping, data visualization, and statistical methods, including multivariate analysis, and feature exercises at the end of each section. The text requires only basic statistics skills, as it strikes a balance between statistical and mathematical understanding and implementation in R, with a special emphasis on reproducible examples and real-world applications. This textbook is primarily intended for undergraduate students of mathematics, statistics, physics, economics, finance and business who are pursuing a career in data analytics. It will be equally valuable for master students of data science and industry professionals who want to conduct data analyses.


Alfonso Zamora Saiz is a professor at the School of Computer Science Engineering, Technical University of Madrid, Spain. Holding a PhD in algebraic geometry from the Complutense University of Madrid (2013), he has been a visiting PhD student at Cambridge University and Columbia University, a postdoc at the IST in Lisbon, lecturer at the California State University Channel Islands and a professor at the CEU San Pablo University in Madrid. He has also worked as a quantitative analyst in the industry. His research interests include algebra, geometry and topology in pure math, as well as data analytical applications and mathematics education.

Carlos Quesada González holds a PhD in applied mathematics from the Complutense University of Madrid, Spain. He is a professor and Vice-dean of the School of Business at the CEU San Pablo University in Madrid, where he has helped to establish the Business Intelligence degree. He also teaches master courses on big data for finance and collaborates as a statistical analyst with Grant-Thornton.

Lluís Hurtado Gil is a professional data scientist at eDreams ODIGEO and holds a PhD in astrophysics from Valencia University (2016). He was a professor of statistics and econometrics for three years at CEU San Pablo University, where he also served as secretary of the Statistics and Applied Mathematics Department. He has published works on econometrics for undergraduate students and research papers on statistical applications in modern astrophysics with R code. Currently, he continues to collaborate with the International J-PAS Survey, investigating the physics of the accelerating universe. Professionally, he has specialized in the application of stochastic processes to digital marketing.

Diego Mondéjar Ruiz is a professor at the Department of Applied Mathematics and Statistics, CEU San Pablo University in Madrid, Spain. He obtained his PhD in mathematics from the Complutense University of Madrid with a thesis on topological data analysis and computational topology in 2015. In addition to having been a visiting PhD student at Stanford University and the University of Pennsylvania, he has taught mathematics, statistics and programming courses at several universities.