Big Data, Data Mining, and Machine Learning: Value Creation for Business Leaders and Practitioners
This book provides a comprehensive overview on the recent trend toward high performance computing architectures especially as it relates to analytics, data mining, and machine learning. Topics that are covered include: big data (and its characteristics), high performance computing for analytics, massively parallel processing (MPP) databases, algorithms for big data, in-memory databases, implementation of machine learning algorithms for big data platforms, and analytics environments. However, none gives a historical and comprehensive view of all these separate topics in a single document. Through the understanding of these topics corporations can create an ideal analytic environment that is better suited to the challenges of today's analytics demands.
The book is organized in three parts:
- Part 1 is designed to introduce the concepts and vocabulary to educate the reader on the current buzz in the area and the tradeoffs or limitations of certain technology and what factors should influence their choices.
- Part 2 focus on the techniques and methods that can be used with a corporation's data to turn it into value.
- Part 3 will be a set of detailed Case Studies.
Updates to this edition include:
- Update introduction
- Add and update sections in Part 1 about cloud computing, virtualized technology (containers), functions as a service (FAAS), and DevOps methodology.
- Add a section on Deep Learning in Part 2. This section will cover convolutional neural networks (CNN) which are generally used for computer vision applications and recurrent neural networks (RNN) which are used in text applications or other sequences.
- Update chapter 3 with major enhancements of R and Python including my contributions to integration of open source with SAS.
- Update recommendation systems in chapter 9 including Factorization machines