Data-intensive Systems: Principles and Fundamentals using Hadoop and Spark (Advanced Information and Knowledge Processing)
Data-intensive systems are a technological building block supporting Big Data and Data Science applications.This book familiarizes readers with core concepts that they should be aware of before continuing with independent work and the more advanced technical reference literature that dominates the current landscape.
The material in the book is structured following a problem-based approach. This means that the content in the chapters is focused on developing solutions to simplified, but still realistic problems using data-intensive technologies and approaches. The reader follows one reference scenario through the whole book, that uses an open Apache dataset.
The origins of this volume are in lectures from a master’s course in Data-intensive Systems, given at the University of Stavanger. Some chapters were also a base for guest lectures at Purdue University and Lodz University of Technology.