Machine Learning for Data Streams: with Practical Examples in MOA (Adaptive Computation and Machine Learning series)

Albert Bifet, Ricard Gavaldà, Geoff Holmes, Bernhard Pfahringer


A hands-on approach to tasks and techniques in data stream mining and real-time analytics, with examples in MOA, a popular freely available open-source software framework.

Today many information sources -- including sensor networks, financial markets, social networks, and healthcare monitoring -- are so-called data streams, arriving sequentially and at high speed. Analysis must take place in real time, with partial data and without the capacity to store the entire data set. This book presents algorithms and techniques used in data stream mining and real-time analytics. Taking a hands-on approach, the book demonstrates the techniques using MOA (Massive Online Analysis), a popular, freely available open-source software framework, allowing readers to try out the techniques after reading the explanations.

The book first offers a brief introduction to the topic, covering big data mining, basic methodologies for mining data streams, and a simple example of MOA. More detailed discussions follow, with chapters on sketching techniques, change, classification, ensemble methods, regression, clustering, and frequent pattern mining. Most of these chapters include exercises, an MOA-based lab session, or both. Finally, the book discusses the MOA software, covering the MOA graphical user interface, the command line, use of its API, and the development of new methods within MOA. The book will be an essential reference for readers who want to use data stream mining as a tool, researchers in innovation or data stream mining, and programmers who want to create new algorithms for MOA.



如今,許多信息源(包括傳感器網絡、金融市場、社交網絡和醫療監測等)都是所謂的數據流,以順序和高速到達。分析必須實時進行,使用部分數據並且無法存儲整個數據集。本書介紹了數據流挖掘和實時分析中使用的算法和技術。本書採用實踐方法,使用MOA(Massive Online Analysis)這個受歡迎的免費開源軟件框架來演示這些技術,讓讀者在閱讀解釋後可以嘗試這些技術。