Big Data Analytics Beyond Hadoop: Real-Time Applications with Storm, Spark, and More Hadoop Alternatives (Hardcover)

Vijay Srinivas Agneeswaran



Master alternative Big Data technologies that can do what Hadoop can't: real-time analytics and iterative machine learning.


When most technical professionals think of Big Data analytics today, they think of Hadoop. But there are many cutting-edge applications that Hadoop isn't well suited for, especially real-time analytics and contexts requiring the use of iterative machine learning algorithms. Fortunately, several powerful new technologies have been developed specifically for use cases such as these. Big Data Analytics Beyond Hadoop is the first guide specifically designed to help you take the next steps beyond Hadoop. Dr. Vijay Srinivas Agneeswaran introduces the breakthrough Berkeley Data Analysis Stack (BDAS) in detail, including its motivation, design, architecture, Mesos cluster management, performance, and more. He presents realistic use cases and up-to-date example code for: 

  • Spark, the next generation in-memory computing technology from UC Berkeley
  • Storm, the parallel real-time Big Data analytics technology from Twitter
  • GraphLab, the next-generation graph processing paradigm from CMU and the University of Washington (with comparisons to alternatives such as Pregel and Piccolo)

Halo also offers architectural and design guidance and code sketches for scaling machine learning algorithms to Big Data, and then realizing them in real-time. He concludes by previewing emerging trends, including real-time video analytics, SDNs, and even Big Data governance, security, and privacy issues. He identifies intriguing startups and new research possibilities, including BDAS extensions and cutting-edge model-driven analytics.


Big Data Analytics Beyond Hadoop is an indispensable resource for everyone who wants to reach the cutting edge of Big Data analytics, and stay there: practitioners, architects, programmers, data scientists, researchers, startup entrepreneurs, and advanced students.



當大多數技術專業人員今天想到大數據分析時,他們會想到Hadoop。但是,有許多尖端應用並不適合使用Hadoop,特別是需要使用迭代機器學習算法的即時分析和情境。幸運的是,已經開發了幾種針對這些用例的強大新技術。《Big Data Analytics Beyond Hadoop》是第一本專門設計來幫助您超越Hadoop的指南。Vijay Srinivas Agneeswaran博士詳細介紹了突破性的Berkeley Data Analysis Stack(BDAS),包括其動機、設計、架構、Mesos集群管理、性能等。他提供了現實的用例和最新的示例代碼,包括:

- Spark:來自加州大學伯克利分校的下一代內存計算技術
- Storm:來自Twitter的並行實時大數據分析技術
- GraphLab:來自CMU和華盛頓大學的下一代圖形處理範式(與Pregel和Piccolo等替代方案進行比較)


《Big Data Analytics Beyond Hadoop》是一本不可或缺的資源,適用於所有希望達到大數據分析的尖端並保持在那裡的人:從業人員、架構師、程序員、數據科學家、研究人員、初創企業家和高級學生。