Scalable Big Data Architecture: A practitioners guide to choosing relevant Big Data architecture

Bahaaldine Azarmi

  • 出版商: Apress
  • 出版日期: 2015-12-30
  • 售價: $2,010
  • 貴賓價: 9.5$1,910
  • 語言: 英文
  • 頁數: 160
  • 裝訂: Paperback
  • ISBN: 1484213270
  • ISBN-13: 9781484213278
  • 相關分類: JVM 語言大數據 Big-data
  • 海外代購書籍(需單獨結帳)

商品描述

This book highlights the different types of data architecture and illustrates the many possibilities hidden behind the term "Big Data", from the usage of No-SQL databases to the deployment of stream analytics architecture, machine learning, and governance.

Scalable Big Data Architecture covers real-world, concrete industry use cases that leverage complex distributed applications , which involve web applications, RESTful API, and high throughput of large amount of data stored in highly scalable No-SQL data stores such as Couchbase and Elasticsearch. This book demonstrates how data processing can be done at scale from the usage of NoSQL datastores to the combination of Big Data distribution.

When the data processing is too complex and involves different processing topology like long running jobs, stream processing, multiple data sources correlation, and machine learning, it’s often necessary to delegate the load to Hadoop or Spark and use the No-SQL to serve processed data in real time.

This book shows you how to choose a relevant combination of big data technologies available within the Hadoop ecosystem. It focuses on processing long jobs, architecture, stream data patterns, log analysis, and real time analytics. Every pattern is illustrated with practical examples, which use the different open sourceprojects such as Logstash, Spark, Kafka, and so on.

Traditional data infrastructures are built for digesting and rendering data synthesis and analytics from large amount of data. This book helps you to understand why you should consider using machine learning algorithms early on in the project, before being overwhelmed by constraints imposed by dealing with the high throughput of Big data.

Scalable Big Data Architecture is for developers, data architects, and data scientists looking for a better understanding of how to choose the most relevant pattern for a Big Data project and which tools to integrate into that pattern.

商品描述(中文翻譯)

這本書突顯了不同類型的資料架構,並展示了「大數據」這個術語背後的許多可能性,從使用No-SQL資料庫到部署流式分析架構、機器學習和治理等。《可擴展的大數據架構》涵蓋了實際的行業使用案例,利用複雜的分散式應用程式,包括網路應用程式、RESTful API和大量數據的高吞吐量存儲在高度可擴展的No-SQL資料存儲中,如Couchbase和Elasticsearch。本書演示了如何從使用NoSQL資料存儲到結合大數據分佈來進行規模化的數據處理。當數據處理變得太複雜,涉及到長時間運行的作業、流式處理、多個數據來源的相關性和機器學習時,通常需要將負載委派給Hadoop或Spark,並使用No-SQL實時提供處理後的數據。本書向您展示如何在Hadoop生態系統中選擇相關的大數據技術組合。重點是處理長時間作業、架構、流數據模式、日誌分析和實時分析。每個模式都以實際示例進行說明,使用不同的開源項目,如Logstash、Spark、Kafka等。傳統的數據基礎設施是為了從大量數據中消化和呈現數據綜合和分析而建立的。本書幫助您了解為什麼在項目早期考慮使用機器學習算法是很重要的,以免被處理大數據的高吞吐量所限制。《可擴展的大數據架構》適用於開發人員、資料架構師和資料科學家,他們希望更好地了解如何為大數據項目選擇最相關的模式以及將哪些工具整合到該模式中。