Using Flume: Flexible, Scalable, and Reliable Data Streaming (Paperback)

Hari Shreedharan

  • 出版商: O'Reilly
  • 出版日期: 2014-10-28
  • 定價: $1,300
  • 售價: 2.3$299
  • 語言: 英文
  • 頁數: 238
  • 裝訂: Paperback
  • ISBN: 1449368301
  • ISBN-13: 9781449368302
  • 相關分類: JVM 語言
  • 立即出貨(限量) (庫存=4)

買這商品的人也買了...

商品描述

How can you get your data from frontend servers to Hadoop in near real time? With this complete reference guide, you’ll learn Flume’s rich set of features for collecting, aggregating, and writing large amounts of streaming data to the Hadoop Distributed File System (HDFS), Apache HBase, SolrCloud, Elastic Search, and other systems.

Using Flume shows operations engineers how to configure, deploy, and monitor a Flume cluster, and teaches developers how to write Flume plugins and custom components for their specific use-cases. You’ll learn about Flume’s design and implementation, as well as various features that make it highly scalable, flexible, and reliable. Code examples and exercises are available on GitHub.

  • Learn how Flume provides a steady rate of flow by acting as a buffer between data producers and consumers
  • Dive into key Flume components, including sources that accept data and sinks that write and deliver it
  • Write custom plugins to customize the way Flume receives, modifies, formats, and writes data
  • Explore APIs for sending data to Flume agents from your own applications
  • Plan and deploy Flume in a scalable and flexible way—and monitor your cluster once it’s running

商品描述(中文翻譯)

如何將前端伺服器的資料以接近即時的方式傳送到Hadoop?這本完整的參考指南將介紹Flume的豐富功能,用於收集、聚合和寫入大量的流式資料到Hadoop分散式檔案系統(HDFS)、Apache HBase、SolrCloud、Elastic Search和其他系統。

《使用Flume》向運維工程師展示如何配置、部署和監控Flume集群,並教授開發人員如何為特定用例編寫Flume插件和自定義組件。您將了解Flume的設計和實現,以及使其高度可擴展、靈活和可靠的各種功能。GitHub上提供了代碼示例和練習。

本書內容包括:
- 學習Flume作為數據生產者和消費者之間的緩衝,提供穩定的數據流速率
- 深入研究Flume的關鍵組件,包括接受數據的源和寫入並傳遞數據的接收器
- 編寫自定義插件,以自定義Flume接收、修改、格式化和寫入數據的方式
- 探索從您自己的應用程序向Flume代理發送數據的API
- 規劃和部署可擴展和靈活的Flume集群,並在運行後監控您的集群

請注意,本書的代碼示例和練習可在GitHub上找到。