Apache Flume: Distributed Log Collection for Hadoop (What You Need to Know)
暫譯: Apache Flume：Hadoop 的分散式日誌收集（您需要知道的事項）

Name: Apache Flume: Distributed Log Collection for Hadoop (What You Need to Know)
Price: 1416 TWD
Availability: OnlineOnly
Author: Steve Hoffman
ISBN: 1782167919

Steve Hoffman

出版商: Packt Publishing
出版日期: 2013-07-04
售價: $1,490
貴賓價: 9.5 折 $1,416
語言: 英文
頁數: 108
裝訂: Paperback
ISBN: 1782167919
ISBN-13: 9781782167914
相關分類: Hadoop

海外代購書籍(需單獨結帳)

買這商品的人也買了...

~~$720~~ $562

Java Objects 徹底研究 (Beginning Java Objects: From Concepts to Code, 2/e)
~~$320~~ $272

上班族一定要會的 Excel 技巧－不必問前輩‧效率馬上 UP !
~~$750~~ $638

Linux 驅動程式開發實戰 (Essential Linux Device Drivers)
~~$1,790~~ $1,701

Understanding Cryptography: A Textbook for Students and Practitioners (Hardcover)
~~$680~~ $537

精通 Python 3 程式設計, 2/e (Programming in Python 3: A Complete Introduction to the Python Language, 2/e)
~~$950~~ $808

Google Android SDK 開發範例大全, 3/e
~~$520~~ $411

Android 4.X 手機/平板電腦程式設計入門、應用到精通, 2/e (適用 Android 1.X~4.X)
~~$580~~ $493

HTML & CSS : 網站設計建置優化之道 (HTML and CSS: Design and Build Websites)
~~$780~~ $764

微積分, 7/e (Stewart)
~~$680~~ $537

ASP.NET MVC 4 網站開發美學
~~$650~~ $514

Visual C# 2012 資料庫程式設計暨進銷存系統實作
~~$1,130~~ $961

超圖解 Arduino 互動設計入門 (附 Arduino UNO R3 開發板)
~~$480~~ $408

易讀程式之美學－提升程式碼可讀性的簡單法則 (The Art of Readable Code)
~~$290~~ $226

雲端行動 App 設計與開發－使用 CmoreCloud 雲端行動 App 設計與開發，讓您不會寫程式也能輕鬆、快速的設計 App！
~~$880~~ $695

深入淺出 HTML and CSS, 2/e (Head First HTML and CSS, 2/e)
~~$860~~ $731

王者歸來－PHP 完全開發範例集, 2/e
~~$940~~ $700

無瑕的程式碼－敏捷軟體開發技巧守則 + 番外篇－專業程式設計師的生存之道 (雙書合購)
~~$2,340~~ $1,825

Raspberry Pi 從入門到應用 + Raspberry Pi rev 2 Model B 512MB (超值限量合購組)
~~$650~~ $585

電腦網際網路, 6/e (國際版)(Computer Networking: A Top-Down Approach, 6/e)(附部分內容光碟)
~~$299~~ $236

一觸即發｜Windows 8.1 玩全手冊
~~$480~~ $379

透視 C語言指標－深度探索記憶體管理核心技術 (Understanding and Using C Pointers)
~~$480~~ $374

設計模式的解析與活用 (Design Patterns Explained: A New Perspective on Object-Oriented Design, 2/e)
~~$2,090~~ $1,986

An Introduction to Mathematical Cryptography (Hardcover)
~~$520~~ $406

培養與鍛鍊程式設計的邏輯腦：世界級程式設計大賽的知識、心得與解題分享, 2/e (CPE 大學程式能力檢定最佳參考用書)
~~$750~~ $593

一次擁有 Linux 雙認證－LPIC Level I + Novell CLA 11 自學手冊, 2/e

商品描述

If your role includes moving datasets into Hadoop, this book will help you do it more efficiently using Apache Flume. From installation to customization, it's a complete step-by-step guide on making the service work for you.

Overview

Integrate Flume with your data sources
Transcode your data en-route in Flume
Route and separate your data using regular expression matching
Configure failover paths and load-balancing to remove single points of failure
Utilize Gzip Compression for files written to HDFS

In Detail

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log data. Its main goal is to deliver data from applications to Apache Hadoop's HDFS. It has a simple and flexible architecture based on streaming data flows. It is robust and fault tolerant with many failover and recovery mechanisms.

Apache Flume: Distributed Log Collection for Hadoop covers problems with HDFS and streaming data/logs, and how Flume can resolve these problems. This book explains the generalized architecture of Flume, which includes moving data to/from databases, NO-SQL-ish data stores, as well as optimizing performance. This book includes real-world scenarios on Flume implementation.

Apache Flume: Distributed Log Collection for Hadoop starts with an architectural overview of Flume and then discusses each component in detail. It guides you through the complete installation process and compilation of Flume.

It will give you a heads-up on how to use channels and channel selectors. For each architectural component (Sources, Channels, Sinks, Channel Processors, Sink Groups, and so on) the various implementations will be covered in detail along with configuration options. You can use it to customize Flume to your specific needs. There are pointers given on writing custom implementations as well that would help you learn and implement them.

By the end, you should be able to construct a series of Flume agents to transport your streaming data and logs from your systems into Hadoop in near real time.

What you will learn from this book

Understand the Flume architecture
Download and install open source Flume from Apache
Discover when to use a memory or file-backed channel
Understand and configure the Hadoop File System (HDFS) sink
Learn how to use sink groups to create redundant data flows
Configure and use various sources for ingesting data
Inspect data records and route to different or multiple destinations based on payload content
Transform data en-route to Hadoop
Monitor your data flows

Approach

A starter guide that covers Apache Flume in detail.

Who this book is written for

Apache Flume: Distributed Log Collection for Hadoop is intended for people who are responsible for moving datasets into Hadoop in a timely and reliable manner like software engineers, database administrators, and data warehouse administrators.

商品描述(中文翻譯)

如果您的角色包括將數據集移動到 Hadoop，本書將幫助您更有效地使用 Apache Flume。從安裝到自定義，這是一本完整的逐步指南，幫助您使該服務為您工作。

概述
- 將 Flume 與您的數據源整合
- 在 Flume 中轉碼您的數據
- 使用正則表達式匹配來路由和分隔您的數據
- 配置故障轉移路徑和負載平衡，以消除單點故障
- 利用 Gzip 壓縮寫入 HDFS 的文件

詳細內容
Apache Flume 是一個分散式、可靠且可用的服務，用於有效地收集、聚合和移動大量日誌數據。其主要目標是將數據從應用程序傳送到 Apache Hadoop 的 HDFS。它擁有基於流數據流的簡單且靈活的架構，並且具有強大的容錯能力，擁有多種故障轉移和恢復機制。

《Apache Flume: Distributed Log Collection for Hadoop》涵蓋了 HDFS 和流數據/日誌的問題，以及 Flume 如何解決這些問題。本書解釋了 Flume 的通用架構，包括將數據移動到/從數據庫、類 NO-SQL 的數據存儲，以及優化性能。本書還包括 Flume 實施的實際場景。

《Apache Flume: Distributed Log Collection for Hadoop》首先介紹 Flume 的架構概述，然後詳細討論每個組件。它將指導您完成 Flume 的完整安裝過程和編譯。

本書將讓您了解如何使用通道和通道選擇器。對於每個架構組件（來源、通道、匯、通道處理器、匯組等），將詳細介紹各種實現及其配置選項。您可以使用它來根據您的特定需求自定義 Flume。本書還提供了編寫自定義實現的指導，幫助您學習和實施。

到最後，您應該能夠構建一系列 Flume 代理，將您的流數據和日誌從系統實時傳輸到 Hadoop。

您將從本書中學到的內容
- 了解 Flume 架構
- 從 Apache 下載並安裝開源 Flume
- 知道何時使用內存或文件支持的通道
- 了解並配置 Hadoop 文件系統（HDFS）匯
- 學習如何使用匯組創建冗餘數據流
- 配置和使用各種來源以攝取數據
- 檢查數據記錄並根據有效負載內容路由到不同或多個目的地
- 在傳輸過程中轉換數據到 Hadoop
- 監控您的數據流

方法
一本詳細介紹 Apache Flume 的入門指南。

本書的讀者對象
《Apache Flume: Distributed Log Collection for Hadoop》適合那些負責及時可靠地將數據集移動到 Hadoop 的人，如軟體工程師、數據庫管理員和數據倉庫管理員。