Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications

Fabian Hueske, Vasiliki Kalavri

買這商品的人也買了...

商品描述

Get started with Apache Flink, the open source framework that enables you to process streaming data—such as user interactions, sensor data, and machine logs—as it arrives. With this practical guide, you’ll learn how to use Apache Flink’s stream processing APIs to implement, continuously run, and maintain real-world applications.

Authors Fabian Hueske, one of Flink’s creators, and Vasia Kalavri, a core contributor to Flink’s graph processing API (Gelly), explains the fundamental concepts of parallel stream processing and shows you how streaming analytics differs from traditional batch data analysis. Software engineers, data engineers, and system administrators will learn the basics of Flink’s DataStream API, including the structure and components of a common Flink streaming application.

  • Solve real-world problems with Apache Flink’s DataStream API
  • Set up an environment for developing stream processing applications for Flink
  • Design streaming applications and migrate periodic batch workloads to continuous streaming workloads
  • Learn about windowed operations that process groups of records
  • Ingest data streams into a DataStream application and emit a result stream into different storage systems
  • Implement stateful and custom operators common in stream processing applications
  • Operate, maintain, and update continuously running Flink streaming applications
  • Explore several deployment options, including the setup of highly available installations

商品描述(中文翻譯)

開始使用 Apache Flink,這個開源框架可以讓您即時處理流式數據,例如使用者互動、感測器數據和機器日誌。這本實用指南將教您如何使用 Apache Flink 的流式處理 API,實現、持續運行和維護真實世界應用程式。

作者 Fabian Hueske 是 Flink 的創建者之一,Vasia Kalavri 是 Flink 圖形處理 API(Gelly)的核心貢獻者,他們解釋了並行流式處理的基本概念,並向您展示了流式分析與傳統批次數據分析的區別。軟體工程師、數據工程師和系統管理員將學習 Flink 的 DataStream API 基礎知識,包括常見 Flink 流式應用程式的結構和組件。

本書內容包括:
- 使用 Apache Flink 的 DataStream API 解決真實世界問題
- 為 Flink 開發流式處理應用程式設置環境
- 設計流式應用程式,將定期批次工作負載轉換為連續流式工作負載
- 學習處理記錄組的窗口操作
- 將數據流輸入到 DataStream 應用程式中,並將結果流輸出到不同的存儲系統
- 實現在流式處理應用程式中常見的有狀態和自定義運算子
- 運營、維護和更新持續運行的 Flink 流式應用程式
- 探索多種部署選項,包括建立高可用性安裝

這本書將幫助您深入了解 Apache Flink,並教您如何使用它來處理流式數據。無論您是初學者還是有經驗的使用者,都能從中獲得實用的知識和技巧。

作者簡介

Fabian Hueske is a committer to and PMC member of the Apache Flink project and has been contributing to Flink since its earliest days. Fabian is cofounder, software engineer, and community evangelist at data Artisans (now Ververica), a Berlin-based startup that fosters Flink and its community. He holds a PhD in computer science from TU Berlin.

Vasiliki (Vasia) Kalavri is a postdoctoral fellow in the Systems Group at ETH Zurich, where she uses Apache Flink extensively for streaming systems research and teaching. Vasia is a PMC member of the Apache Flink project. An early contributor to Flink, she has worked on its graph processing library, Gelly, and on early versions of the Table API and streaming SQL.

作者簡介(中文翻譯)

Fabian Hueske是Apache Flink項目的貢獻者和PMC成員,自Flink創立以來一直在為其做出貢獻。Fabian是data Artisans(現在是Ververica)的聯合創始人、軟件工程師和社區推廣者,該公司位於柏林,致力於推動Flink及其社區的發展。他擁有柏林工業大學的計算機科學博士學位。

Vasiliki(Vasia)Kalavri是ETH Zurich系統組的博士後研究員,她在研究和教學中廣泛使用Apache Flink進行流式系統研究。Vasia是Apache Flink項目的PMC成員。作為Flink的早期貢獻者,她曾在其圖處理庫Gelly以及Table API和流式SQL的早期版本上工作。