Data Engineering Design Patterns: Recipes for Solving the Most Common Data Engineering Problems
暫譯: 數據工程設計模式:解決最常見數據工程問題的配方

Konieczny, Bartosz

  • 出版商: O'Reilly
  • 出版日期: 2025-05-20
  • 售價: $2,750
  • 貴賓價: 9.5$2,613
  • 語言: 英文
  • 頁數: 372
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098165810
  • ISBN-13: 9781098165819
  • 相關分類: Design Pattern
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Data projects are an intrinsic part of an organization's technical ecosystem, but data engineers in many companies are still trying to solve problems that others have already solved. This hands-on guide shows you how to provide valuable data by focusing on various aspects of data engineering, including data ingestion, data quality, idempotency, and more.

Author Bartosz Konieczny guides you through the process of building reliable end-to-end data engineering projects, from data ingestion to data observability, focusing on data engineering design patterns that solve common business problems in a secure and storage-optimized manner. Each pattern includes a user-facing description of the problem, solutions, and consequences that place the pattern into the context of real-life scenarios.

Throughout this journey, you'll use open source data tools and public cloud services to see how to put each pattern into practice. You'll learn:

  • Challenges data engineers face and their impact on data systems
  • How these challenges relate to data system components
  • What data engineering patterns are for
  • How to identify and fix issues with your current data components
  • Technology-agnostic solutions to new and existing data projects
  • How to implement patterns with Apache Airflow, Apache Spark, Apache Flink, and Delta Lake

Bartosz Konieczny is a freelance data engineer who's been coding for more than 15 years. He's held various senior hands-on positions that helped him work on many data engineering problems in batch and stream processing.

商品描述(中文翻譯)

資料專案是組織技術生態系統中不可或缺的一部分,但許多公司的資料工程師仍在嘗試解決其他人已經解決的問題。本手冊將指導您如何專注於資料工程的各個方面,包括資料攝取、資料品質、冪等性等,來提供有價值的資料。

作者 Bartosz Konieczny 將引導您完成構建可靠的端到端資料工程專案的過程,從資料攝取到資料可觀察性,專注於以安全且儲存優化的方式解決常見商業問題的資料工程設計模式。每個模式都包括用戶面向的問題描述、解決方案和後果,將該模式置於現實場景的背景中。

在這段旅程中,您將使用開源資料工具和公共雲服務,了解如何將每個模式付諸實踐。您將學到:
- 資料工程師面臨的挑戰及其對資料系統的影響
- 這些挑戰與資料系統組件的關係
- 資料工程模式的用途
- 如何識別和修復當前資料組件的問題
- 對新舊資料專案的技術無關解決方案
- 如何使用 Apache Airflow、Apache Spark、Apache Flink 和 Delta Lake 實現模式

Bartosz Konieczny 是一位自由職業的資料工程師,擁有超過 15 年的編程經驗。他曾擔任多個高級實務職位,幫助他解決批次和串流處理中的許多資料工程問題。