Streaming Data Mesh: A Model for Optimizing Real-Time Data Services

Dulay, Hubert, Mooney, Stephen

  • 出版商: O'Reilly
  • 出版日期: 2023-06-20
  • 定價: $2,260
  • 售價: 8.0$1,808
  • 語言: 英文
  • 頁數: 223
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098130723
  • ISBN-13: 9781098130725
  • 相關分類: 大數據 Big-data資料庫Data Science
  • 立即出貨

買這商品的人也買了...

商品描述

Data lakes and warehouses have become increasingly fragile, costly, and difficult to maintain as data gets bigger and moves faster. Data meshes can help your organization decentralize data, giving ownership back to the engineers who produced it. This book provides a concise yet comprehensive overview of data mesh patterns for streaming and real-time data services.

Authors Hubert Dulay and Stephen Mooney examine the vast differences between streaming and batch data meshes. Data engineers, architects, data product owners, and those in DevOps and MLOps roles will learn steps for implementing a streaming data mesh, from defining a data domain to building a good data product. Through the course of the book, you'll create a complete self-service data platform and devise a data governance system that enables your mesh to work seamlessly.

With this book, you will:

  • Design a streaming data mesh using Kafka
  • Learn how to identify a domain
  • Build your first data product using self-service tools
  • Apply data governance to the data products you create
  • Learn the differences between synchronous and asynchronous data services
  • Implement self-services that support decentralized data

商品描述(中文翻譯)

數據湖和數據倉庫隨著數據變得越來越大且移動速度加快,變得越來越脆弱、昂貴且難以維護。數據網格可以幫助您的組織分散數據,將所有權交還給生產數據的工程師。本書提供了對於流式和實時數據服務的數據網格模式的簡明而全面的概述。作者Hubert Dulay和Stephen Mooney探討了流式和批處理數據網格之間的巨大差異。數據工程師、架構師、數據產品負責人以及DevOps和MLOps角色的人將學習實施流式數據網格的步驟,從定義數據領域到構建良好的數據產品。在本書的過程中,您將創建一個完整的自助式數據平台,並設計一個能夠無縫運作的數據治理系統。通過本書,您將能夠:設計使用Kafka的流式數據網格、學習如何識別領域、使用自助式工具構建第一個數據產品、對您創建的數據產品應用數據治理、了解同步和異步數據服務之間的差異、實施支持分散數據的自助服務。