Simplifying Data Engineering and Analytics with Delta: Create analytics-ready data that fuels artificial intelligence and business intelligence

Mahapatra, Anindita

  • 出版商: Packt Publishing
  • 出版日期: 2022-07-29
  • 售價: $1,810
  • 貴賓價: 9.5$1,720
  • 語言: 英文
  • 頁數: 334
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1801814864
  • ISBN-13: 9781801814867
  • 相關分類: 人工智慧
  • 下單後立即進貨 (約3~4週)

相關主題

商品描述

Explore how Delta brings reliability, performance, and governance to your data lake and all the AI and BI use cases built on top of it


Key Features:

  • Learn Delta's core concepts and features as well as what makes it a perfect match for data engineering and analysis
  • Solve business challenges of different industry verticals using a scenario-based approach
  • Make optimal choices by understanding the various tradeoffs provided by Delta


Book Description:

Delta helps you generate reliable insights at scale and simplifies architecture around data pipelines, allowing you to focus primarily on refining the use cases being worked on. This is especially important when you consider that existing architecture is frequently reused for new use cases.


In this book, you'll learn about the principles of distributed computing, data modeling techniques, and big data design patterns and templates that help solve end-to-end data flow problems for common scenarios and are reusable across use cases and industry verticals. You'll also learn how to recover from errors and the best practices around handling structured, semi-structured, and unstructured data using Delta. After that, you'll get to grips with features such as ACID transactions on big data, disciplined schema evolution, time travel to help rewind a dataset to a different time or version, and unified batch and streaming capabilities that will help you build agile and robust data products.


By the end of this Delta book, you'll be able to use Delta as the foundational block for creating analytics-ready data that fuels all AI/BI use cases.


What You Will Learn:

  • Explore the key challenges of traditional data lakes
  • Appreciate the unique features of Delta that come out of the box
  • Address reliability, performance, and governance concerns using Delta
  • Analyze the open data format for an extensible and pluggable architecture
  • Handle multiple use cases to support BI, AI, streaming, and data discovery
  • Discover how common data and machine learning design patterns are executed on Delta
  • Build and deploy data and machine learning pipelines at scale using Delta


Who this book is for:

Data engineers, data scientists, ML practitioners, BI analysts, or anyone in the data domain working with big data will be able to put their knowledge to work with this practical guide to executing pipelines and supporting diverse use cases using the Delta protocol. Basic knowledge of SQL, Python programming, and Spark is required to get the most out of this book.

商品描述(中文翻譯)

探索Delta如何為您的資料湖帶來可靠性、效能和治理,以及建立在其之上的所有AI和BI用例。

主要特點:
- 了解Delta的核心概念和功能,以及它與資料工程和分析的完美匹配之處
- 使用基於情境的方法解決不同行業垂直的業務挑戰
- 通過了解Delta提供的各種權衡來做出最佳選擇

書籍描述:
Delta幫助您以規模生成可靠的洞察力,並簡化圍繞數據管道的架構,使您能夠主要專注於改進正在進行的用例。考慮到現有架構經常被重用於新的用例,這一點尤為重要。

在本書中,您將學習分佈式計算的原則、數據建模技術以及幫助解決常見情境的端到端數據流問題的大數據設計模式和模板。您還將學習如何從錯誤中恢復以及處理結構化、半結構化和非結構化數據的最佳實踐,並使用Delta。之後,您將深入了解Delta的特性,例如在大數據上的ACID事務、紀律性模式演進、時間旅行(可將數據集倒回到不同的時間或版本)以及統一的批處理和流處理能力,這將幫助您構建靈活且強大的數據產品。

通過閱讀本書,您將能夠將Delta作為創建支持所有AI/BI用例的分析就緒數據的基礎模塊。

您將學到什麼:
- 探索傳統資料湖的主要挑戰
- 欣賞Delta的獨特功能
- 使用Delta解決可靠性、效能和治理問題
- 分析可擴展且可插拔架構的開放數據格式
- 處理多個用例以支持BI、AI、流處理和數據發現
- 發現在Delta上執行常見數據和機器學習設計模式
- 使用Delta構建和部署大規模的數據和機器學習管道

本書適合對大數據領域有興趣的數據工程師、數據科學家、機器學習從業者、BI分析師或任何在數據領域工作的人,他們將能夠通過這本實用指南來執行管道並支持各種用例,使用Delta協議。要充分利用本書,需要具備SQL、Python編程和Spark的基本知識。