Simplifying Data Engineering and Analytics with Delta: Create analytics-ready data that fuels artificial intelligence and business intelligence
暫譯: 簡化數據工程與分析:創建可用於人工智慧與商業智慧的分析準備數據

Mahapatra, Anindita

  • 出版商: Packt Publishing
  • 出版日期: 2022-07-29
  • 售價: $1,950
  • 貴賓價: 9.5$1,853
  • 語言: 英文
  • 頁數: 334
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1801814864
  • ISBN-13: 9781801814867
  • 相關分類: 人工智慧
  • 海外代購書籍(需單獨結帳)

商品描述

Explore how Delta brings reliability, performance, and governance to your data lake and all the AI and BI use cases built on top of it


Key Features:

  • Learn Delta's core concepts and features as well as what makes it a perfect match for data engineering and analysis
  • Solve business challenges of different industry verticals using a scenario-based approach
  • Make optimal choices by understanding the various tradeoffs provided by Delta


Book Description:

Delta helps you generate reliable insights at scale and simplifies architecture around data pipelines, allowing you to focus primarily on refining the use cases being worked on. This is especially important when you consider that existing architecture is frequently reused for new use cases.


In this book, you'll learn about the principles of distributed computing, data modeling techniques, and big data design patterns and templates that help solve end-to-end data flow problems for common scenarios and are reusable across use cases and industry verticals. You'll also learn how to recover from errors and the best practices around handling structured, semi-structured, and unstructured data using Delta. After that, you'll get to grips with features such as ACID transactions on big data, disciplined schema evolution, time travel to help rewind a dataset to a different time or version, and unified batch and streaming capabilities that will help you build agile and robust data products.


By the end of this Delta book, you'll be able to use Delta as the foundational block for creating analytics-ready data that fuels all AI/BI use cases.


What You Will Learn:

  • Explore the key challenges of traditional data lakes
  • Appreciate the unique features of Delta that come out of the box
  • Address reliability, performance, and governance concerns using Delta
  • Analyze the open data format for an extensible and pluggable architecture
  • Handle multiple use cases to support BI, AI, streaming, and data discovery
  • Discover how common data and machine learning design patterns are executed on Delta
  • Build and deploy data and machine learning pipelines at scale using Delta


Who this book is for:

Data engineers, data scientists, ML practitioners, BI analysts, or anyone in the data domain working with big data will be able to put their knowledge to work with this practical guide to executing pipelines and supporting diverse use cases using the Delta protocol. Basic knowledge of SQL, Python programming, and Spark is required to get the most out of this book.

商品描述(中文翻譯)

探索 Delta 如何為您的數據湖及其上構建的所有 AI 和 BI 用例帶來可靠性、性能和治理

主要特點:


  • 了解 Delta 的核心概念和特性,以及使其成為數據工程和分析的完美選擇的原因

  • 使用情境導向的方法解決不同產業垂直領域的商業挑戰

  • 通過理解 Delta 提供的各種權衡,做出最佳選擇

書籍描述:
Delta 幫助您在大規模上生成可靠的見解,並簡化數據管道周圍的架構,使您能夠專注於精煉正在處理的用例。這一點在考慮到現有架構經常被重複使用於新用例時尤為重要。

在本書中,您將學習分散式計算的原則、數據建模技術以及大數據設計模式和模板,這些都能幫助解決常見情境的端到端數據流問題,並可在用例和產業垂直領域之間重複使用。您還將學習如何從錯誤中恢復,以及使用 Delta 處理結構化、半結構化和非結構化數據的最佳實踐。之後,您將掌握如大數據上的 ACID 交易、規範的模式演進、時間旅行(幫助將數據集回溯到不同的時間或版本)以及統一的批處理和流處理能力等功能,這將幫助您構建靈活且穩健的數據產品。

在本 Delta 書籍結束時,您將能夠將 Delta 作為創建分析就緒數據的基礎,為所有 AI/BI 用例提供支持。

您將學到的內容:


  • 探索傳統數據湖的主要挑戰

  • 欣賞 Delta 的獨特特性,這些特性是開箱即用的

  • 使用 Delta 解決可靠性、性能和治理的問題

  • 分析開放數據格式以實現可擴展和可插拔的架構

  • 處理多個用例以支持 BI、AI、流處理和數據發現

  • 發現如何在 Delta 上執行常見的數據和機器學習設計模式

  • 使用 Delta 大規模構建和部署數據及機器學習管道

本書適合誰:
數據工程師、數據科學家、機器學習從業者、BI 分析師或任何在數據領域從事大數據工作的人,都能通過這本實用指南將他們的知識應用於執行管道並支持使用 Delta 協議的多樣化用例。為了充分利用本書,需具備 SQL、Python 編程和 Spark 的基本知識。