Mapping Data Flows in Azure Data Factory: Building Scalable Etl Projects in the Microsoft Cloud

Kromer, Mark

  • 出版商: Apress
  • 出版日期: 2022-08-26
  • 售價: $2,210
  • 貴賓價: 9.5$2,100
  • 語言: 英文
  • 頁數: 194
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484286111
  • ISBN-13: 9781484286111
  • 相關分類: Microsoft AzureJVM 語言
  • 海外代購書籍(需單獨結帳)

商品描述

Build scalable ETL data pipelines in the cloud using Azure Data Factory's Mapping Data Flows. Each chapter of this book addresses different aspects of an end-to-end data pipeline that includes repeatable design patterns based on best practices using ADF's code-free data transformation design tools. The book shows data engineers how to take raw business data at cloud scale and turn that data into business value by organizing and transforming the data for use in data science projects and analytics systems.
The book begins with an introduction to Azure Data Factory followed by an introduction to its Mapping Data Flows feature set. Subsequent chapters show how to build your first pipeline and corresponding data flow, implement common design patterns, and operationalize your result. By the end of the book, you will be able to apply what you've learned to your complex data integration and ETL projects in Azure. These projects will enable cloud-scale big analytics and data loading and transformation best practices for data warehouses.

What You Will Learn

  • Build scalable ETL jobs in Azure without writing code
  • Transform big data for data quality and data modeling requirements
  • Understand the different aspects of Azure Data Factory ETL pipelines from datasets and Linked Services to Mapping Data Flows
  • Apply best practices for designing and managing complex ETL data pipelines in Azure Data Factory
  • Add cloud-based ETL patterns to your set of data engineering skills
  • Build repeatable code-free ETL design patterns

Who This Book Is For
Data engineers who are new to building complex data transformation pipelines in the cloud with Azure; and data engineers who need ETL solutions that scale to match swiftly growing volumes of data

商品描述(中文翻譯)

在雲端使用Azure Data Factory的Mapping Data Flows建立可擴展的ETL數據流程。本書的每一章節都涵蓋了端到端數據流程的不同方面,包括基於最佳實踐的可重複設計模式,使用ADF的無代碼數據轉換設計工具。本書向數據工程師展示如何在雲端規模下處理原始業務數據,並將該數據組織和轉換為數據科學項目和分析系統中使用的商業價值。
本書以介紹Azure Data Factory為開始,接著介紹其Mapping Data Flows功能集。隨後的章節展示了如何構建第一個數據流程和相應的數據流,實施常見的設計模式,並將結果操作化。通過本書,您將能夠將所學應用於Azure中的複雜數據集成和ETL項目。這些項目將為數據倉庫提供雲端規模的大型分析和數據加載和轉換最佳實踐。

您將學到什麼

  • 在Azure中構建可擴展的無代碼ETL作業

  • 為數據質量和數據建模需求轉換大數據

  • 了解Azure Data Factory ETL數據流程的不同方面,從數據集和連接服務到Mapping Data Flows

  • 應用最佳實踐設計和管理Azure Data Factory中的複雜ETL數據流程

  • 將基於雲端的ETL模式添加到您的數據工程技能集中

  • 構建可重複使用的無代碼ETL設計模式


本書適合對象
對於初次在Azure中構建複雜數據轉換流程的數據工程師,以及需要能夠應對迅速增長的數據量的ETL解決方案的數據工程師。

作者簡介

​Mark Kromer has been in the data analytics product space for over 20 years and is currently a Principal Program Manager for Microsoft's Azure data integration products. Mark often writes and speaks on big data analytics and data analytics and was an engineering architect and product manager for Oracle, Pentaho, AT&T, and Databricks prior to Microsoft Azure.

作者簡介(中文翻譯)

Mark Kromer在數據分析產品領域已有超過20年的經驗,目前擔任微軟Azure數據整合產品的首席計畫經理。Mark經常就大數據分析和數據分析進行撰寫和演講,並在加入微軟Azure之前曾擔任Oracle、Pentaho、AT&T和Databricks的工程架構師和產品經理。