Data Pipelines Pocket Reference: Moving and Processing Data for Analytics

Densmore, James

  • 出版商: O'Reilly
  • 出版日期: 2021-03-16
  • 定價: $1,150
  • 售價: 9.0$1,035
  • 語言: 英文
  • 頁數: 276
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1492087831
  • ISBN-13: 9781492087830
  • 相關分類: 大數據 Big-dataData ScienceMachine Learning
  • 立即出貨 (庫存 < 4)

買這商品的人也買了...

商品描述

Data pipelines are the foundation for success in data analytics and machine learning. Moving data from many diverse sources and processing it to provide context is the difference between having data and actually gaining value from it. This pocket reference defines data pipelines and explains how they work in today's modern data stack.

You'll learn common considerations and key decision points when implementing pipelines, such as data pipeline design patterns, data ingestion implementation, data transformation, the orchestration of pipelines, and build versus buy decision making. This book addresses the most common decisions made by data professionals and discusses foundational concepts that apply to open source frameworks, commercial products, and homegrown solutions.

You'll learn:

  • What a data pipeline is and how it works
  • How data is moved and processed on modern data infrastructure, including cloud platforms
  • Common tools and products used by data engineers to build pipelines
  • How pipelines support machine learning and analytics needs
  • Considerations for pipeline maintenance, testing, and alerting

商品描述(中文翻譯)

數據管道是數據分析和機器學習成功的基礎。將數據從多個不同的來源移動並處理以提供上下文,是擁有數據和實際從中獲得價值之間的區別。這本口袋參考書定義了數據管道並解釋了它們在當今現代數據堆棧中的工作方式。

在實施管道時,您將學習常見的考慮因素和關鍵決策點,例如數據管道設計模式、數據輸入實現、數據轉換、管道的協調以及建立與購買的決策制定。本書討論了數據專業人員最常遇到的決策,並討論了適用於開源框架、商業產品和自家解決方案的基礎概念。

您將學到:
- 數據管道是什麼以及它如何工作
- 在現代數據基礎設施(包括雲平台)上如何移動和處理數據
- 數據工程師用於構建管道的常用工具和產品
- 管道如何支持機器學習和分析需求
- 管道維護、測試和警報的考慮因素

作者簡介

James is the Director of Data Infrastructure at HubSpot as well as the founder and Principal Consultant at Data Liftoff. He has more than 10 years of experience leading data teams and building data infrastructure at Wayfair, O'Reilly Media, and Degreed. James has a BS in Computer Science from Northeastern University and an MBA from Boston College.

作者簡介(中文翻譯)

James是HubSpot的數據基礎設施總監,也是Data Liftoff的創始人和首席顧問。他在Wayfair、O'Reilly Media和Degreed擔任數據團隊領導和數據基礎設施建設方面擁有超過10年的經驗。James擁有東北大學的計算機科學學士學位和波士頓學院的工商管理碩士學位。