Cost-Effective Data Pipelines: Balancing Trade-Offs When Developing Pipelines in the Cloud

Leonard, Sev

  • 出版商: O'Reilly
  • 出版日期: 2023-08-22
  • 定價: $2,250
  • 售價: 9.5$2,138
  • 貴賓價: 9.0$2,025
  • 語言: 英文
  • 頁數: 286
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1492098647
  • ISBN-13: 9781492098645
  • 相關分類: 大數據 Big-dataData Science雲端運算
  • 立即出貨 (庫存 < 4)

商品描述

The low cost of getting started with cloud services can easily evolve into a significant expense down the road. That's challenging for teams developing data pipelines, particularly when rapid changes in technology and workload require a constant cycle of redesign. How do you deliver scalable, highly available products while keeping costs in check?

With this practical guide, author Sev Leonard provides a holistic approach to designing scalable data pipelines in the cloud. Intermediate data engineers, software developers, and architects will learn how to navigate cost/performance trade-offs and how to choose and configure compute and storage. You'll also pick up best practices for code development, testing, and monitoring.

By focusing on the entire design process, you'll be able to deliver cost-effective, high-quality products. This book helps you:

  • Reduce cloud spend with lower cost cloud service offerings and smart design strategies
  • Minimize waste without sacrificing performance by rightsizing compute resources
  • Drive pipeline evolution, head off performance issues, and quickly debug with effective monitoring
  • Set up development and test environments that minimize cloud service dependencies
  • Create data pipeline code bases that are testable and extensible, fostering rapid development and evolution
  • Improve data quality and pipeline operation through validation and testing

商品描述(中文翻譯)

開始使用雲端服務的低成本,很容易在未來演變成重大的開支。對於開發數據管道的團隊來說,這是一個挑戰,特別是當技術和工作負載的快速變化需要不斷重新設計時。在保持成本控制的同時,如何提供可擴展且高可用的產品呢?

在這本實用指南中,作者Sev Leonard提供了一種設計雲端可擴展數據管道的整體方法。中級數據工程師、軟體開發人員和架構師將學習如何在成本/性能之間取得平衡,以及如何選擇和配置計算和存儲。您還將學習代碼開發、測試和監控的最佳實踐。

通過關注整個設計過程,您將能夠提供具有成本效益且高質量的產品。本書將幫助您:

- 通過使用更低成本的雲端服務和智能設計策略來降低雲端開支
- 通過權衡計算資源的大小來減少浪費,同時不影響性能
- 通過有效的監控來推動管道演進,預防性能問題並快速進行調試
- 建立減少對雲端服務依賴的開發和測試環境
- 創建可測試和可擴展的數據管道代碼庫,促進快速開發和演進
- 通過驗證和測試提高數據質量和管道運行效果

作者簡介

With over 20 years of experience in the technology industry Sev brings a breadth of experience spanning circuit design for Intel microprocessors, user-driven application development, and data platform development at both small and large scale. Throughout his career Sev has been a writer, speaker, and teacher along with his technical contributions, seeking to pass on what he has learned and make technology education accessible to all.

Sev's experience developing cloud data pipelines across multiple cloud service providers in large-scale batch and real-time environments, alongside his established record of writing and teaching, make him uniquely qualified to write Cost-effective Data Pipelines. Sev's hands-on experience as a data-engineer coupled with his ability to synthesize ideas provide him both with the subject matter expertise to speak on the topics in Cost-effective Data Pipelines and to elucidate these advanced concepts to readers. Sev's focus on providing actionable, hands-on content in his classes, tutorials, and interactive sessions guarantees an approach that readers will be able to quickly put into practice.

作者簡介(中文翻譯)

擁有超過20年的科技行業經驗,Sev在Intel微處理器的電路設計、以使用者為導向的應用程式開發,以及小型和大型規模的數據平台開發方面具有廣泛的經驗。在他的職業生涯中,Sev除了在技術上做出貢獻外,還是一位作家、演講者和教師,致力於傳遞他所學到的知識,使科技教育對所有人都更加容易接觸。

Sev在大規模批次和實時環境中開發跨多個雲服務提供商的雲數據管道的經驗,以及他在寫作和教學方面的成績,使他成為撰寫《成本效益的數據管道》的獨特資格。Sev作為一名數據工程師的實踐經驗,加上他將觀念綜合起來的能力,使他具備在《成本效益的數據管道》這些主題上發表演講並向讀者闡明這些高級概念的專業知識。Sev在他的課程、教程和互動會議中提供可行的、實踐性的內容,保證讀者能夠迅速將其應用於實踐中。