Data Science on the Google Cloud Platform: Implementing End-To-End Real-Time Data Pipelines: From Ingest to Machine Learning, 2/e (Paperback)

Lakshmanan, Valliappa

  • 出版商: O'Reilly
  • 出版日期: 2022-05-03
  • 定價: $2,780
  • 售價: 9.5$2,641
  • 語言: 英文
  • 頁數: 446
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1098118952
  • ISBN-13: 9781098118952
  • 相關分類: Google CloudMachine LearningData Science
  • 立即出貨 (庫存=1)



Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline, using statistical and machine learning methods and tools on GCP.

Through the course of this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by implementing these statistical and machine learning solutions in your own project on GCP, and discover how this platform provides a transformative and more collaborative way of doing data science.

You'll learn how to:

  • Employ best practices in building highly scalable data and ML pipelines on Google Cloud
  • Automate and schedule data ingest using Cloud Run
  • Create and populate a dashboard in Data Studio
  • Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
  • Conduct interactive data exploration with BigQuery
  • Create a Bayesian model with Spark on Cloud Dataproc
  • Forecast time series and do anomaly detection with BigQuery ML
  • Aggregate within time windows with Dataflow
  • Train explainable machine learning models with Vertex AI
  • Operationalize ML with Vertex AI Pipelines


學習如何在使用Google Cloud Platform (GCP) 構建時,將複雜的統計和機器學習方法應用於實際問題是多麼容易。這本實用指南向數據工程師和數據科學家展示了如何在GCP上實施端到端的數據流程,使用統計和機器學習方法和工具。



- 在Google Cloud上構建高度可擴展的數據和機器學習流程的最佳實踐
- 使用Cloud Run自動化和計劃數據輸入
- 在Data Studio中創建和填充儀表板
- 使用Pub/Sub、Dataflow和BigQuery構建實時分析流程
- 使用BigQuery進行交互式數據探索
- 在Cloud Dataproc上使用Spark創建貝葉斯模型
- 使用BigQuery ML預測時間序列和進行異常檢測
- 使用Dataflow在時間窗口內進行聚合
- 使用Vertex AI訓練可解釋的機器學習模型
- 使用Vertex AI Pipelines將機器學習操作化


Valliappa (Lak) Lakshmanan is the director of analytics and AI solutions at Google Cloud, where he leads a team building cross-industry solutions to business problems. His mission is to democratize machine learning so that it can be done by anyone anywhere. Lak is the author or coauthor of Practical Machine Learning for Computer Vision, Machine Learning Design Patterns, Data Governance The Definitive Guide, Google BigQuery The Definitive Guide, and Data Science on the Google Cloud Platform.


Valliappa (Lak) Lakshmanan 是 Google Cloud 的分析和人工智慧解決方案總監,他帶領一個團隊開發跨行業的解決方案來解決商業問題。他的使命是使機器學習民主化,讓任何人在任何地方都能進行機器學習。Lak 是 Practical Machine Learning for Computer Vision、Machine Learning Design Patterns、Data Governance The Definitive Guide、Google BigQuery The Definitive Guide 和 Data Science on the Google Cloud Platform 的作者或合著者。