Kubeflow for Machine Learning: From Lab to Production

Grant, Trevor, Karau, Holden, Lublinsky, Boris



If you're training a machine learning model but aren't sure how to put it into production, this book will get you there. Kubeflow provides a collection of cloud native tools for different stages of a model's lifecycle, from data exploration, feature preparation, and model training to model serving. This guide helps data scientists build production-grade machine learning implementations with Kubeflow and shows data engineers how to make models scalable and reliable.

Using examples throughout the book, authors Holden Karau, Trevor Grant, Ilan Filonenko, Richard Liu, and Boris Lublinsky explain how to use Kubeflow to train and serve your machine learning models on top of Kubernetes in the cloud or in a development environment on-premises.

  • Understand Kubeflow's design, core components, and the problems it solves
  • Learn how to set up Kubeflow on a cloud provider or on an in-house cluster
  • Train models using Kubeflow with popular tools including scikit-learn, TensorFlow, and Apache Spark
  • Learn how to add custom stages such as serving and prediction
  • Keep your model up-to-date with Kubeflow Pipelines
  • Understand how to validate machine learning pipelines



在整本書中,作者Holden Karau、Trevor Grant、Ilan Filonenko、Richard Liu和Boris Lublinsky解釋了如何使用Kubeflow在雲端或本地開發環境中,在Kubernetes之上訓練和提供機器學習模型。書中使用了許多示例,讓讀者更好地理解如何使用Kubeflow。

- 了解Kubeflow的設計、核心組件以及它所解決的問題
- 學習如何在雲端提供商或內部集群上設置Kubeflow
- 使用Kubeflow和流行工具(如scikit-learn、TensorFlow和Apache Spark)訓練模型
- 學習如何添加自定義階段,如服務和預測
- 使用Kubeflow Pipelines保持模型的最新狀態
- 了解如何驗證機器學習流程



Trevor Grant is a member of the Apache Software Foundation, and is heavily involved in the Apache Mahout, Apache Streams, and Community Development projects. He often tinkers and occasionally documents his (mis)adventures at www.rawkintrevo.org. In the before time, he was an international speaker on technology, but now he focuses mainly on writing. Trevor wishes to thank IBM for their continued patronage of his artistic endeavors. He lives in Chicago because it's the best city on the planet, with world class food, parks, and culture, and because the skies are never orange.

Holden Karau is a queer transgender Canadian, Apache Spark committer, Apache Software Foundation member, and an active open source contributor. She also extends her passion for building community with industry projects including Scaling for Python for ML and teaching distributed computing to children. As a software engineer, she's worked on a variety of distributed compute, search, and classification problems at Google, IBM, Alpine, Databricks, Foursquare, and Amazon. She graduated from the University of Waterloo with a bachelor of mathematics in computer science. Outside of software she enjoys playing with fire, welding, riding scooters, eating poutine, and dancing.

Boris Lublinsky is a Principal Architect at Lightbend. Boris has over 25 years experience in enterprise, technical architecture, and software engineering. He is an active member of OASIS SOA RM committee, co-author of Applied SOA: Service-Oriented Architecture and Design Strategies (Wiley) and author of numerous articles on Architecture, Programming, Big Data, SOA and BPM.

Richard Liu is a Senior Software Engineer at Waymo, where he focuses on building a machine learning platform for self-driving cars. Previously he has worked at Microsoft Azure and Google Cloud. He is one of the primary maintainers of the Kubeflow project and has given several talks at KubeCon. He holds a Master's degree in Computer Science from University of California, San Diego.

Ilan Filonenko is a member of the Data Science Infrastructure team at Bloomberg, where he has designed and implemented distributed systems at both the application and infrastructure level. Previously, Ilan was an engineering consultant and technical lead in various startups and research divisions across multiple industry verticals, including medicine, hospitality, finance, and music. He actively contributes to open source, primarily Apache Spark and Kubeflow's KFServing. He is one of the principal contributors to Spark on Kubernetes--primarily focusing on remote shuffle and HDFS security, and to multimodel serving in KFServing. Ilan's research has been in algorithmic, software, and hardware techniques for high-performance machine learning with a focus on optimizing stochastic algorithms and model management.


Trevor Grant是Apache Software Foundation的成員,積極參與Apache Mahout、Apache Streams和Community Development等專案。他經常進行實驗並偶爾在www.rawkintrevo.org上記錄他的冒險。在過去,他是一位國際技術演講者,但現在主要專注於寫作。Trevor感謝IBM對他藝術事業的持續支持。他住在芝加哥,因為這是地球上最好的城市,擁有世界一流的美食、公園和文化,而且天空從不是橙色的。

Holden Karau是一位加拿大的酷兒跨性別者,Apache Spark的貢獻者,Apache Software Foundation的成員,也是一位活躍的開源貢獻者。她還通過行業專案擴展她對建立社群的熱情,包括為Python的機器學習進行擴展和教授分散式計算給孩子們。作為一名軟體工程師,她曾在Google、IBM、Alpine、Databricks、Foursquare和Amazon等公司從事各種分散式計算、搜索和分類問題的工作。她畢業於滑鐵盧大學,獲得計算機科學的數學學士學位。在軟體之外,她喜歡玩火、焊接、騎踏板車、吃加拿大薯條和跳舞。

Boris Lublinsky是Lightbend的首席架構師。Boris在企業、技術架構和軟體工程方面擁有超過25年的經驗。他是OASIS SOA RM委員會的活躍成員,是《Applied SOA: Service-Oriented Architecture and Design Strategies》(Wiley)的合著者,並在架構、程式設計、大數據、SOA和BPM等方面撰寫了許多文章。

Richard Liu是Waymo的高級軟體工程師,專注於為自駕車建立機器學習平台。之前他曾在Microsoft Azure和Google Cloud工作。他是Kubeflow專案的主要維護者之一,並在KubeCon上發表過幾次演講。他擁有加州大學聖地亞哥分校的計算機科學碩士學位。

Ilan Filonenko是Bloomberg的數據科學基礎設施團隊成員,他在應用和基礎設施層面上設計和實施了分散式系統。之前,Ilan在多個初創企業和研究部門擔任工程顧問和技術負責人,涉及醫學、酒店業、金融和音樂等多個行業垂直領域。他積極參與開源項目,主要是Apache Spark和Kubeflow的KFServing。他是Spark on Kubernetes的主要貢獻者之一,主要關注遠程洗牌和HDFS安全,以及KFServing中的多模型服務。Ilan的研究主要集中在高性能機器學習的算法、軟體和硬體技術,重點是優化隨機算法和模型管理。