Introducing Mlops: How to Scale Machine Learning in the Enterprise

Treveil, Mark, Omont, Nicolas, Stenac, CL



More than half of the analytics and machine learning (ML) models created by organizations today never make it into production. Instead, many of these ML models do nothing more than provide static insights in a slideshow. If they aren't truly operational, these models can't possibly do what you've trained them to do.

This book introduces practical concepts to help data scientists and application engineers operationalize ML models to drive real business change. Through lessons based on numerous projects around the world, six experts in data analytics provide an applied four-step approach--Build, Manage, Deploy and Integrate, and Monitor--for creating ML-infused applications within your organization.

You'll learn how to:

  • Fulfill data science value by reducing friction throughout ML pipelines and workflows
  • Constantly refine ML models through retraining, periodic tuning, and even complete remodeling to ensure long-term accuracy
  • Design the ML Ops lifecycle to ensure that people-facing models are unbiased, fair, and explainable
  • Operationalize ML models not only for pipeline deployment but also for external business systems that are more complex and less standardized
  • Put the four-step Build, Manage, Deploy and Integrate, and Monitor approach into action





- 通過減少ML管道和工作流程中的摩擦來實現數據科學價值
- 通過重新訓練、定期調整甚至完全重建來不斷改進ML模型,以確保長期準確性
- 設計ML Ops生命周期,以確保面向人的模型是無偏、公平且可解釋的
- 將ML模型操作化,不僅用於管道部署,還用於更複雜且不太標準化的外部業務系統
- 將建立、管理、部署和整合、監控的四步方法付諸實踐


Mark Treveil has designed products in fields as diverse as telecoms, banking, and online trading. His own startup led a revolution in governance in the UK local government, where it still dominates. He is now part of the Dataiku Product Team based in Paris.

Nicolas Omont is VP of operations at Artelys where he is developing mathematical optimization solutions for energy and transport. He previously held the role of Dataiku Product Manager for ML and advanced analytics. He holds a PhD in Computer Science, and he's been working in operations research and statistics for the past 15 years, mainly in the telecommunications and energy utility sectors.

Clément Stenac is a passionate software engineer, CTO and co-founder at Dataiku. He oversees the design, development of the Dataiku DSS Entreprise AI Platform. Clément was previously head of product development at Exalead, leading the design and implementation of web-scale search engine software. He also has extensive experience with open source software, as a former developer of the VideoLAN (VLC) and Debian projects.

Kenji Lefevre is VP Product at Dataiku. He oversees the product roadmap and the user experience of the Dataiku DSS Entreprise AI Platform. He holds a PhD in pure mathematics from University of Paris VII, and he directed documentary movies before switching to Data Science and product management.

Du Phan is a Machine Learning engineer at Dataiku, where he works in democratizing data science. In the past few years, he has been dealing with a variety of data problems, from geospatial analysis to deep learning. His work now focuses on different facets and challenges of MLOps.

Joachim Zentici is an Engineering Director at Dataiku. Joachim graduated in applied mathematics from Ecole Centrale Paris. Prior to joining Dataiku in 2014, he was a Research Engineer in computer vision at Siemens Molecular Imaging and INRIA. He has also been a teacher and a lecturer. At Dataiku, Joachim had multiple contributions including managing the engineers in charge of the core infrastructure, building the team for the plugins & ecosystem effort as well as leading the global technology training program for customer-facing engineers.

Adrien Lavoillotte is Engineering Director at Dataiku where he leads the team responsible for machine learning and statistics features in the software. He studied at ECE Paris, a graduate school of engineering, and worked for several startups before joining Dataiku in 2015.

Makoto Miyazaki is a Data Scientist at Dataiku and responsible for delivering hands-on consulting services using Dataiku DSS for European and Japanese clients. Makoto holds a Bachelor's degree in economics and a Master's Degree in data science, and he was also a former financial journalist with a wide range of beats, including nuclear energy and economic recoveries from the tsunami.

Lynn Heidmann received her Bachelor of Arts in Journalism/Mass Communications and Anthropology from the University of Wisconsin-Madison in 2008 and decided to bring her passion for research and writing into the world of tech. She spent seven years in the San Francisco Bay Area writing and running operations with Google and subsequently Niantic before moving to Paris to head content initiatives at Dataiku. In her current role, Lynn follows and writes about technological trends and developments in the world of data and AI.


Mark Treveil在電信、銀行和線上交易等不同領域設計產品。他自己的初創公司在英國地方政府的治理方面引領了一場革命,至今仍佔主導地位。他現在是位於巴黎的Dataiku產品團隊的一員。

Nicolas Omont是Artelys的運營副總裁,他正在開發能源和運輸的數學優化解決方案。他之前擔任Dataiku機器學習和高級分析的產品經理。他擁有計算機科學博士學位,過去15年一直在運營研究和統計學方面工作,主要在電信和能源公用事業部門。

Clément Stenac是一位熱情的軟體工程師,也是Dataiku的首席技術官和聯合創始人。他負責設計和開發Dataiku DSS企業人工智慧平台。Clément之前在Exalead擔任產品開發負責人,領導網絡規模搜索引擎軟體的設計和實施。他還在開源軟體方面擁有豐富的經驗,曾是VideoLAN(VLC)和Debian項目的開發人員。

Kenji Lefevre是Dataiku的產品副總裁。他負責Dataiku DSS企業人工智慧平台的產品路線圖和用戶體驗。他擁有巴黎第七大學的純數學博士學位,並在轉向數據科學和產品管理之前執導紀錄片。

Du Phan是Dataiku的機器學習工程師,致力於普及數據科學。在過去幾年中,他處理了各種數據問題,從地理空間分析到深度學習。他現在的工作重點是MLOps的不同方面和挑戰。

Joachim Zentici是Dataiku的工程總監。Joachim畢業於巴黎中央理工學院應用數學專業。在2014年加入Dataiku之前,他曾在西門子分子成像和法國國家資訊和自動化研究所擔任研究工程師。他還曾擔任教師和講師。在Dataiku,Joachim做出了多項貢獻,包括管理負責核心基礎設施的工程師,建立插件和生態系統工作團隊,以及領導面向客戶的工程師的全球技術培訓計劃。

Adrien Lavoillotte是Dataiku的工程總監,負責軟體中的機器學習和統計功能。他在ECE巴黎工程學院學習,並在加入Dataiku之前在幾家初創公司工作。

Makoto Miyazaki是Dataiku的數據科學家,負責為歐洲和日本客戶提供使用Dataiku DSS的實踐咨詢服務。Makoto擁有經濟學學士學位和數據科學碩士學位,並且曾是一名廣泛涉獵核能和海嘯經濟復甦等各種領域的金融記者。

Lynn Heidmann於2008年從威斯康辛大學麥迪遜分校獲得新聞/大眾傳播學士學位和人類學學位,並決定將她對研究和寫作的熱情帶入科技領域。她在舊金山灣區與Google和隨後的Niantic工作了七年,然後搬到巴黎,在Dataiku負責內容計劃。在她目前的職位上,Lynn關注並撰寫有關數據和人工智慧領域的技術趨勢和發展。