Building Data Science Solutions with Anaconda: A comprehensive starter guide to building robust and complete models

Meador, Dan

  • 出版商: Packt Publishing
  • 出版日期: 2022-05-27
  • 售價: $1,620
  • 貴賓價: 9.5$1,539
  • 語言: 英文
  • 頁數: 330
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1800568789
  • ISBN-13: 9781800568785
  • 相關分類: Data Science
  • 立即出貨 (庫存=1)

商品描述

Key Features

  • Learn from an AI patent-holding engineering manager with deep experience in Anaconda tools and OSS
  • Get to grips with critical aspects of data science such as bias in datasets and interpretability of models
  • Gain a deeper understanding of the AI/ML landscape through real-world examples and practical analogies

Book Description

You might already know that there's a wealth of data science and machine learning resources available on the market, but what you might not know is how much is left out by most of these AI resources. This book not only covers everything you need to know about algorithm families but also ensures that you become an expert in everything, from the critical aspects of avoiding bias in data to model interpretability, which have now become must-have skills.

In this book, you'll learn how using Anaconda as the easy button, can give you a complete view of the capabilities of tools such as conda, which includes how to specify new channels to pull in any package you want as well as discovering new open source tools at your disposal. You'll also get a clear picture of how to evaluate which model to train and identify when they have become unusable due to drift. Finally, you'll learn about the powerful yet simple techniques that you can use to explain how your model works.

By the end of this book, you'll feel confident using conda and Anaconda Navigator to manage dependencies and gain a thorough understanding of the end-to-end data science workflow.

What you will learn

  • Install packages and create virtual environments using conda
  • Understand the landscape of open source software and assess new tools
  • Use scikit-learn to train and evaluate model approaches
  • Detect bias types in your data and what you can do to prevent it
  • Grow your skillset with tools such as NumPy, pandas, and Jupyter Notebooks
  • Solve common dataset issues, such as imbalanced and missing data
  • Use LIME and SHAP to interpret and explain black-box models

Who this book is for

If you're a data analyst or data science professional looking to make the most of Anaconda's capabilities and deepen your understanding of data science workflows, then this book is for you. You don't need any prior experience with Anaconda, but a working knowledge of Python and data science basics is a must.

商品描述(中文翻譯)

主要特點


  • 從一位擁有AI專利的工程經理那裡學習,該經理在Anaconda工具和OSS方面具有豐富的經驗

  • 深入了解數據科學的關鍵方面,例如數據集中的偏見和模型的可解釋性

  • 通過實際案例和實用的類比,更深入地了解AI/ML領域

書籍描述

你可能已經知道市場上有大量的數據科學和機器學習資源,但你可能不知道大多數這些AI資源都遺漏了多少內容。本書不僅涵蓋了你需要了解的所有算法家族,還確保你成為一個專家,從避免數據偏見到模型可解釋性等關鍵方面,這些現在已經成為必備技能。

在本書中,你將學習如何使用Anaconda作為簡單的工具,可以全面了解conda等工具的功能,包括如何指定新通道以引入任何你想要的套件,以及發現你可以使用的新的開源工具。你還將清楚地了解如何評估要訓練的模型以及如何識別由於漂移而變得無法使用的模型。最後,你將學習到一些強大而簡單的技巧,可以解釋你的模型如何工作。

通過閱讀本書,你將對使用conda和Anaconda Navigator來管理依賴關係感到自信,並全面了解端到端的數據科學工作流程。

你將學到什麼


  • 使用conda安裝套件並創建虛擬環境

  • 了解開源軟件的現狀並評估新工具

  • 使用scikit-learn來訓練和評估模型方法

  • 檢測數據中的偏見類型以及如何預防

  • 使用NumPy、pandas和Jupyter Notebooks等工具來擴展你的技能

  • 解決常見的數據集問題,例如不平衡和缺失數據

  • 使用LIME和SHAP來解釋和說明黑盒模型

本書適合對象

如果你是一位數據分析師或數據科學專業人士,希望充分利用Anaconda的功能並加深對數據科學工作流程的理解,那麼本書適合你。你不需要有任何Anaconda的先前經驗,但對Python和數據科學基礎知識的工作經驗是必須的。

作者簡介

Dan Meador is an Engineering Manager at Anaconda and is the creator of Conda as well as champion of open source at Anaconda. With a history of engineering and client facing roles, he has the ability to jump into any position. He has a track record of delivering as a leader and a follower in companies from the Fortune 10 to startups.

作者簡介(中文翻譯)

Dan Meador是Anaconda的工程經理,也是Conda的創造者,同時也是Anaconda開源事業的倡導者。他擁有豐富的工程和客戶面對的經驗,能夠適應任何職位。他在財富10強公司和初創公司中擔任領導者和追隨者的角色,並有著成功的履歷。

目錄大綱

Table of Contents

  1. Understanding the AI/ML Landscape
  2. Analyzing Open Source Software
  3. Using Anaconda Distribution to Manage Packages
  4. Working with Jupyter Notebooks and NumPy
  5. Cleaning and Visualizing Data
  6. Overcoming Bias in AI/ML
  7. Choosing the Best AI Algorithm
  8. Dealing with Common Data Problems
  9. Building a Regression Model with scikit-learn
  10. Explainable AI - Using LIME and SHAP
  11. Tuning Hyperparameters and Versioning Your Model

目錄大綱(中文翻譯)

目錄


  1. 了解人工智慧/機器學習領域

  2. 分析開源軟體

  3. 使用Anaconda Distribution管理套件

  4. 使用Jupyter Notebooks和NumPy

  5. 清理和視覺化資料

  6. 克服人工智慧/機器學習中的偏見

  7. 選擇最佳的人工智慧演算法

  8. 處理常見的資料問題

  9. 使用scikit-learn建立迴歸模型

  10. 可解釋的人工智慧-使用LIME和SHAP

  11. 調整超參數和版本控制模型

類似商品