Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn

Nokeri, Tshepo Chris

  • 出版商: Apress
  • 出版日期: 2021-10-26
  • 售價: $1,520
  • 貴賓價: 9.5$1,444
  • 語言: 英文
  • 頁數: 136
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 1484277619
  • ISBN-13: 9781484277614
  • 相關分類: DeepLearningPython程式語言JVM 語言SparkData Science
  • 海外代購書籍(需單獨結帳)

商品描述

Chapter 1: Understanding Machine Learning and Deep Learning.

Chapter goal: It carefully presents supervised and unsupervised ML and DL models and their application in the real world.

  • Understanding Machine Learning.

  • Supervised Learning.

    • The Parametric Method.

    • The Non-parametric method.

    • Ensemble Methods.

  • Unsupervised Learning.

    • Cluster Analysis.

    • Dimension Reduction.

  • Exploring Deep Learning.

  • Conclusion.

Chapter 2: Big Data Frameworks and ML and DL Frameworks.

Chapter goal: It explains a big data framework recognized as PySpark, machine learning frameworks like SciKit-Learn, XGBoost, and H2O, and a deep learning framework called Keras.

  • Big Data Frameworks and ML and DL Frameworks.

  • Big Data.

    • Characteristics of Big Data.

  • Impact of Big Data on Business and People.

    • Better Customer Relationships.

    • Refined Product Development.

    • Improved Decision-Making.

  • Big Data Warehousing.

    • Big Data ETL.

  • Big Data Frameworks.

    • Apache Spark.

      • Resilient Distributed Datasets.

      • Spark Configuration.

      • Spark Frameworks.

  • ML Frameworks.

  • SciKit-Learn.

  • H2O.

  • XGBoost.

  • DL Frameworks.

    • Keras.

  • Conclusion.

  • Chapter 3: The Parametric Method - Linear Regression.

    Chapter goal: It considers the most popular parametric model - the Generalized Linear Model.

    • Regression Analysis.

    • Regression in practice.

      • SciKit-Learn in action.

      • Spark MLlib in action.

      • H2O in action.

    • Conclusion.

    Chapter 4: Survival Regression Analysis.

    Chapter goal: It covers two main survival regression analysis models, the Cox Proportional Hazards and Accelerated Failure Time model.

    • Cox Proportional Hazards.

    • Lifeline in action.

  • Accelerated Failure Time (AFT) model.

    • Spark MLlib in Action.

  • Conclusion.

  • Chapter 5: The Non-Parametric Method - Classification.

    Chapter goal: It covers a binary classification model, recognized as Logistic Regression, using SciKit-Learn, Keras, PySpark MLlib, and H2O.

  • Logistic Regression.

  • Logistic Regression in Practice.

    • SciKit-Learn in action.

    • Spark MLlib in Action.

    • H2O in action.

  • Conclusion.

  • Chapter 6: Tree-based Modelling and Gradient Boosting.

    Chapter goal: It covers two main ensemble methods, the decision tree model and the gradient boost model.

  • Decision Tree.

    • SciKit-Learn in action.

  • Gradient Boosting.

    • XGBoost in action.

    • Spark MLlib in Action.

    • H2O in action.

  • Conclusion.

  • Chapter 7: Artificial Neural Networks.

    Chapter goal: It covers deep learning and its application in the real world. It shows ways of designing, building, and testing an MLP classifier using the SciKit-Learn framework and an artificial neural network using the Keras framework.

  • Deep Learning.

    • Restricted Boltzmann Machine.

  • Multi-Layer Perception Neural Network.

    • SciKit-Learn in action.

    • Deep Belief Networks.

    • Keras in action.

    • H2O in action.

  • Conclusion.

  • Chapter 8: Cluster Analysis using K-Means.

    Chapter goal: It covers a technique of finding k, modelling and evaluating a cluster model known as K-Means using framework