The Data Science Workshop - Second Edition: Learn how you can build machine learning models and create your own real-world data science projects
So, Anthony, Joseph, Thomas V., John, Robert Thas
- Gain a full understanding of the model production and deployment process
- Build your first machine learning model in just five minutes and get a hands-on machine learning experience
- Understand how to deal with common challenges in data science projects
Where there's data, there's insight. With so much data being generated, there is immense scope to extract meaningful information that'll boost business productivity and profitability. By learning to convert raw data into game-changing insights, you'll open new career paths and opportunities.
The Data Science Workshop begins by introducing different types of projects and showing you how to incorporate machine learning algorithms in them. You'll learn to select a relevant metric and even assess the performance of your model. To tune the hyperparameters of an algorithm and improve its accuracy, you'll get hands-on with approaches such as grid search and random search.
Next, you'll learn dimensionality reduction techniques to easily handle many variables at once, before exploring how to use model ensembling techniques and create new features to enhance model performance. In a bid to help you automatically create new features that improve your model, the book demonstrates how to use the automated feature engineering tool. You'll also understand how to use the orchestration and scheduling workflow to deploy machine learning models in batch.
By the end of this book, you'll have the skills to start working on data science projects confidently. By the end of this book, you'll have the skills to start working on data science projects confidently.
What you will learn
- Explore the key differences between supervised learning and unsupervised learning
- Manipulate and analyze data using scikit-learn and pandas libraries
- Understand key concepts such as regression, classification, and clustering
- Discover advanced techniques to improve the accuracy of your model
- Understand how to speed up the process of adding new features
- Simplify your machine learning workflow for production
Who this book is for
This is one of the most useful data science books for aspiring data analysts, data scientists, database engineers, and business analysts. It is aimed at those who want to kick-start their careers in data science by quickly learning data science techniques without going through all the mathematics behind machine learning algorithms. Basic knowledge of the Python programming language will help you easily grasp the concepts explained in this book.
Table of Contents
- Introduction to Data Science in Python
- Binary Classification
- Multiclass Classification with RandomForest
- Performing Your First Cluster Analysis
- How to Assess Performance
- The Generalization of Machine Learning Models
- Hyperparameter Tuning
- Interpreting a Machine Learning Model
- Analyzing a Dataset
- Data Preparation
- Feature Engineering
- Imbalanced Datasets
- Dimensionality Reduction
- Ensemble Learning