Data Science with Jupyter: Master Data Science skills with easy-to-follow Python examples

Gupta, Prateek

商品描述

Step-by-step guide to practising data science techniques with Jupyter notebooks

Description
Modern businesses are awash with data, making data driven decision-making tasks increasingly complex. As a result, relevant technical expertise and analytical skills are required to do such tasks. This book aims to equip you with just enough knowledge of Python in conjunction with skills to use powerful tool such as Jupyter Notebook in order to succeed in the role of a data scientist.

The book starts with a brief introduction to the world of data science and the opportunities you may come across along with an overview of the key topics covered in the book. You will learn how to setup Anaconda installation which comes with Jupyter and preinstalled Python packages. Before diving in to several supervised, unsupervised and other machine learning techniques, you'll learn how to use basic data structures, functions, libraries and packages required to import, clean, visualize and process data. Several machine learning techniques such as regression, classification, clustering, time-series etc have been explained with the use of practical examples and by comparing the performance of various models.

By the end of the book, you will come across few case studies to put your knowledge to practice and solve real-life business problems such as building a movie recommendation engine, classifying spam messages, predicting the ability of a borrower to repay loan on time and time series forecasting of housing prices. Remember to practice additional examples provided in the code bundle of the book to master these techniques.

Audience
The book is intended for anyone looking for a career in data science, all aspiring data scientists who want to learn the most powerful programming language in Machine Learning or working professionals who want to switch their career in Data Science. While no prior knowledge of Data Science or related technologies is assumed, it will be helpful to have some programming experience.

Key Features
  • Acquire Python skills to do independent data science projects
  • Learn the basics of linear algebra and statistical science in Python way
  • Understand how and when they're used in data science
  • Build predictive models, tune their parameters and analyze performance in few steps
  • Cluster, transform, visualize, and extract insights from unlabelled datasets
  • Learn how to use matplotlib and seaborn for data visualization
  • Implement and save machine learning models for real-world business scenarios
Table of Contents
  1. Data Science Fundamentals
  2. Installing Software and Setting up
  3. Lists and Dictionaries
  4. Function and Packages
  5. NumPy Foundation
  6. Pandas and Dataframe
  7. Interacting with Databases
  8. Thinking Statistically in Data Science
  9. How to import data in Python?
  10. Cleaning of imported data
  11. Data Visualization
  12. Data Pre-processing
  13. Supervised Machine Learning
  14. Unsupervised Machine Learning
  15. Handling Time-Series Data
  16. Time-Series Methods
  17. Case Study - 1
  18. Case Study - 2
  19. Case Study - 3
  20. Case Study - 4
About the Author
Prateek is a Data Enthusiast and loves the data driven technologies. Prateek has total 7 years of experience and currently he is working as a Data Scientist in an MNC. He has worked with finance and retail clients and has developed Machine Learning and Deep Learning solutions for their business. His keen area of interest is in natural language processing and in computer vision. In leisure he writes posts about Data Science with Python in his blog.

商品描述(中文翻譯)

這本書是一本關於使用Jupyter筆記本進行數據科學技術實踐的逐步指南。

現代企業充斥著數據,使得基於數據的決策任務變得越來越複雜。因此,需要相應的技術專業知識和分析能力來完成這些任務。本書旨在為您提供足夠的Python知識,並結合使用Jupyter筆記本等強大工具的技能,以便在數據科學家的角色中取得成功。

本書以簡要介紹數據科學世界和您可能遇到的機會開始,並概述了本書涵蓋的主要主題。您將學習如何設置帶有Jupyter和預安裝Python包的Anaconda安裝。在深入研究幾種監督、非監督和其他機器學習技術之前,您將學習使用基本數據結構、函數、庫和包來導入、清理、可視化和處理數據。書中通過實際示例和比較各種模型的性能來解釋了多種機器學習技術,如回歸、分類、聚類、時間序列等。

通過閱讀本書,您將遇到一些案例研究,將您的知識應用於實踐並解決現實生活中的業務問題,例如構建電影推薦引擎、分類垃圾郵件、預測借款人按時還款的能力以及房價時間序列預測。請記住,為了掌握這些技術,請練習本書提供的附加示例代碼。

本書適合任何希望從事數據科學職業的人,所有渴望學習機器學習中最強大的編程語言的初學者,以及希望轉換職業從事數據科學的專業人士。雖然不需要先備的數據科學或相關技術知識,但具備一些編程經驗將會有所幫助。

本書的主要特點包括:
- 獲取Python技能以進行獨立的數據科學項目
- 以Python方式學習線性代數和統計科學的基礎知識
- 了解它們在數據科學中的使用方式和時機
- 在幾個步驟中構建預測模型,調整其參數並分析性能
- 對無標籤數據集進行聚類、轉換、可視化和提取洞察
- 學習如何使用matplotlib和seaborn進行數據可視化
- 在真實世界的業務場景中實施和保存機器學習模型

本書的目錄包括:
1. 數據科學基礎知識
2. 軟件安裝和設置
3. 列表和字典
4. 函數和包
5. NumPy基礎知識
6. Pandas和數據框
7. 與數據庫交互
8. 在數據科學中思考統計學
9. 如何在Python中導入數據?
10. 導入數據的清理
11. 數據可視化
12. 數據預處理
13. 監督式機器學習
14. 非監督式機器學習
15. 處理時間序列數據
16. 時間序列方法
17. 案例研究-1
18. 案例研究-2
19. 案例研究-3
20. 案例研究-4

關於作者:
Prateek是一位數據愛好者,熱愛數據驅動的技術。Prateek擁有7年的經驗,目前在一家跨國公司擔任數據科學家。他曾與金融和零售客戶合作,為他們的業務開發機器學習和深度學習解決方案。他對自然語言處理和計算機視覺有濃厚的興趣。在閒暇時間,他在自己的博客上撰寫有關使用Python進行數據科學的文章。