Practical Data Analysis using Jupyter Notebook

Marc Wintjen , Andrew Vlahutin

  • 出版商: Packt Publishing
  • 出版日期: 2020-06-19
  • 定價: $1,360
  • 售價: 9.0$1,224
  • 語言: 英文
  • 頁數: 322
  • 裝訂: Paperback
  • ISBN: 1838826033
  • ISBN-13: 9781838826031
  • 相關分類: Data Science
  • 立即出貨 (庫存 < 3)

商品描述

Key Features

  • Find out how to use Python code to extract insights from data using real-world examples
  • Work with structured data and free text sources to answer questions and add value using data
  • Perform data analysis from scratch with the help of clear explanations for cleaning, transforming, and visualizing data

Book Description

Data literacy is the ability to read, analyze, work with, and argue using data. Data analysis is the process of cleaning and modeling your data to discover useful information. This book combines these two concepts by sharing proven techniques and hands-on examples so that you can learn how to communicate effectively using data.

After introducing you to the basics of data analysis using Jupyter Notebook and Python, the book will take you through the fundamentals of data. Packed with practical examples, this guide will teach you how to clean, wrangle, analyze, and visualize data to gain useful insights, and you'll discover how to answer questions using data with easy-to-follow steps.

Later chapters teach you about storytelling with data using charts, such as histograms and scatter plots. As you advance, you'll understand how to work with unstructured data using natural language processing (NLP) techniques to perform sentiment analysis. All the knowledge you gain will help you discover key patterns and trends in data using real-world examples. In addition to this, you will learn how to handle data of varying complexity to perform efficient data analysis using modern Python libraries.

By the end of this book, you'll have gained the practical skills you need to analyze data with confidence.

What you will learn

  • Understand the importance of data literacy and how to communicate effectively using data
  • Find out how to use Python packages such as NumPy, pandas, Matplotlib, and the Natural Language Toolkit (NLTK) for data analysis
  • Wrangle data and create DataFrames using pandas
  • Produce charts and data visualizations using time-series datasets
  • Discover relationships and how to join data together using SQL
  • Use NLP techniques to work with unstructured data to create sentiment analysis models
  • Discover patterns in real-world datasets that provide accurate insights

Who this book is for

This book is for aspiring data analysts and data scientists looking for hands-on tutorials and real-world examples to understand data analysis concepts using SQL, Python, and Jupyter Notebook. Anyone looking to evolve their skills to become data-driven personally and professionally will also find this book useful. No prior knowledge of data analysis or programming is required to get started with this book.

商品描述(中文翻譯)

關鍵特點

- 透過實際案例,學習如何使用Python代碼從數據中提取洞察力
- 使用結構化數據和自由文本來源回答問題並增加數據價值
- 通過清理、轉換和可視化數據的清晰解釋,從頭開始進行數據分析

書籍描述

數據素養是閱讀、分析、使用和爭論數據的能力。數據分析是清理和建模數據以發現有用信息的過程。本書通過分享驗證過的技巧和實際示例,結合了這兩個概念,讓您學習如何有效地使用數據進行溝通。

在介紹使用Jupyter Notebook和Python進行數據分析的基礎知識後,本書將帶您深入了解數據的基礎知識。這本指南充滿了實際示例,將教您如何清理、整理、分析和可視化數據以獲得有用的洞察力,並且您將發現如何使用易於遵循的步驟使用數據回答問題。

後面的章節將教您使用直方圖和散點圖等圖表來講述數據故事。隨著您的進步,您將了解如何使用自然語言處理(NLP)技術處理非結構化數據以進行情感分析。您所獲得的所有知識將幫助您使用實際案例發現數據中的關鍵模式和趨勢。此外,您還將學習使用現代Python庫處理不同複雜性的數據以進行高效的數據分析。

通過閱讀本書,您將獲得分析數據所需的實用技能。

您將學到什麼

- 了解數據素養的重要性以及如何有效地使用數據進行溝通
- 學習使用Python庫(如NumPy、pandas、Matplotlib和自然語言工具包(NLTK))進行數據分析
- 使用pandas整理數據並創建數據框
- 使用時間序列數據集生成圖表和數據可視化
- 使用SQL發現關係並將數據聯結在一起
- 使用NLP技術處理非結構化數據以創建情感分析模型
- 發現真實世界數據中提供準確洞察力的模式

本書適合對數據分析和數據科學有興趣的人士,他們希望通過SQL、Python和Jupyter Notebook的實際教程和實際示例來理解數據分析概念。任何希望在個人和職業生活中發展成為數據驅動型人才的人也會發現本書很有用。閱讀本書無需具備數據分析或編程的先備知識。

作者簡介

Marc Wintjen is a Risk Analytics Architect at Bloomberg LP with over 20 years of professional experience. An evangelist for data literacy, he's known as the Data Mensch by helping others make data driven decisions. His passion for all things data has evolved from SQL and Data Warehousing to Big Data Analytics and Data Visualizations.

作者簡介(中文翻譯)

Marc Wintjen 是 Bloomberg LP 的風險分析架構師,擁有超過 20 年的專業經驗。作為數據素養的倡導者,他以幫助他人做出數據驅動的決策而聞名,被稱為「數據大師」。他對數據的熱情從 SQL 和數據倉儲發展到大數據分析和數據可視化。

目錄大綱

  1. Fundamentals of data analysis
  2. Overview of Python and Installation of Jupyter notebook
  3. Getting Started with NumPy
  4. Creating your first Pandas DataFrame
  5. Gathering and Loading Data in Python
  6. Visualizing and working with time series data
  7. Exploring Cleaning, Refining and Blending Datasets
  8. Understanding Joins, Relationships and Data Aggregates
  9. Plotting, Visualization and Storytelling
  10. Exploring Text Data and Unstructured Data
  11. Practical Sentiment Analysis
  12. Discovering Patterns in Data and providing insights

目錄大綱(中文翻譯)

- 基礎資料分析
- Python概述和Jupyter notebook的安裝
- 開始使用NumPy
- 創建第一個Pandas DataFrame
- 在Python中收集和加載數據
- 可視化和處理時間序列數據
- 探索清理、精煉和混合數據集
- 了解連接、關係和數據聚合
- 繪圖、可視化和故事講述
- 探索文本數據和非結構化數據
- 實用情感分析
- 發現數據中的模式並提供洞察