Data Wrangling with SQL: A hands-on guide to manipulating, wrangling, and engineering data using SQL

Kandarpa, Raghav, Saxena, Shivangi

  • 出版商: Packt Publishing
  • 出版日期: 2023-07-31
  • 售價: $1,490
  • 貴賓價: 9.5$1,416
  • 語言: 英文
  • 頁數: 350
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 183763002X
  • ISBN-13: 9781837630028
  • 相關分類: SQL
  • 下單後立即進貨 (約3~4週)

商品描述

Become a data wrangling expert and make well-informed decisions by effectively utilizing and analyzing raw unstructured data in a systematic manner

Purchase of the print or Kindle book includes a free PDF eBook

Key Features

  • Implement query optimization during data wrangling using the SQL language with practical use cases
  • Master data cleaning, handle the date function and null value, and write subqueries and window functions
  • Practice self-assessment questions for SQL-based interviews and real-world case study rounds

Book Description

The amount of data generated continues to grow rapidly, making it increasingly important for businesses to be able to wrangle this data and understand it quickly and efficiently. Although data wrangling can be challenging, with the right tools and techniques you can efficiently handle enormous amounts of unstructured data.

The book starts by introducing you to the basics of SQL, focusing on the core principles and techniques of data wrangling. You’ll then explore advanced SQL concepts like aggregate functions, window functions, CTEs, and subqueries that are very popular in the business world. The next set of chapters will walk you through different functions within SQL query that cause delays in data transformation and help you figure out the difference between a good query and bad one. You’ll also learn how data wrangling and data science go hand in hand. The book is filled with datasets and practical examples to help you understand the concepts thoroughly, along with best practices to guide you at every stage of data wrangling.

By the end of this book, you’ll be equipped with essential techniques and best practices for data wrangling, and will predominantly learn how to use clean and standardized data models to make informed decisions, helping businesses avoid costly mistakes.

What you will learn

  • Build time series models using data wrangling
  • Discover data wrangling best practices as well as tips and tricks
  • Find out how to use subqueries, window functions, CTEs, and aggregate functions
  • Handle missing data, data types, date formats, and redundant data
  • Build clean and efficient data models using data wrangling techniques
  • Remove outliers and calculate standard deviation to gauge the skewness of data

Who this book is for

This book is for data analysts looking for effective hands-on methods to manage and analyze large volumes of data using SQL. The book will also benefit data scientists, product managers, and basically any role wherein you are expected to gather data insights and develop business strategies using SQL as a language. If you are new to or have basic knowledge of SQL and databases and an understanding of data cleaning practices, this book will give you further insights into how you can apply SQL concepts to build clean, standardized data models for accurate analysis.

商品描述(中文翻譯)

成為一位資料整理專家,透過有效地利用和分析原始非結構化數據,做出明智的決策。

購買印刷版或Kindle電子書,即可免費獲得PDF電子書。

主要特點:

- 使用SQL語言實施數據整理期間的查詢優化,並應用於實際案例。
- 掌握數據清理、處理日期函數和空值,以及編寫子查詢和窗口函數。
- 練習SQL面試和實際案例環節的自我評估問題。

書籍描述:

數據生成量持續快速增長,使得企業能夠迅速高效地整理和理解這些數據變得越來越重要。儘管數據整理可能具有挑戰性,但是通過適當的工具和技術,您可以高效地處理大量非結構化數據。

本書首先介紹SQL的基礎知識,重點介紹數據整理的核心原則和技術。然後,您將探索高級SQL概念,如聚合函數、窗口函數、CTE和子查詢,這些在商業界非常流行。接下來的章節將引導您了解SQL查詢中不同函數對數據轉換造成的延遲,並幫助您理解良好查詢和不良查詢之間的區別。您還將學習數據整理和數據科學如何相輔相成。本書充滿了數據集和實際示例,以幫助您全面理解概念,並提供指導您在數據整理的每個階段的最佳實踐。

通過閱讀本書,您將掌握數據整理的基本技巧和最佳實踐,並主要學習如何使用乾淨和標準化的數據模型做出明智的決策,幫助企業避免昂貴的錯誤。

您將學到什麼:

- 使用數據整理建立時間序列模型。
- 發現數據整理的最佳實踐、技巧和訣竅。
- 學習使用子查詢、窗口函數、CTE和聚合函數。
- 處理缺失數據、數據類型、日期格式和冗余數據。
- 使用數據整理技術構建乾淨高效的數據模型。
- 刪除異常值並計算標準差以評估數據的偏斜程度。

本書適合對使用SQL進行大量數據管理和分析的數據分析師。本書還將對數據科學家、產品經理以及任何需要使用SQL進行數據洞察和制定業務策略的角色有所幫助。如果您對SQL和數據清理實踐有基本的了解,本書將進一步幫助您了解如何應用SQL概念構建乾淨、標準化的數據模型進行準確分析。

目錄大綱

  1. Database Introduction
  2. Data Profiling and Preparation before Data Wrangling
  3. Data Wrangling on String Data Types
  4. Data Wrangling on the DATE Data Type
  5. Handling NULL values
  6. Pivoting Data Using SQL
  7. Subqueries and CTEs
  8. Aggregate Functions
  9. SQL Window Functions
  10. Optimizing Query Performance
  11. Descriptive Statistics with SQL
  12. Time Series with SQL
  13. Outlier Detection

目錄大綱(中文翻譯)


  1. 資料庫介紹

  2. 資料整理前的資料分析和準備

  3. 對字串資料類型進行資料整理

  4. 對日期資料類型進行資料整理

  5. 處理 NULL 值

  6. 使用 SQL 進行資料樞紐分析

  7. 子查詢和公用表達式

  8. 聚合函數

  9. SQL 窗口函數

  10. 優化查詢效能

  11. 使用 SQL 進行描述性統計

  12. 使用 SQL 進行時間序列分析

  13. 異常值檢測