大數據概論(慕課版)

賀寧,丁慧

  • 出版商: 人民郵電
  • 出版日期: 2026-01-01
  • 售價: $299
  • 語言: 簡體中文
  • ISBN: 7115664358
  • ISBN-13: 9787115664358
  • 相關分類: 大數據 Big-data
  • 下單後立即進貨 (約4週~6週)

  • 大數據概論(慕課版)-preview-1
大數據概論(慕課版)-preview-1

商品描述

內 容 提 要

本書針對大數據、雲計算、軟件技術、信息管理和其他相關專業學生的發展需求,系統、全面地介紹

了數據科學和大數據技術的基本知識和技能,詳細介紹了數據科學基礎、大數據的概論、大數據的行業應

用、大數據的基礎存儲方式、大數據技術基礎、Hadoop 分布式平臺、大數據分析、數據可視化以及大數

據價值,本書將通過淺顯易懂的案例,將枯燥的技術或者數學知識簡化至適合高職院校學生的層次,讓讀

者在學習過程中獲得成就感,從而激發學生的學習興趣。

本書既包括數據科學、大數據的基本知識,大數據技術的各個環節的初探,每個章節也會涵蓋在大數

據理論和大數據技術在典型行業的具體應用,幫助讀者在建立初步的大數據思維概念後,為後續深入學習

大數據技術打下良好的基礎。

本書適用於高職院校一年級、二年級大數據相關專業,以及對於大數據有濃厚興趣的學生使用。

作者簡介

賀寧,從事大數據技術與應用專業相關教學和科研工作;主持建設全國第一批大數據技術與應用專業;主持阿裏巴巴大數據分析與應用1+X證書初級書籍編寫(出版中),主持江蘇省高校自然科學基金面上項目課題1項,主持常州市龍城英才第八批項目1項,參與國家資源庫課程建設2項,參與國家十三五研究課題《社區停車設施升級改造重點產品與裝備研發》項目建設,主編《大數據可視化技術》書籍1本(已出版),個人申請軟件著作權13項,指導學生申請實用新型專利2項,指導學生申請軟件著作權15項。。

目錄大綱

第 1 章 緒論及數據科學 ····························· 1

實例描述:評價哪個班級的考試

成績更好? ················································ 1

1.1 數據科學簡述 ··································· 1

1.1.1 數據科學的來源 ···································· 2

1.1.2 數據科學的基本內容 ···························· 4

1.2 數據科學的應用領域 ······················· 6

1.2.1 在計算機編程領域 ································ 6

1.2.2 在數據庫領域 ········································ 8

1.2.3 數據處理流程 ······································ 10

1.2.4 數據科學所在的行業領域 ·················· 10

1.3 數據科學與統計學領域 ················· 11

1.3.1 統計學領域 ·········································· 12

1.3.2 概率論領域 ·········································· 16

1.3.3 數據領域 ·············································· 22

進階案例——奧運會數據分析 ·············· 25

本章小結 ················································ 28

同步訓練 ················································ 28

第 2 章 大數據概述 ································· 30

實例描述——國家電網公司數據

治理實踐 ················································ 30

2.1 初識大數據 ····································· 33

2.2 大數據的概念 ································· 33

2.3 大數據的特征 ································· 34

2.3.1 Volume ·················································· 34

2.3.2 Variety ··················································· 34

2.3.3 Velocity ················································· 34

2.3.4 Value ····················································· 34

2.3.5 Veracity ················································· 35

2.4 大數據的存儲 ································· 35

2.4.1 文件交互期 ·········································· 35

2.4.2 可擴展性解決方案 ······························ 35

2.4.3 容量和性能突破 ·································· 35

2.4.4 雲文件系統的興起 ······························ 35

2.5 數據類型 ········································· 36

2.5.1 結構化數據類型 ·································· 36

2.5.2 半結構化數據類型 ······························ 36

2.5.3 非結構化數據類型 ······························ 37

2.6 大數據的技術與應用 ····················· 37

2.6.1 大數據技術 ·········································· 38

2.6.2 大數據的應用 ······································ 39

2.6.3 大數據的挑戰 ······································ 42

2.7 大數據的價值 ································· 42

進階案例——智慧農業大數據案例 ······ 45

本章小結 ················································ 47

同步訓練 ················································ 47

第3 章 大數據采集與預處理 ·················· 53

實例描述:飛機如何加固鋼板提升

士兵生還率 ·············································· 53

3.1 大數據來源 ····································· 54

3.1.1 交易數據 ·············································· 54

3.1.2 移動通信數據 ······································ 54

3.1.3 人為數據 ·············································· 54

3.1.4 機器和傳感器數據 ······························ 54

3.1.5 互聯網開放數據 ·································· 55

3.1.6 常用數據平臺: ·································· 55

3.2 大數據采集 ····································· 55

3.2.1 數據采集設備 ······································ 55

3.2.2 日誌采集與用戶行為鏈路分析 ·········· 58

3.2.3 大數據采集技術 ·································· 61

大數據概論(慕課版)

2

3.3 數據預處理概述 ····························· 63

3.3.1 數據清洗 ·············································· 64

3.3.2 數據集成 ·············································· 65

3.3.3 數據歸約 ·············································· 67

進階案例——數據采集之網絡爬蟲 ······ 69

本章小結 ················································ 72

同步訓練 ················································ 72

第4 章 大數據存儲 ································· 74

實例描述:HBase 在阿裏巴巴集團中的

應用實踐 ················································ 74

4.1 傳統存儲技術 ································· 75

4.1.1 存儲的概念與作用 ······························ 75

4.1.2 存儲體系結構 ······································ 78

4.1.3 存儲解決方案分類 ······························ 79

4.2 數據庫技術 ····································· 80

4.2.1 數據庫的概念 ······································ 80

4.2.2 數據庫技術的發展 ······························ 82

4.2.3 數據庫分類 ·········································· 82

4.2.4 數據庫體系架構 ·································· 84

4.3 雲存儲 ············································· 85

4.3.1 雲存儲的概念與特性 ·························· 85

4.3.2 雲存儲的結構模型 ······························ 86

4.3.3 雲存儲的應用模式 ······························ 87

4.4 新興數據存儲技術 ························· 88

4.4.1 新興數據庫技術 ·································· 88

4.4.2 數據庫未來發展趨勢 ·························· 90

4.4.3 大數據存儲 ·········································· 92

4.4.4 數據中心與數據倉庫 ·························· 92

進階案例——國內外個人雲存儲產品

分析(個人雲存儲) ······························ 95

本章小結 ················································ 99

同步訓練 ················································ 99

第5 章 大數據計算平臺 ························ 103

實例描述:亞馬遜大數據計算平臺 ···· 103

5.1 雲計算基本認知 ··························· 105

5.1.1 雲計算定義與概念 ···························· 106

5.1.2 雲計算平臺種類 ································ 112

5.1.3 雲計算的基礎架構 ···························· 114

5.1.4 雲平臺的服務類型 ···························· 114

5.1.5 開源項目與商業化雲平臺 ················ 116

5.2 大數據存儲與管理技術 ··············· 117

5.2.1 大數據存儲的多樣化 ························ 117

5.2.2 大數據管理技術 ································ 117

5.2.3 大數據處理關鍵技術 ························ 118

5.3 Hadoop 分布式平臺 ····················· 120

5.3.1 Hadoop 的發展歷史 ··························· 121

5.3.2 Hadoop 生態系統 ······························· 122

5.3.3 HDFS ·················································· 122

5.3.4 MapReduce ········································· 126

5.3.5 Hadoop 其他組件 ······························· 129

5.3.6 Hadoop 平臺的搭建 ··························· 132

5.4 Spark ·············································· 134

5.4.1 Spark 平臺架構 ·································· 134

5.4.2 Spark 的優勢 ······································ 136

進階案例——用大數據集群計算某地

氣溫變化 ·············································· 139

本章小結 ·············································· 141

同步訓練 ·············································· 141

第6 章 大數據分析與挖掘 ···················· 144

實例描述:求職網站數據分析-用excel

工具進行數據分析 ································ 144

6.1 數據分析與數據挖掘的概念 ······· 147

6.1.1 數據分析 ············································ 148

6.1.2 數據挖掘 ············································ 148

6.2 大數據分析方法 ··························· 149

6.2.1 大數據采集技術 ································ 149

6.2.2 大數據預處理 ···································· 150

6.2.3 大數據存儲與管理技術 ···················· 151

6.2.4 大數據分析與挖掘技術 ···················· 152

6.2.5 大數據可視化技術 ···························· 152

6.3 大數據分析應用工具 ··················· 153

6.3.1 傳統的分析統計工具 ························ 153

6.3.2 新型的大數據分析實用工具 ············ 154

6.4 使用pandas 進行數據分析 ·········· 154

目錄

3

6.4.1 數據對象 ············································ 155

6.4.2 文件讀取 ············································ 156

6.4.3 文件存儲 ············································ 157

6.4.4 分組與聚合 ········································ 158

6.5 數據挖掘 ······································· 160

6.5.1 數據挖掘之分類 ································ 160

6.5.2 數據挖掘之聚類 ································ 161

6.5.3 數據挖掘之關聯規則 ························ 163

6.6 基於大數據的機器學習 ··············· 164

進階案例——航空公司客戶分析-基於

K-Means 聚類算法進行數據分析 ········ 165

本章小結 ·············································· 169

同步訓練 ·············································· 169

第7 章 數據可視化 ······························· 171

實例描述——用excel 工具實現

數據可視化 ············································ 171

7.1 數據可視化之美 ··························· 173

7.2 數據可視化的作用 ······················· 175

7.2.1 數據可視化分類 ································ 175

7.2.2 數據可視化流程 ································ 176

7.2.3 數據可視化原則 ································ 176

7.2.4 數據可視化的作用 ···························· 177

7.3 數據可視化的經典圖表 ··············· 178

7.3.1 折線圖與柱狀圖 ································ 178

7.3.2 餅圖與環狀圖 ···································· 179

7.3.3 雷達圖與氣泡圖 ································ 180

7.3.4 詞雲圖與地圖 ···································· 182

7.4 實用的數據可視化的工具集合 ····· 184

7.4.1 Excel ··················································· 184

7.4.2 Tableau ················································ 186

7.4.3 Echarts ················································ 188

7.4.4 數據可視化編程工具 ························ 189

7.5 數據可視化的實現——

以Matplotlib 為例 ························ 191

7.5.1 使用Matplotlib 繪制柱形圖 ············· 191

7.5.2 使用Matplotlib 繪制折線圖 ············· 194

7.5.3 使用Matplotlib 繪制餅圖 ················· 195

7.5.4 使用Matplotlib 繪制散點圖 ············· 196

7.5.5 使用Matplotlib 繪制子圖 ················· 197

進階案例——2020 年國內生產總值

案例分析 ·············································· 199

本章小結 ·············································· 203

同步訓練 ·············································· 203

最後瀏覽商品 (1)