数据挖掘导论(英文版)
暫譯: 數據挖掘導論(英文版)
(美) Pang-Ning Tan , Michael Steinbach , Vipin Kumar 著
- 出版商: 機械工業
- 出版日期: 2010-09-01
- 定價: $354
- 售價: 7.9 折 $280
- 語言: 英文
- 頁數: 769
- ISBN: 7111316703
- ISBN-13: 9787111316701
-
相關分類:
Data-mining
立即出貨 (庫存 < 4)
買這商品的人也買了...
-
$479數據挖掘導論 (完整版) (Introduction to Data Mining) -
計算機科學中的數學:信息與智能時代的必修課$1,008$958 -
$465統計學習方法, 2/e
中文年末書展|繁簡參展書2書75折 詳見活動內容 »
-
75折
為你寫的 Vue Components:從原子到系統,一步步用設計思維打造面面俱到的元件實戰力 (iThome 鐵人賽系列書)$780$585 -
75折
BDD in Action, 2/e (中文版)$960$720 -
75折
看不見的戰場:社群、AI 與企業資安危機$750$563 -
79折
AI 精準提問 × 高效應用:DeepSeek、ChatGPT、Claude、Gemini、Copilot 一本搞定$390$308 -
7折
超實用!Word.Excel.PowerPoint 辦公室 Office 365 省時高手必備 50招, 4/e (暢銷回饋版)$420$294 -
75折
裂縫碎光:資安數位生存戰$550$412 -
日本當代最強插畫 2025 : 150位當代最強畫師豪華作品集$640$576 -
79折
Google BI 解決方案:Looker Studio × AI 數據驅動行銷實作,完美整合 Google Analytics 4、Google Ads、ChatGPT、Gemini$630$498 -
79折
超有料 Plus!職場第一實用的 AI 工作術 - 用對 AI 工具、自動化 Agent, 讓生產力全面進化!$599$473 -
75折
從零開始學 Visual C# 2022 程式設計, 4/e (暢銷回饋版)$690$518 -
75折
Windows 11 制霸攻略:圖解 AI 與 Copilot 應用,輕鬆搞懂新手必學的 Windows 技巧$640$480 -
75折
精準駕馭 Word!論文寫作絕非難事 (好評回饋版)$480$360 -
Sam Yang 的插畫藝術:用 Procreate / PS 畫出最強男友視角 x 女孩美好日常$699$629 -
79折
AI 加持!Google Sheets 超級工作流$599$473 -
78折
想要 SSR? 快使用 Nuxt 吧!:Nuxt 讓 Vue.js 更好處理 SEO 搜尋引擎最佳化(iThome鐵人賽系列書)$780$608 -
78折
超實用!業務.總管.人資的辦公室 WORD 365 省時高手必備 50招 (第二版)$500$390 -
7折
Node-RED + YOLO + ESP32-CAM:AIoT 智慧物聯網與邊緣 AI 專題實戰$680$476 -
79折
「生成式⇄AI」:52 個零程式互動體驗,打造新世代人工智慧素養$599$473 -
7折
Windows APT Warfare:惡意程式前線戰術指南, 3/e$720$504 -
75折
我輩程式人:回顧從 Ada 到 AI 這條程式路,程式人如何改變世界的歷史與未來展望 (We, Programmers: A Chronicle of Coders from Ada to AI)$850$637 -
75折
不用自己寫!用 GitHub Copilot 搞定 LLM 應用開發$600$450 -
79折
Tensorflow 接班王者:Google JAX 深度學習又快又強大 (好評回饋版)$780$616 -
79折
GPT4 會你也會 - 共融機器人的多模態互動式情感分析 (好評回饋版)$700$553 -
79折
技術士技能檢定 電腦軟體應用丙級術科解題教本|Office 2021$460$363 -
75折
Notion 與 Notion AI 全能實戰手冊:生活、學習與職場的智慧策略 (暢銷回饋版)$560$420
相關主題
商品描述
本書全面介紹了數據挖掘的理論和方法,著重介紹如何用數據挖掘知識解決各種實際問題,涉及學科領域眾多,適用面廣。書中涵蓋5個主題︰數據、分類、關聯分析、聚類和異常檢測。除異常檢測外,每個主題都包含兩章︰前面一章講述基本概念、代表性算法和評估技術,後面一章較深入地討論高級概念和算法。目的是使讀者在透徹地理解數據挖掘基礎的同時,還能了解更多重要的高級主題。包含大量的圖表、綜合示例和豐富的習題。‧不需要數據庫背景。只需要很少的統計學或數學背景知識。‧網上配套教輔資源豐富,包括PPT、習題解答、數據集等。
商品描述(中文翻譯)
本書全面介紹了數據挖掘(Data Mining)的理論和方法,著重介紹如何用數據挖掘知識解決各種實際問題,涉及學科領域眾多,適用面廣。書中涵蓋五個主題︰數據(Data)、分類(Classification)、關聯分析(Association Analysis)、聚類(Clustering)和異常檢測(Anomaly Detection)。除異常檢測外,每個主題都包含兩章︰前面一章講述基本概念、代表性算法和評估技術,後面一章較深入地討論高級概念和算法。目的是使讀者在透徹地理解數據挖掘基礎的同時,還能了解更多重要的高級主題。包含大量的圖表、綜合示例和豐富的習題。‧不需要數據庫背景。只需要很少的統計學或數學背景知識。‧網上配套教輔資源豐富,包括PPT、習題解答、數據集等。
目錄大綱
Preface
1 Introduction
1.1 What Is Data Mining?
1.2 Motivating Challenges
1.3 The Origins of Data Mining
1.4 Data Mining Tasks
1.5 Scope and Organization of the Book
1.6 Bibliographic Notes
1.7 Exercises
2 Data
2.1 Types of Data
2.1.1 Attributes and Measurement
2.1.2 Types of Data Sets
2.2 Data Quality
2.2.1 Measurement and Data Collection Issues
2.2.2 Issues Related to Applications
2.3 Data Preprocessing
2.3.1 Aggregation
2.3.2 Sampling
2.3.3 Dimensionality Reduction
2.3.4 Feature Subset Selection
2.3.5 Feature Creation
2.3.6 Discretization and Binarization
2.3.7 Variable Transformation
2.4 Measures of Similarity and Dissimilarity
2.4.1 Basics
2.4.2 Similarity and Dissimilarity between Simple Attributes.
2.4.3 Dissimilarities between Data Objects
2.4.4 Similarities between Data Objects
2.4.5 Examples of Proximity Measures
2.4.6 Issues in Proximity Calculation
2.4.7 Selecting the Right Proximity Measure
2.5 Bibliographic Notes
2.6 Exercises
3 Exploring Data
3.1 The Iris Data Set
3.2 Summary Statistics
3.2.1 Frequencies and the Mode
3.2.2 Percentiles
3.2.3 Measures of Location: Mean and Median
3.2.4 Measures of Spread: Range and Variance
3.2.5 Multivariate Summary Statistics
3.2.6 Other Ways to Summarize the Data
3.3 Visualization
3.3.1 Motivations for Visualization
3.3.2 General Concepts
3.3.3 Techniques
3.3.4 Visualizing Higher-Dimensional Data
3.3.5 Do﹀s and Don﹀ts
3.4 OLAP and Multidimensional Data Analysis
3.4.1 Representing Iris Data as a Multidimensional Array
3.4.2 Multidimensional Data: The General Case
3.4.3 Analyzing Multidimensional Data
3.4.4 Final Comments on Multidimensional Data Analysis
3.5 Bibliographic Notes
3.6 Exercises
Classification:
4 Basic Concepts, Decision Trees, and Model Evaluation
4.1 Preliminaries
4.2 General Approach to Solving a Classification Problem
4.3 Decision Tree Induction
4.3.1 How a Decision Tree Works
4.3.2 How to Build a Decision Tree
4.3.3 Methods for Expressing Attribute Test Conditions
4.3.4 Measures for Selecting the Best Split
4.3.5 Algorithm for Decision Tree Induction
4.3.6 An Example: Web Robot Detection
4.3.7 Characteristics of Decision Tree Induction
4.4 Model Overfitting
4.4.1 Overfitting Due to Presence of Noise
4.4.2 Overfitting Due to Lack of Representative Samples
4.4.3 Overfitting and the Multiple Comparison Procedure
4.4.4 Estimation of Generalization Errors
4.4.5 Handling Overfitting in Decision Tree Induction
4.5 Evaluating the Performance of a Classifier
4.5.1 Holdout Method
4.5.2 Random Subsampling
4.5.3 Cross-Validation
4.5.4 Bootstrap
4.6 Methods for Comparing Classifiers
4.6.1 Estimating a Confidence Interval for Accuracy
4.6.2 Comparing the Performance of Two Models
4.6.3 Comparing the Performance of Two Classifiers
4.7 Bibliographic Notes
4.8 Exercises
5 Classification: Alternative Techniques
6 Association Analysis: Basic Concepts and Algorithms
7 Association Analysis:Advanced Concepts
8 Cluster Analysis:Basic Concepts and Algorithms
9 Cluster Analysis:Additional Issues and Algorithms
10 Anomaly Detection
Appendix A Linear Algebra
Appendix B Dimensionality Reduction
Appendix C Probability and Statistics
Appendix D Regression
Appendix E Optimization
Author Index
Subject Index
Copyright Permissions
目錄大綱(中文翻譯)
Preface
1 Introduction
1.1 What Is Data Mining?
1.2 Motivating Challenges
1.3 The Origins of Data Mining
1.4 Data Mining Tasks
1.5 Scope and Organization of the Book
1.6 Bibliographic Notes
1.7 Exercises
2 Data
2.1 Types of Data
2.1.1 Attributes and Measurement
2.1.2 Types of Data Sets
2.2 Data Quality
2.2.1 Measurement and Data Collection Issues
2.2.2 Issues Related to Applications
2.3 Data Preprocessing
2.3.1 Aggregation
2.3.2 Sampling
2.3.3 Dimensionality Reduction
2.3.4 Feature Subset Selection
2.3.5 Feature Creation
2.3.6 Discretization and Binarization
2.3.7 Variable Transformation
2.4 Measures of Similarity and Dissimilarity
2.4.1 Basics
2.4.2 Similarity and Dissimilarity between Simple Attributes.
2.4.3 Dissimilarities between Data Objects
2.4.4 Similarities between Data Objects
2.4.5 Examples of Proximity Measures
2.4.6 Issues in Proximity Calculation
2.4.7 Selecting the Right Proximity Measure
2.5 Bibliographic Notes
2.6 Exercises
3 Exploring Data
3.1 The Iris Data Set
3.2 Summary Statistics
3.2.1 Frequencies and the Mode
3.2.2 Percentiles
3.2.3 Measures of Location: Mean and Median
3.2.4 Measures of Spread: Range and Variance
3.2.5 Multivariate Summary Statistics
3.2.6 Other Ways to Summarize the Data
3.3 Visualization
3.3.1 Motivations for Visualization
3.3.2 General Concepts
3.3.3 Techniques
3.3.4 Visualizing Higher-Dimensional Data
3.3.5 Do﹀s and Don﹀ts
3.4 OLAP and Multidimensional Data Analysis
3.4.1 Representing Iris Data as a Multidimensional Array
3.4.2 Multidimensional Data: The General Case
3.4.3 Analyzing Multidimensional Data
3.4.4 Final Comments on Multidimensional Data Analysis
3.5 Bibliographic Notes
3.6 Exercises
Classification:
4 Basic Concepts, Decision Trees, and Model Evaluation
4.1 Preliminaries
4.2 General Approach to Solving a Classification Problem
4.3 Decision Tree Induction
4.3.1 How a Decision Tree Works
4.3.2 How to Build a Decision Tree
4.3.3 Methods for Expressing Attribute Test Conditions
4.3.4 Measures for Selecting the Best Split
4.3.5 Algorithm for Decision Tree Induction
4.3.6 An Example: Web Robot Detection
4.3.7 Characteristics of Decision Tree Induction
4.4 Model Overfitting
4.4.1 Overfitting Due to Presence of Noise
4.4.2 Overfitting Due to Lack of Representative Samples
4.4.3 Overfitting and the Multiple Comparison Procedure
4.4.4 Estimation of Generalization Errors
4.4.5 Handling Overfitting in Decision Tree Induction
4.5 Evaluating the Performance of a Classifier
4.5.1 Holdout Method
4.5.2 Random Subsampling
4.5.3 Cross-Validation
4.5.4 Bootstrap
4.6 Methods for Comparing Classifiers
4.6.1 Estimating a Confidence Interval for Accuracy
4.6.2 Comparing the Performance of Two Models
4.6.3 Comparing the Performance of Two Classifiers
4.7 Bibliographic Notes
4.8 Exercises
5 Classification: Alternative Techniques
6 Association Analysis: Basic Concepts and Algorithms
7 Association Analysis:Advanced Concepts
8 Cluster Analysis:Basic Concepts and Algorithms
9 Cluster Analysis:Additional Issues and Algorithms
10 Anomaly Detection
Appendix A Linear Algebra
Appendix B Dimensionality Reduction
Appendix C Probability and Statistics
Appendix D Regression
Appendix E Optimization
Author Index
Subject Index
Copyright Permissions
