Semi-Supervised Learning(Hardcover)

Olivier Chapelle, Bernhard Schölkopf, Alexander Zien

  • 出版商: MIT
  • 出版日期: 2006-09-22
  • 售價: $760
  • 貴賓價: 9.5$722
  • 語言: 英文
  • 頁數: 528
  • 裝訂: Hardcover
  • ISBN: 0262033585
  • ISBN-13: 9780262033589
  • 相關分類: 人工智慧Data ScienceMachine Learning
  • 無法訂購

買這商品的人也買了...

商品描述

Description

In the field of machine learning, semi-supervised learning (SSL) occupies the middle ground, between supervised learning (in which all training examples are labeled) and unsupervised learning (in which no label data are given). Interest in SSL has increased in recent years, particularly because of application domains in which unlabeled data are plentiful, such as images, text, and bioinformatics. This first comprehensive overview of SSL presents state-of-the-art algorithms, a taxonomy of the field, selected applications, benchmark experiments, and perspectives on ongoing and future research.

Semi-Supervised Learning first presents the key assumptions and ideas underlying the field: smoothness, cluster or low-density separation, manifold structure, and transduction. The core of the book is the presentation of SSL methods, organized according to algorithmic strategies. After an examination of generative models, the book describes algorithms that implement the low-density separation assumption, graph-based methods, and algorithms that perform two-step learning. The book then discusses SSL applications and offers guidelines for SSL practitioners by analyzing the results of extensive benchmark experiments. Finally, the book looks at interesting directions for SSL research. The book closes with a discussion of the relationship between semi-supervised learning and transduction.

Olivier Chapelle and Alexander Zien are Research Scientists and Bernhard Schölkopf is Professor and Director at the Max Planck Institute for Biological Cybernetics in Tübingen. Schölkopf is coauthor of Learning with Kernels (MIT Press, 2002) and is a coeditor of Advances in Kernel Methods: Support Vector Learning (1998), Advances in Large-Margin Classifiers (2000), and Kernel Methods in Computational Biology (2004), all published by The MIT Press.

Bernhard Schölkopf is Professor and Director at the Max Planck Institute for Biological Cybernetics in Tübingen and Program Chair of the 2005 NIPS Conference.

 

Table of Contents

 Series Foreword xi
 
 Preface xiii
 
1. Introduction to Semi-Supervised Learning
 
1.1 Supervised, Unsupervised, and Semi-Supervised Learning 1
 
1.2 When Can Semi-Supervised Learning Work? 4
 
1.3 Classes of Algorithms and Organization of This Book 8
 
I. Generative Models 13
 
2. A Taxonomy for Semi-Supervised Learning Methods
Matthias Seeger 15
 
2.1 The Semi-Supervised Learning Problem 15
 
2.2 Paradigms for Semi-Supervised Learning 17
 
2.3 Examples 22
 
2.4 Conclusions 31
 
3. Semi-Supervised Text Classification Using EM
N. C. Nigam, Andrew McCallum and Tom Mitchell 33
 
3.1 Introduction 33
 
3.2 A Generative Model for Text 35
 
3.3 Experminental Results with Basic EM 41
 
3.4 Using a More Expressive Generative Model 43
 
3.5 Overcoming the Challenges of Local Maxima 49
 
3.6 Conclusions and Summary 54
 
4. Risks of Semi-Supervised Learning
Fabio Cozman and Ira Cohen 57
 
4.1 Do Unlabled Data Improve or Degrade Classification Performance? 57
 
4.2 Understanding Unlabeled Data: Asymptotic Bias 59
 
4.3 The Asymptotic Analysis of Generative Smei-Supervised Learning 63
 
4.4 The Value of Labeled and Unlabeled Data 67
 
4.5 Finite Sample Effects 69
 
4.6 Model Search and Robustness 70
 
4.7 Conclusion 71
 
5. Probabilistic Semi-Supervised Cluster with Constraints
Sugato Basu, Mikhail Bilenko, Arindam Banerjee and Raymond J. Mooney 73
 
5.1 Introduction 74
 
5.2 HMRF Model for Semi-Supervised Clustering 75
 
5.3 HMRF-KMeans Algorithm 81
 
5.4 Active Learning for Constraint Acquistion 93
 
5.5 Experimental Results 96
 
5.6 Related Work 100
 
5.7 Conclusions 101
 
II. Low-Density Separation 103
 
6. Transductive Support Vector Machines
Thorsten Joachims 105
 
6.1 Introduction 105
 
6.2 Transductive Support Vector Machines 108
 
6.3 Why Use Margin on the Test Set? 111
 
6.4 Experiments and Applications of the TSVMs 112
 
6.5 Solving the TSVM Optimization Problem 114
 
6.6 Connection to Related Approaches 116
 
6.7 Summary and Conclusions 116
 
7. Semi-Supervised Learning Using Semi-Definite Programming
Tijl De Bie and Nello Cristianini 119
 
7.1 Relaxing SVM transduction 119
 
7.2 An Approximation for Speedup 126
 
7.3 General Semi-Supervised Learning Settings 128
 
7.4 Empirical Results 129
 
7.5 Summary and Outlook 133
 
 Appendix:
The Extended Schur Complement Lemma 134
 
8. Gaussian Processes and the Null-Category Noise Model
Neil D. Lawrence and Michael I. Jordan 137
 
8.1 Introduction 137
 
8.2 The Noise Model 141
 
8.3 Process Model and the Effect of the Null-Category 143
 
8.4 Posterior Inference and Prediction 145
 
8.5 Results 147
 
8.6 Discussion 149
 
9. Entropy Regularization
Yves Grandvalet and Yoshua Bengio 151
 
9.1 Introduction 151
 
9.2 Derivation of the Criterion 152
 
9.3 Optimization Algorithms 155
 
9.4 Related Methods 158
 
9.5 Experiments 160
 
9.6 Conclusion 166
 
 Appendix
Proof of Theorem 9.1 166
 
10. Data-Dependent Regularization
Adrian Corduneanu and Tommi S. Jaakkola 169
 
10.1 Introduction 169
 
10.2 Information Regularization on Metric Spaces 174
 
10.3 Information Regularization and Relational Data 182
 
10.4 Discussion 189
 
III. Graph-Based Models 191
 
11. Label Propogation and Quadratic Criterion
Yoshua Bengio, Olivier Delalleau and Nicolas Le Roux 193
 
11.1 Introduction 193
 
11.2 Label Propogation on a Similarity Graph 194
 
11.3 Quadratic Cost Criterion 198
 
11.4 From Transduction to Induction 205
 
11.5 Incorporating Class Prior Knowledge 205
 
11.6 Curse of Dimensionality for Semi-Supervised Learning 206
 
11.7 Discussion 215
 
12. The Geometric Basis of Semi-Supervised Learning
Vikas Sindhwani, Misha Belkin and Partha Niyogi 217
 
12.1 Introduction 217
 
12.2 Incorporating Geometry in Regularization 220
 
12.3 Algorithms 224
 
12.4 Data-Dependent Kernels for Semi-Supervised Learning 229
 
12.5 Linear Methods for Large-Scale Semi-Supervised Learning 231
 
12.6 Connections to Other Algorithms and Related Work 232
 
12.7 Future Directions 234
 
13. Discrete Regularization
Dengyong Zhou and Bernhard Schölkopf 237
 
13.1 Introduction 237
 
13.2 Discrete Analysis 239
 
13.3 Discrete Regularization 245
 
13.4 Conclusion 249
 
14. Semi-Supervised Learning with Conditional Harmonic Mixing
Christopher J. C. Burges and John C. Platt 251
 
14.1 Introduction 251
 
14.2 Conditional Harmonic Mixing 255
 
14.3 Learning in CHM Models 256
 
14.4 Incorporating Prior Knowledge 261
 
14.5 Learning the Conditionals 261
 
14.6 Model Averaging 262
 
14.7 Experiments 263
 
14.8 Conclusions 273
 
IV. Change of Representation 275
 
15. Graph Kernels by Spectral Transforms
Xiaojin Zhu, Jaz Kandola, John Lafferty and Zoubin Ghahramani 277
 
15.1 The Graph Laplacian 278
 
15.2 Kernels by Spectral Transforms 280
 
15.3 Kernel Alignment 281
 
15.4 Optimizing Alignment Using QCQP for Semi-Supervised Learning 282
 
15.5 Semi-Supervised Kernels with Order Restraints 283
 
15.6 Experimental Results 285
 
15.7 Conclusion 289
 
16. Spectral Methods for Dimensionality Reduction
Lawrence K. Saul, Kilian Weinberger, Fei Sha and Jihun Ham 293
 
16.1 Introduction 293
 
16.2 Linear Methods 295
 
16.3 Graph-Based Methods 297
 
16.4 Kernel Methods 303
 
16.5 Discussion 306
 
17. Modifying Distances
Alon Orlitsky and Sajama 309
 
17.1 Introduction 309
 
17.2 Estimating DBD Metrics 312
 
17.3 Computing DBD Metrics 321
 
17.4 Semi-Supervised Learning Using Density-Based Metrics 327
 
17.5 Conclusions and Future Work 329
 
V. Semi-Supervised Learning in Practice 331
 
18. Large-Scale Algorithms
Olivier Delalleau, Yoshua Bengio and Nicolas Le Roux 333
 
18.1 Introduction 333
 
18.2 Cost Approximations 334
 
18.3 Subset Selection 337
 
18.4 Discussion 340
 
19. Semi-Supervised Protein Classification Using Cluster Kernels
Jason Weston, Christina Leslie, Eugene Ie and William S. Noble 343
 
19.1 Introduction 343
 
19.2 Representation and Kernels for Protein Sequences 345
 
19.3 Semi-Supervised Kernels for Protein Sequences 348
 
19.4 Experiments 352
 
19.5 Discussion 358
 
20. Prediction of Protein Function from Networks
Hyunjung Shin and Koji Tsuda 361
 
20.1 Introduction 361
 
20.2 Graph-Based Semi-Supervised Learning 364
 
20.3 Combining Multiple Graphs 366
 
20.4 Experiments on Function Prediction of Proteins 369
 
20.5 Conclusion and Outlook 374
 
21. Analysis of Benchmarks 377
 
21.1 The Benchmark 377
 
21.2 Application of SSL Methods 383
 
21.3 Results and Discussion 390
 
VI. Perspectives 395
 
22. An Augmented PAC Model for Semi-Supervised Learning
Maria-Florina Balcan and Avrim Blum 397
 
22.1 Introduction 398
 
22.2 A Formal Framework 400
 
22.3 Sample Complexity Results 403
 
22.4 Algorithmic Results 412
 
22.5 Related Models and Discussion 416
 
23. Metric-Based Approaches for Semi-Supervised Regression and Classification
Dale Schuurmans, Finnegan Southey, Dana Wilkinson and Yuhong Guo 421
 
23.1 Introduction 421
 
23.2 Metric Structure of Supervised Learning 423
 
23.3 Model Selection 426
 
23.4 Regularization 436
 
23.5 Classification 445
 
23.6 Conclusion 449
 
24. Transductive Inference and Semi-Supervised Learning
Vladimir Vapnik 453
 
24.1 Problem Settings 453
 
24.2 Problem of Generalization in Inductive and Transductive Inference 455
 
24.3 Structure of the VC Bounds and Transductive Inference 457
 
24.4 The Symmetrization Lemma and Transductive Inference 458
 
24.5 Bounds for Transductive Inference 459
 
24.6 The Structural Risk Minimization Principle for Induction and Transduction 460
 
24.7 Combinatorics in Transductive Inference 462
 
24.8 Measures of Size of Equivalence Classes 463
 
24.9 Algorithms for Inductive and Transductive SVMs 465
 
24.10 Semi-Supervised Learning 470
 
24.11 Conclusion:
Transductive Inference and the New Problems of Inference 470
 
24.12 Beyond Transduction: Selective Inference 471
 
25. A Discussion of Semi-Supervised Learning and Transduction 473
 
 References 479
 
 Notation and Symbols 499
 
 Contributors 503
 
 Index 509
 

商品描述(中文翻譯)

描述

在機器學習領域中,半監督學習(SSL)處於監督學習(所有訓練示例都有標籤)和無監督學習(不提供標籤數據)之間的中間地位。近年來,對SSL的興趣不斷增加,特別是在擁有大量未標記數據的應用領域,如圖像、文本和生物信息學。這本第一本全面介紹SSL的概述提供了最先進的算法、領域的分類、選定的應用、基準實驗以及對正在進行和未來研究的展望。

《半監督學習》首先介紹了該領域的關鍵假設和思想:平滑性、聚類或低密度分離、流形結構和轉導。本書的核心是按照算法策略組織的SSL方法的介紹。在檢查生成模型之後,本書描述了實現低密度分離假設的算法、基於圖的方法以及執行兩步學習的算法。然後,本書討論了SSL的應用並通過分析廣泛的基準實驗結果為SSL從業人員提供指南。最後,本書探討了SSL研究的有趣方向。本書最後討論了半監督學習和轉導之間的關係。

Olivier Chapelle和Alexander Zien是圖資生物進化研究所的研究科學家,Bernhard Schölkopf是圖資生物進化研究所的教授和主任。Schölkopf是《使用核心學習》(MIT Press,2002)的合著者,並且是《核心方法的進展:支持向量學習》(1998)、《大邊界分類器的進展》(2000)和《計算生物學中的核心方法》(2004)的合編者,這些書都由MIT Press出版。

Bernhard Schölkopf是圖資生物進化研究所的教授和主任,也是2005年NIPS會議的程序主席。

目錄

系列前言 xi
前言 xiii
1. 半監督學習簡介
1.1 監督、無監督和半監督學習 1
1.2 半監督學習何時有效? 4
1.3 算法類別和本書組織 8
I. 生成模型 13
2. 半監督學習方法的分類
Matthias Seeger 15
2.1 半監督學習問題 15
2.2 半監督學習範例 17
2.3 範例 22
2.4 結論 31
3. 使用EM的半監督文本分類
N. C. Nigam、Andrew McCallum和Tom Mitchell 33
3.1 簡介 33
3.2 文本的生成模型 35
3.3 基本EM的實驗結果 41
3.4 使用更具表達力的生成模型 43
3.5 克服局部極大值的挑戰 49
3.6 結論和總結 54
4. 半監督學習的風險
Fabio Cozman和Ira Cohen 57
4.1 未標記數據是否改善或降低分類性能