Extreme Value Theory-Based Methods for Visual Recognition (Synthesis Lectures on Computer Vision)
Walter J. Scheirer
A common feature of many approaches to modeling sensory statistics is an emphasis on capturing the "average." From early representations in the brain, to highly abstracted class categories in machine learning for classification tasks, central-tendency models based on the Gaussian distribution are a seemingly natural and obvious choice for modeling sensory data. However, insights from neuroscience, psychology, and computer vision suggest an alternate strategy: preferentially focusing representational resources on the extremes of the distribution of sensory inputs. The notion of treating extrema near a decision boundary as features is not necessarily new, but a comprehensive statistical theory of recognition based on extrema is only now just emerging in the computer vision literature. This book begins by introducing the statistical Extreme Value Theory (EVT) for visual recognition. In contrast to central-tendency modeling, it is hypothesized that distributions near decision boundaries form a more powerful model for recognition tasks by focusing coding resources on data that are arguably the most diagnostic features. EVT has several important properties: strong statistical grounding, better modeling accuracy near decision boundaries than Gaussian modeling, the ability to model asymmetric decision boundaries, and accurate prediction of the probability of an event beyond our experience. The second part of the book uses the theory to describe a new class of machine learning algorithms for decision making that are a measurable advance beyond the state-of-the-art. This includes methods for post-recognition score analysis, information fusion, multi-attribute spaces, and calibration of supervised machine learning algorithms.