Data Exploration Using Example-Based Methods (Synthesis Lectures on Data Management)
暫譯: 基於範例的方法進行數據探索(數據管理綜合講座)

Matteo Lissandrini, Davide Mottin, Themis Palpanas, Yannis Velegrakis

  • 出版商: Morgan & Claypool
  • 出版日期: 2018-11-27
  • 售價: $2,830
  • 貴賓價: 9.5$2,689
  • 語言: 英文
  • 頁數: 164
  • 裝訂: Hardcover
  • ISBN: 1681734575
  • ISBN-13: 9781681734576
  • 相關分類: Data-mining
  • 海外代購書籍(需單獨結帳)

商品描述

Data usually comes in a plethora of formats and dimensions, rendering the information extraction and exploration processes challenging. Thus, being able to perform exploratory analyses of the data with the intent of having an immediate glimpse of some of the data properties is becoming crucial. Exploratory analyses should be simple enough to avoid complicated declarative languages (such as SQL) and mechanisms, while at the same time retaining the flexibility and expressiveness of such languages. Recently, we have witnessed a rediscovery of the so-called example-based methods, in which the user, or analyst, circumvents query languages by using examples as input. An example is a representative of the intended results or, in other words, an item from the result set. Example-based methods exploit inherent characteristics of the data to infer the results that the user has in mind but may not be able to (easily) express. They can be useful in cases where a user is looking for information in an unfamiliar dataset, when they are performing a particularly challenging task like finding duplicate items, or when they are simply exploring the data. In this book, we present an excursus over the main methods for exploratory analysis, with a particular focus on example-based methods. We show how different data types require different techniques and present algorithms that are specifically designed for relational, textual, and graph data. The book also presents the challenges and new frontiers of machine learning in online settings that have recently attracted the attention of the database community. The book concludes with a vision for further research and applications in this area.

商品描述(中文翻譯)

數據通常以多種格式和維度出現,這使得信息提取和探索過程變得具有挑戰性。因此,能夠對數據進行探索性分析,以便立即瞭解某些數據屬性,變得至關重要。探索性分析應該足夠簡單,以避免使用複雜的聲明語言(如 SQL)和機制,同時又能保留這些語言的靈活性和表達能力。最近,我們見證了所謂的「基於示例的方法」的重新發現,在這種方法中,使用者或分析師通過使用示例作為輸入來繞過查詢語言。示例是所需結果的代表,換句話說,是結果集中的一個項目。基於示例的方法利用數據的固有特徵來推斷使用者心中所想的結果,但使用者可能無法(輕易)表達出來。這些方法在使用者尋找不熟悉數據集中的信息時、在執行特別具挑戰性的任務(如查找重複項目)時,或僅僅是在探索數據時都非常有用。在本書中,我們對探索性分析的主要方法進行了概述,特別關注基於示例的方法。我們展示了不同數據類型需要不同技術,並介紹了專門為關聯數據、文本數據和圖形數據設計的算法。本書還介紹了機器學習在在線環境中的挑戰和新前沿,這些問題最近引起了數據庫社群的關注。本書最後展望了該領域進一步研究和應用的願景。