Data Mining in Large Sets of Complex Data (Paperback)

Robson Leonardo Ferreira Cordeiro, Christos Faloutsos, Caetano Traina Júnior

  • 出版商: Springer
  • 出版日期: 2013-01-11
  • 售價: $2,360
  • 貴賓價: 9.5$2,242
  • 語言: 英文
  • 頁數: 116
  • 裝訂: Paperback
  • ISBN: 1447148894
  • ISBN-13: 9781447148890
  • 相關分類: Data-mining
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

商品描述

The amount and the complexity of the data gathered by current enterprises are increasing at an exponential rate. Consequently, the analysis of Big Data is nowadays a central challenge in Computer Science, especially for complex data. For example, given a satellite image database containing tens of Terabytes, how can we find regions aiming at identifying native rainforests, deforestation or reforestation? Can it be made automatically? Based on the work discussed in this book, the answers to both questions are a sound “yes”, and the results can be obtained in just minutes. In fact, results that used to require days or weeks of hard work from human specialists can now be obtained in minutes with high precision. Data Mining in Large Sets of Complex Data discusses new algorithms that take steps forward from traditional data mining (especially for clustering) by considering large, complex datasets. Usually, other works focus in one aspect, either data size or complexity. This work considers both: it enables mining complex data from high impact applications, such as breast cancer diagnosis, region classification in satellite images, assistance to climate change forecast, recommendation systems for the Web and social networks; the data are large in the Terabyte-scale, not in Giga as usual; and very accurate results are found in just minutes. Thus, it provides a crucial and well timed contribution for allowing the creation of real time applications that deal with Big Data of high complexity in which mining on the fly can make an immeasurable difference, such as supporting cancer diagnosis or detecting deforestation.

商品描述(中文翻譯)

當前企業所收集的數據量和複雜性正以指數級增長。因此,大數據分析現在是計算機科學中的一個核心挑戰,尤其是對於複雜數據。例如,如果有一個包含數十TB的衛星圖像數據庫,我們如何找到旨在識別原生雨林、森林砍伐或森林重新植被的區域?能夠自動完成嗎?根據本書中討論的工作,對於這兩個問題的答案都是肯定的,而且結果可以在幾分鐘內獲得。事實上,以前需要人類專家數天或數周的辛勤工作才能獲得的結果,現在可以在幾分鐘內以高精度獲得。

《大型複雜數據的數據挖掘》討論了新的算法,從傳統的數據挖掘(尤其是聚類)中邁出了一步,考慮到大型複雜數據集。通常,其他作品只關注一個方面,要麼是數據大小,要麼是複雜性。而這本書兼顧了兩者:它能夠從高影響應用中挖掘複雜數據,例如乳腺癌診斷、衛星圖像中的區域分類、氣候變化預測的輔助、網絡和社交媒體的推薦系統;數據的規模是以TB為單位,而不是通常的GB;而且可以在幾分鐘內找到非常準確的結果。因此,它為允許創建處理高複雜性的大數據的實時應用提供了至關重要且及時的貢獻,這種即時挖掘可以產生不可估量的差異,例如支持癌症診斷或檢測森林砍伐。