Big Data Preprocessing: Enabling Smart Data
暫譯: 大數據預處理:啟用智慧數據

Luengo, Julián, García-Gil, Diego, Ramírez-Gallego, Sergio

  • 出版商: Springer
  • 出版日期: 2021-03-17
  • 售價: $2,360
  • 貴賓價: 9.5$2,242
  • 語言: 英文
  • 頁數: 186
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 3030391078
  • ISBN-13: 9783030391072
  • 相關分類: 大數據 Big-data
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

This book offers a comprehensible overview of  Big Data Preprocessing, which includes a formal description of each problem.  It also focuses on the most relevant proposed solutions. This book illustrates actual implementations of algorithms that helps the reader deal with these problems. 

 

This book stresses the gap that exists between big, raw data and the requirements of quality data that businesses are demanding. This is called Smart Data, and to achieve Smart Data the preprocessing is a key step, where the imperfections, integration tasks and other processes are carried out to eliminate superfluous information. The authors present the concept of Smart Data through data preprocessing in Big Data scenarios and connect it with the emerging paradigms of IoT and edge computing, where the end points generate Smart Data without completely relying on the cloud.

 

Finally, this book provides some novel areas of study that are gathering a deeper attention on the Big Data preprocessing. Specifically, it considers the relation with Deep Learning (as of a technique that also relies in large volumes of data), the difficulty of finding the appropriate selection and concatenation of preprocessing techniques applied and some other open problems.

Practitioners and data scientists who work in this field, and want to introduce themselves to preprocessing in large data volume scenarios will want to purchase this book. Researchers that work in this field, who want to know which algorithms are currently implemented to help their investigations, may also be interested in this book.

商品描述(中文翻譯)

這本書提供了對大數據預處理的易懂概述,包括對每個問題的正式描述。它還專注於最相關的提議解決方案。本書展示了幫助讀者解決這些問題的算法實際實現。

本書強調了大規模原始數據與企業所需的高品質數據之間的差距。這被稱為智能數據(Smart Data),而要實現智能數據,預處理是一個關鍵步驟,在這個步驟中,進行了消除多餘信息的缺陷、整合任務和其他過程。作者通過大數據場景中的數據預處理來介紹智能數據的概念,並將其與物聯網(IoT)和邊緣計算(edge computing)等新興範式聯繫起來,這些終端點生成智能數據,而不完全依賴雲端。

最後,本書提供了一些在大數據預處理領域中越來越受到關注的新興研究領域。具體而言,它考慮了與深度學習(Deep Learning)的關係,因為這是一種也依賴於大量數據的技術,以及尋找適當的預處理技術的選擇和串接的困難,還有其他一些未解決的問題。

在這個領域工作的實務者和數據科學家,想要了解大數據場景中的預處理,將會希望購買這本書。從事這個領域研究的研究人員,想要了解目前實施的算法以幫助他們的研究,也可能對這本書感興趣。

作者簡介

Julián Luengo received the M.S. degree in computer science and the Ph.D. from the University of Granada, Granada, Spain, in 2006 and 2011 respectively. He currently acts as an Assistant Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada, Spain. His research interests include machine learning and data mining, data preparation in knowledge discovery and data mining, missing values, noisy data, data complexity and fuzzy systems. Dr. Luengo has been given some awards and honors for his personal work or for his publications in and conferences, such as IFSA-EUSFLAT 2009 Best Student Paper Award. He belongs to the list of the Highly Cited Researchers in the area of Computer Sciences (2015- 2018) (Clarivate Analytics).
Diego Garcı́a-Gil received the M.Sc. degree in computer science from the University of Granada, Granada, Spain, in 2015. He is currently pursuing the Ph.D. degree with the Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain. His current research interests include machine learning, data mining, data preprocessing and Big Data.
Sergio Ramírez-Gallego received the M.Sc. degree in computer science from the University of Jaén, Jaén, Spain, in 2012. He obtained the Ph.D. degree with the Department of Computer Science and Artificial Intelligence, University of Granada, Spain in 2018. His current research interests include data mining, data preprocessing, big data, and cloud computing.
Salvador García received the B.S. and Ph.D. degrees in Computer Science from the University of Granada, Granada, Spain, in 2004 and 2008, respectively. He is currently an Associate Professor in the Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain. Dr. García has published more than 80 papers in international journals (more than 60 in Q1), h-index 43, over 60 papers in international conference proceedings (data from Web of Science). He has organized several special sessions and workshops related to data preprocessing and evolutionary learning in conferences such as "Hybrid Intelligent Systems", "Intelligent Systems Design and Applications" and "International Joint-Conference of Neural Networks". He has been associated with the international program committees and organizing committees of several regular international conferences including IEEE CEC, ICPR, ICDM, IJCAI, etc. As edited activities, he has co-edited two special issues in international journals and he is an associate editor of "Information Fusion" (Elsevier), "Swarm and Evolutionary Computation" (Elsevier) and "AI Communications" (IOS Press) journals, and he is co-Editor in Chief of the international journal "Progress in Artificial Intelligence" (Springer). He is a co-author of the books entitled "Data Preprocessing in Data Mining" and "Learning from Imbalanced Data Sets" published by Springer. His research interests include data science, data preprocessing, Big Data, evolutionary learning, Deep Learning, metaheuristics and biometrics.

Francisco Herrera (SM'15) received his M.Sc. in Mathematics in 1988 and Ph.D. in Mathematics in 1991, both from the University of Granada, Spain. He is currently a Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada and Director of DaSCI Institute (Andalusian Research Institute in Data Science and Computational Intelligence). He has been the supervisor of 44 Ph.D. students. He has published more than 400 journal papers, receiving more than 66000 citations (Scholar Google, H-index 132). He is co-author of the books "Genetic Fuzzy Systems" (World Scientific, 2001) and "Data Preprocessing in Data Mining" (Springer, 2015), "The 2-tuple Linguistic Model. Computing with Words in Decision Making" (Springer, 2015), "Multilabel Classification. Problem analysis, metrics and techniques" (Springer, 2016), "Multiple Instance Learning. Foundations and Algorithms" (Springer, 2016) and "Learning from Imbalanced Data Sets" (Springer, 2018). He currently acts as Editor in Chief of the international journals "Information Fusion" (Elsevier) and "Progress in Artificial Intelligence (Springer). He acts as editorial member of a dozen of journals.

作者簡介(中文翻譯)

胡利安·盧恩戈於2006年和2011年分別在西班牙格拉納達大學獲得計算機科學碩士學位和博士學位。他目前擔任西班牙格拉納達大學計算機科學與人工智慧系的助理教授。他的研究興趣包括機器學習和資料探勘、知識發現和資料探勘中的資料準備、缺失值、噪聲資料、資料複雜性和模糊系統。盧恩戈博士因其個人工作或在會議上的出版物獲得了一些獎項和榮譽,例如IFSA-EUSFLAT 2009最佳學生論文獎。他被列入計算機科學領域的高被引研究者名單(2015-2018)(Clarivate Analytics)。

迭戈·加西亞-吉爾於2015年在西班牙格拉納達大學獲得計算機科學碩士學位。他目前在西班牙格拉納達大學計算機科學與人工智慧系攻讀博士學位。他目前的研究興趣包括機器學習、資料探勘、資料預處理和大數據。

塞爾吉奧·拉米雷斯-加列戈於2012年在西班牙哈恩大學獲得計算機科學碩士學位。他於2018年在西班牙格拉納達大學計算機科學與人工智慧系獲得博士學位。他目前的研究興趣包括資料探勘、資料預處理、大數據和雲計算。

薩爾瓦多·加西亞於2004年和2008年分別在西班牙格拉納達大學獲得計算機科學學士學位和博士學位。他目前是西班牙格拉納達大學計算機科學與人工智慧系的副教授。加西亞博士在國際期刊上發表了超過80篇論文(其中超過60篇在Q1期刊),h-index為43,並在國際會議論文集中發表了超過60篇論文(數據來自Web of Science)。他在“混合智能系統”、“智能系統設計與應用”以及“國際神經網絡聯合會議”等會議上組織了幾個與資料預處理和進化學習相關的特別會議和研討會。他與多個常規國際會議的國際程序委員會和組織委員會有關聯,包括IEEE CEC、ICPR、ICDM、IJCAI等。作為編輯活動,他共同編輯了兩個國際期刊的特刊,並擔任“資訊融合”(Elsevier)、“群體與進化計算”(Elsevier)和“人工智慧通訊”(IOS Press)期刊的副編輯,並擔任國際期刊“人工智慧進展”(Springer)的共同主編。他是《資料探勘中的資料預處理》和《從不平衡資料集學習》兩本書的共同作者,這些書籍由Springer出版。他的研究興趣包括資料科學、資料預處理、大數據、進化學習、深度學習、元啟發式和生物識別技術。

弗朗西斯科·埃雷拉(SM'15)於1988年獲得數學碩士學位,1991年獲得數學博士學位,均來自西班牙格拉納達大學。他目前是西班牙格拉納達大學計算機科學與人工智慧系的教授,並擔任DaSCI研究所(安達盧西亞資料科學與計算智慧研究所)主任。他曾指導44名博士生。他發表了超過400篇期刊論文,獲得超過66000次引用(Google Scholar,h-index 132)。他是《遺傳模糊系統》(World Scientific,2001年)和《資料探勘中的資料預處理》(Springer,2015年)、《2元語言模型:在決策中使用詞語計算》(Springer,2015年)、《多標籤分類:問題分析、度量和技術》(Springer,2016年)、《多實例學習:基礎與演算法》(Springer,2016年)以及《從不平衡資料集學習》(Springer,2018年)等書籍的共同作者。他目前擔任國際期刊《資訊融合》(Elsevier)和《人工智慧進展》(Springer)的主編,並擔任十幾本期刊的編輯委員。