商品描述
The SpringerBrief introduces FasTensor, a powerful parallel data programming model developed for big data applications. This book also provides a user's guide for installing and using FasTensor. FasTensor enables users to easily express many data analysis operations, which may come from neural networks, scientific computing, or queries from traditional database management systems (DBMS). FasTensor frees users from all underlying and tedious data management tasks, such as data partitioning, communication, and parallel execution.
This SpringerBrief gives a high-level overview of the state-of-the-art in parallel data programming model and a motivation for the design of FasTensor. It illustrates the FasTensor application programming interface (API) with an abundance of examples and two real use cases from cutting edge scientific applications. FasTensor can achieve multiple orders of magnitude speedup over Spark and other peer systems in executing big data analysis operations. FasTensor makes programming for data analysis operations at large scale on supercomputers as productively and efficiently as possible. A complete reference of FasTensor includes its theoretical foundations, C++ implementation, and usage in applications.
Scientists in domains such as physical and geosciences, who analyze large amounts of data will want to purchase this SpringerBrief. Data engineers who design and develop data analysis software and data scientists, and who use Spark or TensorFlow to perform data analyses, such as training a deep neural network will also find this SpringerBrief useful as a reference tool.
商品描述(中文翻譯)
《SpringerBrief》介紹了FasTensor,這是一個為大數據應用開發的強大並行數據編程模型。本書還提供了FasTensor的安裝和使用指南。FasTensor使用戶能夠輕鬆表達許多數據分析操作,這些操作可能來自神經網絡、科學計算或傳統數據庫管理系統(DBMS)的查詢。FasTensor使使用者擺脫所有底層繁瑣的數據管理任務,例如數據分區、通信和並行執行。
這本《SpringerBrief》提供了並行數據編程模型的最新技術概述,以及FasTensor設計的動機。它通過大量示例和兩個來自前沿科學應用的實際案例來說明FasTensor的應用程式介面(API)。FasTensor在執行大數據分析操作時,可以比Spark和其他同類系統實現多個數量級的加速。FasTensor使得在超級計算機上進行大規模數據分析操作的編程變得盡可能高效和生產力。
針對物理和地球科學等領域的科學家,他們分析大量數據,將會希望購買這本《SpringerBrief》。設計和開發數據分析軟體的數據工程師,以及使用Spark或TensorFlow進行數據分析(例如訓練深度神經網絡)的數據科學家,也會發現這本《SpringerBrief》作為參考工具非常有用。
作者簡介
Dr. Bin Dong is a Research Scientist in Lawrence Berkeley National Laboratory in Berkeley, California, USA. Bin has the Ph.D degree in computing science and technology. Bin has wide research interests in big scientific data analysis, parallel computing, parallel I/O, machine learning, etc. He has co-authored more than 62 technical publications.
Dr. Kesheng Wu is a Senior Scientist at Lawrence Berkeley National Laboratory. He works extensively on data management, data analysis, and scientific computing. He is the developer of a number of widely used algorithms including FastBit bitmap indexes for querying large scientific datasets, Thick-Restart Lanczos (TRLan) algorithm for solving eigenvalue problems, and IDEALEM for statistical data reduction and feature extraction. He has co-authored more than 200 technical publications.
Dr. Suren Byna is a Computer Scientist in the Scientific Data Management (SDM) Group at Lawrence Berkeley National Laboratory in Berkeley, California, USA. His research interests are in scalable scientific data management. More specifically, he works on optimizing parallel I/O and on developing systems for managing scientific data. He leads the ExaIO project in the Exascale Computing Project (ECP) that contributes advanced I/O features to HDF5 and develops a new file system called UnifyFS. He also leads efforts that develop object-centric data management systems (Proactive Data Containers - PDC) and experimental and observational data (EOD) management strategies. He has co-authored more than 150 technical publications.
作者簡介(中文翻譯)
董斌博士是美國加州伯克利的洛倫斯伯克利國家實驗室的研究科學家。董博士擁有計算科學與技術的博士學位。他的研究興趣廣泛,涵蓋大科學數據分析、平行計算、平行I/O、機器學習等領域。他共同撰寫了超過62篇技術出版物。
吳克生博士是洛倫斯伯克利國家實驗室的高級科學家。他在數據管理、數據分析和科學計算方面有廣泛的研究。他是多個廣泛使用的演算法的開發者,包括用於查詢大型科學數據集的FastBit位圖索引、用於解決特徵值問題的厚重啟Lanczos (TRLan) 演算法,以及用於統計數據減少和特徵提取的IDEALEM。他共同撰寫了超過200篇技術出版物。
蘇仁·比納博士是美國加州伯克利的洛倫斯伯克利國家實驗室科學數據管理(SDM)小組的計算機科學家。他的研究興趣在於可擴展的科學數據管理。更具體地說,他專注於優化平行I/O和開發科學數據管理系統。他領導Exascale Computing Project (ECP)中的ExaIO專案,為HDF5貢獻先進的I/O功能,並開發一個名為UnifyFS的新檔案系統。他還領導開發以物件為中心的數據管理系統(主動數據容器 - PDC)以及實驗和觀測數據(EOD)管理策略的工作。他共同撰寫了超過150篇技術出版物。

 
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
     
    
 
     
     
    
 
    
