Learn CUDA Programming A beginner's guide to GPU programming and parallel computing with CUDA 10.x and C/C++

Jaegeun Han , Bharatkumar Sharma

  • 出版商: Packt Publishing
  • 出版日期: 2019-09-27
  • 售價: $1,540
  • 貴賓價: 9.5$1,463
  • 語言: 英文
  • 頁數: 508
  • 裝訂: Paperback
  • ISBN: 1788996240
  • ISBN-13: 9781788996242
  • 相關分類: CUDAGPU
  • 無法訂購

買這商品的人也買了...

商品描述

Key Features

  • Learn parallel programming principles and practices and performance analysis in GPU computing
  • Get to grips with distributed multi GPU programming and other approaches to GPU programming
  • Understand how GPU acceleration in deep learning models can improve their performance

Book Description

Compute Unified Device Architecture (CUDA) is NVIDIA's GPU computing platform and application programming interface. It's designed to work with programming languages such as C, C++, and Python. With CUDA, you can leverage a GPU's parallel computing power for a range of high-performance computing applications in the fields of science, healthcare, and deep learning.

Learn CUDA Programming will help you learn GPU parallel programming and understand its modern applications. In this book, you'll discover CUDA programming approaches for modern GPU architectures. You'll not only be guided through GPU features, tools, and APIs, you'll also learn how to analyze performance with sample parallel programming algorithms. This book will help you optimize the performance of your apps by giving insights into CUDA programming platforms with various libraries, compiler directives (OpenACC), and other languages. As you progress, you'll learn how additional computing power can be generated using multiple GPUs in a box or in multiple boxes. Finally, you'll explore how CUDA accelerates deep learning algorithms, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs).

By the end of this CUDA book, you'll be equipped with the skills you need to integrate the power of GPU computing in your applications.

What you will learn

  • Understand general GPU operations and programming patterns in CUDA
  • Uncover the difference between GPU programming and CPU programming
  • Analyze GPU application performance and implement optimization strategies
  • Explore GPU programming, profiling, and debugging tools
  • Grasp parallel programming algorithms and how to implement them
  • Scale GPU-accelerated applications with multi-GPU and multi-nodes
  • Delve into GPU programming platforms with accelerated libraries, Python, and OpenACC
  • Gain insights into deep learning accelerators in CNNs and RNNs using GPUs

Who this book is for

This beginner-level book is for programmers who want to delve into parallel computing, become part of the high-performance computing community and build modern applications. Basic C and C++ programming experience is assumed. For deep learning enthusiasts, this book covers Python InterOps, DL libraries, and practical examples on performance estimation.

商品描述(中文翻譯)

主要特點


  • 學習並行編程原則、GPU計算的性能分析

  • 掌握分佈式多GPU編程和其他GPU編程方法

  • 了解GPU加速在深度學習模型中如何提高性能

書籍描述

Compute Unified Device Architecture (CUDA) 是NVIDIA的GPU計算平台和應用程序編程接口。它設計用於與C、C++和Python等編程語言一起使用。通過CUDA,您可以利用GPU的並行計算能力進行科學、醫療和深度學習等領域的高性能計算應用。

《學習CUDA編程》將幫助您學習GPU並行編程並了解其現代應用。在本書中,您將發現現代GPU架構的CUDA編程方法。您不僅將通過示例並行編程算法指導GPU的特性、工具和API,還將學習如何使用樣本並行編程算法分析性能。本書將通過各種庫、編譯器指令(OpenACC)和其他語言,幫助您優化應用程序的性能。隨著學習的進展,您將了解如何在一個或多個盒子中使用多個GPU生成額外的計算能力。最後,您將探索CUDA如何加速深度學習算法,包括卷積神經網絡(CNN)和循環神經網絡(RNN)。

通過閱讀本書,您將具備將GPU計算能力整合到應用程序中所需的技能。

您將學到什麼


  • 了解CUDA中的一般GPU操作和編程模式

  • 揭示GPU編程和CPU編程之間的區別

  • 分析GPU應用程序的性能並實施優化策略

  • 探索GPU編程、性能分析和調試工具

  • 掌握並行編程算法及其實現方法

  • 使用多個GPU和多個節點擴展GPU加速應用程序

  • 深入研究具有加速庫、Python和OpenACC的GPU編程平台

  • 利用GPU在CNN和RNN中加速深度學習算法

適合閱讀對象

這本初學者級的書籍適合希望深入研究並行計算、成為高性能計算社區的一員並構建現代應用程序的程序員。假設您具備基本的C和C++編程經驗。對於深度學習愛好者,本書涵蓋了Python InterOps、DL庫和性能估算的實際示例。

作者簡介

Jaegeun Han is currently working as a solutions architect at NVIDIA, Korea. He has around 9 years' experience and he supports consumer internet companies in deep learning. Before NVIDIA, he worked in system software and parallel computing developments, and application development in medical and surgical robotics fields. He obtained a master's degree in CSE from Seoul National University.

Bharatkumar Sharma obtained a master's degree in information technology from the Indian Institute of Information Technology, Bangalore. He has around 10 years of development and research experience in the domains of software architecture and distributed and parallel computing. He is currently working with NVIDIA as a senior solutions architect, South Asia.

作者簡介(中文翻譯)

Jaegeun Han 是目前在NVIDIA韓國擔任解決方案架構師。他擁有約9年的經驗,並支援深度學習的消費互聯網公司。在加入NVIDIA之前,他在系統軟體和平行計算開發以及醫療和外科機器人領域的應用開發方面工作。他在首爾國立大學獲得了計算機科學與工程碩士學位。

Bharatkumar Sharma 在印度信息技術學院(Bangalore)獲得了資訊技術碩士學位。他在軟體架構、分散式和平行計算領域擁有約10年的開發和研究經驗。他目前在NVIDIA擔任南亞區高級解決方案架構師。

目錄大綱

  1. Introduction to CUDA programming
  2. CUDA Memory Management
  3. CUDA Thread Programming: Performance Indicators and Optimization Strategies
  4. CUDA Kernel Execution model and optimization strategies
  5. CUDA Application Monitoring and Debugging
  6. Scalable Multi-GPU programming
  7. Parallel Programming Patterns in CUDA
  8. GPU accelerated Libraries and popular programming languages
  9. GPU programming using OpenACC
  10. Deep Learning Acceleration with CUDA
  11. Appendix

目錄大綱(中文翻譯)

CUDA程式設計入門
CUDA記憶體管理
CUDA執行緒程式設計:效能指標與最佳化策略
CUDA核心執行模型與最佳化策略
CUDA應用程式監控與除錯
可擴展的多GPU程式設計
CUDA中的平行程式設計模式
GPU加速函式庫與流行的程式語言
使用OpenACC進行GPU程式設計
使用CUDA進行深度學習加速
附錄