AI Systems Performance Engineering: Optimizing Model Training and Inference Workloads with Gpus, Cuda, and Pytorch
暫譯: AI 系統性能工程:使用 GPUs、CUDA 和 PyTorch 優化模型訓練與推論工作負載

Fregly, Chris

  • 出版商: O'Reilly
  • 出版日期: 2025-12-16
  • 售價: $3,600
  • 貴賓價: 9.5$3,420
  • 語言: 英文
  • 頁數: 1058
  • 裝訂: Quality Paper - also called trade paper
  • ISBN: 9798341627789
  • ISBN-13: 9798341627789
  • 相關分類: CUDA
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Elevate your AI system performance capabilities with this definitive guide to unlocking peak efficiency across every layer of your AI infrastructure. In today's era of ever-growing generative models, AI Systems Performance Engineering equips professionals with actionable strategies to co-optimize hardware, software, and algorithms for high-performance and cost-effective AI systems. Authored by Chris Fregly, a performance-focused engineering and product leader, this comprehensive resource transforms complex systems into streamlined, high-impact AI solutions.

Inside, you'll discover step-by-step methodologies for fine-tuning GPU CUDA kernels, PyTorch-based algorithms, and multinode training and inference systems. You'll also master the art of scaling GPU clusters for high performance, distributed model training jobs, and inference servers.

  • Codesign and optimize hardware, software, and algorithms to achieve maximum throughput and cost savings
  • Implement cutting-edge inference strategies that reduce latency and boost throughput in real-world settings
  • Utilize industry-leading scalability tools and frameworks
  • Profile, diagnose, and eliminate performance bottlenecks across complex AI pipelines
  • Integrate full stack optimization techniques for robust, reliable AI system performance

Whether you're an engineer, researcher, or developer, AI Systems Performance Engineering offers a holistic roadmap for building resilient, scalable, and cost-effective AI systems that excel in both training and inference.

商品描述(中文翻譯)

提升您的 AI 系統性能能力,這本終極指南將幫助您在每一層 AI 基礎架構中解鎖最佳效率。在當今不斷增長的生成模型時代,AI 系統性能工程 為專業人士提供可行的策略,以共同優化硬體、軟體和演算法,實現高效能且具成本效益的 AI 系統。這本由專注於性能的工程和產品領導者 Chris Fregly 所撰寫的綜合資源,將複雜系統轉化為精簡且高影響力的 AI 解決方案。

在書中,您將發現逐步的方法論,用於微調 GPU CUDA 核心、基於 PyTorch 的演算法,以及多節點訓練和推論系統。您還將掌握擴展 GPU 集群以實現高性能、分散式模型訓練任務和推論伺服器的藝術。


  • 共同設計並優化硬體、軟體和演算法,以實現最大吞吐量和成本節省

  • 實施尖端的推論策略,減少延遲並提高實際環境中的吞吐量

  • 利用行業領先的可擴展性工具和框架

  • 分析、診斷並消除複雜 AI 管道中的性能瓶頸

  • 整合全堆疊優化技術,以實現穩健、可靠的 AI 系統性能

無論您是工程師、研究人員還是開發者,AI 系統性能工程 都提供了一個全面的路線圖,幫助您構建在訓練和推論中都表現卓越的韌性、可擴展且具成本效益的 AI 系統。