Vision Language Models: Building Vlms with Hugging Face (Paperback)
暫譯: 視覺語言模型：使用 Hugging Face 建立 VLMs (平裝本)

Name: Vision Language Models: Building Vlms with Hugging Face (Paperback)
Price: 2641 TWD
Availability: OnlineOnly
Author: Noyan, Merve, Marafioti, Andrés, Farré, Miquel
ISBN: 9798341624047

Noyan, Merve, Marafioti, Andrés, Farré, Miquel

出版商: O'Reilly
出版日期: 2026-07-14
售價: $2,780
貴賓價: 9.5 折 $2,641
語言: 英文
頁數: 406
裝訂: Quality Paper - also called trade paper
ISBN: 9798341624047
ISBN-13: 9798341624047
相關分類: DeepLearning

海外代購書籍(需單獨結帳)

買這商品的人也買了...

~~$720~~ $561

刷題實戰筆記：演算法工程師求職加分的祕笈 (暢銷回饋版)
~~$750~~ $585

從零搞懂演算法：12種演算法 + 6種資料結構，超圖解入門
$564

AI Agent 應用開發：構建多智能體協同系統

商品描述

Vision language models (VLMs) combine computer vision and natural language processing to create powerful systems that can interpret, generate, and respond in multimodal contexts. Vision Language Models is a hands-on guide to building real-world VLMs using the most up-to-date stack of machine learning tools from Hugging Face, Meta (PyTorch), NVIDIA (Cuda), OpenAI (CLIP), and others, written by leading researchers and practitioners Merve Noyan, Miquel Farré, Andrés Marafioti, and Orr Zohar. From image captioning and document understanding to advanced zero-shot inference and retrieval-augmented generation (RAG), this book covers the full VLM application and development lifecycle.

Designed for ML engineers, data scientists, and developers, this guide distills cutting-edge VLM research into practical techniques. Readers will learn how to prepare datasets, select the right architectures, fine-tune and deploy models, and apply them to real-world tasks across a range of industries.

Explore core model architectures and alignment techniques
Train and fine-tune VLMs with Hugging Face, PyTorch, and others
Deploy models for applications like image search and captioning
Implement advanced inference strategies, from zero-shot to agentic systems
Build scalable VLM systems ready for production use

商品描述(中文翻譯)

視覺語言模型（VLMs）結合了計算機視覺和自然語言處理，創造出強大的系統，能夠在多模態環境中解釋、生成和回應。視覺語言模型是一本實用指南，教你如何使用來自 Hugging Face、Meta（PyTorch）、NVIDIA（Cuda）、OpenAI（CLIP）等最新的機器學習工具堆疊來構建現實世界的 VLM，由領先的研究人員和實踐者 Merve Noyan、Miquel Farré、Andrés Marafioti 和 Orr Zohar 撰寫。從圖像標註和文檔理解到先進的零樣本推理和檢索增強生成（RAG），本書涵蓋了完整的 VLM 應用和開發生命周期。

本指南專為機器學習工程師、數據科學家和開發人員設計，將前沿的 VLM 研究提煉為實用技術。讀者將學習如何準備數據集、選擇合適的架構、微調和部署模型，並將其應用於各行各業的現實任務。

- 探索核心模型架構和對齊技術
- 使用 Hugging Face、PyTorch 等訓練和微調 VLM
- 部署模型以用於圖像搜索和標註等應用
- 實施先進的推理策略，從零樣本到代理系統
- 構建可擴展的 VLM 系統，準備投入生產使用

Vision Language Models: Building Vlms with Hugging Face (Paperback) 暫譯: 視覺語言模型：使用 Hugging Face 建立 VLMs (平裝本)

Noyan, Merve, Marafioti, Andrés, Farré, Miquel

買這商品的人也買了...

商品描述

商品描述(中文翻譯)

類似商品

Vision Language Models: Building Vlms with Hugging Face (Paperback)
暫譯: 視覺語言模型：使用 Hugging Face 建立 VLMs (平裝本)