Planning with Markov Decision Processes: An AI Perspective (Synthesis Lectures on Artificial Intelligence and Machine Learning)

Mausam, Andrey Kolobov

  • 出版商: Morgan & Claypool
  • 出版日期: 2012-07-03
  • 售價: $1,590
  • 貴賓價: 9.5$1,511
  • 語言: 英文
  • 頁數: 210
  • 裝訂: Paperback
  • ISBN: 1608458865
  • ISBN-13: 9781608458868
  • 相關分類: 人工智慧Machine Learning
  • 海外代購書籍(需單獨結帳)

買這商品的人也買了...

商品描述

Markov Decision Processes (MDPs) are widely popular in Artificial Intelligence for modeling sequential decision-making scenarios with probabilistic dynamics. They are the framework of choice when designing an intelligent agent that needs to act for long periods of time in an environment where its actions could have uncertain outcomes. MDPs are actively researched in two related subareas of AI, probabilistic planning and reinforcement learning. Probabilistic planning assumes known models for the agent's goals and domain dynamics, and focuses on determining how the agent should behave to achieve its objectives. On the other hand, reinforcement learning additionally learns these models based on the feedback the agent gets from the environment.

This book provides a concise introduction to the use of MDPs for solving probabilistic planning problems, with an emphasis on the algorithmic perspective. It covers the whole spectrum of the field, from the basics to state-of-the-art optimal and approximation algorithms. We first describe the theoretical foundations of MDPs and the fundamental solution techniques for them. We then discuss modern optimal algorithms based on heuristic search and the use of structured representations. A major focus of the book is on the numerous approximation schemes for MDPs that have been developed in the AI literature. These include determinization-based approaches, sampling techniques, heuristic functions, dimensionality reduction, and hierarchical representations. Finally, we briefly introduce several extensions of the standard MDP classes that model and solve even more complex planning problems.

Table of Contents: Introduction / MDPs / Fundamental Algorithms / Heuristic Search Algorithms / Symbolic Algorithms / Approximation Algorithms / Advanced Notes

商品描述(中文翻譯)

馬可夫決策過程(Markov Decision Processes,簡稱MDPs)在人工智慧領域中廣泛應用於建模具有機率動態的連續決策情境。當設計一個智能代理需要在可能產生不確定結果的環境中長時間行動時,MDPs是首選的框架。MDPs在人工智慧的兩個相關子領域,即機率規劃和強化學習中,都受到積極研究。機率規劃假設代理的目標和領域動態的模型已知,並專注於確定代理應該如何行為以實現其目標。另一方面,強化學習則基於代理從環境中獲得的反饋進一步學習這些模型。

本書提供了一個簡明的介紹,介紹了使用MDPs解決機率規劃問題的方法,並強調算法的觀點。它涵蓋了整個領域的範譜,從基礎知識到最先進的最優和近似算法。我們首先描述了MDPs的理論基礎和解決方案的基本技術。然後,我們討論基於啟發式搜索和結構化表示的現代最優算法。本書的主要焦點是在人工智慧文獻中已開發的眾多MDPs近似方案上。這些方案包括基於確定化的方法、抽樣技術、啟發式函數、維度降低和階層表示。最後,我們簡要介紹了幾個標準MDP類別的擴展,用於建模和解決更複雜的規劃問題。

目錄:引言/MDPs/基本算法/啟發式搜索算法/符號算法/近似算法/高級筆記