Learning-From-Observation 2.0: Automatic Acquisition of Robot Behavior from Human Demonstration
暫譯: 從觀察學習 2.0:自動獲取機器人行為的人工示範
Ikeuchi, Katsushi, Wake, Naoki, Takamatsu, Jun
- 出版商: Springer
- 出版日期: 2025-11-02
- 售價: $2,040
- 貴賓價: 9.5 折 $1,938
- 語言: 英文
- 頁數: 204
- 裝訂: Hardcover - also called cloth, retail trade, or trade
- ISBN: 3032034442
- ISBN-13: 9783032034441
-
相關分類:
Reinforcement
海外代購書籍(需單獨結帳)
相關主題
商品描述
This book presents recent breakthroughs in the field of Learning-from-Observation (LfO) resulting from advancement in large language models (LLM) and reinforcement learning (RL) and positions it in the context of historical developments in the area. LfO involves observing human behaviors and generating robot actions that mimic these behaviors. While LfO may appear similar, on the surface, to Imitation Learning (IL) in the machine learning community and Programing-by-Demonstration (PbD) in the robotics community, a significant difference lies in the fact that these methods directly imitate human hand movements, whereas LfO encodes human behaviors into the abstract representations and then maps these representations onto the currently available hardware (individual body) of the robot, thus indirectly mimicking them. This indirect imitation allows for absorbing changes in the surrounding environment and differences in robot hardware. Additionally, by passing through this abstract representation, filtering can occur, distinguishing between important and less important aspects of human behavior, enabling imitation with fewer demonstrations and less demanding demonstrations. The authors have been researching the LfO paradigm for the past decade or so. Previously, the focus was primarily on designing necessary and sufficient task representations to define specific task domains such as assembly of machine parts, knot-tying, and human dance movements. Recent advancements in Generative Pre-trained Transformers (GPT) and RL have led to groundbreaking developments in methods to obtain and map these abstract representations. By utilizing GPT, the authors can automatically generate abstract representations from videos, and by employing RL-trained agent libraries, implementing robot actions becomes more feasible.
商品描述(中文翻譯)
這本書介紹了在觀察學習(Learning-from-Observation, LfO)領域的最新突破,這些突破源於大型語言模型(Large Language Models, LLM)和強化學習(Reinforcement Learning, RL)的進展,並將其置於該領域歷史發展的背景中。LfO 涉及觀察人類行為並生成模仿這些行為的機器人動作。雖然 LfO 在表面上可能看起來與機器學習社群中的模仿學習(Imitation Learning, IL)和機器人社群中的示範編程(Programming-by-Demonstration, PbD)相似,但其顯著的區別在於這些方法直接模仿人類的手部動作,而 LfO 則將人類行為編碼為抽象表示,然後將這些表示映射到機器人當前可用的硬體(個體身體)上,從而間接地進行模仿。這種間接模仿使得能夠吸收周圍環境的變化和機器人硬體的差異。此外,通過這種抽象表示,可以進行過濾,區分人類行為中重要和不重要的方面,從而使模仿所需的示範更少且要求更低。作者在過去十年左右一直在研究 LfO 範式。之前的重點主要是設計必要且充分的任務表示,以定義特定的任務領域,例如機器零件的組裝、打結和人類舞蹈動作。最近在生成預訓練變壓器(Generative Pre-trained Transformers, GPT)和強化學習方面的進展,導致了獲取和映射這些抽象表示的方法的突破性發展。通過利用 GPT,作者可以自動從視頻中生成抽象表示,並通過使用經過強化學習訓練的代理庫,使得實現機器人動作變得更加可行。
作者簡介
Jun Takamatsu received his Ph.D. degree in Computer Science from the University of Tokyo, Japan, in 2004. From 2004 to 2008, he was with the Institute of Industrial Science, the University of Tokyo. In 2007, he was with Microsoft Research Asia as a Visiting Researcher. From 2008 to 2021, he was with Robotics Laboratory, Nara Institute of Science and Technology, Japan, as an Associate Professor. He was also with Carnegie Mellon University as a Visitor in 2012 and 2013 and with Microsoft as a Visiting Scientist in 2018. Now, he is with Microsoft as a Senior Research Scientist. His research interests are in robotics including learning-from-observation, task/motion planning, feasible motion analysis, 3D shape modeling and analysis, and physics-based vision.Kazuhiro Sasabuchi received his Ph.D. degree in Information Science and Technology at the University of Tokyo, Japan in 2019. He has worked across various fields in robotics including human-robot interaction, hardware design, field robotics, robot systems, robot teaching, reinforcement learning, and mobile manipulation. He currently works at Microsoft as a Research Scientist for Industrial Solutions and Engineering. His interests are in practical robot systems which leverage composable skills, cloud operations, large language models, human interaction, simulation, and machine-learning.
作者簡介(中文翻譯)
池内勝志(Katsushi Ikeuchi)於東京大學獲得資訊工程博士學位。他曾在麻省理工學院人工智慧實驗室(MIT-AI)擔任博士後研究員,並在卡內基梅隆大學機器人研究所(CMU-RI)擔任研究教授,之後於2015年加入微軟。在麻省理工學院,他參與了全球首個使用光度立體技術的拾取系統的演算法開發。在卡內基梅隆大學,他啟動了觀察學習(Learning-from-Observation)專案,專注於開發能夠從人類示範中學習行為的機器人。在東京大學,他將這一觀察學習應用於開發能夠表演會津磐梯山舞蹈、打結輸入以及組裝機械零件的人形機器人。他曾擔任多個國際會議的總主席或程序主席,包括IROS1995、CVPR1996、ICRA 2009和ICCV 2017。他擔任Springer-IJCV的主編超過10年,並獲得IEEE-PAMI-TC的傑出研究者獎和日本天皇頒發的紫綬褒章。他是IEEE、IEICE、IPSJ、RSJ和IAPR的會士。
和氣直樹(Naoki Wake)於2019年在日本東京大學獲得資訊科學與技術博士學位。他目前在微軟擔任工業解決方案與工程的研究科學家。他目前的研究涉及機器人的多模態感知系統和共同語言手勢系統的開發。他過去的研究涵蓋聽覺神經科學、神經康復和語音處理。
高松潤(Jun Takamatsu)於2004年在日本東京大學獲得計算機科學博士學位。從2004年到2008年,他在東京大學工業科學研究所工作。2007年,他在微軟亞洲研究院擔任訪問研究員。從2008年到2021年,他在日本奈良科學技術大學的機器人實驗室擔任副教授。他在2012年和2013年曾作為訪客在卡內基梅隆大學工作,並在2018年作為訪問科學家在微軟工作。現在,他在微軟擔任高級研究科學家。他的研究興趣包括機器人學,涵蓋觀察學習、任務/運動規劃、可行運動分析、3D形狀建模與分析以及基於物理的視覺。
笹淵和宏(Kazuhiro Sasabuchi)於2019年在日本東京大學獲得資訊科學與技術博士學位。他在機器人學的各個領域工作,包括人機互動、硬體設計、現場機器人、機器人系統、機器人教學、強化學習和移動操作。他目前在微軟擔任工業解決方案與工程的研究科學家。他的興趣在於實用的機器人系統,這些系統利用可組合的技能、雲端操作、大型語言模型、人機互動、模擬和機器學習。