Multimodal Knowledge Systems: Construction and Reasoning
暫譯: 多模態知識系統:建構與推理

Zheng, Changmeng, Li, Qing

  • 出版商: Springer
  • 出版日期: 2026-06-06
  • 售價: $6,950
  • 貴賓價: 9.5$6,602
  • 語言: 英文
  • 頁數: 230
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3032187605
  • ISBN-13: 9783032187604
  • 相關分類: Large language model
  • 海外代購書籍(需單獨結帳)

商品描述

This book focuses on advancing the integration of multimodal data (text, images, and structured knowledge) to enable precise knowledge extraction and human-like reasoning. The book's primary objective is to address critical challenges such as modality gaps, semantic misalignment, dataset biases, and static reasoning paradigms. By introducing novel frameworks that unify graph-based learning, hierarchical representation, bias mitigation, and iterative refinement, this book provides systematic solutions to build robust, interpretable, and scalable AI systems. This book addresses gaps caused by incomplete textual semantics, spurious correlations across modalities, and inflexible reasoning pipelines by offering three pivotal contributions. First, the authors offer theoretical innovations in graph alignment techniques, hierarchical learning paradigms, and multi-agent reasoning frameworks. Then, the book goes on to offer practical tools including benchmark datasets, reproducible methodologies, and applications validated on state-of-the-art tasks. Finally, the book offers a broader impact through solutions tailored for low-resource settings, ethical considerations in AI deployment, and integration with emerging technologies like large foundation models. By bridging the divide between theoretical advancements and real-world applicability, the book serves as an essential resource for researchers and practitioners aiming to leverage multimodal data effectively, ethically, and at scale.

商品描述(中文翻譯)

本書專注於推進多模態數據(文本、圖像和結構化知識)的整合,以實現精確的知識提取和類人推理。本書的主要目標是解決關鍵挑戰,例如模態差距、語義不對齊、數據集偏見和靜態推理範式。通過引入統一基於圖的學習、層次表示、偏見緩解和迭代精煉的新框架,本書提供系統性的解決方案,以構建穩健、可解釋且可擴展的人工智慧系統。本書通過提供三個關鍵貢獻來解決由不完整的文本語義、模態之間的虛假相關性和不靈活的推理管道所造成的差距。首先,作者在圖對齊技術、層次學習範式和多代理推理框架方面提供理論創新。接著,本書提供實用工具,包括基準數據集、可重現的方法論以及在最先進任務上驗證的應用。最後,本書通過針對低資源環境的解決方案、人工智慧部署中的倫理考量以及與大型基礎模型等新興技術的整合,提供更廣泛的影響。通過彌合理論進展與現實應用之間的鴻溝,本書成為研究人員和實踐者的重要資源,旨在有效、倫理且大規模地利用多模態數據。

作者簡介

Dr. Changmeng Zheng received his Bachelor's and Master's degree from South China University of Technology (Guangzhou China), and the PhD degree from the Hong Kong Polytechnic University (Hong Kong SAR). He is currently a Research Assistant Professor with the department of computing, the Hong Kong Polytechnic University, Hong Kong SAR. His research work has been published in refereed journals and conferences such as IEEE Transactions on Multimedia, IEEE Transactions on Circuits and Systems for Video Technology, Neural Networks, ACL, EMNLP, COLING, ACM MM. His research interests are in the areas of multimodal learning and social media analytics, especially the knowledge graph and large language models.

Prof. Qing Li received his BEng. Degree from Hunan University (Changsha, China), MSc and PhD degrees from the University of Southern California (Los Angeles, USA), all in computer science. He is currently a Chair Professor at the Hong Kong Polytechnic University, a visiting professor of the Zhejiang University, a guest professor of the University of Science and Technology of China, and an adjunct professor of the Hunan University. His research interests include multi-modal data modeling, multimedia retrieval and management, and e-learning systems. Dr. Li has published over 500 papers in technical journals and international conferences in these areas, and is actively involved in the research community by serving as a journal reviewer, programme committee chair/co-chair, and as an organizer/co-organizer of several international conferences. Currently he serves as the Chairman of the Hong Kong Web Society, a councillor of the Database Society of Chinese Computer Federation, and a Steering Committee member of the international WISE Society. He is a Fellow of IEEE, AAIA, and IET.

作者簡介(中文翻譯)

鄭長夢博士於中國廣州的華南理工大學獲得學士及碩士學位,並於香港理工大學獲得博士學位。他目前是香港理工大學計算機系的研究助理教授。他的研究工作已發表於多個經過審核的期刊和會議,包括《IEEE多媒體學報》、《IEEE電路與系統視頻技術學報》、《神經網絡》、《ACL》、《EMNLP》、《COLING》和《ACM MM》。他的研究興趣包括多模態學習和社交媒體分析,特別是知識圖譜和大型語言模型。

李青教授於中國長沙的湖南大學獲得工程學士學位,並於美國洛杉磯的南加州大學獲得碩士及博士學位,均為計算機科學專業。他目前是香港理工大學的講座教授,浙江大學的客座教授,中國科學技術大學的客座教授,以及湖南大學的兼任教授。他的研究興趣包括多模態數據建模、多媒體檢索與管理,以及電子學習系統。李博士在這些領域已發表超過500篇技術期刊和國際會議論文,並積極參與研究社群,擔任期刊審稿人、程序委員會主席/聯合主席,以及多個國際會議的組織者/聯合組織者。目前,他擔任香港網絡協會的主席、中國計算機學會數據庫學會的理事,以及國際WISE協會的指導委員會成員。他是IEEE、AAIA和IET的會士。