Intelligent and Efficient Video Moment Localization
暫譯: 智能高效的視頻瞬間定位

Liu, Meng, Hu, Yupeng, Guan, Weili

  • 出版商: Springer
  • 出版日期: 2025-07-01
  • 售價: $6,380
  • 貴賓價: 9.5$6,061
  • 語言: 英文
  • 裝訂: Hardcover - also called cloth, retail trade, or trade
  • ISBN: 3031875877
  • ISBN-13: 9783031875878
  • 尚未上市,無法訂購

相關主題

商品描述

This book provides a comprehensive exploration of video moment localization, a rapidly emerging research field focused on enabling precise retrieval of specific moments within untrimmed, unsegmented videos. With the rapid growth of digital content and the rise of video-sharing platforms, users face significant challenges when searching for particular content across vast video archives. This book addresses how video moment localization uses natural language queries to bridge the gap between video content and semantic understanding, offering an intuitive solution for locating specific moments across diverse domains like surveillance, education, and entertainment.

This book explores the latest advancements in video moment localization, addressing key issues such as accuracy, efficiency, and scalability. It presents innovative techniques for contextual understanding and cross-modal semantic alignment, including attention mechanisms and dynamic query decomposition. Additionally, the book discusses solutions for enhancing computational efficiency and scalability, such as semantic pruning and efficient hashing, while introducing frameworks for better integration between visual and textual data. It also examines weakly-supervised learning approaches to reduce annotation costs without sacrificing performance. Finally, the book covers real-world applications and offers insights into future research directions.

商品描述(中文翻譯)

本書全面探討了視頻時刻定位(video moment localization),這是一個快速發展的研究領域,旨在實現對未經修剪和未經分段視頻中特定時刻的精確檢索。隨著數位內容的快速增長和視頻分享平台的興起,使用者在龐大的視頻檔案中搜尋特定內容時面臨重大挑戰。本書探討了視頻時刻定位如何利用自然語言查詢來彌合視頻內容與語義理解之間的鴻溝,提供了一種直觀的解決方案,以便在監控、教育和娛樂等多樣化領域中定位特定時刻。

本書探討了視頻時刻定位的最新進展,解決了準確性、效率和可擴展性等關鍵問題。它提出了用於上下文理解和跨模態語義對齊的創新技術,包括注意力機制和動態查詢分解。此外,本書還討論了增強計算效率和可擴展性的解決方案,如語義修剪和高效哈希,同時介紹了更好整合視覺和文本數據的框架。它還考察了弱監督學習方法,以降低標註成本而不影響性能。最後,本書涵蓋了現實世界的應用,並提供了對未來研究方向的見解。

作者簡介

Meng Liu is a Professor in the School of Computer Science and Technology at Shandong Jianzhu University. She received her PhD from Shandong University, China, in 2019. Her research interests include multimedia computing and information retrieval. She has co-authored over 70 papers in leading conferences and journals, including ICML, CVPR, IEEE TPAMI, and IEEE TIP. She has also served as a reviewer and area chair for conferences such as ICLR, CVPR, AAAI, ICME, and ACM MM, as well as a reviewer for journals such as IEEE TIP and IEEE TMM.

Yupeng Hu is currently an Associate Professor in the School of Software at Shandong University. He received his Ph.D. in Software Engineering from Shandong University, Jinan, China, in 2018. His research interests include information retrieval, data mining, and explainable AI. He has published his work in leading journals and conferences, such as IEEE TIP and ACM MM. He has also served as a reviewer for ACM MM, ACL, and AAAI, and as a reviewer for IEEE TKDE and IEEE TMM.

Weili Guan is currently a Professor in the School of Electronics and Information Engineering at Harbin Institute of Technology (Shenzhen), China. She received her Master's degree from the National University of Singapore and her Ph.D. from Monash University. She has approximately six years of experience working in industry. Her research interests include multimedia computing and information retrieval. She has published over 40 papers in top-tier conferences and journals, including ACM MM, SIGIR, IEEE TPAMI, and IEEE TIP.

Liqiang Nie is a Professor in the School of Computer Science and Technology at Harbin Institute of Technology (Shenzhen). He received his B.Eng. from Xi'an Jiaotong University and his Ph.D. from the National University of Singapore. His research interests focus primarily on multimedia computing and information retrieval. Dr. Nie has co-authored over 100 papers and four books, amassing more than 15,000 citations on Google Scholar. He serves as an Associate Editor for IEEE TKDE, IEEE TMM, IEEE TCSVT, and ACM ToMM, and regularly acts as an Area Chair for ACM MM, NeurIPS, IJCAI, and AAAI. He has received numerous honors, including an Honorable Mention for Best Paper at both ACM MM and SIGIR in 2019, the SIGMM Rising Star award in 2020, TR35 China 2020, DAMO Academy Young Fellow in 2020, and the SIGIR Best Student Paper award in 2021.

作者簡介(中文翻譯)

孟柳是山東建築大學計算機科學與技術學院的教授。她於2019年獲得中國山東大學的博士學位。她的研究興趣包括多媒體計算和信息檢索。她在包括ICML、CVPR、IEEE TPAMI和IEEE TIP等頂尖會議和期刊上共同發表了超過70篇論文。她還擔任過ICLR、CVPR、AAAI、ICME和ACM MM等會議的審稿人和區域主席,以及IEEE TIP和IEEE TMM等期刊的審稿人。

胡宇鵬目前是山東大學軟件學院的副教授。他於2018年在中國濟南的山東大學獲得軟件工程博士學位。他的研究興趣包括信息檢索、數據挖掘和可解釋的人工智慧。他的研究成果發表在IEEE TIP和ACM MM等頂尖期刊和會議上。他還擔任過ACM MM、ACL和AAAI的審稿人,以及IEEE TKDE和IEEE TMM的審稿人。

關偉莉目前是中國哈爾濱工業大學(深圳)電子與信息工程學院的教授。她獲得新加坡國立大學的碩士學位,並在莫納什大學獲得博士學位。她在業界擁有約六年的工作經驗。她的研究興趣包括多媒體計算和信息檢索。她在ACM MM、SIGIR、IEEE TPAMI和IEEE TIP等頂級會議和期刊上發表了超過40篇論文。

聶立強是哈爾濱工業大學(深圳)計算機科學與技術學院的教授。他在西安交通大學獲得工程學士學位,並在新加坡國立大學獲得博士學位。他的研究興趣主要集中在多媒體計算和信息檢索。聶博士共同發表了超過100篇論文和四本書籍,在Google Scholar上累積了超過15,000次引用。他擔任IEEE TKDE、IEEE TMM、IEEE TCSVT和ACM ToMM的副編輯,並定期擔任ACM MM、NeurIPS、IJCAI和AAAI的區域主席。他獲得了多項榮譽,包括2019年ACM MM和SIGIR最佳論文的榮譽提名、2020年SIGMM新星獎、2020年TR35中國、2020年達摩院青年學者以及2021年SIGIR最佳學生論文獎。