Probabilistic Indexing for Information Search and Retrieval in Large Collections of Handwritten Text Images
暫譯: 大規模手寫文本影像資訊搜尋與檢索的機率索引技術
Toselli, Alejandro Héctor, Puigcerver, Joan, Vidal, Enrique
- 出版商: Springer
- 出版日期: 2025-04-12
- 售價: $6,460
- 貴賓價: 9.5 折 $6,137
- 語言: 英文
- 頁數: 344
- 裝訂: Quality Paper - also called trade paper
- ISBN: 3031553918
- ISBN-13: 9783031553912
海外代購書籍(需單獨結帳)
相關主題
商品描述
The book is structured into 11 chapters and three appendices. The first two chapters briefly outline the necessary fundamentals and state of the art in pattern recognition, statistical decision theory, and handwritten text recognition. Chapter 3 presents approaches for indexing (as opposed to "spotting") each region of a handwritten text image which is likely to contain a word. Next, Chapter 4 describes models adopted for handwritten text in images, namely hidden Markov models, convolutional and recurrent neural networks and language models, and provides full details of weighted finite-state transducer (WFST) concepts and methods, needed in further chapters of the book. Chapter 5 explains the set of techniques and algorithms developed to generate image probabilistic indexes which allow for fast search and retrieval of textual information in the indexed images. Chapter 6 then presents experimental evaluations of the proposed framework and algorithms on different traditional benchmark datasets and compares them with other approaches, while Chapter 7 reviews the most popular keyword-spotting approaches. Chapter 8 explains how PrIx can support classical free-text search tools, while Chapter 9 presents new methods that use PrIx not only for searching, but also to deal with text analytics and other related natural language processing and information extraction tasks. Chapter 10 shows how the proposed solutions can be used to effectively index very large collections of handwritten document images, before Chapter 11 eventually summarizes the book and suggests promising lines of future research. The appendices detail the necessary mathematical foundations for the work and presents details of the text image collections and datasets used in the experiments throughout the book.
This book is written for researchers and (post-)graduate students in pattern recognition and information retrieval. It will also be of interest to people in areas like history, criminology, or psychology who need technical support to evaluate, understand or decode historical or contemporary handwritten text.
商品描述(中文翻譯)
本書全面介紹了一個最近推出的框架,名為「機率索引」(probabilistic indexing,簡稱 PrIx),用於在大型文檔影像集合中搜尋文本及其他相關應用。它促進了新搜尋引擎的發展,以有效地從手稿中檢索資訊,然而這些手稿通常缺乏進行此類搜尋和檢索任務所需的電子文本(抄本)。
本書共分為11章和三個附錄。前兩章簡要概述了模式識別、統計決策理論和手寫文本識別的必要基礎和最新技術。第三章介紹了對手寫文本影像中每個可能包含單詞的區域進行索引(與「定位」相對)的各種方法。接下來,第四章描述了用於影像中手寫文本的模型,即隱馬可夫模型、卷積神經網絡和遞迴神經網絡及語言模型,並提供了加權有限狀態轉換器(weighted finite-state transducer,WFST)概念和方法的詳細資訊,這些在本書的後續章節中是必需的。第五章解釋了為生成影像機率索引而開發的一系列技術和算法,這些索引允許快速搜尋和檢索索引影像中的文本資訊。第六章則展示了在不同傳統基準數據集上對所提出的框架和算法的實驗評估,並將其與其他方法進行比較,而第七章回顧了最受歡迎的關鍵字定位方法。第八章解釋了 PrIx 如何支持傳統的自由文本搜尋工具,第九章則介紹了使用 PrIx 的新方法,不僅用於搜尋,還用於處理文本分析及其他相關的自然語言處理和資訊提取任務。第十章展示了所提出的解決方案如何有效地對非常大的手寫文檔影像集合進行索引,最後第十一章總結了本書並建議未來研究的有前景方向。附錄詳細說明了本研究所需的數學基礎,並介紹了在本書實驗中使用的文本影像集合和數據集的詳細資訊。
本書是為模式識別和資訊檢索領域的研究人員及(研究生)學生所撰寫的。對於需要技術支持以評估、理解或解碼歷史或當代手寫文本的歷史學、犯罪學或心理學等領域的人士也將感興趣。
作者簡介
Alejandro Héctor Toselli, is currently working as a PostDoc (María Zambrano grant) at the Universitat Politècnica de València. He obtained an Electrical Engineer degree from the University Nacional de Tucumán (Argentina, 1997) and a Phd in Computer Science from the Universitat Politècnica de València (UPV) (Spain, 2004). His research expertise focuses primarily on Document Analysis and Recognition, in which he has more than 20 years of experience, publishing on these topics and working on related projects funded by European and US institutions. He held a Post-Doctoral Fellow at Northeastern University (Boston, USA) in the the multi-institutional Open Islamicate Texts Initiative (OpenITI) and at the "Institut de Recherche en Informatique et Systèmes Aléatoires" (IRISA, Rennes France).
Joan Puigcerver received his MSc and PhD in Computer Science from the Universitat Politècnica de València, in 2014 and 2018, respectively, focusing on probabilistic indexing and handwritten text recognition. In 2018, he joined Google Research as a software engineer. His research focuses on deep learning architectures, transfer learning, and computer vision. Joan is a member of the Spanish Society for Pattern Recognition and Image Analysis (AERFAI), an affiliate organization of the International Association for Pattern Recognition (IAPR).
Enrique Vidal is an emeritus professor of the Universitat Politècnica de València (Spain) and former co-leader of the PRHLT research center there. He is co-author of hundreds of research papers in the fields of Pattern Recognition, Multimodal Interaction and applications to Language, Speech and Image Processing and has led many important projects in these fields. Enrique is a fellow of the International Association for Pattern Recognition (IAPR).
作者簡介(中文翻譯)
Alejandro Héctor Toselli 目前在瓦倫西亞理工大學(Universitat Politècnica de València)擔任博士後研究員(María Zambrano 獎學金)。他於1997年在阿根廷的國立圖庫曼大學(University Nacional de Tucumán)獲得電機工程學位,並於2004年在西班牙的瓦倫西亞理工大學(UPV)獲得計算機科學博士學位。他的研究專長主要集中在文檔分析和識別,擁有超過20年的經驗,並在這些主題上發表了多篇論文,參與了由歐洲和美國機構資助的相關項目。他曾在美國波士頓的東北大學(Northeastern University)擔任博士後研究員,參與多機構的開放伊斯蘭文本倡議(Open Islamicate Texts Initiative, OpenITI),以及在法國雷恩的計算機與隨機系統研究所(Institut de Recherche en Informatique et Systèmes Aléatoires, IRISA)工作。
Joan Puigcerver 於2014年和2018年在瓦倫西亞理工大學獲得計算機科學碩士和博士學位,專注於概率索引和手寫文本識別。2018年,他加入谷歌研究(Google Research)擔任軟體工程師。他的研究重點是深度學習架構、遷移學習和計算機視覺。Joan 是西班牙模式識別與圖像分析學會(AERFAI)的成員,該學會是國際模式識別協會(IAPR)的附屬組織。
Enrique Vidal 是瓦倫西亞理工大學(西班牙)的名譽教授,並曾擔任該校PRHLT研究中心的共同負責人。他是數百篇有關模式識別、多模態互動及其在語言、語音和圖像處理應用領域的研究論文的共同作者,並在這些領域領導了許多重要項目。Enrique 是國際模式識別協會(IAPR)的研究員。