Apache Solr: A Practical Approach to Enterprise Search
暫譯: Apache Solr:企業搜尋的實用方法

Dikshant Shahi

  • 出版商: Apress
  • 出版日期: 2015-12-19
  • 售價: $2,040
  • 貴賓價: 9.5$1,938
  • 語言: 英文
  • 頁數: 328
  • 裝訂: Paperback
  • ISBN: 1484210719
  • ISBN-13: 9781484210710
  • 相關分類: 全文搜尋引擎 Full-text-search
  • 海外代購書籍(需單獨結帳)

相關主題

商品描述

Build an enterprise search engine using Apache Solr: index and search documents; ingest data from varied sources; apply various text processing techniques; utilize different search capabilities; and customize Solr to retrieve the desired results. Apache Solr: A Practical Approach to Enterprise Search explains each essential concept-backed by practical and industry examples--to help you attain expert-level knowledge.

The book, which assumes a basic knowledge of Java, starts with an introduction to Solr, followed by steps to setting it up, indexing your first set of documents, and searching them. It then introduces you to information retrieval and its implementation in Apache Solr; this will help you understand your search problem, decide the approach to build an effective solution, and use various metrics to evaluate the results.

The book next covers the schema design and techniques to build a text analysis chain for cleansing, normalizing and enriching your documents and addressing different types of search queries. It describes various popular matching techniques which are generally applied to improve the precision and recall of searches.

You will learn the end-to-end process of data ingestion from varied sources, metadata extraction, pre-processing and transformation of content, various search components, query parsers and other advanced search capabilities.

After covering out-of-the-box features, Solr expert Dikshant Shahi dives into ways you can customize Solr for your business and its specific requirements, along with ways to plug in your own components. Most important, you will learn about implementations for Solr scoring, factors affecting the document score, and tuning the score for the application at hand. The book explains why textual scoring is not sufficient for practical ranking of documents and ways to integrate real-world factors for contributing to the document ranking.

You'll see how to influence user experience by providing suggestions and recommendations. You'll also see integration of Solr with important related technologies such as OpenNLP and Tika. Additionally, you will learn about scaling Solr using SolrCloud.

This book concludes with coverage of semantic search capabilities, which is crucial for taking the search experience to the next level. By the end of Apache Solr, you will be proficient in designing and developing your search engine. 

商品描述(中文翻譯)

建立一個使用 Apache Solr 的企業搜尋引擎:索引和搜尋文件;從各種來源攝取數據;應用各種文本處理技術;利用不同的搜尋功能;並自訂 Solr 以檢索所需的結果。Apache Solr: A Practical Approach to Enterprise Search 解釋了每個基本概念,並以實際和行業範例為支撐,幫助您獲得專家級的知識。

本書假設讀者具備基本的 Java 知識,首先介紹 Solr,接著是設置步驟、索引您的第一組文件以及搜尋這些文件。然後,它將介紹信息檢索及其在 Apache Solr 中的實現;這將幫助您理解搜尋問題,決定構建有效解決方案的方法,並使用各種指標來評估結果。

接下來,本書涵蓋了架構設計和構建文本分析鏈的技術,以清理、標準化和豐富您的文件,並處理不同類型的搜尋查詢。它描述了各種流行的匹配技術,這些技術通常應用於提高搜尋的精確度和召回率。

您將學習從各種來源進行數據攝取的端到端過程、元數據提取、內容的預處理和轉換、各種搜尋組件、查詢解析器及其他高級搜尋功能。

在介紹了現成功能後,Solr 專家 Dikshant Shahi 深入探討了如何根據您的業務及其特定需求自訂 Solr,以及如何插入您自己的組件。最重要的是,您將了解 Solr 評分的實現、影響文件分數的因素,以及針對當前應用調整分數的方法。本書解釋了為什麼文本評分不足以實現文件的實際排名,以及如何整合現實世界因素以促進文件排名。

您將看到如何通過提供建議和推薦來影響用戶體驗。您還將看到 Solr 與重要相關技術(如 OpenNLP 和 Tika)的整合。此外,您將學習如何使用 SolrCloud 擴展 Solr。

本書最後涵蓋了語義搜尋功能,這對於提升搜尋體驗至關重要。在閱讀完Apache Solr後,您將能夠熟練設計和開發您的搜尋引擎。