Build and configure your own search engine using Apache Solr 5.X
About This Book
- Use Apache Solr to build and customize Solr/Lucene based search solutions using indexing
- Apache Solr provides you with a powerful, faceted, geo-spatial text search, along with rich document handling
- This is a hands-on guide packed with real-world implementations of the Solr search feature
Who This Book Is For
This book is an ideal starting point for anyone who wants to embed a search engine in their site.
In particular, if you are a data architect or a project manager and you need to make some key design decisions, every example included is applicable in real-world contexts.
If you are a Java developer who would like to start using Apache Solr to build and customize Solr/Lucene based search-solutions, then this is the book for you.
What You Will Learn
- Define a simple and effective full-text search
- Write configurations incrementally and test them with the Solr web UI or CURL
- Get acquainted with the logical structure of an Inverted Index
- Understand how to use the text analysis chain and customize searches for different use cases
- Use faceted search, simple analytics, or data clustering to enhance users' search experience
- Import data from various sources (including XML and databases), clean or expand it with scripting, and expose it it using several formats such as CSV, JSON, and XML
- Use Solr UI for simple maintenance tasks
Apache Solr is a standalone enterprise search server, exposing services for advanced text search, spatial search, faceted search, and analytics. Solr's architecture is very fast and scalable, from working prototypes to complex distributed architecture; the internal workflow is also open to components' customization, and integration with external tools for advanced text analysis.
This book is a practical introduction to the Solr platform that shows you how to configure your own search engine experience and embed a search engine in your website to help users navigate the data.
We start with the basics of how to use Solr and perform indexing on the default installation. You'll be introduced to the workings of the Solr schema API, the structure of an inverted index, text analysis, and the concept of similarity. Next, we demonstrate indexing and searching with some sample data.
Moving on, you'll learn how to use a faceted search and work with multiple entities and multicores, and how to index external data sources such as open source datasets. You'll get to grips with basic SolrCloud concepts such as routing / shard splitting, Zookeeper, and clustering Solr for distributed searches using SolrCloud. You'll also learn how to detect language with Tika and LangDetect.
At the end of the book, we create a project on a site for bookcrossing, which puts all the concepts together to give you the bigger picture.