Improving search result with Search API Solr (Partial Search)

Parent Feed: 

Introduction

At Finalist we use the Search API and the Search API Solr modules to provide the search functionality for most of our websites. With a little bit of configuration you can get a great search experience that works perfectly for a basic website. However, sometimes customers want more then a standard implementation. In this post I’ll explain more about some of the improvements we make and how these work. The following topics will be covered in a series of blogs:

  • Stemming
  • Partial search
  • Better search excerpts
  • Custom boosting

Partial search is supported by Apache Solr, but is not activated by default. Partial search can be added through the so called NGramFilterFactory. This could look something like this:

<filter class="solr.NGramFilterFactory" mingramsize="3" maxgramsize="25"/>

The NGramFilterFactory uses a min/max ngramsize attribute to define how big each ngram can be. Implementing partial search can have a great impact on the performance, so please test this properly with different ngram sizes. The quote below (read more in the original blog) explains how the performance impact works:

“There is a high price to be paid for n-gramming. Recall that in the earlier example, Tonight was split into 15 substring terms, whereas typical analysis would probably leave only one. This translates to greater index sizes, and thus a longer time to index. Note the ten-fold increase in indexing time for the artist name, and a five-fold increase in disk space. Remember that this is just one field!”

The best place to add this to the schema.xml file for Apache Solr is to the following section:

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">

Basically add this in the same places the SnowballPorterFilterFactory is already added.

Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web