Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Improving search result with Search API Solr (Stemming)

Parent Feed: 

Introduction

At Finalist we use the Search API and the Search API Solr modules to provide the search functionality for most of our websites. With a little bit of configuration you can get a great search experience that works perfectly for a basic website. However, sometimes customers want more then a standard implementation. In this post I’ll explain more about some of the improvements we make and how these work. The following topics will be covered in a series of blogs:

  • Stemming
  • Partial search
  • Better search excerpts
  • Custom boosting

Stemming

As Wikipedia puts it: “Stemming is the term used in linguistic morphology and information retrieval to describe the process for reducing inflected (or sometimes derived) words to their word stem, base or root form—generally a written word form.”

Basically stemming will help your users find what you are looking for. Search for Cars will also return search results for the word car. Searching for working will also return results for work or worked etc.

Apache Solr uses the so called SnowballPorterFilterFactory to add stemming. In your schema.xml file for Solr you will probably find something like this:

<filter class="solr.SnowballPorterFilterFactory" language="English" protected="protwords.txt"/>

As you can see, the stemming algoritm needs a language to make sure the stemming will be accurate. For example, the root form for English words is different than the root form for Dutch words. Most languages are supported. You can find a list of languages in the documentation.

Dutch stemming For the Dutch language (which a lot of our customers use), there are 2 different supported languages: Dutch and Kp (Kraaij-Pohlmann). We’ve found the Kp stemmer provides much better results than the Dutch stemmer.

Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web