Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Sharing ApacheSolr Module

Parent Feed: 


This week I gave a talk for the Vancouver Island Java User Group on integrating Apache Solr search into web applications. Since the group is, of course, Java-focused, I didn't dwell overly much on Drupal except to demo a non-trivial example of integration showing some of the more advanced capabilities of Solr search, including faceted search, search spelling correction, "find similar content", and so on - all available out of the box with Robert Douglass, pwolanin, claudiu.cristea et al.'s excellent ApacheSolr module for Drupal.

Slides are available here.

Since I was originally schedule to give the talk in November of 2008, this was a great opportunity to look back over the past year and a half or so and see what has changed in the Solr and ApacheSolr world.

Solr has had two major point releases, going from version 1.2 to 1.4, adding substantial performance improvements, replication, multi-select faceting, range queries (e.g., date between Sep 2004 and Oct 2006), nested queries, multiple cores, more flexible architecture, and much more. The number of installations and the community of developers seems to be steadily growing - I'd estimate that the numbers have at least doubled in the past 16 months.

ApacheSolr has had steady development releases, leading to full DRUPAL-5--2 and DRUPAL-6--1 and DRUPAL-6--2 releases. More than 240 issues and feature requests have been addressed since Jan 2009. Many issues including indexing of attached documents, access control, implementation of various Solr features, have all been addressed in one or more ways by this or various associated modules.

One of the issues that seems to come up relatively frequently is the difficulty of using Solr's fuzzy matching or wildcard matches out of the box, because the ApacheSolr module chooses to use the DisMax query handler rather than the "Standard" query handler for Solr, in order to better deal with weighted fields, if I understand the rationale correctly (i.e., a core use case trumps a special use case). This situation may soon improve with proposed improvements to the DisMax handler. Let's hope so (better yet, in the magical event of a sudden rush of free time or significant client interest, pitch in and help make this happen!)

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web