Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Solr Integrations with Drupal Sarnia Module

Parent Feed: 

Every day, companies and organizations with lots of content are weighing the pros and cons of adopting Drupal. Often, this decision takes the form of “to what extent should we adopt Drupal” - meaning whether an organization will want to move toward managing all, or possibly only some of its content in Drupal. Having chosen some form of the latter (as practical concerns often warrant), organizations and their technical teams must delve into the territory of integrating Drupal with third party or sometimes proprietary data sources.

 We’re going to focus on one specific facet of this problem today: what to do about custom Apache Solr cores that need to be searchable on a Drupal webpage.

When we hear “Apache Solr” and “Drupal” in the same sentence, the first thing that comes to mind is the Search API module and it’s dependent Search API Apache Solr. This combo is great if you want to index content managed in Drupal (i.e. lists of nodes, products from Drupal Commerce, users, etc). But imagine that you already have a Solr index, and you’ve spent years customizing it to be exactly what you need. Maybe it feeds multiple existing web properties, or maybe it is fed by an ERP system. Any of these factors would make it troublesome to migrate the indexed content to Drupal.

Fortunately the Sarnia module offers an effective way to bring your custom index into Drupal and at the same time leverage the power of Search API and its Views and Facet API integration.

The Sarnia module provides its own comprehensive installation guide. I followed it and it works, so I don’t want to simply repeat what it recommends. Instead I’m going to focus on a few points of interest that I gleaned while setting up this module.

Search API Apache Solr dependency

One key feature about Sarnia is that it although it relies on the Search API Solr module, you do not have to create a Search API Solr server and/or index for this module to work. Sarnia lets you add its own type of server, which accepts Solr connection input. The module automatically generates a Sarnia index when you enable an “Sarnia entity type” for the server (edit the server and click the “Sarnia” tab).

Understanding the Sarnia Entity

The whole point of Sarnia module is so that you can get Search API features to work on indexes managed outside of Drupal, right? Then why does Sarnia module define the “Sarnia entity type” that claims to “represent data from Solr?” At first I thought this implied that the module was going to replicate indexed data in a table. However, Sarnia entity types are rather unusual:


 

'label' => $type['label'],
'controller class' => 'SarniaController',
'fieldable' => TRUE,
'static cache' => TRUE,
'uri callback' => 'sarnia_uri',
'view callback' => 'sarnia_view_multiple',
'base table' => NULL, // Prevent undefined array index errors from Views.
'entity keys' => array(
   'id' => 'id',
   'revision' => FALSE,
   'bundle' => FALSE,
)

Most importantly base table is null. So entities of this type are not stored. What is the point then? It turns out that these entity types exist mainly because Search API requires an entity type to work.

You’ll notice that in the “Sarnia” menu scope for Sarnia Search API servers, there are “Manage fields” and “Manage display.” It looks like at some point there was an initiative to allow developers to store field values for Solr documents so that Drupal can remember things about them. However, the Sarnia devs note:

“It is possible to add fields here, but there is no corresponding interface for editing field content; saving content has not been tested, even programmatically.”

I doubt that you would be successful trying to save values for Sarnia entity fields, because the Solr ID is not an integer and the field data tables require integers for entity ids. Fortunately, if you’re using the Sarnia module, saving field data in Drupal about your indexed Solr documents is more of an edge case.

Solr field typing

The Sarnia module attempts to assign types to the fields that it finds in its target index. One weak point of this module is that these field mappings are not very robust. Properties are ingested into Search API as either “text” - if the Solr field is fulltext and “none” for everything else.

In our case, we needed to use one of the Sarnia fields as a group-by field in one of our Solr queries, and group-by does not work on “text” fields. We needed to convert one of our fields to type “string.” We had to employ the following hook implementations:


 

/**
* Implements hook_search_api_index_load().
*/
function mymodule_search_api_index_load($indexes) {
 // Sarnia module only sets a type for fulltext fields, so we set it manually.
 if (!empty($indexes['sarnia_sarnia_test']->options['fields']['ss_field_pattern$url'])) {
   $indexes['sarnia_sarnia_test']->options['fields']['ss_field_pattern$url']['type'] = 'string';
 }
}

/**
* Implements hook_entity_property_info_alter().
*/
function mymodule_entity_property_info_alter(&$info) {
 // Add a definition for our grouping field. Left to its own, Sarnia module
 // only adds properties for fulltext fields. This causes errors to be to be
 // thrown when we do our grouping implementation.
 if (isset($info['sarnia_sarnia_test']['properties'])) {
   $info['sarnia_sarnia_test']['properties']['ss_field_pattern$url'] = array(
     'type' => 'string',
     'label' => 'ss_field_pattern$url'
   );
 }
}

The Sarnia devs gave clues on how to produce this code, as they say in a comment explaining how the module field typing generally works:

 “We have to inject the Solr properties both in hook_entity_load() and in hook_entity_property_info()”

Thus I employed a similar approach to alter the Solr properties. We did not test the Facet API integration, but my guess is that some similar work would have to be done to prepare the facetable Solr fields with a data type that Search API deems facetable.

Conclusion

Whether you are an organization adopting Drupal as a CMS but still wanting to use an externally managed Solr core to power web searches, or a Drupal agency going for a “land and expand” strategy by providing service to clients that may want only limited Drupal integration on day one, keep the Sarnia module in mind if Solr search is part of the project scope.

Additional Resources

Preparing for Solr in Four Easy Steps | Mediacurrent Blog Post
Your Intranet on Drupal | Mediacurrent Blog Post

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web