Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

A view of nodes and their translations, part 2: optimizing the Views-generated query

Last week, I described a technique to query and display nodes in all available translations. This worked well enough, but a performance-minded reader pointed out that the query generated by Views (that includes N self-joins for N enabled languages) would not scale to a large number of nodes.

My usual approach when implementing new ideas is to ensure the logic works first, and only handle optimization when needed. It's a strategy that has worked well for me in the past. So I set out to test this hypothesis, and to optimize the query if it was needed. Here's what happened:

The first obstacle was to generate a large set of nodes and their translations. Devel Generate, the Devel sub-module that generates Drupal objects for development purposes, does not support content translation at the time of writing. I submitted a D7 patch to the 2 years old feature request to achieve this. I tested it with 10K nodes, and it seems to work well. Your review is appreciated!

Having generated 10K nodes and their translations to Arabic and French (30K nodes in total), I cloned the Proverbs view from last time to query and display this content. The result was quite explicit: the view page never finished loading! Clearly, the Views-generated query was not scaling. And for good reason: 3 SQL JOINS of 30,000 records each is a performance black hole. Optimization was needed.

My goal for optimizing the query was to retain all the advantages that Views offers in terms of theming query results, integrating into Drupal pages, etc. - these are indispensable features when creating real-world applications. In short, I wanted to transparently override the Views-generated query. To do so, I needed to:

  • Remove the peformance-killing JOINs from the query
  • Perform an optimized query to find node translations
  • Re-insert the results from the optimized query into the Views results, to allow it to proceed with the display

The code I used follows. I will explain the important parts below.

Remove the peformance-killing JOINs from the query

The function demo_i18n_views_query_alter() removes from the Views query object all references to the SQL JOINs, which are called "relationships" in Views parlance. Views core invokes this hook just before converting the query object into an SQL statement. The resulting query that Views will execute looks like this:

SELECT node.nid AS nid, node.created AS node_created
FROM {node} node
WHERE (( (node.status = '1') AND (node.type IN  ('multilingual_node')) AND (node.tnid = node.nid OR node.tnid = 0) ))

Perform an optimized query to find node translations

The query as modified above will only return nodes that are translation sources. It's now up to me to query the node translations, by waiting for Views to execute the modified query, and then gathering the nids to find their translations (as stored in {node}.tnid). This is a simple query using the SQL IN operator. I call this hand-made query in the demo_i18n_views_post_execute() function, which is invoked by Views after it executes its own query.

Re-insert the results from the optimized query into the Views results

The challenge with the new query is that it returns one node translation per row, as opposed to the original query which returned all translations on the same row. In addition, the results need to be copied into the view::result object, with the right key names that Views expects. In order to find the right key names, I first displayed the results from the unmodified Views query and noted the result keys. With this information, I then proceeded to loop over the optimized query results, and find the corresponding entry in the Views result array that would receive them. This loop is also implemented in the demo_i18n_views_post_execute() function.

The results were impressive! The view page loaded in very acceptable time (ApacheBench reports a mean time of ~1350ms, against ~650ms in the case of a view with just 4 nodes), and Views happily themed the translated nodes as if it had queried them itself. You can see this code in action on my i18n demo site.

The approach of hand-crafting Views queries has been on my mind for a long time, and I'm glad I took the first step. So far, I am not sure that a generic module can be created out of this, mainly due to the necessity to transform the result set after the optimized query is run. In any case, I'll be applying this technique in my projects!

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web