Jun 02 2013
Jun 02

Last week, I described a technique to query and display nodes in all available translations. This worked well enough, but a performance-minded reader pointed out that the query generated by Views (that includes N self-joins for N enabled languages) would not scale to a large number of nodes.

My usual approach when implementing new ideas is to ensure the logic works first, and only handle optimization when needed. It's a strategy that has worked well for me in the past. So I set out to test this hypothesis, and to optimize the query if it was needed. Here's what happened:

The first obstacle was to generate a large set of nodes and their translations. Devel Generate, the Devel sub-module that generates Drupal objects for development purposes, does not support content translation at the time of writing. I submitted a D7 patch to the 2 years old feature request to achieve this. I tested it with 10K nodes, and it seems to work well. Your review is appreciated!

Having generated 10K nodes and their translations to Arabic and French (30K nodes in total), I cloned the Proverbs view from last time to query and display this content. The result was quite explicit: the view page never finished loading! Clearly, the Views-generated query was not scaling. And for good reason: 3 SQL JOINS of 30,000 records each is a performance black hole. Optimization was needed.

My goal for optimizing the query was to retain all the advantages that Views offers in terms of theming query results, integrating into Drupal pages, etc. - these are indispensable features when creating real-world applications. In short, I wanted to transparently override the Views-generated query. To do so, I needed to:

  • Remove the peformance-killing JOINs from the query
  • Perform an optimized query to find node translations
  • Re-insert the results from the optimized query into the Views results, to allow it to proceed with the display

The code I used follows. I will explain the important parts below.

Remove the peformance-killing JOINs from the query

The function demo_i18n_views_query_alter() removes from the Views query object all references to the SQL JOINs, which are called "relationships" in Views parlance. Views core invokes this hook just before converting the query object into an SQL statement. The resulting query that Views will execute looks like this:

SELECT node.nid AS nid, node.created AS node_created
FROM {node} node
WHERE (( (node.status = '1') AND (node.type IN  ('multilingual_node')) AND (node.tnid = node.nid OR node.tnid = 0) ))

Perform an optimized query to find node translations

The query as modified above will only return nodes that are translation sources. It's now up to me to query the node translations, by waiting for Views to execute the modified query, and then gathering the nids to find their translations (as stored in {node}.tnid). This is a simple query using the SQL IN operator. I call this hand-made query in the demo_i18n_views_post_execute() function, which is invoked by Views after it executes its own query.

Re-insert the results from the optimized query into the Views results

The challenge with the new query is that it returns one node translation per row, as opposed to the original query which returned all translations on the same row. In addition, the results need to be copied into the view::result object, with the right key names that Views expects. In order to find the right key names, I first displayed the results from the unmodified Views query and noted the result keys. With this information, I then proceeded to loop over the optimized query results, and find the corresponding entry in the Views result array that would receive them. This loop is also implemented in the demo_i18n_views_post_execute() function.

The results were impressive! The view page loaded in very acceptable time (ApacheBench reports a mean time of ~1350ms, against ~650ms in the case of a view with just 4 nodes), and Views happily themed the translated nodes as if it had queried them itself. You can see this code in action on my i18n demo site.

The approach of hand-crafting Views queries has been on my mind for a long time, and I'm glad I took the first step. So far, I am not sure that a generic module can be created out of this, mainly due to the necessity to transform the result set after the optimized query is run. In any case, I'll be applying this technique in my projects!

Oct 20 2011
Oct 20

As part of the Knowledge Lab at Cocomore on 2 September, 2011, I outlined how to use Jenkins to implement Continuous Integration for a Drupal-based project. The basic building blocks for the demonstration included Jenkins, a demo Drupal project, and its automatic update from the SVN version control system, which triggers each run of SimpleTest. I will describe the primary steps to this process here.

Basic Installation

There are fully prepared Jenkins packages available for download for the most popular Linux distributions, including the Debian server version I used. The Jenkins project even maintains a repository which blends seamlessly into the software management. All the necessary commands are described on the Jenkins download page; with them, installing the base system is simply an “apt-get install jenkins” away.

Of course, since we want to test a Drupal system, our machine also needs a Web server (Apache), PHP, and a database (MySQL). Installing Ant is also useful, since it provides a good way to automate tasks and is very well supported by Jenkins.

Configuration

The next step is to create a new Job in Jenkins for testing our project. The “Build a free-style software project” option is the recommended setting for an Ant-based project like this.

Create a new job in Jenkins

On the configuration page, there are numerous settings to customize the job, according to your needs. For example, you can configure which version control system to use. CVS and SVN support are included in the Jenkins default installation, but there are easily-installed Jenkins plugins available for numerous other systems such as Git, Mercurial, or Bazaar. In this example, we are using SVN.

We need to add the repository URL (including path) to allow Jenkins to find our source code. We also need to configure what should trigger a build. For easy setup, it makes sense to either start the build at fixed intervals, or have Jenkins make periodic checks for changes in the repository and begin build execution after changes. We’ll choose the latter option and, for this example, the code in the “Schedule” field is set to check the repository every two minutes. It’s also useful that Jenkins examines not only the version of the “direct” project code, but also any linked dependencies via svn:externals.

Configure SVN in Jenkins

Now that Jenkins can find the source code, we need to configure what should happen during a build. We have chosen to create an Ant-based target, so we create a series of steps for our build, as Ant targets, which are executed in sequence.

Build processes in Selenium configured with Ant targets

We have yet to define what to do with the results of the build. We want to configure SimpleTest to write its output in JUnit XML format, so we select the JUnit plugin and enter the location our SimpleTest output will be found.

Post-build actions in Jenkins.

Here’s the relevant excerpt of the Ant target which runs the SimpleTests. In this example, it runs all tests in the group, “Demo”:

   target name="run_simpletests">
  [...]
   <exec dir="${basedir}/htdocs"
    executable="php"
    failonerror="true">
    arg line="scripts/run-tests.sh --php /usr/bin/php --url http://demo.debiserv/
      $--xml {basedir}/build/logs/simpletest Demo"/>
   </exec>
  </target>

The example below shows a very simple “Dummy” SimpleTest, which can be switched back and forth between returning “test successful” and “test failed”.

<?php
    
class DemoModuleTestCase extends DrupalWebTestCase {      public static function getInfo() {
       return array(
      
'name' => 'Demo Module tests',
      
'description' => 'Test the demo module.',
      
'group' => 'Demo',
        );
      }
   
      public function
testDummyTest() {
      
$value = true;
      
//$value = false;
      
$this->assertTrue($value, 'The value is true.');
      }
     }
?>

Jenkins build historyHere, if we look at the “Build History” block, we can see the results summary for recent test commits. The first was set to trigger a SimpleTest failure. Jenkins recognized a change in SVN, which triggered running build #34. The yellow ball indicates a build which completed, but a test failed. A subsequent commit to SVN triggered build #35; the blue ball indicates that the build and test were successful.

You can easily configure Jenkins to send notifications, via RSS or email, whenever the status of a project changes, so your development team is immediately made aware whenever any code changes cause a test to fail.

Conclusion: Continuous integration with Jenkins is useful

In conclusion, continuous integration can be implemented with Jenkins for a Drupal project. It is useful to formulate the steps as Ant targets; these targets can also be used for setting up development environments, so the time taken to write the automation is spent once, with continuous dividends throughout the life of your project. Drupal’s built-in SimpleTest framework can be a step in the project build process and further integrate its output with JUnit-compatible post-processing plug-ins. Jenkins notification features help keep developers up-to-date on the status of the project. With a little configuration, this can produce a complete solution.

PS: This basic demonstration of Jenkins did not cover the issues of security and authentication. As you’d expect, Jenkins supports users and access rights. On real installations such configuration should be considered mandatory, even if the system is in your internal network.

Jan 11 2010
Jan 11

Devel is a supremely useful module for Drupal development, but if you've never enabled the Development menu block, there are some useful links you might be missing out on. Here are some features of Devel that you might not know about:

Execute PHP
Path: devel/php
Provides a text area for entering PHP code into. Any output (print, print_r, var_dump) is shown in a drupal_set_message.

PHP Info
Path: devel/phpinfo
Get PHP configuration info from the server your site is running on.

View Theme Registry
Path: devel/theme/registry
Get down deep with the theme info your Drupal site knows about. Great for expert themers.

Function Reference
Path: devel/reference
Lists every function available to your Drupal site without loading any include files. Each function links to the API reference site you can specify at admin/settings/devel, so if it's a core function or a well-documented contrib funciton, it'll link to its documentation.

Reinstall Modules
Path: devel/reinstall
Uninstall and reinstall modules. This saves a lot of time compared to disabling the module, uninstalling it, and then re-enabling it.

Theme Developer Toggle
Path: devel/devel_themer
Another time saver. I love the Theme Developer module, but it's such a hassle to turn it on, use it for minute, and then off again right after using it. This path is a callback that switches the module on and off. Very handy.

View Source Code
Path: devel/source?file=sites/default/settings.php
View the actual, raw PHP source of any file on your website. If this sounds like a security issue to you, remember that every one of these Devel paths requires permission to view it.

Edit Stored Variables
Path: devel/variable
Edit variables stored in the {variable} table. These are the same variables available through variable_get() and variable_set().

Available Form Elements
Path: devel/elements
If you've worked with Form API, you've seen #type declarations like 'textfield', 'textarea', 'fieldset' and so on. This page lists all the types available to you. This can be handy with modules like gmap that silently add really cool form elements for you.

Dec 19 2009
mg
Dec 19

There are several modules I have used during building sites. I am not going to get into details of the modules because you should visit the modules. But by far, the most important / useful modules for me have been:

1. Views ; You can think of views as output to your custom query (except you don't write the query). There are several ways to present the views and Calendar is one example. So this is a query builder which can execute queries at run time and present the data however you want.

2. : You can add custom fields to a node or define a new content type based on CCK, This is powerful functionality with no coding. You can order the fields according to your preference. And CCK will become very powerful once is introduced. As of now, Multigroup is in alpha, but I have been using it with no problems. It lets you create a composite or a compound field, consisting of different base CCK fields.

3. : The module renders all administrative menu items below 'administer' in a clean, attractive and purely CSS-based menu at the top of your website. This is a life saver if you are administering a site. Essentially it is a overlay of the entire admin functionality as a ribbon on the top of your web site.

4. : I have to give it to katrib for this module. This is great if you want spreadsheet functionality in your website. We did some pretty advanced things with the module like embedding it within a node as a tab and YUI integration. We also configured a java bridge to be able to upload excel sheets from desktop to the website. Google spread sheet integration is pretty straightforward. This is based on SocialText platform

5. Enables users to create and manage their own groups (like forums). This can bring great social aspects to your website. For some of the sites, we changed the module to represent a company and it was great.

6. : This adds a lot of umph to the site. It is Mainly used to showcase featured content at a prominent place on the frontpage of the site. Demos and tutorials are excellent.

7. : Essential for developers and themers It can generate SQL query summaries, create data for your test site, show the theme information on your site. Don't develop without this module.

8. Pathauto : It automatically generates path aliases for various kinds of content (nodes, categories, users) without requiring the user to manually specify the path alias. Helps get rid of default / ugly paths (node/3 ... )

9. jQuery : jQuery depends on jQuery UI and jQuery update in drupal 6. And here is a good overview of jQuery.This architectural diagram might also be a useful reference. jQuery modules provides all the nice functionality jQuery library.

10. : This module provides several useful extensions to the login system of drupal. What I found really useful was to allow email based logins, email confirmation during registration and auto logout. Happy drupaling :)

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web