Sep 26 2019
Sep 26

I am writing this quick tutorial in the hopes it helps someone else out there. There are a few guides out there to do similar tasks to this. They just are not quite what I wanted.

To give everyone an idea on the desired outcome, this is what I wanted to achieve:

Example user profile with 2 custom tabs in it.

Before I dive into this, I will mention that you can do this with views, if all that you want to produce is content supplied by views. Ivan wrote a nice article on this. In my situation, I wanted a completely custom route, controller and theme function. I wanted full control over the output.

Steps to add sub tabs

Step 1 - create a new module

If you don't already have a module to house this code, you will need one. These commands make use of Drupal console, so ensure you have this installed first.

drupal generate:module --module='Example module' --machine-name='example' --module-path='modules/custom' --description='My example module' --package='Custom' --core='8.x'

Step 2 - create a new controller

Now that you have a base module, you need a route

drupal generate:controller --module='example' --class='ExampleController' --routes='"title":"Content", "name":"example.user.contentlist", "method":"contentListUser", "path":"/user/{user}/content"'

Step 3 - alter your routes

In order to use magic autoloading, and also proper access control, you can alter your routes to look like this. This is covered in the official documentation.

# Content user tab.
example.user.zonelist:
  path: '/user/{user}/zones'
  defaults:
    _controller: '\Drupal\example\Controller\ExampleController::contentListUser'
    _title: 'Content'
  requirements:
    _permission: 'access content'
    _entity_access: 'user.view'
    user: \d+
  options:
    parameters:
      user:
        type: entity:user

# Reports user tab.
example.user.reportList:
  path: '/user/{user}/reports'
  defaults:
    _controller: '\Drupal\example\Controller\ExampleController::reportListUser'
    _title: 'Reports'
  requirements:
    _permission: 'access content'
    _entity_access: 'user.view'
    user: \d+
  options:
    parameters:
      user:
        type: entity:user

This is the code that actually creates the tabs in the user profile. No Drupal console command for this unfortunately. The key part of this is defining base_route: entity.user.canonical.

example.user.zones_task:
  title: 'Content'
  route_name: example.user.contentlist
  base_route: entity.user.canonical
  weight: 1

example.user.reports_task:
  title: 'Reports'
  route_name: example.user.reportList
  base_route: entity.user.canonical
  weight: 2

Step 5 - enable the module

Don't forget to actually turn on your custom module, nothing will work until then.

drush en example

Example module

The best (and simplest) example module I could find that demonstrates this is the Tracker module in Drupal core. The Tracker module adds a tab to the user profile.

Aug 19 2019
Aug 19

This is a short story on an interesting problem we were having with the Feeds module and Feeds directory fetcher module in Drupal 7.

Background on the use of Feeds

Feeds for this site is being used to ingest XML from a third party source (Reuters). The feed perhaps ingests a couple of hundred articles per day. There can be updates to the existing imported articles as well, but typically they are only updated the day the article is ingested.

Feeds was working well for over a few years, and then all of a sudden, the ingests started to fail. The failure was only on production, whilst the other (lower environments) the ingestion worked as expected.

The bizarre error

On production we were experiencing the error during import:

PDOStatement::execute(): MySQL server has gone away database.inc:2227 [warning] 
PDOStatement::execute(): Error reading result set's header [warning] 
database.inc:2227PDOException: SQLSTATE[HY000]: General error: 2006 MySQL server has [error]

The error is not so much that the database server is not alive, more so that PHP's connection to the database has been severed due to exceeding MySQL's wait_timeout value.

The reason why this would occur on only production happens on Acquia typically when you need to read and write to the shared filesystem a lot. As lower environments, the filesystem is local disk (as the environments are not clustered) the access is a lot faster. On production, the public filesystem is a network file share (which is slower).

Going down the rabbit hole

Working out why Feeds was wanting to read and/or write many files from the filesystem was the next question, and immediately one thing stood out. The shear size of the config column in the feeds_source table:

mysql> SELECT id,SUM(char_length(config))/1048576 AS size FROM feeds_source GROUP BY id;
+-------------------------------------+---------+
| id                                  | size    |
+-------------------------------------+---------+
| apworldcup_article                  |  0.0001 |
| blogs_photo_import                  |  0.0003 |
| csv_infographics                    |  0.0002 |
| photo_feed                          |  0.0002 |
| po_feeds_prestige_article           |  1.5412 |
| po_feeds_prestige_gallery           |  1.5410 |
| po_feeds_prestige_photo             |  0.2279 |
| po_feeds_reuters_article            | 21.5086 |
| po_feeds_reuters_composite          | 41.9530 |
| po_feeds_reuters_photo              | 52.6076 |
| example_line_feed_article           |  0.0002 |
| example_line_feed_associate_article |  0.0001 |
| example_line_feed_blogs             |  0.0003 |
| example_line_feed_gallery           |  0.0002 |
| example_line_feed_photo             |  0.0001 |
| example_line_feed_video             |  0.0002 |
| example_line_youtube_feed           |  0.0003 |
+-------------------------------------+---------+
What 52 MB of ASCII looks like in a single cell.

Having to deserialize 52 MB of ASCII in PHP is bad enough.

The next step was dumping the value of the config column for a single row:

drush --uri=www.example.com sqlq 'SELECT config FROM feeds_source WHERE id = "po_feeds_reuters_photo"' > /tmp/po_feeds_reuters_photo.txt
Get the 55 MB of ASCII in a file for analysis

Then open the resulting file in vim:

"/tmp/po_feeds_reuters_photo.txt" 1L, 55163105C
Vim struggles to open any file that has 55 million characters on a single line

And sure enough, inside this config column was a reference to every single XML file ever imported, a cool ~450,000 files.

a:2:{s:31:"feeds_fetcher_directory_fetcher";a:3:{s:6:"source";s:23:"private://reuters/pass1";s:5:"reset";i:0;
s:18:"feed_files_fetched";a:457065:{
s:94:"private://reuters/pass1/topnews/2018-07-04T083557Z_1_KBN1JU0WQ_RTROPTC_0_US-CHINA-AUTOS-GM.XML";i:1530693632;
s:94:"private://reuters/pass1/topnews/2018-07-04T083557Z_1_KBN1JU0WR_RTROPTT_0_US-CHINA-AUTOS-GM.XML";i:1530693632;
s:96:"private://reuters/pass1/topnews/2018-07-04T083557Z_1_LYNXMPEE630KJ_RTROPTP_0_USA-TRADE-CHINA.XML";i:1530693632;
s:97:"private://reuters/pass1/topnews/2018-07-04T083617Z_147681_KBE99T04E_RTROPTT-LNK_0_OUSBSM-LINK.XML";i:1530693632;
s:102:"private://reuters/pass1/topnews/2018-07-04T083658Z_1_KBN1JU0X2_RTROPTT_0_JAPAN-RETAIL-ZOZOTOWN-INT.XML";i:1530693632
457,065 is the array size in feed_files_fetched

So this is the root cause of the problem, Drupal is attempting to stat() ~450,000 files that do not exist, and these files are mounted on a network file share. This process took longer than MySQL's wait_timeout and MySQL closed the connection. When Drupal finally wanted to talk to the database, it was not to be found.

Interesting enough, the problem of the config column running out of space came up in 2012, and "the solution" was just to change the type of the column. Now you can store 4GB of content in this 1 column. In hindsight, perhaps this was not the smartest solution.

Also in 2012, you see the comment from @valderama:

However, as feed_files_fetched saves all items which were already imported, it grows endless if you have a periodic import.

Great to see we are not the only people having this pain.

The solution

The simple solution to limp by is to increase the wait_timeout value of your Database connection. This gives Drupal more time to scan for the previously imported files prior to importing the new ones.

$databases['default']['default']['init_commands'] = [
  'wait_timeout' => "SET SESSION wait_timeout=1500",
];
Increasing MySQL's wait_timeout in Drupal's settings.php.

As you might guess, this is not a good long term solution for sites with a lot of imported content, or content that is continually being imported.

Instead we opted to do a fairly quick update hook that would loop though all of the items in the feed_files_fetched key, and unset the older items.

<?php

/**
 * @file
 * Install file.
 */

/**
 * Function to iterate through multiple strings.
 *
 * @see https://www.sitepoint.com/community/t/strpos-with-multiple-characters/2004/2
 * @param $haystack
 * @param $needles
 * @param int $offset
 * @return bool|int
 */
function multi_strpos($haystack, $needles, $offset = 0) {
  foreach ($needles as $n) {
    if (strpos($haystack, $n, $offset) !== FALSE) {
      return strpos($haystack, $n, $offset);
    }
  }
  return false;
}

/**
 * Implements hook_update_N().
 */
function example_reuters_update_7001() {
  $feedsSource = db_select("feeds_source", "fs")
    ->fields('fs', ['config'])
    ->condition('fs.id', 'po_feeds_reuters_photo')
    ->execute()
    ->fetchObject();

  $config = unserialize($feedsSource->config);

  // We only want to keep the last week's worth of imported articles in the
  // database for content updates.
  $cutoff_date = [];
  for ($i = 0; $i < 7; $i++) {
    $cutoff_date[] = date('Y-m-d', strtotime("-$i days"));
  }

  watchdog('FeedSource records - Before trimmed at ' . time(), count($config['feeds_fetcher_directory_fetcher']['feed_files_fetched']));

  // We attempt to match based on the filename of the imported file. This works
  // as the files have a date in their filename.
  // e.g. '2018-07-04T083557Z_1_KBN1JU0WQ_RTROPTC_0_US-CHINA-AUTOS-GM.XML'
  foreach ($config['feeds_fetcher_directory_fetcher']['feed_files_fetched'] as $key => $source) {
    if (multi_strpos($key, $cutoff_date) === FALSE) {
      unset($config['feeds_fetcher_directory_fetcher']['feed_files_fetched'][$key]);
    }
  }

  watchdog('FeedSource records - After trimmed at ' . time(), count($config['feeds_fetcher_directory_fetcher']['feed_files_fetched']));

  // Save back to the database.
  db_update('feeds_source')
    ->fields([
      'config' => serialize($config),
    ])
    ->condition('id', 'po_feeds_reuters_photo', '=')
    ->execute();
}

Before the code ran, there were > 450,000 items in the array, and after we are below 100. So a massive decrease in database size.

More importantly, the importer now runs a lot quicker (as it is not scanning the shared filesystem for non-existent files).

Jul 28 2019
Jul 28

PHP 7.3.0 was released in December 2018, and brings with it a number of improvements in both performance and the language. As always with Drupal you need to strike a balance between adopting these new improvements early and running into issues that are not yet fixed by the community.

Why upgrade PHP to 7.3 over 7.2?

What hosting providers support PHP 7.3?

All the major players have support, here is how you configure it for each.

Acquia

Somewhere around April 2019 the option to choose PHP 7.3 was released. You can opt into this version by changing a value in Acquia Cloud. This can be done on a per environment basis.

The PHP version configuration screen for Acquia Cloud 

Pantheon

Pantheon have had support since the April 2019 as well (see the announcement post). To change the version, you update your pantheon.yml file (see the docs on this).

# Put overrides to your pantheon.upstream.yml file here.
# For more information, see: https://pantheon.io/docs/pantheon-yml/
api_version: 1
php_version: 7.3
Example pantheon.yml file

On a side note, it is interesting that PHP 5.3 is still offered on Pantheon (end of life for nearly 5 years).

Platform.sh

Unsure when Platform.sh released PHP 7.3, but the process to enable it is very similar to Pantheon, you update your .platform.app.yaml file (see the docs on this).

# .platform.app.yaml
type: "php:7.3"
Example .platform.app.yaml file

Dreamhost

PHP 7.3 is also available on Dreamhost, and can be chosen in a dropdown in their UI (see the docs on this).

The 'Manage Domains' section of Dreamhost

Dreamhost also win some award for also allowing the oldest version of PHP that I have seen in a while (PHP 5.2).

When can you upgrade PHP 7.3

Drupal 8

As of Drupal 8.6.4 (6th December 2018), PHP 7.3 is fully supported in Drupal core (change record). I have been running PHP 7.3 with Drupal 8 for a while now and have seen no issues, and this includes running some complex installation profiles such as Thunder and Lightning.

Any Drupal 8 site that is reasonably up to date should be fine with Drupal 8 today.

Drupal 7

Slated for support in the next release of Drupal 7 - being Drupal 7.68 (see the drupal.org issue), however there are a number of related tasks that seem like deal breakers. There also is not PHP 7.3 and Drupal 7 tests running daily either.

It seems like for the mean time, it is probably best to hold off on the PHP 7.3 upgrade until 7.68 is out the door, and also contributed modules have had a chance to upgrade and release a new stable release.

A simple search on Drupal.org yields the following modules that look like they may need work (more are certainly possible):

  • composer_manager (issue)
  • scald (issue) [now fixed and released]
  • video (issue)
  • search_api (issue) [now fixed and released]

Most of the issues seem to be related to this deprecation - Deprecate and remove continue targeting switch. If you know of any other modules that have issues, please let me know in the comments.

Drupal 6

For all you die hard Drupal 6 fans out there (I know a few large websites still running this), you are going to be in for a rough ride. There is a PHP 7 branch of the d6lts Github repo, so this is promising, however the last commit was September 2018, so this does not bode well for PHP 7.3 support. I also doubt contributed modules are going to be up to scratch (drupal.org does not even list D6 versions of modules anymore).

To test this theory, I audited the current 6.x-2.x branch of views

$ phpcs -p ~/projects/views --standard=PHPCompatibility --runtime-set testVersion 7.3
................................................W.W.WW.W....  60 / 261 (23%)
................................E........................... 120 / 261 (46%)
...................................................EE....... 180 / 261 (69%)
............................................................ 240 / 261 (92%)
.....................                                        261 / 261 (100%)

3 errors in views alone. The errors are show stoppers too

Function split() is deprecated since PHP 5.3 and removed since PHP 7.0; Use preg_split() instead

If this is the state of one of the most popular modules for Drupal 7, then I doubt the lesser known modules will be any better.

If you are serious about supporting Drupal 6, it would pay to get in contact with My Drop Wizard, as they at least at providing support for people looking to adopt PHP 7.

Jul 08 2019
Jul 08

Part of my day job is to help tune the Cloudflare WAF for several customers. This blog post helps to summarise some of the default rules I will deploy to every Drupal (7 or 8) site as a base line.

The format of the custom WAF rules in this blog post are YAML format (for humans to read), if you do want to create these rules via the API, then you will need them in JSON format (see the end of this blog post for a sample API command).

Default custom WAF rules

Unfriendly Drupal 7 URLs

I often see bots trying to hit URLs like /?q=node/add and /?q=user/register. This is the default unfriendly URL to hit on Drupal 7 to see if user registration or someone has messed up the permissions table (and you can create content as an anonymous user). Needless to say, these requests are rubbish and add no value to your site, let's block them.

description: 'Drupal 7 Unfriendly URLs (bots)'
action: block
filter:
  expression: '(http.request.uri.query matches "q=user/register") or (http.request.uri.query matches "q=node/add")'

Autodiscover

If your organisation has bought Microsoft Exchange, then likely your site will receive loads of requests (GET and POST) to which is likely to just tie up resources on your application server serving these 404s. I am yet to meet anyone that actually serves back real responses from a Drupal site for Autodiscover URLs. Blocking is a win here.

description: Autodiscover
action: block
filter:
  expression: '(http.request.uri.path matches "/autodiscover\.xml$") or (http.request.uri.path matches "/autodiscover\.src/")'

Wordpress

Seeing as Wordpress has a huge market share (34% of all websites) a lot of Drupal sites get caught up in the mindless (and endless) crawling. These rules will effectively remove all of this traffic from your site.

description: 'Wordpress PHP scripts'
action: block
filter:
  expression: '(http.request.uri.path matches "/wp-.*\.php$")'
description: 'Wordpress common folders (excluding content)'
action: block
filter:
  expression: '(http.request.uri.path matches "/wp-(admin|includes|json)/")'

I separate wp-content into it's own rule as you may want to disable this rule if you are migrating from a old Wordpress site (and want to put in place redirects for instance).

description: 'Wordpress content folder'
action: block
filter:
  expression: '(http.request.uri.path matches "/wp-content/")'

SQLi

I have seen several instanced in the past where obvious SQLi was being attempted and the default WAF rules by Cloudflare were not intercepting them. This custom WAF rule is an attempt to fill in this gap.

description: 'SQLi in URL'
action: block
filter:
  expression: '(http.request.uri.path contains "select unhex") or (http.request.uri.path contains "select name_const") or (http.request.uri.path contains "unhex(hex(version()))") or (http.request.uri.path contains "union select") or (http.request.uri.path contains "select concat")'

Drupal 8 install script

Drupal 8's default install script will expose your major, minor and patch version of Drupal you are running. This is bad for a lot of reasons.

Drupal 8's default install screen exposes far too much information

It is better to just remove these requests from your Drupal site altogether. Note, this is not a replacement for upgrading Drupal, it is just to make fingerprinting a little harder.

description: 'Install script'
action: block
filter:
  expression: '(http.request.uri.path eq "/core/install.php")'

Microsoft Office and Skype for Business

Microsoft sure is good at making lots of products that attempt to DoS its own customers websites. These requests are always POST requests, often to your homepage, and you require partial string matching to match the user agent, as it changes with the version of Office/Skype you are running.

In large organisation, I have seen the number of requests here number in the hundreds of thousands per day.

description: 'Microsoft Office/Skype for Business POST requests'
action: block
filter:
  expression: '(http.request.method eq "POST") and (http.user_agent matches "Microsoft Office" or http.user_agent matches "Skype for Business")'

Microsoft ActiveSync

Yet another Microsoft product that you don't why it is trying to hit another magic endpoint that doesn't exist.

description: 'Microsoft Active Sync'
action: block
filter:
  expression: '(http.request.uri.path eq "/Microsoft-Server-ActiveSync")'

Using the Cloudflare API to import custom WAF rules

It can be a pain to have to manually point and click a few hundred times per zone to import the above rules. Instead you would be better off to use the API. Here is a sample cURL command you can use do import all of the above rules in one easy go.

You will need to replace the redacted sections with your details.

curl 'https://api.cloudflare.com/client/v4/zones/XXXXXXXXXXXXXX/firewall/rules' \
  -H 'X-Auth-Email: XXXXXXXXXXXXXX' \
  -H 'X-Auth-Key: XXXXXXXXXXXXXX'
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json'
  -H 'Accept-Encoding: gzip'
  -X POST \
  -d '[{"ref":"","description":"Autodiscover","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path matches \"\/autodiscover\\.xml$\") or (http.request.uri.path matches \"\/autodiscover\\.src\/\")"}},{"ref":"","description":"Drupal 7 Unfriendly URLs (bots)","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.query matches \"q=user\/register\") or (http.request.uri.query matches \"q=node\/add\")"}},{"ref":"","description":"Install script","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path eq \"\/core\/install.php\")"}},{"ref":"","description":"Microsoft Active Sync","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path eq \"\/Microsoft-Server-ActiveSync\")"}},{"ref":"","description":"Microsoft Office\/Skype for Business POST requests","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.method eq \"POST\") and (http.user_agent matches \"Microsoft Office\" or http.user_agent matches \"Skype for Business\")"}},{"ref":"","description":"SQLi in URL","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path contains \"select unhex\") or (http.request.uri.path contains \"select name_const\") or (http.request.uri.path contains \"unhex(hex(version()))\") or (http.request.uri.path contains \"union select\") or (http.request.uri.path contains \"select concat\")"}},{"ref":"","description":"Wordpress common folders (excluding content)","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path matches \"\/wp-(admin|includes|json)\/\")"}},{"ref":"","description":"Wordpress content folder","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path matches \"\/wp-content\/\")"}},{"ref":"","description":"Wordpress PHP scripts","paused":false,"action":"block","priority":null,"filter":{"expression":"(http.request.uri.path matches \"\/wp-.*\\.php$\")"}}]'

How do you know the above rules are working

Visit the firewall overview tab in Cloudflare's UI to see how many requests are being intercepted by the above rules.

Cloudflare's firewall overview screen showing the custom WAF rules in action

Final thoughts

The above custom WAF rules are likely not the only custom WAF rules you will need for any given Drupal site, but it should at least be a good start. Let me know in the comments if you have any custom WAF rules that you always deploy. I would be keen to update this blog post with additional rules from the community.

This is likely the first post in a series of blog posts on customising Cloudflare to suit your Drupal site. If you want to stay up to date - subscribe to the RSS feed, sign up for email updates, or follow us on Twitter.

Jun 21 2019
Jun 21

I am working with a customer now that is looking to go through a JSON:API upgrade, from version 1.x on Drupal 8.6.x to 2.x and then ultimately to Drupal 8.7.x (where it is bundled into core).

As this upgrade will involve many moving parts, and it is critical to not break any existing integrations (e.g. mobile applications etc), having basic end-to-end tests over the API endpoints is essential.

In the past I have written a lot about CasperJS, and since then a number of more modern frameworks have emerged for end-to-end testing. For the last year or so, I have been involved with Cypress.

I won't go too much in depth about Cypress in this blog post (I will likely post more in the coming months), instead I want to focus specifically on JSON:API testing using Cypress.

In this basic test, I just wanted to hit some known valid endpoints, and ensure the response was roughly OK.

Rather than have to rinse and repeat a lot of boiler plate code for every API end point, I wrote a custom Cypress command, to which abstracts all of this away in a convenient function.

Below is what the spec file looks like (the test definition), it is very clean, and is mostly just the JSON:API paths.

describe('JSON:API tests.', () => {

    it('Agents JSON:API tests.', () => {
        cy.expectValidJsonWithMinimumLength('/jsonapi/node/agent?_format=json&include=field_agent_containers,field_agent_containers.field_cont_storage_conditions&page[limit]=18', 6);
        cy.expectValidJsonWithMinimumLength('/jsonapi/node/agent?_format=json&include=field_agent_containers,field_agent_containers.field_cont_storage_conditions&page[limit]=18&page[offset]=72', 0);
    });
    
    it('Episodes JSON:API tests.', () => {
        cy.expectValidJsonWithMinimumLength('/jsonapi/node/episode?fields[file--file]=uri,url&filter[field_episode_podcast.nid][value]=4976&include=field_episode_podcast,field_episode_audio,field_episode_audio.field_media_audio_file,field_episode_audio.thumbnail,field_image,field_image.image', 6);
    });

});
jsonapi.spec.js

And as for the custom function implementation, it is fairly straight forward. Basic tests are done like:

  • Ensure the response is an HTTP 200
  • Ensure the content-type is valid for JSON:API
  • Ensure there is a response body and it is valid JSON
  • Enforce a minimum number of entities you expect to be returned
  • Check for certain properties in those returned entities.
Cypress.Commands.add('expectValidJsonWithMinimumLength', (url, length) => {
    return cy.request({
        method: 'GET',
        url: url,
        followRedirect: false,
        headers: {
            'accept': 'application/json'
        }
    })
    .then((response) => {
        // Parse JSON the body.
        let body = JSON.parse(response.body);
        expect(response.status).to.eq(200);
        expect(response.headers['content-type']).to.eq('application/vnd.api+json');
        cy.log(body);
        expect(response.body).to.not.be.null;
        expect(body.data).to.have.length.of.at.least(length);

        // Ensure certain properties are present.
        body.data.forEach(function (item) {
            expect(item).to.have.all.keys('type', 'id', 'attributes', 'relationships', 'links');
            ['changed', 'created', 'default_langcode', 'langcode', 'moderation_state', 'nid', 'path', 'promote', 'revision_log', 'revision_timestamp', 'status', 'sticky', 'title', 'uuid', 'vid'].forEach((key) => {
                expect(item['attributes']).to.have.property(key);
            });
        });
    });

});
commands.js

Some of the neat things in this function is that it does log the parsed JSON response with cy.log(body); this allows you to inspect the response in Chrome. This allows you to extend the test function rather easily to meet you own needs (as you can see the full entity properties and fields.

Cypress with a GUI can show you detailed log information

Using Cypress is like having an extra pair of eyes on the Drupal upgrade. Over time Cypress will end up saving us a lot of developer time (and therefore money). The tests will be in place forever, and so regressions can be spotted much sooner (ideally in local development) and therefore fixed much faster.

If you do JSON:API testing with Cypress I would be keen to know if you have any tips and tricks.

May 19 2016
May 19

Introduction

I have had the chance to be involved with 2 fresh builds with Drupal 8 now, I thought I would describe some of the neat things I have found during this time and some of my lessons learned. My hope is that blog post will help you in your journey with Drupal 8.

1. Drupal Console is awesome

Every time you need to generate a custom module, or a new block in a custom module, you can quickly and easily use Drupal Console to produce the code scaffolding for you. This quite easily makes the job of a developer a lot less stressful, and allows you to focus on actually writing code that delivers functionality.

I plucked these example commands that I use frequently from my bash history:

drupal site:mode dev
drupal generate:module
drupal generate:plugin:block
drupal generate:routesubscriber
drupal generate:form:config

Documentation is online but for the most part, the commands are self documenting, if you use the --help option, then you get a great summary on the command, and the other options you can pass in.

The other nice thing is that this is a Symfony Console application, so it should feel very familiar to you if you used another tool written in the same framework.

2. Custom block types are amazing

In Drupal 7 land there was bean which was an attempt to stop making ‘meta’ nodes to fill in content editable parts of complex landing pages. Now, fast forward to Drupal 8, and custom block types are now in Drupal Core.

This basically means as a site builder you now have another really powerful tool at your disposal in order to model content effectively in Drupal 8.

Each custom block type can have it’s own fields, it’s own display settings, and form displays.

Here are the final custom block types on a recent Drupal 8 build:

One downside is that there is no access control per custom block type (just a global permission “administer blocks”), no doubt contrib will step in to fill this hole in the future (does anyone know a module that can help here?). In the mean time there is drupal.org issue on the subject.

I also found it weird that the custom blocks administration section was not directly under the ‘structure’ section of the site, there is another drupal.org issue about normalising this as well. Setting up some default shortcuts really helped me save some time.

3. View modes on all the things

To create custom view modes in Drupal 7 required either a custom module or Dave Reid’s entity_view_mode contrib module. Now this is baked into Drupal 8 core.

View modes on your custom block types takes things to yet another level still as well. This is one more feather in the Drupal site builder’s cap.

4. Twig is the best

In Drupal 7 I always found it weird that you could not unleash a front end developer upon your site and expect to have a pleasant result. In order to be successful the themer would need to know PHP, preprocess hooks, template naming standards, the mystical specific order in which the templates apply and so on. This often meant that a backend and front end developer would need to work together in order to create a good outcome.

With the introduction of Twig, I now feel that theming is back in the hands of the front end developer, and knowledge of PHP is no longer needed in order to override just about any markup that Drupal 8 produces.

Pro tip - use the Drupal Console command drupal site:mode dev to enable Twig development options, and disable Drupal caching. Another positive side effect is that Twig will then render the entire list of templates that you could be using, and which one you actually are using (and where that template is located).

Pro tip: - If you want to use a template per custom block type (to which I did), then you can use this PHP snippet in your theme’s .theme file (taken from drupal.org):

<?php
/**
 * Implements hook_theme_suggestions_HOOK_alter() for form templates.
 *
 * @param array $suggestions
 * @param array $variables
 */
function THEMENAME_theme_suggestions_block_alter(array &$suggestions, array $variables) {
  if (isset($variables['elements']['content']['#block_content'])) {
    array_splice($suggestions, 1, 0, 'block__bundle__' . $variables['elements']['content']['#block_content']->bundle());
  }
}

When looking for a layout manager to help build the more complex landing pages, I came across panelizer + panels IPE. Using panelizer you are able to:

  • create per node layout variants
  • apply a single layout to all nodes of a particular bundle (e.g. all your news articles have the same layout)

The other neat thing is that the layouts themselves are now standardised between all the various layout managers using a contrib module called layout_plugin. Also they are just YAML and Twig. Simple. There is even an effort to get this merged into Drupal 8.2 which I think would be a great idea.

Downside - all JS is still rendered on the page even though the user (e.g. anonymous users) have no access to panelizer. There is a patch on drupal.org to help fix this.

Since starting this build there has also been a stable release of display suite come out for Drupal 8 as well giving you even more options.

6. You can build a rather complex site with very little contributed modules

For this most recent site I build I got away with using only 10 contributed modules (one of which - devel was purely for debugging purposes).

  • ctools
  • google_analytics
  • metatag
  • panels
  • token
  • contact_block
  • devel
  • layout_plugin
  • panelizer
  • pathauto

This means you are inherently building a more stable and supportable site, as most of the functionality now comes out of Drupal core.

In Drupal 7, the contact module was one of those modules to which I never turned on, as it was rather inflexible. You could not change the fields in a UI, nor add email recipients, or have more than 1 form. Now in Drupal 8 you can have as many “contact” forms as you want, each one is fieldable, and can send emails to as many people as needed.

You can also enhance the core module with:

  • contact_block - allows you to place the contact form in a block
  • contact_storage - allows you to store the submissions in the database, rather than firing an email and forgetting about it

There is still a place for webform, namely:

  • large complex form with lots of fields
  • multi-step forms
  • forms you want to ‘save draft’

You can read more about this in the OS training blog post on the contact module.

Downside - I wanted to have a plain page use the path /contact but the contact module registers this path, so pathauto gave my contact page a path of /contact-0. Luckily creating a route subscriber with Drupal Console was painless, so altering the contact module route was very simple to do. I can paste the code here if needed, but most of it is the code that Drupal Console generates for you.

8. PHPunit is bundled into core

Now that Drupal 8 is largely Object Oriented (OO), you are able to test classes using PHPunit. I have wrote about phpunit in the past if you want to know more.

9. Views is in core

This was the main reason why adoption of Drupal 7 was so slow after it’s initial 7.0 release, as everyone needed views to be stable before jumping ship. Now with views bundled into core, views plugins are also being ported at a great rate of knots too.

10. CKEditor is in core

I often found that this was one library that never (or hardly ever) got updated on sites that had been around for a while. More worryingly, CKEditor (the library) would from time to time fix security related issues. Now that this comes with Drupal 8 core, it is just one less thing to worry about.

Also I would love to shout out to Wim Leers (and other contributors) for revamping the image dialog with alignment and caption options. I cannot tell you how much pain and suffering this caused me in Drupal 7.

If you have built a site recently in Drupal 8 and have found anything interesting or exciting, please let me know in the comments. Also keen to see what sites people have built, so post a link to it if it is public.

May 18 2016
May 18

Introduction

I have had the chance to be involved with 2 fresh builds with Drupal 8 now, I thought I would describe some of the neat things I have found during this time and some of my lessons learned. My hope is that blog post will help you in your journey with Drupal 8.

1. Drupal Console is awesome

Every time you need to generate a custom module, or a new block in a custom module, you can quickly and easily use Drupal Console to produce the code scaffolding for you. This quite easily makes the job of a developer a lot less stressful, and allows you to focus on actually writing code that delivers functionality.

I plucked these example commands that I use frequently from my bash history:

drupal site:mode dev
drupal generate:module
drupal generate:plugin:block
drupal generate:routesubscriber
drupal generate:form:config

Documentation is online but for the most part, the commands are self documenting, if you use the --help option, then you get a great summary on the command, and the other options you can pass in.

The other nice thing is that this is a Symfony Console application, so it should feel very familiar to you if you used another tool written in the same framework.

2. Custom block types are amazing

In Drupal 7 land there was bean which was an attempt to stop making ‘meta’ nodes to fill in content editable parts of complex landing pages. Now, fast forward to Drupal 8, and custom block types are now in Drupal Core.

This basically means as a site builder you now have another really powerful tool at your disposal in order to model content effectively in Drupal 8.

Each custom block type can have it’s own fields, it’s own display settings, and form displays.

Here are the final custom block types on a recent Drupal 8 build:

Custom block types in Drupal 8

One downside is that there is no access control per custom block type (just a global permission “administer blocks”), no doubt contrib will step in to fill this hole in the future (does anyone know a module that can help here?). In the mean time there is drupal.org issue on the subject.

I also found it weird that the custom blocks administration section was not directly under the ‘structure’ section of the site, there is another drupal.org issue about normalising this as well. Setting up some default shortcuts really helped me save some time.

Shortcuts are handy in Drupal 8 to remember where the custom block section is

3. View modes on all the things

To create custom view modes in Drupal 7 required either a custom module or Dave Reid’s entity_view_mode contrib module. Now this is baked into Drupal 8 core.

View modes on your custom block types takes things to yet another level still as well. This is one more feather in the Drupal site builder’s cap.

4. Twig is the best

In Drupal 7 I always found it weird that you could not unleash a front end developer upon your site and expect to have a pleasant result. In order to be successful the themer would need to know PHP, preprocess hooks, template naming standards, the mystical specific order in which the templates apply and so on. This often meant that a backend and front end developer would need to work together in order to create a good outcome.

With the introduction of Twig, I now feel that theming is back in the hands of the front end developer, and knowledge of PHP is no longer needed in order to override just about any markup that Drupal 8 produces.

Pro tip - use the Drupal Console command drupal site:mode dev to enable Twig development options, and disable Drupal caching. Another positive side effect is that Twig will then render the entire list of templates that you could be using, and which one you actually are using (and where that template is located).

Twig developer comments in HTML are the best

Pro tip: - If you want to use a template per custom block type (to which I did), then you can use this PHP snippet in your theme’s .theme file (taken from drupal.org):

<?php
/**
 * Implements hook_theme_suggestions_HOOK_alter() for form templates.
 *
 * @param array $suggestions
 * @param array $variables
 */
function THEMENAME_theme_suggestions_block_alter(array &$suggestions, array $variables) {
  if (isset($variables['elements']['content']['#block_content'])) {
    array_splice($suggestions, 1, 0, 'block__bundle__' . $variables['elements']['content']['#block_content']->bundle());
  }
}

When looking for a layout manager to help build the more complex landing pages, I came across panelizer + panels IPE. Using panelizer you are able to:

  • create per node layout variants
  • apply a single layout to all nodes of a particular bundle (e.g. all your news articles have the same layout)

The other neat thing is that the layouts themselves are now standardised between all the various layout managers using a contrib module called layout_plugin. Also they are just YAML and Twig. Simple. There is even an effort to get this merged into Drupal 8.2 which I think would be a great idea.

Downside - all JS is still rendered on the page even though the user (e.g. anonymous users) have no access to panelizer. There is a patch on drupal.org to help fix this.

Since starting this build there has also been a stable release of display suite come out for Drupal 8 as well giving you even more options.

6. You can build a rather complex site with very little contributed modules

For this most recent site I build I got away with using only 10 contributed modules (one of which - devel was purely for debugging purposes).

  • ctools
  • google_analytics
  • metatag
  • panels
  • token
  • contact_block
  • devel
  • layout_plugin
  • panelizer
  • pathauto

This means you are inherently building a more stable and supportable site, as most of the functionality now comes out of Drupal core.

In Drupal 7, the contact module was one of those modules to which I never turned on, as it was rather inflexible. You could not change the fields in a UI, nor add email recipients, or have more than 1 form. Now in Drupal 8 you can have as many “contact” forms as you want, each one is fieldable, and can send emails to as many people as needed.

You can also enhance the core module with:

  • contact_block - allows you to place the contact form in a block
  • contact_storage - allows you to store the submissions in the database, rather than firing an email and forgetting about it

There is still a place for webform, namely:

  • large complex form with lots of fields
  • multi-step forms
  • forms you want to ‘save draft’

You can read more about this in the OS training blog post on the contact module.

Downside - I wanted to have a plain page use the path /contact but the contact module registers this path, so pathauto gave my contact page a path of /contact-0. Luckily creating a route subscriber with Drupal Console was painless, so altering the contact module route was very simple to do. I can paste the code here if needed, but most of it is the code that Drupal Console generates for you.

8. PHPunit is bundled into core

Now that Drupal 8 is largely Object Oriented (OO), you are able to test classes using PHPunit. I have wrote about phpunit in the past if you want to know more.

9. Views is in core

This was the main reason why adoption of Drupal 7 was so slow after it’s initial 7.0 release, as everyone needed views to be stable before jumping ship. Now with views bundled into core, views plugins are also being ported at a great rate of knots too.

10. CKEditor is in core

I often found that this was one library that never (or hardly ever) got updated on sites that had been around for a while. More worryingly, CKEditor (the library) would from time to time fix security related issues. Now that this comes with Drupal 8 core, it is just one less thing to worry about.

Also I would love to shout out to Wim Leers (and other contributors) for revamping the image dialog with alignment and caption options. I cannot tell you how much pain and suffering this caused me in Drupal 7.

The image dialog bundled with Drupal 8 in CKeditor is amazing

If you have built a site recently in Drupal 8 and have found anything interesting or exciting, please let me know in the comments. Also keen to see what sites people have built, so post a link to it if it is public.

Mar 23 2016
Mar 23

Background

I have been doing a bit of Drupal 8 development as of recent, and am loving the new changes, and entities everywhere. I am passionate about automated testing, and when I saw that integrating PHPunit into your custom modules is now even easier, I set out to see how this all worked.

Why is PHPunit important

There are are number of reasons why PHPunit is a great idea

  • it forces you to write testable code in the first place, this means small classes, with methods that do a single thing
  • it runs in only a few seconds, there is also no need to have a working Drupal install
  • integrates with PHPStorm, allowing you run your tests from within the IDE

Step 1, set up your phpunit.xml.dist file

There is a file that comes included with Drupal 8 core, but by default it will not scan any sub-directories under /modules (e.g. like the very common /modules/custom). I stumbled across this question on stackoverflow. So you have a couple of options from here:

Option 1 - Create and use your own phpunit.xml.dist file

You can simply copy (and modify) Drupal 8 core’s phpunit.xml.dist file into git repo somewhere (perhaps outside the webroot), and use this file for all your custom module tests.

Option 2 - Patch Drupal 8 core

Another option (which is the option I took) was to apply a simple patch to Drupal core. There is an open issue on drupal.org to look at scanning all sub-directories for test files. At the time of writing it was uncertain whether this patch would be accepted by the community.

Step 2, write your tests

There are some general conventions you should use when writing your PHPunit tests:

  • the suffix of the filename should be Test.php, e.g. MonthRangeTest.php
  • the files should all reside in either the directory /MY_MODULE/tests/src/Unit/ or a sub directory of that

More information on the requirements can be found on the drupal.org documentation.

Dataproviders

Data providers are pretty much the best thing to happen to automated testing. Instead of testing a single scenario, you can instead test a whole range of permutations in order to find those bugs. You start by declaring an annotation @dataProvider for your test method:

<?php
  /**
   * @covers ::getMonthRange
   * @dataProvider monthRangeDataProvider
   */
  public function testGetMonthRange($expected_start, $expected_end, $month_offset, $now) {
    // ... more code
  }

You then declare a method monthRangeDataProvider that returns an array of test cases (which are also arrays). The items in the data provider method are passed one at a time to the testing method, in the same order they are declared (so you can map them to friendly names).

<?php
  /**
   * Data provider for testGetMonthRange().
   *
   * @return array
   *   Nested arrays of values to check:
   *   - $expected_start
   *   - $expected_end
   *   - $month_offset
   *   - $now
   */
  public function monthRangeDataProvider() {
    return [
      // Feb 29 @ noon.
      [1454284800, 1456790399, 0, 1456747200],
      // ... more tests follow
    ];
  }

More information can be found in the phpunit documentation for data providers.

Testing for expected exceptions

Just as important as testing valid inputs, you should also test invalid inputs as well. This is easily achieved with @expectedException annotations above your test method:

<?php
  /**
   * Tests that an end date that is before the start date produces an exception.
   *
   * @expectedException        Exception
   * @expectedExceptionMessage Start date must be before end date
   */
  public function testGetWorkingDaysInRangeException() {
    // ... more code in here
  }

You can annotate both the test class and the methods to provide additional information and metadata about your tests:

@covers

This is mainly used for PHPunit’s automated code coverage report, but I find it also very helpful for developers to up front state what method that are testing.

@coversDefaultClass

This is used at a class level, and saves you having to write rather lengthy @covers statement for all your testing methods, if they all test the same class.

@depends

If a certain test makes no sense to run unless a previous test passed, then you can add in a ‘depends’ annotation above the test method in question. You can depend on multiple other tests too. Note, that this does not change the execution order of the tests, they are still executed top to bottom.

@group or @author

You can think of adding a ‘group’ to a PHPunit class the same as tagging in. It is free tagging in that sense, and you can tag a single class with many tags. This should allow you to categorise your tests. @author is an alias of group, the idea being you can run all tests written by a particular developer.

More information can be found in the PHPunit documentation on annotations.

Step 4, run your test suite

This section assumes you have opted to use Drupal core’s phpunit.xml.dist file (modify the paths as appropriate if you are using a file in another location).

List groups (or tags)

cd core/
../vendor/bin/phpunit --list-groups

Run all tests that are tags with a particular group (or tag)

cd core/
../vendor/bin/phpunit --group tamdash

Example CLI output

$ ../vendor/bin/phpunit --group tamdash
PHPUnit 4.8.11 by Sebastian Bergmann and contributors.
...........
Time: 5.01 seconds, Memory: 144.25Mb
OK (11 tests, 18 assertions)

If you are using PHPStorm, spend a few minutes and set this up too.

Set up PHPStorm to run PHPunit tests

Example output

Running PHPunit tests in PHPStorm

So now there is no need to flip back to your terminal if you just want to quickly run a group of tests.

Conclusion

PHPunit is a great way to be able to run quick tests on isolated parts of your code. Tests often take less than 10 seconds to run, so developer feedback is near instant. It also forces your developers to write better more testible code from the get go. This can only be a good thing. Personally I am very excited to see PHPunit join Drupal 8, and cannot wait to see what people do with it.

There seems to be quite healthy debate on whether contrib or custom modules should ship with their own phpunit.xml.dist file or whether Drupal core’s file should cover both. I am keen to hear anyone’s thoughts on this. Also let me know if you have any contrib modules in the wild shipping their own phpunit.xml.dist files, and how you found that process.

Feb 16 2016
Feb 16

Background

I was recently tasked with cleaning up a legacy Drupal 7 database to which had accumulated a lot of data in several fields. The fields were no longer used, and the data could be deleted. The data however totalled more than 30 GB with those 7 fields, and this presented a few challenges.

Existing solutions that did not quite cut it

I search around the internet to see the best way to delete fields in Drupal 7 using an update hook. I came across this stack overflow post to which advised using something like:

<?php
function EXAMPLE_update_7001() {
  field_delete_field('field_name');
  field_purge_batch(1);
}
?>

The main issue with the above code is that it only marks the data for deletion, and then slowly removes it on cron. This could take weeks to fully remove the data. It was not an ideal solution in our case.

Solution

If you want to expatiate the process of removing data from fields, and also the fields themselves, we found the code below would remove the 30 GB of data, remove the tables, update the content types, all in one nice update hook. It does rely on having removed the fields from your features, so that are the respective content type can be reverted successfully.

<?php
function EXAMPLE_update_7001() {
  $fields_to_delete = array(
    'field_example_a',
    'field_example_b',
    'field_example_c',
    'field_example_d',
    'field_example_e',
    'field_example_f',
    'field_example_g',
    );
  foreach ($fields_to_delete as $field) {
    db_truncate('field_data_' . $field)->execute();
    db_truncate('field_revision_' . $field)->execute();
    field_delete_field($field);
    watchdog('EXAMPLE', 'Deleted the field @field from all content types.', array('@field' => $field));
  }

  /**
   * The fields aren't really deleted until the purge function runs, ordinarily
   * during cron. Count the number of fields we need to purge, and add five in
   * case a few other miscellaneous fields are in there somehow.
   */
  field_purge_batch(count($fields_to_delete) + 5);

  // Revert features to update the content type in Drupal to drop the field.
  features_revert(array('EXAMPLE' => array('field')));
}

Conclusions

A couple of lessons can be learnt here

  • Truncations, even on large data sets, are extremely quick, and awesome. The above update hook takes less than 1 minute to run end to end.
  • If you are going to programmatically update nodes with a high degree of frequency, consider disabling revisions on those content types. Your database thanks you in advance.

If you have had to delete fields in update hooks in Drupal, let me know if you used another method and how that went for you.

Feb 11 2016
Feb 11

Background

Estimating Drupal projects can be tricky, and there are a number of tools and guides out there to help make your life easier in this topic. I thought it would be a great idea to aggregate this data, and present some of the best information out there. It is up to you to choose the best match for your team.

Resources

palantir.net

Developing Drupal sites: Plan or Perish - by Larry Garfield (2013)

Even though this article is nearly 3 years old, it still is completely valid for Drupal. There is an included spreadsheet that you get the tech lead, product owner and another engineer to collaborate on prior to building.

What I like about this:

  • Emphasis on involving tech leads early to perform scoping and breaking up functionality into tangible Drupal elements. This allows you to have less experienced developers do the actual build, as largely everything has been speced out.
  • View modes are used extensively which is a great for re-usability of content and maintaining flexibility
  • Questions, or opportunities for simplification can come up before anything has been built

lullabot.com

There are 3 articles on lullabot.com that are worth reading, they all follow on from one another, so it would pay to start at the beginning and work your way to the end.

In here they introduce a spreadsheet that attempts to combine 2 developers estimates into a single number with variance.

What I like about this:

  • It exposes the fact that all website builds have range to which it can be completed in
  • It helps to explain the ‘Wideband Delphi’ method of estimation
  • Excellent explanation on how to use the spreadsheet is provided

wunderkraut.com

Resources for my session on early estimating by Jakob Persson (2010)

Included in the article here is the Drupal early estimation sheet v3.

What I like about this:

  • Walks you through an actual example on how an email was turned into a concrete estimate, and how that was analysed
  • Introduces often overlooked things such as:
    • migration of data
    • producing help or documentation
    • deployments
    • third party system integration
    • working with a third party (e.g. external design vendor)

drupalize.me

There are several training videos on drupalize.me that would be worth checking out if you have some time:

If you have an existing Acquia subscription you might already have access to Drupalize.me, so this might be worth looking into.

Personal thoughts

Having been involved in doing a lot of this in my previous role, and one additional point I would like to make is to never start from a fresh Drupal install, it often makes sense to standardise on a Drupal technology suite, and installation profiles make this easy.

For instance, Acquia typically use an installation profile called lightning which provides built in ways to do lots of common tasks, e.g. layout, workflow, media management. This saves you having to re-invent the wheel on every project, and should help you provide more solid estimates when it comes to the base features of the site.

Another common theme across the different methods is to involve 1 or 2 tech leads early on to help break the requirements down into Drupal functionality. Having these early conversations with the product owner early on can often lead to a better solution not only in time to build, but feature set too.

Also remember, the above articles and spreadsheets may not work perfectly for you and your organsiation, so feel free to adapt them to suit your needs.

If you know of some other method to help here, please let me know in the comments. Especially keen to know of any other training content or spreadsheets which other companies use to help estimate Drupal projects.

Dec 02 2015
Dec 02

On a recent project I was tasked with working out why the database was so large, in particular certain tables like field_revision_body and field_data_body had grown to be several gigabytes in size.

SQL to the rescue

Here is a simple SQL statement you can execute on your production database to retrieve a count of nodes created grouped by day.

drush --uri=www.example.com sqlq "SELECT count(nid) as count, DATE_FORMAT(FROM_UNIXTIME(created),'%d-%m-%Y') AS month FROM node  GROUP BY YEAR(FROM_UNIXTIME(created)), MONTH(FROM_UNIXTIME(created)), DAY(FROM_UNIXTIME(created))" > /tmp/example-nodes.txt

PostgreSQL users - you can add a custom function to add FROM_UNIXTIME functionality:

CREATE OR REPLACE FUNCTION from_unixtime(integer) RETURNS timestamp AS '
     SELECT to_timestamp($1)::timestamp AS result
' LANGUAGE 'SQL';

Cleaning up results

The SQL will return in a few seconds, and now we need to clean up the data before it is graphed. First lets trim the start of the file, here you can see a few nodes with no valid date, and 1 created in 1980, we want to remove these lines.

head /tmp/example-nodes.txt

count   month
662     NULL
1       08-06-1980
194     30-06-1992
198     01-07-1992
186     02-07-1992
...

Now if we look at the tail you can see a few nodes that are created in the future, I also like to remove these from the results as they tend to blow out the timescale of any graphs:

tail /tmp/example-nodes.txt

1       15-04-2022
12      30-05-2022
1       19-06-2022
1       24-06-2022
1       27-06-2022
...

Conversion to CSV

By default the SQL will return a tab separated values file, and with blank lines in there, lets convert this to CSV and enforce consistent line endings:

cat /tmp/example-nodes.txt | sed $'s/\r//' | tr "\\t" "," > /tmp/example-nodes.txt

Graphing results

Now that you have clean data you can import into your favourite graphing tool, here is Excel making a graph of that data. Here are the number of nodes created by day:

Drupal content growth by day over time

And if you make the total cumulative, you can track total growth in Drupal, and use this to predict into the future:

Drupal total content growth over time

Wait, what about taxonomy terms

You can extend this method to work with taxonomy terms as well with a slight modification to the SQL (Drupal 7):

drush --uri=www.example.com sqlq "SELECT count(tid) as count, DATE_FORMAT(FROM_UNIXTIME(created),'%d-%m-%Y') AS month FROM taxonomy_term_data GROUP BY YEAR(FROM_UNIXTIME(created)), MONTH(FROM_UNIXTIME(created)), DAY(FROM_UNIXTIME(created))" > /tmp/example-terms.txt

Where to from here

Based on the analysis, you might be able to make recommendations to:

  • perhaps look at exporting (e.g. with node_export) and removing content from Drupal’s database after X number of years. This will reduce the database size immediately
  • removing unused revisions from content where the content is older than X number of years
  • predict into the future how many nodes you will have in 1 year, 5 years etc and be able to plan for this in terms of hardware capacity

If you are having challenges with Drupal content growing over time, how are you keeping track of this? Are there any other tips and tricks you know about?

Oct 14 2015
Oct 14

There are a number of ways this can be solved with Drupal 7:

1. Use a search core per site

Using this approach, you create an unique Solr core (N.B. a single Solr application can contain 0 or more cores) per site and either:

to connect the appropriate search cores to the appropriate Drupal environments. Then you can enable (but not no need to configure) the apachesolr_multisitesearch module.

Pros

  • Each site’s index is isolate from each other, so there is no interfering with other indexes, and no ability to drop other indexes
  • You can customise the Solr config per core

Cons

  • Need to provision a new Solr core every time you add a new site
  • Need to change settings.php every time you add a new site

2. Use a single shared core, and apachesolr_multisitesearch

In this option you utilise a single shared Solr core, and send the documents from all sites into it. Using special attributes on the documents, you are able to query and get only the results you desire back out again.

The module you need to enable is apachesolr_multisitesearch. Just enabling the module makes your standard search pages return results only for your site. This is accomplished internally by adding a filtering query to each Solr query using hook_apachesolr_query_alter().

You can see the site hashes are different based on the base_url in Drupal already.

Example of this working

Note the site hashes are different:

$ drush --uri=www.example1.com vget apachesolr_site_hash
apachesolr_site_hash: 'lm0ujj'

$ drush --uri=www.example2.com.au vget apachesolr_site_hash
apachesolr_site_hash: '9qt6o6'

Example of this not working

Note the site hashes are the same, thus the search results will be blended:

$ drush --uri=www.example1.com vget apachesolr_site_hash
apachesolr_site_hash: '92ub55'

$ drush --uri=www.example2.com vget apachesolr_site_hash
apachesolr_site_hash: '92ub55'

You will find yourself in this predicament if you clone new sites from old sites (thus the database variable persists).

In order to fix blended site hashes

  1. drop the current index (both sites)
  2. delete the variable ‘apachesolr_site_hash’ (both sites)
  3. visit the admin page solr page (both sites) - this will regenerate the site hash
  4. verify the site hash is now different (both sites)
  5. reindex (both sites)

Pros

  • No need to provision a new Solr core every time you add a new site
  • No need to change settings.php every time you add a new site
  • Hooks into the delete query to prevent dropping the entire index

Cons

  • You need to ensure the site hash is different for each site, so if you clone one, you will need to delete the variable apachesolr_site_hash straight away
  • If you ever drop Solr’s index, this will drop all documents across all sites. This should be hard to do (you will need to disable the apachesolr_multisitesearch module)
  • You must share Solr config (e.g. the schema XML), so you cannot change this per site

If you run a multisite setup, and make use of Apache Solr, I am keen to hear how you integrated it, and if you have any tips or tricks to share.

Sep 05 2015
Sep 05

Out of the box, Drupal 7 comes with the ability to set the global cache lifetime for all pages on the site. I find this works generally quite well with small sites with not a lot of content or complex caching requirements. Pantheon have a nice write up on the performance settings page that comes with Drupal 7 core and what each of the settings mean.

N.B. In this article I am assuming you are running some form of reverse proxy (e.g. varnish) in front of Drupal (most Drupal managed cloud vendors like Acquia or Pantheon provide varnish as a part of it’s offering as standard).

The problem with Drupal 7 core settings

What happens when you are running a large site, with a very long tail of content? All of a sudden running a global cache TTL of 10 minutes can result in a very poor hit rate in varnish. Having a poor hit rate in varnish ultimately means your web servers end up doing more work, which often leads to having to shell out for additional hardware.

Generally, faced with this issue, you have 2 options:

1) Varnish with purging

In this option, you set a global high cache TTL, so that all content lives in varnish for an excessively long time (say 12 hours), and when certain events occur (say a news article gets published), then purge requests occur telling varnish to drop certain URLs from it’s cache. I don’t want to go into detail to much on this, but I will say there are a number of drawbacks with this approach:

  • When you publish 1 node, often this node can potentially appear on dozens of pages (e.g. on landing pages as a teaser, or in a view), this means the purge rules get overly complex in a hurry. If you ever re-architect a portion of the site, often this means hundreds of purge rules need to be rewritten and tested, this can be very costly.
  • Running a CDN (e.g. Akamai, or Cloudflare) can complicate things further, as you need to purge varnish first, then the CDN and never in the reverse order, else the CDN will end up caching the stale content all over again
  • Having exposed filters on the page means you will need to use bans in varnish (to utlise regex), this is another topic for another day
  • All too often, content editors resort to nuking the varnish cache from orbit and dropping the whole lot when the rules do not function as desired. This is especially bad as it means all the content in there needs to be entirely rebuilt again. This can (and often does) take down busy sites when it occurs in production.

2) Varnish with intelligent cache TTLs

Another option to which I want to discuss in this blog post is around using the attributes of the node in order to determine at runtime how long this piece of content should be cached in varnish.

Attributes you can use to help you work out the most effective cache TTL:

  • content type
  • content created age (now - created)
  • recency of most recent revision
  • when the last comment was made
  • current time of day
  • any other attribute of the node

What do I mean by this? Well let’s say you run a large digital media organisation, and publish hundreds of fresh news articles per day, but after a week or so, they generally get no edits done to them at all. Your cache TTL rules could be:

  • if content type == ‘news_article’ then
  • if age < 1 week, then the cache TTL inherits from the global configuration
  • else if age > 1 week, then increase cache TTL to 1 week

This way, your new content is able to get upgrades quite promptly, and your long tail of older news articles have a massively inflated cache TTL, meaning your varnish server should really be able to help out here. It is important to note, that these rules apply at a node by node level, and can be made to suit your business requirements.

Modules out there already

I had a look around on Drupal.org to see what others had done in this area in the past, I was genuinely surprised at the lack of modules in this area, here is a short summary:

  • cacheexclude - used to exclude certain URLs from cache, kind of the opposite of what we want here.
  • max_age - an (abandoned?) Drupal 6 module that aimed to provide different cache TTLs per content type in addition to the site wide default. This is close to what we want, there even is a patch to help port the module to Drupal 7.

Example code

Rather than using any existing contrib modules, the code required to do what we want is very simple. Here is some simple custom code that can be used as a starting point for your code:

<?php
/**
 * Implements hook_page_build().
 *
 * Responsible for setting the cache TTL based on the content attributes.
 */
function MYMODULE_page_build(&$page) {
  $node = menu_get_object('node');
  if (drupal_page_is_cacheable() && isset($node)) {
    $age_since_last_change = REQUEST_TIME - $node->changed;

    switch ($node->type) {
      case 'news_article' :
        // If the news article has not been edited in a week, then set the cache
        // TTL to a week.
        if ($age_since_last_change > 604800) {
          $max_age = 604800;
          drupal_add_http_header('Cache-Control', 'public,max-age=' . $max_age);
          drupal_add_http_header('Expires', gmdate(DATE_RFC1123, REQUEST_TIME + $max_age));
        }
        break;
    }
  }
}
?>

The above example is simple, but hopefully gives you a taste of what can be achieved with very little code.

How to test your code

I have a few cURL aliases in my .bashrc that will come in handy if you do a lot of this type of thing:

# cURL
function curlh() { curl -sLIXGET "[email protected]"; }
function curlc() { curl -sLIXGET "[email protected]" | grep -E -i "^(Cache-Control|Age|Expires|Set-Cookie|X-Cache|X-Varnish|X-Hits|Vary)"; }

So if you are after all response headers you can:

$ curlh www.example.com
HTTP/1.1 200 OK
Accept-Ranges: bytes
Cache-Control: max-age=604800
Content-Type: text/html
Date: Sun, 06 Sep 2015 22:52:40 GMT
Etag: "359670651"
Expires: Sun, 13 Sep 2015 22:52:40 GMT
Last-Modified: Fri, 09 Aug 2013 23:54:35 GMT
Server: ECS (cpm/F9D5)
X-Cache: HIT
x-ec-custom-error: 1
Content-Length: 1270
Connection: Keep-Alive
Age: 2265

If you just want the relevant caching response headers:

$ curlc www.example.com
Cache-Control: max-age=604800
Expires: Sun, 13 Sep 2015 22:52:40 GMT
X-Cache: HIT
Age: 2249

What this technique relies on

  1. Expectation setting - you will need to ensure all content authors and stakeholders are aware what this means. If a really old news article is edited, there potentially will be lag before the changes are live.
  2. Enough RAM in varnish to house all your objects. If you are RAM constrained, and are seeing a lot of evictions at present, then this technique will not have the impact you will want it to. You might be better to upsize varnish first, then look at this technique.

I am keen to hear what other large websites do in the area, especially around multiple layers of external cache invalidation and/or if custom cache TTL headers are used. Also if you know of any other contrib modules that can help here, let me know.

May 26 2015
May 26

Why is this important

How to size the PHP setting max_memory is actually really important for the health of your Drupal application. Size this too small, and you risk getting PHP fatals due to not enough memory allocated. Size this too large, and you are essentially under-utilising your hardware, which in turn can lead to more cost.

How to record every Drupal requests PHP max memory usage

Tim Hillard created this really nice module called Memory profiler, which probably wins some sort of award for being around one of the smallest modules on drupal.org. Essentially this module registers a shutdown function that gets called at the end of every normal Drupal request.

The module is lightweight enough to run on production and only produces an extra syslog line per request.

Analyse the data

The data for memory profiler flows into watchdog, so if you run syslog (which you should), you can use CLI tools to analyse the data.

What does a single request look like

$ grep "memory profiler" drupal-watchdog.log | head -n 1

May 26 06:25:21 10.212.4.16 sitename: https://www.sitename.com|1432621521|memory profiler|1.152.97.17|https://www.sitename.com/|https://www.sitename.com/home|0||4.75 MB - home request_id="v-fc9573dc-036f-11e5-a8c0-22000af91462"

This comes from your syslog format (which can be changed on a per site basis):

$ drush vget syslog_format

syslog_format: '!base_url|!timestamp|!type|!ip|!request_uri|!referer|!uid|!link|!message'

Extract the data from syslog

From here you can tokenise the parts you actually care about, in other words the:

  • URL requested (part 5)
  • PHP max memory (part 9)

Using more bash foo

$ grep "memory profiler" drupal-watchdog.log | head -n 1 | awk -F'|' -v OFS=',' '{print $5, $9}'

https://www.sitename.com/,4.75 MB - home request_id="v-fc9573dc-036f-11e5-a8c0-22000af91462"

On Acquia Cloud a request ID is added to all requests, we don’t need this. Also having the string ‘MB’ there is superfluous.

$ grep "memory profiler" drupal-watchdog.log | head -n 1 | awk -F'|' -v OFS=',' '{print $5, $9}' | sed 's/ MB.*//'

https://www.sitename.com/,4.75

Perfect.

So in order to create a CSV for analysing in a spreadsheet you could do:

$ echo "request_uri,max_memory" > /tmp/memory.csv && grep "memory profiler" drupal-watchdog.log | awk -F'|' -v OFS=',' '{print $5, $9}' | sed 's/ MB.*//' >> /tmp/memory.csv

And then you can make pretty graphs if you want:

Graph showing PHP memory usage sorted by smallest to largest

Or if you just want to find the top requests to your application by memory you can do

$ grep "memory profiler" drupal-watchdog.log | awk -F'|' -v OFS=',' '{print $5, $9}' | sed 's/ MB.*//' | sort -t, -k+2 -n -r | head -n 20

Conclusions

Based on your findings in the logs, you should be able to come up with:

  • A better understanding of your request memory profile
  • Better max memory settings for your Drupal application
  • Potentially identify poor performing pages (memory wise) and can look to optimise them

Gotchas

This module will only work if:

  • hook_boot() is called (which might not be the case if you run custom lightweight PHP scripts that do not bootstrap Drupal)
  • The Drupal request is not terminated with a SIGTERM or SIGKILL signal

Let me know if you found this helpful, or if you have any changes to my bash foo. If you have profiled your Drupal application recently, what methods and tools did you use?

May 21 2015
May 21

DrupalCon LA

So I did not make it along to DrupalCon Los Angeles, but I did spend some time reading twitter, and watching the sessions online. Here are some of the sessions I found entertaining and insightful and would recommend to others.

Driesnote Keynote

Dries, as always, sets the lay of the land with Drupal. He also goes into the early days of Drupal, and how some key people he was involved with and have now gone on to form organisations that centre around Drupal.

Best quote:

Obstacles don’t block the path, they are the path

[embedded content]

No

Larry Garfield gives an interesting talk on why sometimes it is best to say NO in order to give focus to the things that actually matter.

Best quote:

Case and point, the new Macbook Airs, they say NO TO EVERYTHING.

[embedded content]

PHP Containers at Scale: 5K Containers per Server

David Strauss explains the history of web hosting, and how this is now far more complex. David is CTO of Pantheon, and they now run 100,000+ websites, all with dev + test + production environments. Pantheon run 150+ containers on a 30GB box (205MB each on average). Really interesting talk on how to run large amounts of sites efficiently.

[embedded content]

Decoupled Drupal: When, Why, and How

Amitai Burstein and Josh Koenig give a really entertaining presentation on monolithical architectures and some developer frustrations. And then introduce REST web services in Drupal 8, and how this can be used to provide better consumer interfaces for other frameworks.

[embedded content]

Features for Drupal 8

Mike Potter goes through what role features played in Drupal 7, and how features will adapt in Drupal 8 now that CMI is in. Features in Drupal 8 will be going back to it’s roots and provide ‘bundles’ of configuration for re-use.

[embedded content]

Meet Commerce 2.x

Ryan and Bojan go through 1.x on Drupal 7, and how they have chosen to develop Commerce 2.x on Drupal 8. This is a complete rewrite. The hierarchical product model is really exciting.

[embedded content]

How, When and Why to Patch a Module

Joshua Turton goes over what a patch is, when you should patch contributed modules, and how to keep track of these with Drush make.

[embedded content]

My colleague Josh also wrote a blog post on how to use Drush make.

CI for CSS: Creating a Visual Regression Testing Workflow

I topic that I am passionate about is visual regressions, here Kate Kligman goes through some tools that can help you test your site for visual changes. Tools covered include PhantomJS, SlimerJS, Selenium, Wraith.

[embedded content]

Speeding up Drupal 8 development using Drupal Console

Eduardo and Jesus give us an introduction to your new best friend in Drupal 8. Drupal console is a Symfony CLI application to help you write boilerplate code, e.g. to create a new module. Personally, I am excited for the form API generator, and the ability to create a new entity with a single command.

[embedded content]

For more information see drupalconsole.com.

Q&A with Dries

As Drupal heads down from 130 critical issues down to 22 currently, what are some key concerns by people. The questions are answered by dries, xjm, webchick and alexpott.

[embedded content]

Where can I find more videos

Don’t worry there are plenty more videos on the Drupal Association Youtube page.

If you have any awesome sessions that I have missed let me know in the comments.

Apr 12 2015
Apr 12

This post is a follow up to my previous blog post on how to upgrade PHP to 5.4 to support Drupal 8.

Why you should upgrade PHP

If you are looking for reasons to ditch PHP 5.3, here are some:

Security

PHP 5.3 reached end of life in August 2014, this means that if you are running this version, you are running an insecure version of PHP that potentially has security holes in it. This is bad for obvious reasons.

Bundled opcode cache

PHP 5.5 is the first version that bundles an opcode cache with PHP, this means there is also no need to also run APC (unless you need userland caching in APCu).

Performance

PHP profiled the 5.4 release compared to 5.3 for Drupal, and that found that:

  • 7% more requests/second
  • 50% PHP memory reduction

PHP 5.5 offers more performance again, and there is a section at the bottom of this article that goes through a real life scenario.

Cool new features

Read through the list of new features, here are some neat things you are missing out on:

$array = [
  "foo" => "bar",
  "bar" => "foo",
];
  • Function array dereferencing
  $secondElement = getArray()[1];

And many others.

How to upgrade to PHP 5.5

There are a number of ways to update your server to PHP 5.5.

Upgrade to Ubuntu Trusty Tahr 14.04

Ubuntu Trusty Tahr 14.04 (which is an LTS version), which comes bundled with PHP 5.5.9. This is probably the best solution if you are managing your own Ubuntu box.

Install a PPA on Ubuntu Precise 12.04

If you are running the older Ubuntu Precise 12.04, you can add a PPA

sudo add-apt-repository ppa:ondrej/php5
sudo apt-get update
sudo apt-get install php5
php5 -v

It would be worth considering a dist upgrade though, but this at least can buy you some time.

Acquia Cloud UI

If you use Acquia Cloud for hosting there is a convenient PHP version selector in the UI.

Acquia Cloud UI allows site administrators to change the PHP version

More information can be found in the documentation. Be aware, once you upgrade beyond PHP 5.3, you cannot downgrade, so ensure you test your code on a development server first ;)

Common coding issues

Although Drupal 7 core, and most popular contributed modules will already support PHP 5.5, it would pay to do a code audit on any custom code written to ensure you are not using things you should not be. Here are some links you should read:

Below are some of the most common issues I have found in sites:

Call time pass-by-reference

If you have this in your code, you will have a bad time, as this is now a PHP fatal.

foo(&$a); // Bad times.

Only variables can be passed by reference

This will cause PHP to throw notices.

$ php -a
Interactive shell

php > ini_set('error_reporting', E_ALL);
php > var_dump(reset(explode('|', 'Jim|Bob|Cat')));
PHP Strict Standards:  Only variables should be passed by reference in php shell code on line 1

Strict Standards: Only variables should be passed by reference in php shell code on line 1
string(3) "Jim"

Where you will likely find this in Drupal in my experience is when manually rendering nodes:

This code works in PHP 5.3, but will throw notices in PHP 5.5:

$rendered = drupal_render(node_view(node_load(1), 'teaser'));

The fix is to simply use a temporary variable:

$view = node_view(node_load(1), 'teaser');
$rendered = drupal_render($view);

The reason being that drupal_render() expects a variable to be passed in (as it is passed by reference).

How do you find coding issues

Enable the syslog module, and tail that in your development environment, hunt down and fix as many notices and warnings as possible. The more noisy your logs are, the harder it is to find actual issues in them. While you are at it, turn off the dblog module, this is only helpful if you do not have access to your syslog (as it is a performance issue to be continually writing to the database).

Real world performance comparison

This was taken from a recent site that underwent a PHP 5.3 to 5.5 upgrade. Here are 2 New Relic overviews, taken with identical performance tests run against the same codebase. The first image is taken with PHP 5.3 running:

Performance of Drupal on PHP 5.3 is not that flash

You can see PHP time is around 260ms of the request.

Performance of Drupal on PHP 5.5 is much better than 5.3

With an upgrade to PHP 5.5, the time spent in PHP drops to around 130ms. So this is around a a 50% reduction in PHP time. This not only makes your application faster, but also it means you can serve more traffic from the same hardware.

If you have gone through a recent PHP upgrade, I would be interested to hear how you found it, and what performance gains you managed to achieve.

Mar 06 2015
Mar 06

What is site preview

The ability to see (and have selected others see) changes to content and layout that is not yet visible to the general public

So recently I did a talk at DrupalSouth Melbourne on site preview solutions that exist within Drupal at present. I noticed that no one had managed to do a comparison between them as they stand at the moment, so I aimed to help out there.

Solutions compared in the talk:

  • Drupal 7 (stock)
  • SPS
  • CPS
  • Zariz
  • Live preview
  • What is coming in Drupal 8

I also wanted to introduce the new solution that was developed for the Ministry of Health New Zealand, that aimed to solve site preview in an entirely different manner.

Slides

Here are my full slides from the talk if you want to read about the above options in more detail.

How preview sync is different

Instead of try to take over your production site, and pollute it with complex revisioning and access control, and altering your EFQ and views using magic, it instead works with your existing workflow (workbench_moderation integration is OOTB), and aims to be a lightweight solution.

Preview sync syndicates your production database to a separate preview environment

Preview sync, takes a live snapshot (optimised) of your production database, and imports this to a separate dedicated preview environment. Then a number of actions take place, all of which are entirely alterable, so you can add your own tasks in, and remove tasks you don’t need.

Example tasks in preview sync:

  • environment switch to preview, this allows you to enable and disable modules, perform actions (e.g. redirecting email to a log). This is a powerful hook.
  • publish the content currently in ‘Needs Review’, this allows your content approvers to see the content, including all surrounding content as if it was published on production, but in a safe and controlled environment
  • re-index solr, if your site is largely driven by Apache Solr (which is not uncommon), this will allow the newly published content above to be inserted into the preview Solr index. This is a unique feature
  • your task here, seriously, the task list is completely alterable, and any drush command can be remotely executed on the preview environment. Custom drush commands can be added. An example of which is the bundled workbench-moderation-publish drush command.

Security

As all the complex access control is not needed on production (e.g. you are never sending un-published content to Solr), there is a huge security benefit to using preview sync. Access control to nodes is kept simple on production.

Also, as the preview environment is dedicated, you can lock down access, e.g. only allow access to your preview site from certain IPs. This way, your internal content approvers can still see the content, and no one else.

If preview sync sounds like it could be useful to you, I am keen to know - please leave a comment below. I am also keen to hear if

  • there is a missing feature that is needed for you to adopt preview sync
  • key integrations with other contributed modules are missing
Dec 12 2014
Dec 12

As you end up building more and more websites that target mobile devices (e.g. iPhone, iPad, Android, Windows), you need to supply an ever increasing amount of favicons. This process can be complex if done by hand, luckily there is an easy way to introduce these into your Drupal site.

What you will need

Before we start you will need a high quality icon to begin with, the icon should be:

  • 260x260px (i.e. square)
  • a PNG with transparency as needed
  • recognizable when shrunk right done to your browser favicon (so don’t use your entire logo complete with words).

Generating the favicons

This is where the really handy realfavicongenerator.net website comes into play. I have used many other websites that offer similar functionality, but this seems to be the best, and is dead simple to use.

You will need to upload the 260x260px PNG file, and also select a hex color for the Windows 8 tile, but this should be straight forward.

I also opt for the option “I will place favicon files (favicon.ico, apple-touch-icon.png, etc.) at the root of my web site.” as this seems the most sensible place for them anyway.

When you complete the process, you will be able to download a zip file containing a whole bunch of icons and XML files, this is fine, extract them to your docroot for Drupal.

Adding the favicons to Drupal

You now will need to edit your html.tpl.php inside your theme, and add the code that the generator provides. The code should resemble something like this:

<link rel="apple-touch-icon" sizes="57x57" href="/apple-touch-icon-57x57.png">
<link rel="apple-touch-icon" sizes="114x114" href="/apple-touch-icon-114x114.png">
<link rel="apple-touch-icon" sizes="72x72" href="/apple-touch-icon-72x72.png">
<link rel="apple-touch-icon" sizes="144x144" href="/apple-touch-icon-144x144.png">
<link rel="apple-touch-icon" sizes="60x60" href="/apple-touch-icon-60x60.png">
<link rel="apple-touch-icon" sizes="120x120" href="/apple-touch-icon-120x120.png">
<link rel="apple-touch-icon" sizes="76x76" href="/apple-touch-icon-76x76.png">
<link rel="apple-touch-icon" sizes="152x152" href="/apple-touch-icon-152x152.png">
<link rel="apple-touch-icon" sizes="180x180" href="/apple-touch-icon-180x180.png">
<link rel="icon" type="image/png" href="/favicon-192x192.png" sizes="192x192">
<link rel="icon" type="image/png" href="/favicon-160x160.png" sizes="160x160">
<link rel="icon" type="image/png" href="/favicon-96x96.png" sizes="96x96">
<link rel="icon" type="image/png" href="/favicon-16x16.png" sizes="16x16">
<link rel="icon" type="image/png" href="/favicon-32x32.png" sizes="32x32">
<meta name="msapplication-TileColor" content="#b91d47">
<meta name="msapplication-TileImage" content="/mstile-144x144.png">

You will notice though that Drupal likes to place it’s default favicon into the <head> section of the page, we need to remove this in order for it not to mess up the above code you inserted.

<link rel="shortcut icon" href="http://[YOURSITE]/misc/favicon.ico" type="image/vnd.microsoft.icon" />

The following code below can be inserted into your template.php file for your theme to remove the default favicon from Drupal:

<?php
/**
 * Remove the unneeded favicon from the head section.
 */
function YOURTHEME_html_head_alter(&$head_elements) {
  foreach ($head_elements as $key => $element) {
    if (!empty($element['#attributes'])) {
      if (array_key_exists('href', $element['#attributes'])) {
        if (strpos($element['#attributes']['href'], 'misc/favicon.ico') > 0) {
          unset($head_elements[$key]);
        }
      }
    }
  }
}
?>

There you have it all done.

Update 5 January

We have created a simple module “Responsive Favicons” to help people new to Drupal get those metatags in the head section of the HTML, you can also upload the zip file and it will upload and extract them for you as well.

Google recently announced that from Chrome 39 onwards on Android Lollipop (5.0+), a new meta tag will be supported

<meta name="theme-color" content="#b91d47" />

This is what your site’s title bar now looks like (instead of boring and grey).

The theme-color meta tag in use on www.maoritelevision.com

This meta tag can be added to your html.tpl.php file as above.

Let me know if this has helped you, and also if you have any other tips and tricks when it comes to favicons on your mobile devices.

Dec 03 2014
Dec 03

All too often when peer reviewing code done by other Drupalers, I spot debug code left in the commit, waiting for the chance to be deployed to staging and break everything.

I started to read up on git hooks, paying particular attention to pre-commit:

This hook is invoked by git commit, and can be bypassed with --no-verify option. It takes no parameter, and is invoked before obtaining the proposed commit log message and making a commit. Exiting with non-zero status from this script causes the git commit to abort.

You can write you pre-commit hook in any language, bash seems the most sane due to the power of text analysis tools at your disposal.

Where is the code

Here is a link to the github repository with the pre-commit hook:

git clone https://github.com/wiifm69/drupal-pre-commit.git

Features

  • Executes PHP lint across the PHP files that were changed
  • Checks PHP for a blacklist of function names (e.g. dsm(), kpr())
  • Checks JavaScript/CoffeeScript for a blacklist of function names (e.g. console.log(), alert())
  • Ignores deleted files from git and will not check them
  • Tells you all of the fails at the end (and stores a log)
  • Only lets the commit go ahead when there are no fails

Installation

cd /tmp
git clone https://github.com/wiifm69/drupal-pre-commit.git
cd drupal-pre-commit
cp scripts/pre-commit.sh [PATH_TO_YOUR_DRUPAL_PROJECT]/scripts
cd [PATH_TO_YOUR_DRUPAL_PROJECT]
ln -s ../../scripts/pre-commit.sh .git/hooks/pre-commit

Feedback

I am keen to hear from anyone else on how they do this, and if you have any enhancements to the code then I am happy to accept pull requests on github. Happy coding.

Sep 29 2014
Sep 29

As a course of developing for larger Drupal sites, you typically find yourself having multiple environments, one for development, one or more for staging or user acceptance testing, and another for production (and perhaps disaster recovery).

One thing that always comes up is making Drupal “environment” aware, so it knows how it should behave, what modules should be turned on (or off) and what servers it should be talking to for instance.

Environment module

The environment module allows you to define arbitrary environments (it also comes with some out of the box) in code. Example hook

 1 <?php
 2 /**
 3  * Implements hook_environment().
 4  */
 5 function HOOK_config_environment() {
 6   return array(
 7     'prod' => array(
 8       'label' => t('Production'),
 9       'description' => t('Live sites are in full production and browsable on the web.'),
10       'allowed' => array(
11         'default' => TRUE,
12       ),
13     ),
14     'dev' => array(
15       'label' => t('Development'),
16       'description' => t('Developer machines.'),
17       'allowed' => array(
18         'default' => FALSE,
19       ),
20     ),
21     'staging' => array(
22       'label' => t('Staging'),
23       'description' => t('Bug fixes and testing are done here.'),
24       'allowed' => array(
25         'default' => FALSE,
26       ),
27     ),
28   );
29 }
30 ?>

You also may want to remove the default environments that come with the environment module

 1 <?php
 2 /**
 3  * Implements hook_environment_alter().
 4  */
 5 function HOOK_config_environment_alter(&$environments) {
 6   // Remove default environments.
 7   unset($environments['production']);
 8   unset($environments['development']);
 9 }
10 ?>

Using this code, and some magic in settings.php, you can effectively tell Drupal which environment it is by switching on the HTTP_HOST. A stripped back example is:

 1 <?php
 2 // Include environment-specific config by parsing the URL.
 3 // To override this, set $environment in settings.php
 4 // BEFORE including this file.
 5 if (!isset($environment)) {
 6   if (strpos($_SERVER['SERVER_NAME'], '.demo.net.nz') !== FALSE) {
 7     $environment = 'staging';
 8   }
 9   elseif (strpos($_SERVER['SERVER_NAME'], 'local') !== FALSE) {
10     $environment = 'dev';
11   }
12   else {
13     // Default to production.
14     $environment = 'prod';
15   }
16 }
17 // The environment module uses a lowercase variable.
18 $conf['environment']['default'] = $environment;
19 define('ENVIRONMENT', $environment);
20 
21 // Load the environment config file, followed by host-specific
22 // over-rides (if any) in non-production environments.
23 $conf_path = DRUPAL_ROOT . "/sites/default/";
24 require $conf_path . "settings.$environment.php";
25 ?>

Note this also lets you include a separate PHP file called “settings.dev.php” for development, where variable overrides (using global $conf) can be done on a per-environment basis.

Environment switching

A natural extension of the environment module is to allow pulling a production database back to development, and then “switching” it into a development state. This switching is already a hook you can implement, allowing you to react on both the current and target environments.

An example of this is from the module’s homepage:

 1 <?php
 2 /**
 3  * Implementation of hook_environment_switch().
 4  */
 5 function HOOK_environment_switch($target_env, $current_env) {
 6   // Declare each optional development-related module
 7   $devel_modules = array(
 8     'context_ui',
 9     'devel',
10     'devel_generate',
11     'devel_node_access',
12     'update',
13     'views_ui',
14   );
15   switch ($target_env) {
16     case 'production':
17       module_disable($devel_modules);
18       drupal_set_message('Disabled development modules');
19       return;
20     case 'development':
21       module_enable($devel_modules);
22       drupal_set_message('Enabled development modules');
23       return;
24   }
25 }
26 ?>

Other common things we do in our environment switch to development include:

Removing JS and CSS aggregation

1 <?php
2 variable_set('preprocess_css', 0);
3 variable_set('preprocess_js', 0);
4 drupal_set_message(t('Removed aggregation from CSS and JS.'));
5 ?>

Changing the Apache Solr environment to point at localhost

 1 <?php
 2 if (module_exists('apachesolr')) {
 3   $solr_env = array(
 4     'url' => 'http://127.0.0.1:8983/solr/dev',
 5     'make_default' => TRUE,
 6     'name' => 'DEV',
 7     'env_id' => 'solr',
 8     'service_class' => '',
 9     'conf' => array(
10       'apachesolr_read_only' => '0',
11     ),
12   );
13   apachesolr_environment_save($solr_env);
14   drupal_set_message(t('Set solr environment to @name at @url', array(
15     '@name' => $solr_env['name'],
16     '@url' => $solr_env['url'],
17   )));
18 }
19 ?>

Preventing API writes to production systems

1 <?php
2 variable_set('brightcove_api_write_enabled', 0);
3 drupal_set_message(t('Stopped the Brightcove write API sync.'));
4 ?>

Granting helpful debugging permissions to certain roles

 1 <?php
 2 if (module_exists('devel')) {
 3   $dev_perms = array(
 4     'access devel information',
 5     'switch users',
 6   );
 7   user_role_grant_permissions(DRUPAL_ANONYMOUS_RID, $dev_perms);
 8   user_role_grant_permissions(DRUPAL_AUTHENTICATED_RID, $dev_perms);
 9 }
10 ?>

The list goes on here.

Environment indicator module

Now that Drupal is environment aware, it is really helpful if Drupal can inform the user what environment they are currently looking at. Out of the box the environment module has no UI, so enter the module environment indicator to come save the day.

The new 7.x-2.x branch of environment indicator contains a lot of improvements over the 7.x-1.x branch, one of which is the integration with the core toolbar and shortcut modules.

To illustrate this, here is a screenshot of our toolbar in development (some links were stripped)

Environment indicator and the toolbar module working together

The module also alters the favicon to include a tiny letter and coloured background to match the colour you chose

Environment indicator and the favicon working together

This way you will never again forget which environment you are using, as the colours will be right there at the top of every page.

I am keen to hear how other people solve the issue of environment aware Drupal applications, and how this is communicated to the end users of the application. What other modules are out there? What experiences have you had?

Jul 31 2014
Jul 31

What is Dashing

Dashing is a Sinatra (think ruby but not rails) based framework that lets you build dashboards. It was originally made by the guys at Shopify for displaying custom dashboards on TVs around the office.

Why use Dashing

Dashing makes your life easier, freeing you up to focus on more inportant things - like what data you are looking to display, and what time of widget you want to use.

Features of dashing:

  • Opensource (MIT license)
  • Widgets are tiny and encapsulated, made with SASS, HTML and coffeescript
  • The dashboard itself is simply HTML and SASS, meaning you can theme and style it to suit your needs
  • Comes bundled with several powerful widgets
  • Widgets are powered by simply data bindings (powered by batman.js)
  • Push and pull methods available to each widget
  • Pull jobs can be configured to run in the background on a set interval (e.g. every 30 seconds, poll Chartbeat for new data)
  • Layout is drag & drop interface for re-arranging widgets

Why not make a dashboard in Drupal

There are several dashboard modules in Drupal, and yes you can go to a bit lot of trouble and re-create the power of Dashing in Drupal, but there is no need.

Dashing is great at what it does, and it only does one thing.

Another advantage is that you can query other sources of data - e.g. Google Analytics or MailChimp and display metrics from those applications on your dashboard.

A really great example (including code) and can found over at http://derekweitzel.blogspot.co.nz/2014/03/a-hcc-dashboard-with-osg-accounting.html

Installation of Dashing on Ubuntu 14.04

The only real requirement is ruby 1.9+ (this comes by default in Ubunty 14.04, in Ubuntu 12.04 you need to install ruby-1.9 explicitly)

sudo apt-get install ruby ruby-dev nodejs g++ bundler
sudo gem install dashing

you can create a new dashboard with

dashing new awesome_dashboard
cd awesome_dashboard
bundle

You start the application by

sudo dashing start

You now have a dashboard on http://localhost:3030 ready to go

There are already a few tutorials online, the best of which is probably just the existing suite of widgets available.

Here we will go through a simple example where we want to graph the of pieces of content in the “Needs review” state (provided by Workbench moderation) in Drupal. This serves as a mini-todo list for content authors, as ideally this number should be as low as possible.

In this example, we are re-cycling the “List” widget.

Place an instance of the “List” widget on a dashboard - e.g. sample.erb

<li data-row="1" data-col="1" data-sizex="1" data-sizey="1">
      <div data-id="newsarticlesreview" data-view="List" data-unordered="true" data-title="News articles in 'Needs review'"></div>
      <i class="icon-check-sign icon-background"></i>
    </li>

Create a new job to poll for data

Create a new file in jobs/newsarticlesreview.rb, and place:

#!/bin/env ruby
# encoding: utf-8

require 'net/http'
require 'uri'
require 'json'

# TODO replace with a real production host
server = "https://localhost"

SCHEDULER.every '30s', :first_in => 0 do |job|

  url = URI.parse("#{server}/api/content/dashboard?token=FawTP0fJgSagS1aYcM2a5Bx-MaJI8Y975NwYWP12B0E")
  http = Net::HTTP.new(url.host, url.port)
  http.use_ssl = (url.scheme == 'https')
  http.verify_mode = OpenSSL::SSL::VERIFY_NONE
  response = http.request(Net::HTTP::Get.new(url.request_uri))

  # Convert to JSON
  j = JSON[response.body]

  # Send the joke to the text widget
  review_content = {}
  review_content['en'] = { label: 'English', value: j['en']['news_article']['needs_review'] }
  review_content['mi'] = { label: 'M?ori', value: j['mi']['news_article']['needs_review'] }
  send_event("newsarticlesreview", { items: review_content.values })
end

Create a Drupal data source

Now we need to feed the Dashing request with a Drupal API. I have chosen to do all of these custom, as they are straight forward. In theory you could also craft these with the services module as well.

Create hook_menu() entry

/**
 * Implements hook_menu().
 */
function CUSTOM_menu() {
  // Dashboard API requests. Protected using a token.
  // e.g. api/content/dashboard?token=FawTP0fJgSagS1aYcM2a5Bx-MaJI8Y975NwYWP12B0E
  $items['api/content/dashboard'] = array(
    'title' => 'Content types broken down by workflow status',
    'page callback' => 'CUSTOM_content_dashboard',
    'access callback' => 'CUSTOM_dashboard_api_access',
    'access arguments' => array('api/content/dashboard'),
    'type' => MENU_CALLBACK,
    'file' => 'CUSTOM.dashboard.inc',
  );
  return $items;
}

Here we define a custom route, and declare the access callback. The access callback is special as it needs to ensure that access is restricted to only requests with a special token. The token being created from a hash of the Drupal salt combined with the current path and private key, and base64 encoded (much like drupal_get_token() without the session ID check).

/**
 * Access callback to the dashboard API endpoints. These are protected by a
 * token.
 *
 * @param String $path
 *   The path that is being requested.
 *
 * @return Boolean
 *   Whether or not the use has access to the callback.
 */
function CUSTOM_dashboard_api_access($path) {
  global $is_https;

  // HTTPS only.
  $only_allow_https = (bool) variable_get('api_https_only', 1);
  if ($only_allow_https && !$is_https) {
    return FALSE;
  }

  // Only allow get requests.
  if ($_SERVER['REQUEST_METHOD'] !== 'GET') {
    return FALSE;
  }

  // Check token is correct.
  $params = drupal_get_query_parameters();
  if (!isset($params['token']) || empty($params['token'])) {
    return FALSE;
  }
  $valid_token = CUSTOM_token_validation($path);
  if ($params['token'] !== $valid_token) {
    return FALSE;
  }

  return TRUE;
}

And finally the data for the callback.

/**
 * Gathers current content statistics from Drupal, including the amount of
 * content broken down by a) content type, b) workflow state, c) status.
 *
 * @return JSON
 */
function CUSTOM_content_dashboard() {
  $output = array();

  $languages = language_list('enabled');
  $types = node_type_get_types();

  // Workbench states.
  foreach ($languages[1] as $langcode => $language) {
    foreach ($types as $machine_name => $type) {
      // Workbench moderation in use (remove this if you do not have the module).
      if (workbench_moderation_node_type_moderated($machine_name)) {
        $results = db_query("SELECT COUNT(n.vid) AS total, w.state
                             FROM {node} n
                             JOIN {workbench_moderation_node_history} w
                               ON w.vid = n.vid
                             WHERE n.type = :type
                               AND n.language = :lang
                               AND w.current = 1
                             GROUP BY w.state", array(':type' => $machine_name, ':lang' => $langcode))->fetchAllAssoc('state');

        foreach ($results as $state => $result) {
          $output[$langcode][$machine_name][$state] = (int) $result->total;
        }
      }

      // No workbench moderation for this content type, use the status column.
      else {
        $results = db_query("SELECT COUNT(n.nid) AS total, n.status
                             FROM {node} n
                             WHERE n.type = :type
                               AND n.language = :lang
                             GROUP BY n.status", array(':type' => $machine_name, ':lang' => $langcode))->fetchAllAssoc('status');

        foreach ($results as $status => $result) {
          if ($status == NODE_PUBLISHED) {
            $status = 'published';
          }
          else {
            $status = 'unpublished';
          }
          $output[$langcode][$machine_name][$status] = (int) $result->total;
        }
      }
    }
  }

  drupal_json_output($output);
  drupal_exit();
}

And there you have it. Note the above code relies on workbench moderation being present, if you do not have it, simply remove the section of the code that is relevant. Note that the API response is considerably more complex and complete than the example calls for, but this just means you can display more data in more ways on your dashboard.

Here is the finished product:

Dashing example widget

Extra for experts

Create a init.d script for dashing, here is a good starter.

Let me know if you have completed (or started) a recent project to visual data from Drupal (or related third party applications) and your experiences there. Pictures are always welcome.

Mar 22 2014
Mar 22

As of the 28 February 2014, Drupal 8 now requires a minimum PHP version of 5.4.2. For background information read the drupal.org issue.

This places everyone running Ubuntu 12.04 LTS in an awkward situation as the PHP version bundled with this release is PHP 5.3.10.

Luckily there are options to solve this:

Perform a dist-upgrade to 14.04 LTS

This may not be the easiest option, but I mention it for completeness, as this newer version of Ubuntu (Trusty Tahr) contains PHP 5.5.9 out of the box.

Add a PPA and install newer a newer version of PHP

For most people this will be the easiest option. For PHP 5.4.x run the command:

sudo add-apt-repository ppa:ondrej/php5-oldstable

or for PHP 5.5.x run:

sudo add-apt-repository ppa:ondrej/php5

And then update your packages:

sudo apt-get update
sudo apt-get upgrade

The PPA maintainer has more information on the launchpad site https://launchpad.net/~ondrej/+archive/php5

Common issues

I was getting the message “The following packages have been kept back” when running the upgrade command earlier

seanh /var/www/D8 git:8.x » sudo apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following packages have been kept back:
  libapache2-mod-php5 linux-generic linux-headers-generic linux-image-generic php-pear php5-cli php5-common php5-curl php5-dev php5-gd php5-mcrypt php5-mysql php5-pgsql php5-xdebug

This was solved by manually installing the packages:

sudo apt-get install php-pear php5-cli php5-common php5-curl php5-dev php5-gd php5-mcrypt php5-mysql php5-pgsql php5-xdebug

Let me know if this worked for you in the comments, or if you have another way to easily update PHP on your stack.

Jan 07 2014
Jan 07

So Google Analytics has a new version of Google Analytics dubbed “Universal Analytics”, which has a bunch of new features, that could be handy for your website. I would dive into exactly what they are here, as you can read about them on Google’s own website.

In this post I will go through the steps to upgrade the Google Analytics 7.x-1.x module to the new 7.x-2.x version that supports Universal Analytics.

Update the Drupal module

If you read the Google Analytics module page you will spot that there are two different branches in use, in order to get the correct version you will need to get the 7.x-2.x version.

You can do this with Drush:

drush dl google_analytics
drush updb

Event tracking

If you have used custom event tracking in your website, a few changes are required.

Instead of

_gaq.push(['_trackEvent', 'category', 'action', 'opt_label', opt_value, opt_noninteraction]);

It is now

ga('send', 'event', 'category', 'action', 'opt_label', opt_value, {'nonInteraction': 1});

Handy grep command

If you want to find the offending lines of code, you can use grep

grep -nrI "_trackEvent" *

Custom variables are now dimensions and metrics

If you were using the old style custom variables, these are now completely gone, now replaced with dimensions and metrics. You can read more about these on Google’s website.

Instead of

_gaq.push(['_setCustomVar',
  1,                           // Slot
  'Customer Type',             // Name
  'Paid',                      // Value
  1                            // Scope (1 = User scope)
]);

It is now

ga('set', 'dimension1', 'Paid');

Drupal support of custom dimensions and metrics

The Drupal module has an active issue that allows you to configure this through the UI, unfortunately this is still only a patch at the moment, but is looking likely to be committed shortly.

Update 24 July 2014 - this is now bundled into the latest stable release now.

DoubleClick data

If you were using the additional data that DoubleClick integration provided, this is now supported, this is just a tickbox on the admin settings page.

To enable it

variable_set('googleanalytics_trackdoubleclick', 1)

Other new features in Universal Analytics

UserID tracking

This effectively allows you to track the same user across multiple devices. This comes in handy if your users can login to your Drupal site, and they would likely login on their mobile phones, and tablets etc. You can read more on Google’s page about User ID tracking

To enable it

variable_set('googleanalytics_trackuserid', 1)

Allows Google Analytics to differentiate URLs based on what link the user clicked on, really handy if you have many links pointing at the same page. You can read more on Google’s page about User ID tracking

To enable it

variable_set('googleanalytics_tracklinkid', 1)

Finally

Run this little gem over your codebase to ensure there are no legacy Google Analytics code lying around.

grep -nrI "_gaq" *

Let me know if you have any tips or tricks in the comments for the new Google Analytics

Dec 11 2013
Dec 11

As part of the series of blog posts on the top 10 OWASP web application security risks and how to defend against them in Drupal 7, here is the first post in the series. This post deals with the top security hole - classified as “injection”.

From the OWASP top 10 security risks:

Injection flaws, such as SQL, OS, and LDAP injection occur when untrusted data is sent to an interpreter as part of a command or query.

For this post I will cover what SQL injection is and how Drupal 7’s built in tools can help you avoid this.

What is SQL injection

Incorrectly filtered escape characters

This form of SQL injection occurs when the user supplied input does not have the escape characters filtered from the query. This is easily demonstrated with this SQL query example (which you should never write), imagine the query:

'SELECT nid FROM node WHERE nid = ' . $_GET['nid']

What happens when the query parameter ‘nid’ is:

' or '1'='1

Or even worse, imagine this is the query parameter ‘nid’:

1'; DROP TABLE NODE; --

Incorrect type handling

This occurs when user supplied input is not strongly typed. For instance you may be expecting an integer but as a developer make no effort to enforce this. The above example shows this in action as well.

Blind SQL injection

This is used when an application is vulnerable to SQL but the results are not visible to the attacker. A crafted query parameter ‘nid’ (for the above SQL query) might something like:

5 AND substring(@@version,1,1) = 4

If you received a non-error page it would indicate that the server was running MySQL 4.

For more information see the wikipedia page on SQL injection.

Database abstraction layer

The easiest way is to use the database abstraction layer effectively. By writing your database queries in this manner not only are you defending against injection from unclean inputs, but you are also ensuring your query will execute on all supported databases (MySQL, PostgreSQL, SQLite and other contributed types e.g. Oracle). So this method is great for portability and security.

Example insert query showing the abstraction layer:

$nid = db_insert('node')
->fields(array('title', 'uid', 'created'))
->values(array(
  'title' => 'Example',
  'uid' => 1,
  'created' => REQUEST_TIME,
))
->execute();

See the documentation for more information on the database abstraction layer.

Dynamic select queries are possible by adding to the query object:

// Create an object of type SelectQuery.
$query = db_select('users', 'u');
// Add extra detail to this query object: a condition, fields and a range.
$query->condition('u.uid', 0, '<>');
$query->fields('u', array('uid', 'name', 'status', 'created', 'access'));
$query->range(0, 50);
$result = $query->execute();

Adding tags to your queries allows you (or another module) to alter the query before executed. The best example of this is the node_access tag.

$query->addTag('node_access');

By adding that simple tag onto any SELECT query, it ensures that all returned node IDs node access restrictions placed on it. All queries that retrieve a list of nodes (or node IDs) for display to users should have this tag (this touches on a few other OWASP security risks).

Static or custom queries with db_query()

If you do need to write static and fast (no placeholders) or extremely custom (multiple complex joins, sub-queries, temporary tables) SQL you can elect for the less standardised method db_query().

N.B. this should never be the first choice for dynamic (with placeholders) database queries - db_select() should be used where possible.

Example static query that is perfect for db_query():

$result = db_query("SELECT nid, title FROM {node}");

With db_query() and dynamic queries you need to perform any sanitisation of the query yourself, luckily there are built in methods you can take advantage of here.

Example select query showing the raw SQL:

$result = db_query('SELECT n.nid, n.title, n.created FROM {node} n WHERE n.uid = :uid', array(':uid' => $uid))->fetchAll();

With the above example you can see the uid parameter is escaped as it is passed in as an argument. In general if you are creating raw SQL with any form of string concatenation you are most likely doing it wrong.

See the documentation for more information on db_query().

Update 13 November

  • I have added a section on what SQL injection is, and the common attacks
  • I have made clearer that db_select() is the preferred method of querying the database over db_query()
  • Removed the section on db_escape_table() as it was seen as being not helpful.

Further reading

Want to read more about this topic, here are a few resources that can help you:

If I have missed any techniques, or you know of any modules or automated tools that may help, let me know in the comments. I am also after better ‘further reading’ material, so if there are other blog posts that deal with SQL injection let me know.

Nov 28 2013
Nov 28

As your site grows with the number of modules, the amount of memory and SQL queries required to perform a full bootstrap grows. Even though your AJAX callback might only need to perform a single SQL SELECT query, sometimes Drupal spends a lot of time loading and executing code that will never be used.

Introducing the JS module

The JS module aims to solve this problem by providing an alternative way to bootstrap to only the required level needed to perform the task at hand. This has a major advantage of including only the necessary files needed to serve the request.

From the module’s description:

JavaScript callback handler is an interim solution for high-performance server requests including (but not limited to) AHAH, AJAX, JSON, XML, etc. This project targets module developers and provides a "bare bone" callback handler which is intended to be addressed by modules wanting to improve response times for specialized tasks.

Drupal and bootstrap levels

By default a vanilla hook_menu() in Drupal will bootstrap to DRUPAL_BOOTSTRAP_FULL, which means it will load in all .module files, and execute hook_boot() and hook_init() for all enabled modules. With the JS module however you can select the bootstrap level you wish to go to (the less levels you load, the faster your code can effectively be), the only trade-off is in the functionality you might not have available.

Here is a list of the bootstrap levels defined in Drupal core (highest are the most lightweight, with each layer adding code and complexity):

static $phases = array(
  DRUPAL_BOOTSTRAP_CONFIGURATION,
  DRUPAL_BOOTSTRAP_PAGE_CACHE,
  DRUPAL_BOOTSTRAP_DATABASE,
  DRUPAL_BOOTSTRAP_VARIABLES,
  DRUPAL_BOOTSTRAP_SESSION,
  DRUPAL_BOOTSTRAP_PAGE_HEADER,
  DRUPAL_BOOTSTRAP_LANGUAGE,
  DRUPAL_BOOTSTRAP_FULL,
);

For example if you wished to use the function variable_get() in your AJAX callback, you would need to ensure you had bootstrapped to at least the DRUPAL_BOOTSTRAP_VARIABLES level, and if you required access to the current $user object you would need DRUPAL_BOOTSTRAP_SESSION etc.

Real world site performance of the JS module

In order to demonstrate the difference in performance, I decided to create a really simple module that highlights the difference between a traditional full Drupal bootstrap and the lightweight approach of the JS module. This simple module can be cloned from my sandbox on Drupal.org.

On a large site with more than 185 modules enabled (including memcache + entitycache), I ran a series of tests to see what impact that had

Full bootstrap (default Drupal hook_menu())

After a drush cc all (206 database queries):

[email protected] /var/www/example git:master » time curl http://example.co.nz/js_example/results/35513
    {"title":"Example node title","success":true}
    real        0m5.085s

Primed cache (17 database queries):

[email protected] /var/www/example git:master » time curl http://example.co.nz/js_example/results/35513
    {"title":"Example node title","success":true}
    real        0m0.514s

Lightweight bootstrap (with JS module bootstrapping to DRUPAL_BOOTSTRAP_DATABASE)

After a drush cc all (9 database queries):

[email protected] /var/www/example git:master » time curl http://example.co.nz/js/js_example/results/35513
    {"title":"Example node title","success":true}
    real        0m0.300s

Primed cache (6 database queries):

[email protected] /var/www/example git:master » time curl http://example.co.nz/js/js_example/results/35513
    {"title":"Example node title","success":true}
    real        0m0.078s

The saving in terms of speed (more than 6 times faster with a primed cache) an less system resources (around one third of the database queries with a primed cache) are remarkable.

Limitations

Access control can be trickier when on the lightweight bootstrap, so be careful and if anything be overly paranoid about any data you receive. You may also want to read up about SQL injection and Drupal 7 - top 1 of 10 OWASP security risks

Performing theming can get tricky as well with a lightweight bootstrap, so if you need to harness the power of view modes, templating and node rendering, you might be better off to use a full bootstrap.

Let me know if you have used the JS module in your Drupal site (and a link if it is public), and I would also be interested to see what benchmarks you get by running my sandbox module on your latest Drupal site.

Oct 11 2013
Oct 11

Being able to work out when an issue started occurring an what impact it is having on real people using your site is critical business information that too often gets overlooked.

Existing (core) modules that can help

Drupal core ships with a few modules that go some way as to helping you track down application errors:

  • dblog - writes PHP error and watchdog messages to the database, where they can be filtered. On cron the old messages are optionally truncated at pre-defined intervals.
  • syslog - writes PHP error and watchdog messages to the syslog on the server. Logs are plain text and can be parsed by numerous standard programs (e.g. grep, Splunk).

The main issue with the above modules is that they:

  • fail to aggregate errors for you - if the issue happens 1 or 1,000 times, you have no way no knowing this without working it out yourself
  • provide enough detail in order to fix the underlying issue - there is no stacktrace, no line numbers etc
  • are not application version specific - you don’t know what release introduced the bug
  • cannot log PHP white screens of death - for instance exceeding the PHP max execution time will not result in a watchdog entry, so you are potentially missing really important information.

Introducing the Raygun.io module

The Raygun.io module aims to solve this problem by replacing the PHP’s exception and error handler with custom error handlers. These handlers simply send the error and all it’s surrounding metadata off to a third party server - in this case Raygun.io.

Raygun.io takes care of the aggregation and alerting around the errors. You can choose to have immediate notifications for new errors, or receive daily summaries of the days activity.

The error sending happens asynchronously so any errors will not hold up your main web thread from executing (useful if there are firewall(s) between your Drupal web servers and the internet).

Screenshots

Main dashboard that can you filter by time and application version:

Error drilldown:

Drupal configuration form for the module:

Let me know if you have used third party error aggregation and alerting modules with your Drupal site and what lessons you have learned.

Please enable JavaScript to view the comments powered by Disqus.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web