Apr 28 2016
Apr 28

Recently I needed to migrate a small set of content into a Drupal 8 site from a JSON feed, and since documentation for this particular scenario is slightly thin, I decided I'd post the entire process here.

I was given a JSON feed available over the public URL http://www.example.com/api/products.json which looked something like:

{
  "upcs" : [ "11111", "22222" ],
  "products" : [ {
    "upc" : "11111",
    "name" : "Widget",
    "description" : "Helpful for many things.",
    "price" : "14.99"
  }, {
    "upc" : "22222",
    "name" : "Sprocket",
    "description" : "Helpful for things needing sprockets.",
    "price" : "8.99"
  } ]
}

I first created a new 'Product' content type inside Drupal, with the Title field label changed to 'Name', and with additional fields for UPC, Description, and Price. Then I needed to migrate all the data in the JSON feed into Drupal, in the product content type.

Note: at the time of this writing, Drupal 8.1.0 had just been released, and many of the migrate ecosystem of modules (still labeled experimental in Drupal core) require specific or dev versions to work correctly with Drupal 8.1.x's version of the Migrate module.

Required modules

Drupal core includes the base 'Migrate' module, but you'll need to download and enable all the following modules to create JSON migrations:

After enabling those modules, you should be able to use the standard Drush commands provided by Migrate Tools to view information about migrations (migrate-status), run a migration (migrate-import [migration]), rollback a migration (migrate-rollback [migration]), etc.

The next step is creating your own custom migration by adding custom migration configuration via a module:

Create a Custom Migration Module

In Drupal 8, instead of creating a special migration class for each migration, registering the migrations in an info hook, etc., you can just create a migration configuration YAML file inside config/install (or, technically, config/optional if you're including the migration config inside a module that does a bunch of other things and may or may not be used with the Migration module enabled), then when your module is installed, the migration configuration is read into the active configuration store.

The first step in creating a custom migration module in Drupal 8 is to create an folder (in this case, migrate_custom_product), and then create an info file with the module information, named migrate_custom_product.info.yml, with the following contents:

type: module
name: Migrate Custom Product
description: 'Custom product migration.'
package: Migration
core: 8.x
dependencies:
  - migrate_plus
  - migrate_source_json

Next, we need to create a migration configuration file, so inside migrate_custom_product/config/install, add a file titled migrate_plus.migration.product.yml (I'm going to call the migration product to keep things simple). Inside this file, define the entire JSON migration (don't worry, I'll go through each part of this configuration in detail later!):

# Migration configuration for products.
id: product
label: Product
migration_group: Products
migration_dependencies: {}

source:
  plugin: json_source
  path: http://www.example.com/api/products.json
  headers:
    Accept: 'application/json'
  identifier: upc
  identifierDepth: 0
  fields:
    - upc
    - title
    - description
    - price

destination:
  plugin: entity:node

process:
  type:
    plugin: default_value
    default_value: product

  title: name
  field_upc: upc
  field_description: description
  field_price: price

  sticky:
    plugin: default_value
    default_value: 0
  uid:
    plugin: default_value
    default_value: 0

The first section defines the migration machine name (id), human-readable label, group, and dependencies. You don't need to separately define the group outside of the migration_group defined here, though you might want to if you have many related migrations that need the same general configuration (see the migrate_example module included in Migrate Plus for more).

source:
  plugin: json_source
  path: http://www.example.com/api/products.json
  headers:
    Accept: 'application/json'
  identifier: upc
  identifierDepth: 1
  fields:
    - upc
    - title
    - description
    - price

The source section defines the migration source and provides extra data to help the source plugin know what information to retrieve, how it's formatted, etc. In this case, it's a very simple feed, and we don't need to do any special transformation to the data, so we can just give a list of fields to bring across into the Drupal Product content type.

The most important parts here are the path (which tells the JSON source plugin where to go to get the data), the identifier (the unique ID that should be used to match content in Drupal to content in the feed), and the identifierDepth (the level in the feed's hierarchy where the identifier is located).

destination:
  plugin: entity:node

Next we tell Migrate the data should be saved to a node entity (you could also define a destination of entity:taxonomy, entity:user, etc.).

process:
  type:
    plugin: default_value
    default_value: product

  title: name
  field_upc: upc
  field_description: description
  field_price: price

  sticky:
    plugin: default_value
    default_value: 0
  uid:
    plugin: default_value
    default_value: 0

Inside the process configuration, we'll tell Migrate which specific node type to migrate content into (in this case, product), then we'll give a simple field mapping between the Drupal field name (e.g. title) and the name of the field in the JSON feed's individual record (name). For certain properties, like a node's sticky status, or the uid, you can provide a default using the default_value plugin.

Enabling the module, running a migration

Once the module is ready, go to the module page or use Drush to enable it, then use migrate-status to make sure the Product migration configuration was picked up by Migrate:

$ drush migrate-status
Group: Products  Status  Total  Imported  Unprocessed  Last imported
product          Idle    2      0         2

Use migrate-import to kick off the product migration:

$ drush migrate-import product
Processed 2 items (2 created, 0 updated, 0 failed, 0 ignored) - done with 'product'           [status]

You can then check under the content administration page to see if the products were migrated successfully:

Drupal 8 content admin - successful product JSON migration

If the products appear here, you're done! But you'll probably need to do some extra data transformation using a custom JSONReader to transform the data from the JSON feed into your custom content type. That's another topic for another day!

Note: Currently, the Migrate UI at /admin/structure/migrate is broken in Drupal 8.1.x, so using Drush is the only way to inspect and interact with migrations; even with a working UI, it's generally best to use Drush to inspect, run, roll back, and otherwise interact with migrations.

Reinstalling the configuration for testing

Since the configuration you define inside your module's config/install directory is only read into the active configuration store at the time when you enable the module, you will need to re-import this configuration frequently while developing the migration. There are two ways you can do this. You could use some code like the following in your custom product migration's migrate_custom_product.install file:

<?php
/**
 * Implements hook_uninstall().
 */
function migrate_custom_product_uninstall() {
 
db_query("DELETE FROM {config} WHERE name LIKE 'migrate.migration.custom_product%'");
 
drupal_flush_all_caches();
}
?>

...or you can use the Configuration Development module to easily re-import the configuration continuously or on-demand. The latter option is recommended, and is also the most efficient when dealing with more than just a single migration's configuration. I have a feeling config_devel will be a common module in a Drupal 8 developer's tool belt.

Further Reading

Some of the inspiration for this post was found in this more fully-featured example JSON migration module, which was referenced in the issue Include JSON example in the module on Drupal.org. You should also make sure to read through the Migrate API in Drupal 8 documentation.

Apr 06 2016
Apr 06

As Drupal VM has passed 500 stars on GitHub, and is becoming a fairly mature environment for local development environment—especially for teams of Drupal developers who want to maintain consistency and flexibility when developing many sites, I've been working to get more stable releases, better documentation, and a more focused feature set.

Also, in the past few months, as interest has surged, I've even had the opportunity to talk about all things Drupal VM on the DrupalEasy podcast! Check out DrupalEasy Podcast 172 - The Coup (Jeff Geerling - Drupal VM), which was just posted a few days ago.

And to keep the conversation flowing, I'm going to be moderating a BoF on Drupal VM at DrupalCon New Orleans, Drupal VM and local Drupal development for teams.

The BoF will be Wednesday morning, from 10:45-11:45 a.m., in room 289 (Chromatic), and if you want to talk about local development environments for teams, or just the future of Drupal VM, please stop by—I hope it will be a lively and productive conversation!

And who knows, maybe I'll bring my Raspberry Pi Zero cluster and see how Drupal VM performs on it :D

Pi Zero cluster: “go small or go home” #raspberrypi pic.twitter.com/wn4GDQV4AB

— Jeff Geerling (@geerlingguy) March 26, 2016
Mar 31 2016
Mar 31

For the past few days, I've been diving deep into testing Drupal 8's experimental new BigPipe feature, which allows Drupal page requests for authenticated users to be streamed and loaded in stages—cached elements (usually the majority of a page) are loaded almost immediately, meaning the end user can interact with the main elements on the page very quickly, then other uncacheable elements are loaded in as Drupal is able to render them.

Here's a very quick demo of an extreme case, where a particular bit of content takes five seconds to load; BigPipe hugely improves the usability and perceived performance of the page by streaming the majority of the page content from cache immediately, then streaming the harder-to-generate parts as they become available (click to replay):

BigPipe demonstration in an animated gif
Drupal BigPipe demo - click to play again.

BigPipe takes advantage of streaming PHP responses (using flush() to flush the output buffer at various times during a page load), but to ensure the stream is delivered all the way from PHP through to the client, you need to make sure your entire webserver and proxying stack streams the request directly, with no buffering. Since I maintain Drupal VM and support Apache and Nginx as webservers, as well as Varnish as a reverse caching proxy, I experimented with many different configurations to find the optimal way to stream responses through any part of this open source stack.

And because my research dug up a bunch of half-correct, mostly-untested assumptions about output buffering with PHP requests, I figured I'd set things straight in one comprehensive blog post.

Testing output buffering

I've seen a large number of example scripts used to test output_buffering on Stack Overflow and elsewhere, and many of them assume output buffering is disabled completely. Rather than doing that, I decided to make a little more robust script for my testing purposes, and also to document all the different bits for completeness:

<?php
// Set a valid header so browsers pick it up correctly.
header('Content-type: text/html; charset=utf-8');// Emulate the header BigPipe sends so we can test through Varnish.
header('Surrogate-Control: BigPipe/1.0');// Explicitly disable caching so Varnish and other upstreams won't cache.
header("Cache-Control: no-cache, must-revalidate");// Setting this header instructs Nginx to disable fastcgi_buffering and disable
// gzip for this request.
header('X-Accel-Buffering: no');$string_length = 32;
echo
'Begin test with an ' . $string_length . ' character string...<br />' . "\r\n";// For 3 seconds, repeat the string.
for ($i = 0; $i < 3; $i++) {
 
$string = str_repeat('.', $string_length);
  echo
$string . '<br />' . "\r\n";
  echo
$i . '<br />' . "\r\n";
 
flush();
 
sleep(1);
}

echo

'End test.<br />' . "\r\n";
?>

If you place this file into a web-accessible docroot, then load the script in your terminal using PHP's cli, you should see output like (click to replay):

PHP CLI streaming response test
PHP response streaming via PHP's CLI - click to play again.

And if you view it in the browser? By default, you won't see a streamed response. Instead, you'll see nothing until the entire page loads (click to replay):

PHP webserver streaming response test not working
PHP response not streaming via webserver in the browser - click to play again.

That's good, though—we now have a baseline. We know that the script works on PHP's CLI, but either our webserver or PHP is not streaming the response all the way through to the client. If you change the $string_length to 4096, and are using a normal PHP/Apache/Nginx configuration, you should see the following (click to replay):

PHP webserver streaming response test not working
PHP response streaming via webserver in the browser - click to play again.

The rest of this post will go through the steps necessary to ensure the response is streamed through your entire stack.

PHP and output_buffering

Some guides say you have to set output_buffering = Off in your php.ini configuration in order to stream a PHP response. In some circumstances, this is useful, but typically, if you're calling flush() in your PHP code, PHP will flush the output buffer immediately after the buffer is filled (the default value is 4096, which means PHP will flush it's buffer in 4096 byte chunks).

For many applications, 4096 bytes of buffering offers a good tradeoff for better transport performance vs. more lively responses, but you can lower the value if you need to send back much smaller responses (e.g. tiny JSON responses like {setting:1}).

One setting you definitely do need to disable, however, is zlib.output_compression. Set it to zlib.output_compression = Off in php.ini and restart PHP-FPM to make sure gzip compression is disabled.

There are edge cases where the above doesn't hold absolutely true... but in most real-world scenarios, you won't need to disable PHP's output_buffering to enable streaming responses.

Nginx configuration

I recommend using Nginx with PHP-FPM for the most flexible and performant configuration, but still run both Apache and Nginx in production for various reasons. Nginx has a small advantage over Apache for PHP usage in that it doesn't have the cruft of the old mod_php approach where PHP was primarily integrated with the webserver, meaning the proxied request approach (using FastCGI) has always been the default, and is well optimized.

All you have to do to make streaming responses work with Nginx is set the header X-Accel-Buffering: no in your response. Once Nginx recognizes that header, it automatically disables gzip and fastcgi_buffering for only that response.

header('X-Accel-Buffering: no');

You can also manually disable gzip (gzip off) and buffering (fastcgi_buffering off) for an entire server directive, but that's overkill and would harm performance in any case where you don't need to stream the response.

Apache configuration

Because there are many different ways of integrating PHP with Apache, it's best to discuss how streaming works with each technique:

mod_php

Apache's mod_php seems to be able to handle streaming without disabling deflate/gzip for requests out of the box. No configuration changes required.

mod_fastcgi

When configuring mod_fastcgi, you must add the -flush option to your FastCgiExternalServer directive, otherwise if you have mod_deflate/gzip enabled, Apache will buffer the entire response and delay until the end to deliver it to the client:

# If using PHP-FPM on TCP port 9000.
FastCgiExternalServer /usr/lib/cgi-bin/php5-fcgi -flush -host 127.0.0.1:9000 -pass-header Authorization

mod_fcgi

I've never configured Apache and PHP-FPM using mod_fcgi, and it seems cumbersome to do so; however, according to the Drupal BigPipe environment docs, you can get output buffering disabled for PHP responses by setting:

FcgidOutputBufferSize 0

mod_proxy_fcgi

If you use mod_proxy_fcgi with PHP-FPM, then you have to disable gzip in order to have responses streamed:

SetEnv no-gzip 1

In all the above cases, PHP's own output buffering will take effect up to the default output_buffering setting of 4096 bytes. You can always change this value to something lower if absolutely necessary, but in real-world applications (like Drupal's use of BigPipe), many response payloads will have flushed output chunks greater than 4096 bytes, so you might not need to change the setting.

Varnish configuration

Varnish buffers output by default, and you have to explicitly disable this behavior for streamed responses by setting do_stream on the backend response inside vcl_backend_response. Drupal, following Facebook's lead, uses the header Surrogate-Control: BigPipe/1.0 to flag a response as needing to b streamed. You need to use Varnish 3.0 or later (see the Varnish blog post announcing streaming support in 3.0), and make the following changes:

Inside your Varnish VCL:

sub vcl_backend_response {
    ...
    if (beresp.http.Surrogate-Control ~ "BigPipe/1.0") {
        set beresp.do_stream = true;
        set beresp.ttl = 0s;
    }
}

Then make sure you output the header anywhere you need to stream a response:

header('Surrogate-Control: BigPipe/1.0');

Debugging output streaming

During the course of my testing, I ran into some strange and nasty networking issue with a VMware vagrant box, which was causing HTTP responses delivered through the VM's virtual network to be buffered no matter what, while responses inside the VM itself worked fine. After trying to debug it for an hour or two, I gave up, rebuilt the VM in VirtualBox instead of VMware, couldn't reproduce the issue, then rebuilt again in VMware, couldn't reproduce again... so I just put that there as a warning—your entire stack (including any OS, network and virtualization layers) has to be functioning properly for streaming to work!

To debug PHP itself, and make sure PHP is delivering the stream even when your upstream webserver or proxy is not, you can analyze packet traffic routed through PHP-FPM on port 9000 (it's a lot harder to debug via UNIX sockets, which is one of many reasons I prefer defaulting to TCP for PHP-FPM). I used the following command to sniff port 9000 on localhost while making requests through Apache, Nginx, and Varnish:

tcpdump -nn -i any -A -s 0 port 9000

You can press Ctrl-C to exit once you're finished sniffing packets.

Mar 24 2016
Mar 24

tl;dr: Drupal 8's defaults make most Drupal sites perform faster than equivalent Drupal 7 sites, so be wary of benchmarks which tell you Drupal 7 is faster based solely on installation defaults or raw PHP execution speed. Architectural changes have made Drupal's codebase slightly slower in some ways, but the same changes make the overall experience of using Drupal and browsing a Drupal 8 site much faster.

When some people see reports of Drupal 8 being 'dramatically' slower than Drupal 7, they wonder why, and they also use this performance change as ammunition against some of the major architectural changes that were made during Drupal 8's development cycle.

First, I wanted to give some more concrete data behind why Drupal 8 is slower (specifically, what kinds of things does Drupal 8 do that make it take longer per request than Drupal 7 on an otherwise-identical system), and also why this might or might not make any difference in your choice to upgrade to Drupal 8 sooner rather than later.

Load test benchmarks with a cluster of Raspberry Pis

For a hobby project of mine, the Raspberry Pi Dramble, I like to benchmark every small change I make to the infrastructure—I poke and prod to see how it affects load capacity (how many requests per second can be served without errors), per-page load performance (how many milliseconds before the page is delivered), and availability (how many requests are served correctly and completely).

I've compiled these benchmarks from time to time on the Dramble - Drupal Benchmarks page, and I also did a much more detailed blog post on the matter (especially comparing PHP 5.6 to 7.0 to HHVM): Benchmarking PHP 7 vs HHVM - Drupal and Wordpress.

The most recent result paints a pretty sad picture if you're blindly comparing Drupal 8's standard configuration with Drupal 7's (with anonymous page caching enabled1):

Drupal 8 vs Drupal 7 standard profile performance on home page load - anonymous vs authenticated

These particular benchmarks highlight the maximum load capacity with 100% availability that the cluster of five (incredibly slow, in comparison to most modern servers) Raspberry Pis. Chances are you'll get more capacity just spinning up an instance of Drupal VM on your own laptop! But the fact of the matter is: Drupal 7, both when loading pages for anonymous and authenticated users, in a very bare (no custom modules, no content) scenario, is much faster than Drupal 8. But why?

XHProf page profiling with Drupal 7 and Drupal 8

With Drupal VM, it's very easy to profile code with XHProf, so I spun up one VM for Drupal 8, then shut that one down and spun up an identical environment for Drupal 7 (both using PHP 5.6 and Apache 2.4), and ran an XHProf analysis on the home page, standard profile, anonymous user, with anonymous page cache enabled, on the first page load (e.g. when Drupal stores its anonymous cache copy).

Subsequent page loads use even less of Drupal's critical code path, and it would be helpful to also analyze what's happening there, but for this post I'll focus on the first anonymous page request to the home page.

Compare, first, the zoomed out callgraph image for Drupal 7 (126ms, 13,406 function calls, 3.7 MB memory) vs Drupal 8 (371ms, 41,863 function calls, 11.1 MB memory):

Drupal 7 vs Drupal 8 - Anonymous request to standard profile home page - XHProf call graph comparison
Call graphs for Drupal 7 (left) vs Drupal 8 (right) - click the links to download full size

Callgraphs allow you to visualize the flow of the code from function to function, and to easily identify areas of the code that are 'hotspots', which either take a long time to run or are called many, many times.

Just glancing at the callgraphs, you can see the difference in the way the page is rendered. In Drupal 7, Drupal's homegrown request routing/menu system efficiently chooses the proper menu callback, and most of the time is spent in regular expressions (that 'preg_grep' red box) during theming and rendering the page.

In Drupal 8, there is a bit of extra time spent routing the request to the proper handler, notifying subscribers of the current request and response flow2, with similar amounts of time as Drupal 7 are spent theming and rendering the page. On top of that, since Drupal 8 has been architected in a more OOP way, especially with the splitting out of functionality into discrete PHP files, more time is spent scanning file data on each page load—this can be mitigated in some circumstances by disabling opcache's stat of each file on each page load, but even then, there is a lot of time spent in file_exists, Composer\Autoload\ClassLoader::findFileWithExtension, is_file, and filemtime.

In both cases, one of the most time-consuming tasks is retrieving data from the database; in Drupal 8, the front page took about 29ms grabbing data from MySQL, in Drupal 7, about 26ms—close enough to be practically the same. In most real-world scenarios, database access is a much larger portion of the page load, so the total page render times in real world usage are often a bit closer between Drupal 7 and Drupal 8. But even there, Drupal 8 adds in a tiny bit of extra time for its more flexible (and thus expensive) entity loading.

So Drupal 8's hot/minimal code path is verifiably slower than Drupal 7 in many small ways, due to additional function calls and object instantiation for Symfony integration, notification handling (on top of some remaining Drupal 7-style hooks) and time spent rummaging through the highly individual-file-per-class-heavy codebase. But does this matter for you? Thats can be a difficult question to answer.

You can download the full .xhprof reports below; if you want to view them in XHProf and generate your own callgraphs, you can do so by placing them in your XHProf output directory without the .txt extension:

Drupal 8 changes - more than just the architecture

Most Drupal 7 site builders feel quite at home in Drupal 8, especially considering many of the features that are baked into Drupal 8 core were the most popular components of many Drupal 7 sites—Views, Wysiwyg, entity relationships, etc. Already, just adding those modules (which are used on many if not most Drupal 7 sites) to a standard Drupal 7 site evens the playing field by a large margin, at least for uncached requests:

Drupal 7 vs Drupal 8 with D8 core modules included in Drupal 7
Drupal 7 and Drupal 8 authenticated requests are much more even when including all of D8's core functionality

It's rare to see a Drupal 7 site with less than ten or fifteen contributed modules; many sites have dozens—or even hundreds—of contributed modules that power the various admin and end-user-facing features that make a Drupal 7 site work well. Using real-world sites as examples, rather than clean-room Drupal installs, benchmarks between functionally similar Drupal 7 and Drupal 8 sites are often much closer (like the one above); though Drupal 7 still takes the raw performance crown per-page-request.

For the above D7 + D8 core module test, I ran the following drush command to get (most of) the modules that are in D8 core, enabled via the standard install profile: drush en -y autoupload backbone bean breakpoints ckeditor date date_popup_authored edit email entityreference entity_translation file_entity filter_html_image_secure jquery_update link magic module_filter navbar phone picture resp_img save_draft strongarm transliteration underscore uuid variable views

So, Drupal 8 is slightly slower than a Drupal 7 site with a comparable suite of modules... excluding many of the amazing new features like Twig templating, built-in Wysiwyg and file upload integration, a better responsive design for everything, more accessibility baked in, and huge multilingual improvements—what else in Drupal 8 makes the raw PHP performance tradeoff worth it?

Easier and more robust caching for anonymous users

Varnish cache hit in Drupal 8

What's the best way to speed up any kind of dynamic CMS? To bypass it completely, using something like Varnish, Nginx caching, or a CDN acting as a caching or 'reverse' proxy like Fastly, CloudFlare or Akamai. In Drupal 7, all of these options were available, and could be made to work fairly easily. However, the elephant in the room was always how do you keep content fresh?

The problem was Drupal couldn't pass along any information with pages that were cached to help upstream reverse proxies to intelligently cache the documents. You'd end up with dozens or custom configured rules and a concoction of modules like Expire, Purge, and/or Varnish, and then you'd still have people who publish content on your site asking why their changes aren't visible on page XYZ.

In Drupal 8, cache tags are built into core and passed along with every page request. Cache tags allow reverse proxies to attach a little extra metadata to every page on the site (this doesn't need to be passed along to the client, since it's only for cacheability purposes), and then Drupal can intelligently say "expire any page where node:118292 appears". Then Varnish could add a ban rule that will mark any view, content listing, block, or other node where node:118292 appears as needing to be refreshed from the backend.

Instead of setting extremely short TTLs (time to live) for content, meaning more requests to Drupal (and thus a slower average response time), you will be free to set TTLs much longer—for some sites, you could even set the cache TTL to days, weeks or longer, so Drupal is only really ever touched when new content is added or specific content is updated.

I wrote a very detailed article on how you can use cache tags with Varnish and the Purge module in Drupal 8; you can also more easily use Drupal 8 with CloudFlare, Fastly, and other CDNs and reverse proxies; for simple cases, you can use Drupal 8 with CloudFlare's free plan, like I did with my Raspberry Pi Dramble. Paid plans allow you to integrate more deeply and use cache tags effectively.

Faster for authenticated users and slow-loading content

If you need to support many logged in users (e.g. a community site/forum, or a site with many content editors), you know how difficult it is to optimize Drupal 6 or 7 for authenticated users; the Authcache module and techniques like Edge-Side Includes have been the most widely-adopted solutions, but if, like me, you've ever had to implement these tools on complex sites, you know that they are hard to configure correctly, and in some cases can cause slower performance while simultaneously making the site's caching layers harder to debug. Authenticated user caching is a tricky thing to get right!

In Drupal 8, because of the comprehensive cacheability metadata available for content and configuration, a new Dynamic Page Cache module is included in core. It works basically the same as the normal anonymous user page cache, but it uses auto-placeholdering to patch in the dynamic and uncacheable parts of the cached page. For many sites, this will make authenticated page requests an order of magnitude faster (and thus more scalable) in Drupal 8 than in Drupal 7, even though the raw Drupal performance is slightly slower.

That's well and good... but the end user still doesn't see the rendered page until Drupal is completely finished rendering the page and placing content inside the placeholders. Right? Well, Drupal 8.1 adds a new and amazing experimental feature modeled after Facebook's "BigPipe" tech:

BigPipe demonstration in an animated gif
BigPipe demo - click the above gif to play it again.

The image above illustrates how BigPipe can help even slow-to-render pages deliver usable content to the browser very quickly. If you have a block in a sidebar that pulls in some data from an external service, or only one tiny user-specific block (like a "Welcome, Jeff!" widget with a profile picture) that takes a half second or longer to render, Drupal can now serve the majority of a page immediately, then send the slower content when it's ready.

To the end user, it's a night-and-day difference; users can start interacting with the page very quickly, and content seamlessly loads into other parts of the page as it is delivered. Read more about BigPipe in Drupal—it's currently labeled as an 'experimental' module in Drupal 8.1, and I'm currently poking and prodding BigPipe with Drupal VM.

Also, in case you're wondering, here's a great overview of the difference between ESI and BigPipe.

There are a few caveats with BigPipe—depending on your infrastructure's configuration, you may need to make some changes so BigPipe can stream the page correctly to the end user. Read BigPipe environment requirements for more information.

Only the beginning of what's possible

Drupal 8's architecture also allows for other innovative ways of increasing overall performance. One trend on the upswing is decoupled Drupal, where all or parts of the front-end of the site (the parts of the website your users see) are actually rendered via javascript either on a server, or on the client (or a mix of both). These decoupled sites have the potential to make more seamless browsing experiences, and can also make some types of sites perform much faster.

Caveat: Before decoupling, have a read through Dries Buytaert's recent (and very insightful) blog post: How should you decouple Drupal?

In Drupal 7, building a fully decoupled site was extremely difficult, as everything would need to work around the fact that Drupal < 8 was built mainly for generating HTML pages. Drupal 8's approach is to generate generic "responses". The default is to generate an HTML page... but it's much easier to generate JSON, XML, or other types of responses. And things like cacheability metadata are also flexible enough to work with any kind of response, so you can have a full-cacheable decoupled Drupal site if you want, without even having to install extra modules or hack around Drupal's rendering system, like you did in Drupal 7.


Click here to download the video if it won't play above.

Another recent development is the RefreshLess module, which uses javascript and the HTML5 history API to make any Drupal 8 site behave more like a one page app—if you click on a link, the page remains in place, but the URL updates, and the parts of the page that are different are swapped out seamlessly, using the cacheability data that's already baked into Drupal 8, powering all the other awesome new caching tools!

On top of all that, we're still very early in Drupal 8's release cycle. Since Drupal is using semantic versioning for releases, new features and improvements can be added to minor releases (e.g. 8.1, 8.2, etc.), meaning as we see more of what's possible with BigPipe, Dynamic page cache, etc., we'll make even more improvements—maybe to the point where even the tiniest Drupal 8 page request is close to Drupal 7 in terms of raw PHP execution speed!

What are your thoughts and experiences with Drupal 8 performance so far?

1 Drupal 7's standard profile doesn't enable the anonymous page cache out of the box. You have to enable it manually on the Performance configuration page. This is one area where Drupal 8's initial out of the box experience is actually faster than Drupal 7. Additionally, Drupal 7's anonymous page cache was much less intelligent than Drupal 8's (any content update or comment posting in Drupal 7 resulted in the entire page cache emptying), meaning content updates and page caching in general are much less painful in Drupal 8.

2 One of the biggest contributors to the slower request routing performance is Drupal 8's use of Symfony components for matching routes, notifying subscribers, etc. chx's comment on the far-reaching nature of this change was prescient; much of Drupal's basic menu handling and access control had to be adapted to the new (less efficient, but more structured) Symfony-based routing system.

Mar 22 2016
Mar 22

Varnish cache hit in Drupal 8

Over the past few months, I've been reading about BigPipe, Cache Tags, Dynamic Page Cache, and all the other amazing-sounding new features for performance in Drupal 8. I'm working on a blog post that more comprehensively compares and contrasts Drupal 8's performance with Drupal 7, but that's a topic for another day. In this post, I'll focus on cache tags in Drupal 8, and particularly their use with Varnish to make cached content expiration much easier than it ever was in Drupal 7.

Purging and Banning

Varnish and Drupal have long had a fortuitous relationship; Drupal is a flexible CMS that takes a good deal of time (relatively speaking) to generate a web page. Varnish is an HTTP reverse proxy that excels at sending a cached web page extremely quickly—and scaling up to thousands or more requests per second even on a relatively slow server. For many Drupal sites, using Varnish to make the site hundreds or thousands of times faster is a no-brainer.

But there's an adage in programming that's always held true:

There are two hard things in computer science: cache invalidation, naming things, and off-by-one errors.

Cache invalidation is rightly positioned as the first of those two (three!) hard things. Anyone who's set up a complex Drupal 7 site with dozens of views, panels pages, panelizer layouts, content types, and configured Cache expiration, Purge, Acquia Purge, Varnish, cron and Drush knows what I'm talking about. There are seemingly always cases where someone edits a piece of content then complains that it's not updating in various places on the site.

The traditional answer has been to reduce the TTL for the caching; some sites I've seen only cache content for 30 seconds, or at most 15 minutes, because it's easier than accounting for every page where a certain type of content or menu will change the rendered output.

In Varnish, PURGE requests have been the de-facto way to deal with this problem for years, but it can be a complex task to purge all the right URLs... and there could be hundreds or thousands of URLs to purge, meaning Drupal (in combination with Purge/Acquia Purge) would need to churn through a massive queue of purge requests to send to Varnish.

Drupal 8 adds in a ton of cacheability metadata to all rendered pages, which is aggregated from all the elements used to build that page. Is there a search block on the page? There will be a config:block.block.bartik_search cache tag added to the page. Is the main menu on the page? There will be a config:system.menu.main cache tag, and so on.

Adding this data to every page allows us to do intelligent cache invalidation. Instead of us having to tell Varnish which particular URLs need to be invalidated, when we update anything in the main menu, we can tell Varnish "invalidate all pages that have the config:system.menu.main cache tag, using a BAN instead of a PURGE. If you're running Varnish 4.x, all you need to do is add some changes to your VCL to support this functionality, then configure the Purge and Generic HTTP Purger modules in Drupal.

Whereas Varnish would process PURGE requests immediately, dropping cached pages matching the PURGE URL, Varnish can more intelligently match BAN requests using regular expressions and other techniques against any cached content. You have to tell Varnish exactly what to do, however, so there are some changes required in your VCL.

Varnish VCL Changes

Borrowing from the well-documented FOSHttpCache VCL example, you need to make the following changes in your Varnish VCL (see the full set of changes that were made to Drupal VM's VCL template):

Inside of vcl_recv, you need to add some logic to handle incoming BAN requests:

sub vcl_recv {
    ...
    # Only allow BAN requests from IP addresses in the 'purge' ACL.
    if (req.method == "BAN") {
        # Same ACL check as above:
        if (!client.ip ~ purge) {
            return (synth(403, "Not allowed."));
        }

        # Logic for the ban, using the Purge-Cache-Tags header. For more info
        # see https://github.com/geerlingguy/drupal-vm/issues/397.
        if (req.http.Purge-Cache-Tags) {
            ban("obj.http.Purge-Cache-Tags ~ " + req.http.Purge-Cache-Tags);
        }
        else {
            return (synth(403, "Purge-Cache-Tags header missing."));
        }

        # Throw a synthetic page so the request won't go to the backend.
        return (synth(200, "Ban added."));
    }
}

The above code basically inspects BAN requests (e.g. curl -X BAN http://127.0.0.1:81/ -H "Purge-Cache-Tags: node:1"), then passes along a new ban() if the request comes from the acl purge list, and if the Purge-Cache-Tags header is present. In this case, the ban is set using a regex search inside stored cached object's obj.http.Purge-Cache-Tags property. Using this property (on obj instead of req) allows Varnish's ban lurker to clean up ban requests more efficiently, so you don't end up with thousands (or millions) of stale ban entries. Read more about Varnish's ban lurker.

Inside of vcl_backend_response, you can add a couple extra headers to help the ban lurker (and, potentially, allow you to make more flexible ban logic should you choose to do so):

sub vcl_backend_response {
    # Set ban-lurker friendly custom headers.
    set beresp.http.X-Url = bereq.url;
    set beresp.http.X-Host = bereq.http.host;
    ...
}

Then, especially for production sites, you should make sure Varnish doesn't pass along all the extra headers needed to make Cache Tags work (unless you want to see them for debugging purposes) inside vcl_deliver:

sub vcl_deliver {
    # Remove ban-lurker friendly custom headers when delivering to client.
    unset resp.http.X-Url;
    unset resp.http.X-Host;
    unset resp.http.Purge-Cache-Tags;
    ...
}

At this point, if you add these changes to your site's VCL and restart Varnish, Varnish will be ready to handle cache tags and expire content more efficiently with Drupal 8.

Drupal Purge configuration

First of all, so that external caches like Varnish know they are safe to cache content, you need to set a value for the 'Page cache maximum age' on the Performance page (admin/config/development/performance). You can configure Varnish or other reverse proxies under your control to cache for as long or short a period of time as you want, but a good rule-of-thumb default is 15 minutes—even with cache tags, clients cache pages based on this value until the user manually refreshes the page:

Set page cache maximum age

Now we need to make sure Drupal does two things:

  1. Send the Purge-Cache-Tags header with every request, containing a space-separated list of all the page's cache tags.
  2. Send a BAN request with the appropriate cache tags whenever content or configuration is updated that should expire pages with the associated cache tags.

Both of these can be achieved quickly and easily by enabling and configuring the Purge and Generic HTTP Purger modules. I used drush en -y purge purge_purger_http to install the modules on my Drupal 8 site running inside Drupal VM.

Purge automatically sets the http.response.debug_cacheability_headers property to true via it's purge.services.yml, so Step 1 above is taken care of. (Note that if your site uses it's own services.yml file, the http.response.debug_cacheability_headers setting defined in that file will override Purge's settings—so make sure it's set to true if you define settings via services.yml on your site!)

Note that you currently (as of March 2016) need to use the -dev release of Purge until 8.x-3.0-beta4 or later, as it sets the Purge-Cache-Tags header properly.

For step 2, you need to add a 'purger' that will send the appropriate BAN requests using purge_purger_http: visit the Purge configuration page, admin/config/development/performance/purge, then follow the steps below:

  1. Add a new purger by clicking the 'Add Purger' button: Add Purger
  2. Choose 'HTTP Purger' and click 'Add': HTTP Purger
  3. Configure the Purger's name ("Varnish Purger"), Type ("Tag"), and Request settings (defaults for Drupal VM are hostname 127.0.0.1, port 81, path /, method BAN, and scheme http): Configure HTTP purger request settings
  4. Configure the Purger's headers (add one header Purge-Cache-Tags with the value [invalidation:expression]): Configure HTTP purger header settings

Testing cache tags

Now that you have an updated VCL and a working Purger, you should be able to do the following:

  1. Send a request for a page and refresh a few times to make sure Varnish is caching it:

    $ curl -s --head http://drupalvm.dev:81/about | grep X-Varnish
    X-Varnish: 98316 65632
    X-Varnish-Cache: HIT

  2. Edit that page, and save the edit.

  3. Run drush p-queue-work to process the purger queue:

    $ drush @drupalvm.drupalvm.dev p-queue-work
    Processed 5 objects...

  4. Send another request to the same page and verify that Varnish has a cache MISS:

    $ curl -s --head http://drupalvm.dev:81/about | grep X-Varnish
    X-Varnish: 47
    X-Varnish-Cache: MISS

  5. After the next request, you should start getting a HIT again:

    $ curl -s --head http://drupalvm.dev:81/about | grep X-Varnish
    X-Varnish: 50 48
    X-Varnish-Cache: HIT

You can also use Varnish's built in tools like varnishadm and varnishlog to verify what's happening. Run these commands from the Varnish server itself:

# Watch the detailed log of all Varnish requests.
$ varnishlog
[wall of text]

# Check the current list of Varnish bans.
$ varnishadm
varnish> ban.list
200       
Present bans:
1458593353.734311     6    obj.http.Purge-Cache-Tags ~ block_view

# Check the current parameters.
varnish> param.show
...
ban_dups                   on [bool] (default)
ban_lurker_age             60.000 [seconds] (default)
ban_lurker_batch           1000 (default)
ban_lurker_sleep           0.010 [seconds] (default)
...

If you're interested in going a little deeper into general Varnish debugging, read my earlier post, Debugging Varnish VCL configuration files.

Other notes and further reading

I spent a few days exploring cache tags, and how they work with Varnish, Fastly, CloudFlare, and other services with Drupal 8, as part of adding cache tag support to Drupal VM. Here are some other notes and links to further reading so you can go as deep as you want into cache tags in Drupal 8:

  • If you're building custom Drupal modules or renderable arrays, make sure you add cacheability metadata so all the cache tag magic just works on your site! See the official documentation for Cacheability of render arrays.
  • The Varnish module is actively being ported to Drupal 8, and could offer an alternative option for using cache tags with Drupal 8 and Varnish.
  • Read the official Varnish documentation on Cache Invalidation, especially regarding the effectiveness and performance of using Bans vs Purges vs Hashtwo vs. Cache misses.
  • There's an ongoing meta issue to profile and rationalize cache tags in Drupal 8, and the conversation there has a lot of good information about cache tag usage in the wild, caveats with header payload size and hashing, etc.
  • As mentioned earlier, if you have a services.yml file for your site, make sure you set http.response.debug_cacheability_headers: true inside (see note here).
  • Read more about Varnish bans
  • Read more about Drupal 8 cache tags
  • Read a case study of cache tags (with Fastly) dramatically speeding up a large Drupal 8 site.
  • Be careful with your ban logic in the VCL; you need to avoid using regexes on req to allow the ban lurker to efficiently process bans (see Why do my bans pile up?).
  • If you find Drupal 8's cache_tags database table is growing very large, please check out the issue Garbage collection for cache tag invalidations. For now, you can safely truncate that table from time to time if needed.
Mar 14 2016
Mar 14

I think today was my most Pi-full ? day, ever! Let's see:

Early in the morning, I finished upgrading all the Ansible playbooks used by the Raspberry Pi Dramble so my cluster of five Raspberry Pis would run faster and better on the latest version of official Raspberry Pi OS, Raspbian Jessie.

Later, opensource.com published an article I wrote about using Raspberry Pis placed throughout my house to help my kids sleep better:

Raspberry Pi project to regulate room temperature and sleep better https://t.co/ikwRbS5wns by @geerlingguy pic.twitter.com/rXA1eWodIm

— Open Source Way (@opensourceway) March 14, 2016

Meanwhile, my wife brought home some tasty Dutch Apple pie, of which I had a slice at lunch:

#PiDay celebration continues… pic.twitter.com/W4HhAmigrT

— Jeff Geerling (@geerlingguy) March 14, 2016

Later, at 3:14 p.m. on 3/14, while I was in the 314 area code, I tweeted:

It’s 3:14 on 3/14 in the 314 #PiDay #stlouis

— Jeff Geerling (@geerlingguy) March 14, 2016

...and I was walking into our Micro Center to pick up a Raspberry Pi 3, which I'm going to benchmark like crazy for my Raspberry Pi Dramble Drupal 8 cluster, and for my Drupal Pi project:

Time to run some Drupal benchmarks on the Pi 3 I picked up from @microcenter on #PiDay #RaspberryPi pic.twitter.com/BwiOsSvLJR

— Jeff Geerling (@geerlingguy) March 14, 2016
Mar 14 2016
Mar 14

I've been supporting Drupal VM (a local Drupal CMS development environment) for Windows, Mac, and Linux for the past couple years, and have been using Vagrant and virtual machines for almost all my development (mostly PHP, but also some Python and Node.js at the moment) for the past four years. One theme that comes up quite frequently when dealing with VMs, open source software stacks (especially Drupal/LAMP), and development, is how much extra effort there is to make things work well on Windows.

Problem: tool-builders use Linux or Mac OS X

The big problem, I see, is that almost all the tool-builders for OSS web software run either Mac OS X or a flavor of Linux, and many don't even have access to a Windows PC (outside of maybe an odd VM for testing sites in Internet Explorer or Edge, if they're a designer/front-end developer). My evidence is anecdotal, but go to any OSS conference/meetup and you'll likely see the same.

When tool-builders don't use Windows natively, and in many cases don't even have access to a real Windows environment, you can't expect the tools they build to always play nice in that kind of environment. Which is why virtualization is almost an essential component of anyone's Windows development workflow.

However, that's all a bit of an aside leading up to the substance of this post: common issues with Windows-based development using virtual machines (e.g. VirtualBox, Hyper-V, VMware, etc.), and some solutions or tips for minimizing the pain.

As an aside, I bought a Lenovo T420 and stuck 2 SSDs and an eMMC card in it, then I purchased and installed copies of Windows 7, 8, and 10 on them so I could make sure the tools I build work at least decently well on Windows in multiple different environments. Many open source project maintainers aren't willing to fork over a $500+ outlay just to test in a Windows environment, especially since the Windows bugs often take 2-4x more time and cause many more grey hairs than similar bugs on Linux/Mac OS X.

Tips for more efficiency on Windows

First: if there's any way you can use Linux or Mac instead of Windows, you'll be in much less pain. This is not a philosophical statement, nor is it an anti-Microsoft screed. Almost all the modern web development toolkits are supported primarily (and sometimes only) for Linux/POSIX-like environments. Getting everything in a modern stack working together in Windows natively is almost never easy; getting things working within a VM is a viable but sometimes annoying alternative.

Second: if you can do all development within the VM, you'll be in somewhat less pain. If you can check out your codebase inside the VM's filesystem (and not use a synced folder), then edit files in the VM, or edit files in Windows through a mounted share (instead of using built-in shared folders), some of the pain will be mitigated.

Here are some of the things I've found that make development on Windows possible; and sometimes even enjoyable:

Working with symbolic links (symlinks)

You should not use Git Bash, Git Shell, or Powershell when managing the Vagrant environment (there is a way, but it's excruciating). It's highly recommended to use either Cygwin (with openssh) or Cmder instead. There are additional caveats when you require symlink support:

  • You need to run Cygwin or Cmder as administrator (otherwise you can't create symlinks).
  • You have to do everything inside the VM (you can do git stuff outside, and edit code outside, but anything you need to do dealing with creating/changing/deleting symlinks should be done inside the VM).
  • If you touch the symlinks outside the VM, bad things will happen.
  • Symlinks only work with SMB or native shares (possibly rsync, too, but that's a bit harder to work with in my experience.
  • If you switch from native to SMB, or vice-versa, you have to rebuild all symlinks (symlinks between the two synced folder types are incompatible).
  • If you use SMB, you have to set custom mount options for Vagrant in the synced folder configuration, e.g.: mount_options: ["mfsymlinks,dir_mode=0755,file_mode=0755"]

VirtualBox Guest Additions

Probably half the problems I've seen are due to outdated (or not-installed) VirtualBox Guest Additions. Many Vagrant box maintainers don't update their boxes regularly, and if this is the case, and you have a newer version of VirtualBox, all kinds of strange issues (ssh errors, synced folder errors, etc.) ensue.

I highly recommend (for Mac, Linux, and Windows) you install the vagrant-vbguest plugin: vagrant plugin install vagrant-vbguest

Delete a deep folder hierarchy (nested directories)

For many of the projects I've worked on, folder hierarchies can get quite deep, e.g. C:/Users/jgeerling/Documents/GitHub/myproject/docroot/sites/all/themes/mytheme/node_modules/dist/modulename/etc. Windows hates deep folder hierarchy, and if you try deleting a folder with such a structure either in Explorer or using rmdir in PowerShell, you'll likely end up with an error message (e.g. File name too long...).

To delete a folder with a deep hierarchy (many nested directories), you need to install robocopy (part of the Windows Server set of tools), then follow these directions to delete the directory.

Node.js and npm problems

There are myriad issues running Node.js, NPM, and the ecosystem of associated build tools. It's hard enough keeping things straight with multiple Node versions and nvm on Mac/Linux... but toss in a Windows environment and most corporate networks/group policies, and you will also need to deal with:

  • If you have certain flavors of antivirus running, you might have trouble with Node.js and NPM.
  • If you are behind a corporate proxy, you will need to run a few extra commands to make sure bolt can do what it needs to do.

If you attempt to use Node/NPM within Windows, you should run Cygwin or Cmder as administrator, and possibly disable AntiVirus software. If working behind a proxy, you will also need to configure NPM to work with the proxy (in addition to configuring the VM/Linux in general to work behind the proxy):

$ npm config set proxy http://username:[email protected]:port/
$ npm config set https-proxy http://username:[email protected]:port/

Intel VT-x virtualization

Many PC laptops (especially those from Lenovo, HP, and Dell) have Intel's VT-x virtualization turned off by default, which can cause issues with many Vagrant boxes. Check your computer manufacturer's knowledge base for instructions for enabling VT-x in your system BIOS/UEFI settings.

I have a Lenovo T420, and had to follow these instructions for enabling virtualization from Lenovo's support site.

Other Notes

I've also compiled a list of Windows tips and tricks in the Drupal VM project documentation: Drupal VM Docs - Windows Notes.

Summary

Developing for Drupal with Vagrant and VMs on Windows is possible—I've used Drupal VM and related projects with Windows 7, 8, and 10, with and without proxies, on a variety of hardware—but not optimal for all cases. If you keep running into issues like those listed above (or others), you might want to investigate switching your development environment to Linux or Mac OS X instead.

Feb 29 2016
Feb 29

I'm going to bring the Raspberry Pi Dramble with me to php[tek] on May 25 in St. Louis this year, and I'm hoping to also bring it with me to DrupalCon New Orleans in early May (I submitted the session Highly-available Drupal 8 on a Raspberry Pi Cluster, hopefully it's approved!).

Raspberry Pi model 3 B from Raspberry Pi Foundation

After this morning's official announcement of the Raspberry Pi model 3 B, I placed two orders with separate vendors as quickly as possible; I'm hoping I can get at least one or two to run some benchmarks and see where the Pi Dramble can get the most benefit from the upgraded ARMv8 processor (it's a 64 bit processor with a higher base clock speed than the model 2); I'm also going to see if any of the other small improvements in internal SoC architecture make an impact on real-world Drupal and networking benchmarks.

I also now have three Raspberry Pi Zeros that I'm working with to build a creative, battery-powered cluster for educational purposes, but without the quad core processor of the Pi 2/3, speed is a huge limitation in what this smaller cluster (it's tiny!) can do.

At a minimum, I'll have a slightly faster single Pi for running Drupal 8 / www.pidramble.com from home while the cluster is on the road, using the Drupal Pi project!

Feb 29 2016
Feb 29

With the rampant speculation there will be a new Raspberry Pi model released next week, I was wondering if the official Raspberry Pi blog might reveal anything of interest; they just posted a Four Years of Pi blog post on the 26th, which highlighted the past four years, and mentioned the excitement surrounding 4th anniversary of Pi sales, coming up on February 29th, 2016.

Glancing at the blog's source, I noticed it looks like a Wordpress blog (using httpie on the cli):

$ http https://www.raspberrypi.org/blog/four-years-of-pi/ | grep generator
<meta name="generator" content="WordPress 4.4.2" />

Having set up a few WP sites in the past, I knew there was a simple way to load content by its ID, using a URL in the form:

https://www.example.org/?p=[post-id-here]

Trying this on the RPi blog, I put in the post ID of the latest blog post, which is set as a class on the <body> tag: postid-20167: https://www.raspberrypi.org/?p=20167

This URL redirects to the pretty URL (yay for good SEO, at least :), so this means if I can put in other IDs, and get back valid pages (or just valid redirects), I can start enumerating the post IDs and seeing what I can find. Checking a few other IDs gets some interesting images, a few 404s with no redirects... and eventually, a 404 after a redirect, with a fairly large spoiler (well, not so large if you're following Raspberry Pi news almost anywhere this weekend!).

$ http head https://www.raspberrypi.org/?p=[redacted]
HTTP/1.1 301 Moved Permanently
Location: https://www.raspberrypi.org/[redacted]

If I load the 'Location' URL, I get:

$ http head https://www.raspberrypi.org/[redacted]
HTTP/1.1 404 Not Found

So... for SEO purposes, it's best to either drop the ?p=[id] format, or redirect it to the pretty URL. However, for information security, this redirect should only happen if the content is published, because it can lead (like we see here) to information disclosure.

Other CMSes like Drupal and Joomla (or most any other CMS/custom CMS I've seen) can also suffer from the same enumeration problem, and I know at least for Drupal, there are tools like Username Enumeration Prevention for usernames and m4032404 for other content. Another way to work around this particular problem is to stage content in a non-production environment, and only have the content exist in production at all once it's published.

Note that enumeration by ID (posts, users, etc.) is not necessarily considered a security vulnerability (and it's really not... it's not like someone can hack your site using this attack). But it can lead to unwanted information leakage.

Feb 25 2016
Feb 25

After months of having this on my todo list, I've finally had the time to record a quick introduction video for Drupal VM. Watch the video below, then a transcript below the video:

Drupal VM is a local development environment for Drupal that's built with Vagrant and Ansible. It helps you build and maintain Drupal sites using best practices and the best tools. In this quick overview, I'll show you where you can learn more about Drupal VM, then show you a simple Drupal VM setup.

The Drupal VM website gives a general overview of the project and links to:

I'm going to build Drupal VM on my Mac using the Quick Start Guide.

First, download Vagrant using the link in the Quick Start Guide and install it on your computer. Vagrant will install the only other required application, VirtualBox, the first time you run it. (If you're on a Mac or Linux PC, you should also install Ansible for the best experience.)

Next, go back to the Drupal VM website, then download Drupal VM using the download link.

  • Copy the example.drupal.make.yml file to drupal.make.yml.
  • Then copy example.config.yml to config.yml, and make changes to suit your environment.

I removed some of the tools inside the installed_extras section since I don't need them for this demonstration.

Open your Terminal, and change directories into the Drupal VM directory using the cd command.

cd ~/Downloads/drupal-vm-master

Type in vagrant up, and after a few minutes, Drupal VM will be set up and ready for you to use.

Once the VM is built, visit http://dashboard.drupalvm.dev/ to see an overview of all the sites and software on your VM. Visit http://drupalvm.dev/ to see the Drupal 8 site that was automatically created.

At this point, after I'm finished working on my project, I can shut down the VM using vagrant halt, restart it with vagrant reload, or delete it and start over from scratch with vagrant destroy.

I'll be posting other videos demonstrating Drupal VM on Windows, Drupal VM with PHP 7, and how to use Drupal VM with existing Drupal sites, or multisite Drupal installs!

For more information about Drupal VM, visit the Drupal VM website at http://www.drupalvm.com/.

Feb 22 2016
Feb 22

Update: I just posted a new video about Drupal VM, Drupal VM - Quick Introduction, covering some of these new features!

I'm excited to announce the release of Drupal VM 2.3.0 "Miracle and Magician"—with over 21 new features and bugs fixed!

One of the most amazing improvements is the new Drupal VM dashboard; after you build Drupal VM, visit the VM's IP address to see all the sites, tools, and connection details in your local development environment:

Drupal VM 2.3.0 release - new dashboard UI

This feature was singlehandedly implemented by Oskar Schöldström—who also happens to have practically matched my commit activity for the past month or so. I'm pretty sure I owe him something like 100 beers at this point!

Here are some of the other great new features of Drupal VM in 2.3.0:

  • Greater stability on Windows and when Ansible is not installed on a Mac/Linux host.
  • Full test coverage using Docker containers on Travis CI—instead of just testing syntax, we're fully installing Drupal VM in two Ubuntu 14.04 and one CentOS 7 environment to test every aspect of Drupal VM.
  • Deployment to a DigitalOcean droplet - this is an experimental feature for now, but you can easily build a more secure Drupal VM instance on the Cloud using nothing but Ansible.
  • Vastly improved and expanded documentation
  • Support for a local Vagrantfile - this allows you to easily override the default Vagrant configuration if needed

For more information about this release, check out the 2.3.0 Release Notes.

Feb 22 2016
Feb 22

I'll be hosting a Reddit AMA on the Drupal subreddit tomorrow morning, Monday February 22, starting at 10 a.m. Eastern / 9 a.m. Central.

During the AMA, I would love to hear any questions you have about Drupal VM, Honeypot, Ansible, writing, open source communities, or really anything else you can think of! I just wrapped up a big project last week, so I'll have a couple hours tomorrow to talk about anything and everything with the Drupal community on Reddit. Even horse-sized ducks and Legos, if you're so inclined.

I'll also be formally announcing the next major release of Drupal VM, with some amazing new features for local Drupal development, so please check in tomorrow morning!

Feb 04 2016
Feb 04

I wanted to document this here just because it took me a little while to get all the bits working just right so I could have a hierarchical taxonomy display inside a Facet API search facet, rather than a flat display of only the taxonomy terms directly related to the nodes in the current search.

Basically, I had a search facet on a search page that allowed users to filter search results by a taxonomy term, and I wanted it to show the taxonomy's hierarchy:

Flat taxonomy to hierarchical taxonomy display using Search API Solr and Facet API in Drupal 7

To do this, you need to do two main things:

  1. Make sure your taxonomy field is being indexed with taxonomy hierarchy data intact.
  2. Set up the Facet API facet for this taxonomy term so it will display the full hierarchy.

Let's first start by making sure the taxonomy information is being indexed (refer to the image below):

Search API Solr index Filters configuration for hierarchical taxonomy

  1. In Search API's configuration, edit the Filters for the search index you're using (e.g. /admin/config/search/search_api/index/[index]/workflow).
    1. Make sure the 'Index hierarchy' checkbox is checked.
    2. In the 'Index hierarchy' Callback settings (which appear after you check the box in step 1), scroll down and make sure you select 'Parent terms' and 'All parent terms' under the Taxonomy type you need to display hierarchically.
  2. Save the Filters configuration, then reindex all the content on your site (otherwise Solr won't have the updated hierarchy information).

Next, we need to edit the Facet API facet for this taxonomy:

  1. Go to the taxonomy Facet's configuration page (e.g. /admin/config/search/facetapi/search_api%40[index]/block/field_release/edit).
  2. Check the 'Expand hierarchy' checkbox under 'Display settings' (near the top of the form).
  3. Set 'Treat parent items as individual facet items' to 'No'.
  4. Set 'Flatten hierarchy' to 'No'.
  5. Set 'Minimum facet count' to 0 (to show all terms in the taxonomy).

After you've done that (make sure you reindexed your content!), you should have a nice hierarchical facet display.

Jan 21 2016
Jan 21

In a prior post on the constraints of in-home website hosting, I mentioned one of the major hurdles to serving content quickly and reliably over a home Internet connection is the bandwidth you get from your ISP. I also mentioned one way to mitigate the risk of DoSing your own home Internet is to use a CDN and host images externally.

At this point, I have both of those things set up for www.pidramble.com (a Drupal 8 site hosted on a cluster of Raspberry Pis in my basement!), and I wanted to outline how I set up Drupal 8 and CloudFlare so almost all requests to www.pidramble.com are served through CloudFlare directly to the end user!

CloudFlare Configuration

Before anything else, you need a CloudFlare account; the free plan offers all the necessary features (though you should consider upgrading to a better plan if you have anything beyond the simplest use cases in mind!). Visit the CloudFlare Plans page and sign up for a Free account.

Once there, you can add your site and use all the default settings for security, SSL, DNS, etc. You'll have to configure your website's DNS to point to CloudFlare, then CloudFlare will have some DNS records that point to your 'origin' (the server IP where your Drupal 8 site is running).

After all that's done, go to the Caching section and choose the 'Standard' level of caching, as well as 'Always Online' (so CloudFlare keeps your static site up even if your server goes down).

The most important part of the configuration is adding 'Page Rules', which will allow you to actually enable the cache for certain paths and bypass cache for others (e.g. site login and admin pages). Free accounts are limited to only 3 rules, so we have to be a bit creative to make the site fully cached but not accidentally lock ourselves out of it!

We'll need to add three rules total:

  1. A rule to 'cache everything' on www.pidramble.com/*
  2. A rule to 'bypass cache' on www.pidramble.com/user/login (allows us to log into the site)
  3. A rule to 'bypass cache' on www.pidramble.com/admin/* (allows content management and administration)

The free account 3-rule limitation means that we have to do a little trickery to bypass the cache on non-admin paths when we're working on the site. Otherwise, our options would be to have some sort of alternate URL for editing (e.g. edit.example.com) that bypasses CloudFlare, or turn off caching entirely while doing development work through the CloudFlare-powered URL!

One major downside to this approach—URLs like node/[id]/edit, if accessed by someone who is not logged in, will be cached in CloudFlare as a '403 - Access Denied' page, and then you won't be able to edit that content (even when logged in) unless you purge that path from CloudFlare or use a different workaround mentioned above).

www.pidramble.com CloudFlare caching rules for Drupal 8

For the three rules, set the following options (only the non-default options you should change are shown here):

'cache everything' on /*:

  • Custom caching: Cache everything
  • Edge cache expire TTL: Respect all existing headers
  • Browser cache expire TTL: 1 hour (adjust as you see fit)

'bypass cache' on /user/login:

  • Custom caching: Bypass cache
  • Browser cache expire TTL: 4 hours (adjust as your see fit)

'bypass cache' on /admin/*:

  • Custom caching: Bypass cache
  • Browser cache expire TTL: 4 hours (adjust as you see fit)

Drupal 8 Configuration

To make sure CloudFlare (or any other reverse proxy you use) caches your Drupal site pages correctly, you need to make the following changes to your Drupal 8 site:

  1. Make sure the 'Internal Page Cache' module is enabled.
  2. Set a 'Page cache maximum age' on the Performance configuration page (/admin/config/development/performance).
  3. Add a few options to tell Drupal about your reverse proxy inside your settings.php file:

Inside sites/default/settings.php, add the following configuration to tell Drupal it is being served from behind a reverse proxy (CloudFlare), and also to make sure the trusted_host_patterns are configured:

<?php
// Reverse proxy configuration.
$settings['reverse_proxy'] = TRUE;
$settings['reverse_proxy_addresses'] = array($_SERVER['REMOTE_ADDR']);
$settings['reverse_proxy_header'] = 'HTTP_CF_CONNECTING_IP';$settings['omit_vary_cookie'] = TRUE;// Trusted host settings.
$settings['trusted_host_patterns'] = array(
 
'^pidramble\.com$',
 
'^.+\.pidramble\.com$',
);
?>

Once you've added this configuration, open another browser or an incognito browser session so you can access your site as an anonymous user. Click around on a few pages so CloudFlare gets a chance to cache your pages.

You can check that pages are being served correctly by CloudFlare by checking the HTTP headers returned by a request. The quickest way to do this is using curl --head in your terminal:

$ curl -s --head http://www.pidramble.com/ | grep CF
CF-Cache-Status: HIT
CF-RAY: 25bf2a08a7f425a3-ORD

If you see a value of HIT for the CF-Cache-Status, that means CloudFlare is caching the page. You should also notice it loads very fast now; for this site, I'm seeing the page load in < .3 seconds when cached through CloudFlare; it takes almost twice as long without CloudFlare caching!

Jan 16 2016
Jan 16

tl;dr: Drupal VM 2.2.0 'Wormhole' was released today, and it adds even more features for local dev!

Over the past few months, I've been working towards a more reliable release cadence for Drupal VM, and I've targeted one or two large features, a number of small improvements, and as many bugfixes as I have time to review. The community surrounding Drupal VM's development has been amazing; in the past few months I've noticed:

  • Lunchbox, a new Node.js-based app wrapper for Drupal VM for managing local development environments.
  • A mention of using Drupal VM + docker-selenium for running Behat tests with Chrome or FireFox, complete with automatic screenshots of test steps!
  • A great discussion about using Drupal VM with teams in the issue queue, along with a PR with some ideas in code.
  • A total of 27 individual contributors to Drupal VM (who have helped me work through 307 issues and 77 pull requests), along with hundreds of contributors for the various Ansible roles that support it.

Drupal VM is the fruit of a lot of open-source effort, and one of the things that I'm most proud of is the architecture—whereas many similar projects (whether they use Docker, Vagrant, or locally-installed software) maintain an 'island' of roles/plugins/configuration scripts within one large project, I decided to build Drupal VM on top of a few dozen completely separate Ansible roles, each of which serves an independent need, can be used for a variety of projects outside of Drupal or PHP-land, and is well tested, even in some cases on multiple platforms via Travis CI and Docker.

For example, the Apache and Nginx roles that Drupal VM uses are also used for many individual's and companies' infrastructure, even if they don't even use PHP! I'm happy to see that even some other VM-based Drupal development solutions use some of the roles as a foundation, because by sharing a common foundation, all of our tooling can benefit. It's kind of like Drupal using Twig, which benefits not only our community, but all the other PHP developers who are used to Twig!

If you want to kick the tires on Drupal VM (want to test Drupal 8 with Redis, PHP 7, Nginx, and Maria DB, or easily benchmark Drupal 8 on PHP 7 and HHVM?), follow the Quick Start Guide and let me know how it goes!

Dec 30 2015
Dec 30

I spent about an hour yesterday debugging a Varnish page caching issue. I combed the site configuration and code for anything that might be setting cache to 0 (effectively disabling caching), I checked and re-checked the /admin/config/development/performance settings, verifying the 'Expiration of cached pages' (page_cache_maximum_age) had a non-zero value and that the 'Cache pages for anonymous users' checkbox was checked.

After scratching my head a while, I realized that the headers I was seeing when using curl --head [url] were specified as the defaults in drupal_page_header(), and were triggered any time there was a message displayed on the page (e.g. via drupal_set_message()):

X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
X-Content-Type-Options: nosniff

On this particular site, the error_level was set to 1 to show all errors on the screen, and the page in question had a PHP error displayed on every page load.

After setting error_level to 0 ('None' on the /admin/config/development/logging page), Drupal sent the correct cache headers, Varnish was able to cache the page, and my sanity was restored.

Kudos especially to this post on coderwall, which jogged my memory.

Other potential reasons a page might not be showing as cacheable:

  • A form with a unique per-user token may be present.
  • An authenticated user is viewing the page (Drupal by default marks any page view with a valid session as no-cache).
  • Someone set \Drupal::service('page_cache_kill_switch')->trigger(); (Drupal 8), or drupal_page_is_cacheable() (Drupal 7).
  • Some configuration file that's being included is either setting cache or page_cache_maximum_age to 0.
Dec 30 2015
Dec 30

I spent about an hour yesterday debugging a Varnish page caching issue. I combed the site configuration and code for anything that might be setting cache to 0 (effectively disabling caching), I checked and re-checked the /admin/config/development/performance settings, verifying the 'Expiration of cached pages' (page_cache_maximum_age) had a non-zero value and that the 'Cache pages for anonymous users' checkbox was checked.

After scratching my head a while, I realized that the headers I was seeing when using curl --head [url] were specified as the defaults in drupal_page_header(), and were triggered any time there was a message displayed on the page (e.g. via drupal_set_message()):

X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
X-Content-Type-Options: nosniff

On this particular site, the error_level was set to 1 to show all errors on the screen, and the page in question had a PHP error displayed on every page load.

After setting error_level to 0 ('None' on the /admin/config/development/logging page), Drupal sent the correct cache headers, Varnish was able to cache the page, and my sanity was restored.

Kudos especially to this post on coderwall, which jogged my memory.

Other potential reasons a page might not be showing as cacheable:

  • A form with a unique per-user token may be present.
  • An authenticated user is viewing the page (Drupal by default marks any page view with a valid session as no-cache).
  • Someone set \Drupal::service('page_cache_kill_switch')->trigger(); (Drupal 8), or drupal_page_is_cacheable() (Drupal 7).
  • Some configuration file that's being included is either setting cache or page_cache_maximum_age to 0.
Dec 23 2015
Dec 23

[Multiple updates: I've added results for concurrencies of 1 and 10, results on bare metal vs. VMware instances, tested Drupal 8 vs Drupal 7 vs Wordpress 4.4, and I've also retested every single benchmark at least twice! Please make sure you're read through the entire post prior to contesting these benchmark results!]

tl;dr: Always test your own application, and trust, but verify every benchmark you see. PHP 7 is actually faster than HHVM in many cases, neck-in-neck in others, and slightly slower in others. Both PHP 7 and HHVM blow PHP ? 5.6 out of the water.

Skip to benchmark results:

Introduction and Methodology

As PHP 7 became a reality through this past year, there were scores of benchmarks pitting PHP 7 against 5.6 and HHVM using applications and frameworks like Drupal, Wordpress, Joomla, Laravel, October, etc.

One benchmark that really stood out to me (in that it seemed so wrong for Drupal, based on my experience) was The Definitive PHP 7.0 & HHVM Benchmark from Kinsta. Naming a benchmark that way certainly makes the general PHP populace take it seriously!

The results are pretty damning for PHP 7:

PHP 7 HHVM Definitive Benchmark screenshot by Kinsta

In the comments on that post, Thomas Svenson mentioned:

Standard installation for Drupal 8 has cache on as default. If you did not turn that off, then it is probably a reason to why the PHP 7 boost isn't bigger.

Would be interesting to see the result comparing the benchmark with/without caching enabled in Drupal 8. Should potentially reveal something interesting.

This was my main concern too, as there wasn't enough detail in the benchmarking article to determine what exactly was the system under test. Therefore, I'll submit my own PHP 7 vs HHVM benchmark here, using the following versions:

  • Ubuntu 14.04
  • Drupal 8.0.1
  • Nginx 1.4.6
  • MySQL 5.5.46
  • PHP 5.6.16, PHP 7.0.1, or HHVM 3.11.0

All tests were run using Drupal VM version 2.1.2 with VMware Fusion 8.1.0, on my mid-2013 MacBook Air 13" 1.7 GHz i7 with 8GB of RAM. Using the above notes, you can exactly replicate this benchmarking environment should you desire. All tests were run five times, the first two results were discarded (because they often reflect times when some caches are still warming), and the latter three were averaged.

After installing Drupal 8.0.1 with the standard installation profile (this is done automatically by Drupal VM), I logged in as the admin user (user 1), then grabbed the admin user's session cookie, and ran the following two commands:

# Benchmark Drupal 8 home page out of the box with default caching options enabled.
ab -n 750 -c 10 http://drupalvm.dev/

# Benchmark Drupal 8 /admin page logged in as user 1.
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

Drupal 8 results (concurrency 10)

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 214.39 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 407.10 req/s 62% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 260.19 req/s 19% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 20.09 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 39.26 req/s 65% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 34.41 req/s 53% faster than 5.6

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, cached

PHP 5, PHP 7, HHVM benchmark cached Drupal 8 home page request

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, uncached

PHP 5, PHP 7, HHVM benchmark uncached Drupal 8 admin request

Drupal 8 results (concurrency 1)

Sometimes, the use of concurrency (-c 10 in the above case)(to simulate concurrent users hitting the site at the same time, can cause benchmarks to be slightly inaccurate. The reason I usually use a level of concurrency is so the benchmark more closely mirrors real-world usage, and tests the full stack a little better (because PHP by itself is nice to benchmark, but very few sites are run on top of PHP alone!).

Anyways, I re-ran all the tests using -c 1, and am publishing the results below:

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 171.34 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 242.00 req/s 34% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 192.92 req/s 12% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 19.89 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 30.07 req/s 41% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 23.37 req/s 16% faster than 5.6

In all my benchmarking, I care more about deltas and reproducibility than measuring raw, clean-room-scenario performance, because unless a result is absolutely reproducible, it's of no value to me. Therefore if I can prove that there's no particular difference to testing with certain concurrency levels, I typically move the benchmark to a level that mirrors traffic patterns I actually see on my sites :)

Absolute numbers mean nothing to me—it's the comparison between test A and test B, and how reproducible that comparison is, that matters. That's why I enjoy benchmarking on the incredibly slow Raspberry Pi model 2 sometimes, because though it's much slower than my i7 laptop, it sometimes exposes surprising results!

Drupal 8 Results ('bare metal', concurrency 1)

Some people argue that running benchmarks in a VM is highly unreliable and leads to incorrect benchmarks, so I've also sacrificed a partition of a Lenovo T420 core i5 laptop (it has 3 SSDs inside, so I just formatted one, installed Ubuntu desktop 15.10, then installed PHP, MySQL, and Nginx exactly the same as with Drupal VM (same settings, same apt repos, etc.), and re-ran all the tests in that environment—so-called 'bare metal', where there's absolutely no overhead from shared filesystems, the hypervisor, etc.

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 152.35 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 230.67 req/s 41% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 142.50 req/s 7% slower than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 11.37 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 13.13 req/s 14% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 11.40 req/s 0.3% faster than 5.6

After running these benchmarks with an identical environment on 'bare metal' (e.g. a laptop with a brand new/fresh install of Ubuntu 15.10 running the same software, with 8 GB of RAM and an SSD), it seems HHVM for some reason performed even worse than PHP 5.6.16 for Drupal 8.

Since this result is wildly different than the Kinsta post (basically the opposite of their results for Drupal 8), I decided to test Wordpress 4.4 as well.

Wordpress 4.4 Results ('bare metal', concurrency 1)

For Wordpress, I ran the test using the exact same Lenovo T420 environment as the test above, and tested an anonymous user (no cookie value) hitting the default home page, and an admin logged in (using a valid session cookie—actually all five of the cookies wordpress uses to track valid sessions) visiting the admin Dashboard page (/wp-admin/index.php).

Environment WP 4.4 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 18.76 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 40.45 req/s 73% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 40.14 req/s 73% faster than 5.6 PHP 5.6.16 Bypassed (/wp-admin/index.php, admin) 13.45 req/s ~ PHP 7.0.1 Bypassed (/wp-admin/index.php, admin) 28.10 req/s 71% faster than 5.6 HHVM 3.11.0 Bypassed (/wp-admin/index.php, admin) 35.43 req/s 90% faster than 5.6

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Wordpress 4.4, anonymous home page

PHP 5 7 and HHVM benchmark comparison of Wordpress 4.4 home page anonymous

PHP 5.6, PHP 7, and HHVM running Wordpress 4.4, admin dashboard

PHP 5 7 and HHVM benchmark comparison of Wordpress 4.4 admin dashboard

These results highlight to me how much the particular project's architecture influences the benchmark. Wordpress still uses a traditional quasi-functional-style design, while Drupal 8 is heavily invested in OOP and a bit more formal data architecture. While I'm not as familiar with Wordpress's quirks as I am Drupal, I know that it's no speed demon, and also benefits from added caching layers in front of the site! It's interesting to see that PHP 7 and HHVM are practically neck-and neck for front-facing portions of Wordpress (and FAR faster than 5.6), while HHVM runs even a little faster than PHP 7 for administrative tasks.

Drupal 7 Results (concurrency 10)

I also benchmarked Drupal 7 on Drupal VM for another point of comparison (using -c 10):

Environment D7 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 511.40 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 736.90 req/s 36% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 585.71 req/s 14% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 93.78 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 169.95 req/s 57% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 143.25 req/s 42% faster than 5.6

For these tests, I went to the Performance configuration page prior to running the tests, and enabled anonymous page cache, block cache, and CSS and JS aggregation (to make D7 match up to D8 cached anonymous user results a little more evenly).

Some people point out benchmarks like these and say "Drupal 8 is slow"... and they're right, of course. But Drupal 8 trades performance for better architecture, much more pluggability, and the inclusion of many more essential 'out-of-the-box' features than Drupal 7, so there's that. Having built a few Drupal 8 sites, I don't ever want to go back to 7 again—but it's nice to know that PHP 7 can still accelerate all my existing D7 sites quite a bit!

Summary

tl;dr: For Drupal 7 and Drupal 8 at least, PHP 7 takes the performance crown—by a wide margin.

After running the benchmarks, I scratched my head, because almost every other benchmark I've seen either puts HHVM neck-and-neck with PHP 7 or makes it seem HHVM is still the clear victor. Maybe other people running these benchmarks didn't have PHP's opcache turned on? Maybe something else was missing? Not sure, but if you'd like to reproduce the SUT and find any results different than the above (in terms of percentages), please let me know!

I ran the HHVM benchmarks three times with fresh new VM instances just because I was surprised PHP 7 stepped out in front. PHP 5.6's performance is as expected... it's better than 5.3, but that's not saying much :)

The moral of the story: Trust, but verify... especially for benchmarks which compare a plethora of totally different applications, each result can tell a completely different story depending on the test process and system under test! Please run your own tests with your own application before definitively stating that one server is faster than another.

Installing HHVM in Drupal VM

Just for posterity, since I want people to be able to reproduce the steps exactly, here's the process I used after using Drupal VM's default config.yml (with Ubuntu 14.04) to build the VM:

  1. Log into Drupal VM with vagrant ssh
  2. $ sudo su
  3. # service php5-fpm stop
  4. # apt-get install -y python-software-properties
  5. # curl http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
  6. # add-apt-repository http://dl.hhvm.com/ubuntu
  7. # apt-get update && apt-get install -y hhvm
  8. # update-rc.d hhvm defaults
  9. # /usr/share/hhvm/install_fastcgi.sh
  10. # vi /etc/nginx/sites-enabled/drupalvm.dev.conf and inside the location ~ \.php$|^/update.php block:
    1. Clear out the contents of this configuration block.
    2. Replace with include hhvm.conf;
  11. # service hhvm restart
  12. # service nginx restart

Visit the /admin/reports/status/php page after logging in to confirm you're running HHVM instead of PHP.

Dec 23 2015
Dec 23

As PHP 7 became a reality through this past year, there were scores of benchmarks pitting PHP 7 against 5.6 and HHVM using applications and frameworks like Drupal, Wordpress, Joomla, Laravel, October, etc.

One benchmark that really stood out to me (in that it seemed so wrong for Drupal, based on my experience) was The Definitive PHP 7.0 & HHVM Benchmark from Kinsta. Naming a benchmark that way certainly makes the general PHP populace take it seriously!

The results are pretty damning for PHP 7:

PHP 7 HHVM Definitive Benchmark screenshot by Kinsta

In the comments on that post, Thomas Svenson mentioned:

Standard installation for Drupal 8 has cache on as default. If you did not turn that off, then it is probably a reason to why the PHP 7 boost isn't bigger.

Would be interesting to see the result comparing the benchmark with/without caching enabled in Drupal 8. Should potentially reveal something interesting.

This was my main concern too, as there wasn't enough detail in the benchmarking article to determine what exactly was the system under test. Therefore, I'll submit my own PHP 7 vs HHVM benchmark here, using the following versions:

  • Ubuntu 14.04
  • Drupal 8.0.1
  • Nginx 1.4.6
  • MySQL 5.5.46
  • PHP 5.6.16, PHP 7.0.1, or HHVM 3.11.0

All tests were run using Drupal VM version 2.1.2 with VMware Fusion 8.1.0, on my mid-2013 MacBook Air 13" 1.7 GHz i7 with 8GB of RAM. Using the above notes, you can exactly replicate this benchmarking environment should you desire. All tests were run five times, the first two results were discarded (because they often reflect times when some caches are still warming, and the latter three were averaged.

After installing Drupal 8.0.1 with the standard installation profile (this is done automatically by Drupal VM), I logged in as the admin user (user 1), then grabbed the admin user's session cookie, and ran the following two commands:

# Benchmark Drupal 8 home page out of the box with default caching options enabled.
ab -n 750 -c 10 http://drupalvm.dev/

# Benchmark Drupal 8 /admin page logged in as user 1.
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

Results are as follows:

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 214.39 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 407.10 req/s ~ HHVM 3.11.0 Enabled (home, anonymous) 260.19 req/s ~ PHP 5.6.16 Bypassed (/admin, user 1) 20.09 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 39.26 req/s ~ HHVM 3.11.0 Bypassed (/admin, user 1) 34.41 req/s ~

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, cached

PHP 5, PHP 7, HHVM benchmark cached Drupal 8 home page request

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, uncached

PHP 5, PHP 7, HHVM benchmark uncached Drupal 8 admin request

Summary

After running the benchmarks, I scratched my head, because almost every other benchmark I've seen either puts HHVM neck-and-neck with PHP 7 or makes it seem HHVM is still the clear victor. Maybe other people running these benchmarks didn't have PHP's opcache turned on? Maybe something else was missing? Not sure, but if you'd like to reproduce the SUT and find any results different than the above (in terms of percentages), please let me know!

I ran the HHVM benchmarks three times with fresh new VM instances just because I was surprised PHP 7 stepped out in front. PHP 5.6's performance is as expected... it's better than 5.3, but that's not saying much :)

The moral of the story: Trust, but verify... especially for benchmarks which compare a plethora of totally different applications, each result can tell a completely different story depending on the test process and system under test! Please run your own tests with your own application before definitively stating that one server is faster than another.

Installing HHVM in Drupal VM

Just for posterity, since I want people to be able to reproduce the steps exactly, here's the process I used after using Drupal VM's default config.yml (with Ubuntu 14.04) to build the VM:

  1. Log into Drupal VM with vagrant ssh
  2. $ sudo su
  3. # service php5-fpm stop
  4. # apt-get install -y python-software-properties
  5. # curl http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
  6. # add-apt-repository http://dl.hhvm.com/ubuntu
  7. # apt-get update && apt-get install -y hhvm
  8. # update-rc.d hhvm defaults
  9. # /usr/share/hhvm/install_fastcgi.sh
  10. # vi /etc/nginx/sites-enabled/drupalvm.dev.conf and inside the location ~ \.php$|^/update.php block:
    1. Clear out the contents of this configuration block.
    2. Replace with include hhvm.conf;
  11. # service hhvm restart
  12. # service nginx restart

Visit the /admin/reports/status/php page after logging in to confirm you're running HHVM instead of PHP.

Dec 15 2015
Dec 15

One of the motivations behind Drupal VM is flexibility in local development environments. When you develop many different kinds of Drupal sites you need to be able to adapt your environment to the needs of the site—some sites use Memcached and Varnish, others use Solr, and yet others cache data in Redis!

Drupal VM has recently gained much more flexibility in that it now allows configuration options like:

  • Choose either Ubuntu or CentOS as your operating system.
  • Choose either Nginx or Apahe as your webserver.
  • Choose either MySQL or MariaDB for your database.
  • Choose either Memcached or Redis as a caching layer.
  • Add on extra software like Apache Solr, Node.js, Ruby, Varnish, Xhprof, and more.

Out of the box, Drupal VM installs Drupal 8 on Ubuntu 14.04 with PHP 5.6 (the most stable release as of December 2015) and MySQL. We're going to make a few quick changes to config.yml so we can run the following local development stack on top of CentOS 7:

Drupal VM - Drupal 8 status report page showing Nginx, Redis, MariaDB, and PHP 7

Configure Drupal VM

To get started, download or clone a copy of Drupal VM, and follow the Quick Start Guide, but before you run vagrant up (step 2, #6), edit config.yml and make the following changes/additions:

# Update vagrant_box to use the geerlingguy/centos7 box.
vagrant_box: geerlingguy/ubuntu1404

# Update drupalvm_webserver to use nginx instead of apache.
drupalvm_webserver: nginx

# Make sure 'redis' is listed in installed_extras, and memcached, xhprof, and
# xdebug are commented out.
installed_extras:
  [ ... ]
  - redis

# Switch the PHP version to "7.0".
php_version: "7.0"

# Add the following variables to the end of the file to make sure the PhpRedis
# extension is compiled to run with PHP 7.
php_redis_install_from_source: true
php_redis_source_version: php7

# Add the following variables to the 'MySQL Configuration' section to make sure
# the MariaDB installation works correctly.
mysql_packages:
  - mariadb
  - mariadb-server
  - mariadb-libs
  - MySQL-python
  - perl-DBD-MySQL
mysql_daemon: mariadb
mysql_socket: /var/lib/mysql/mysql.sock
mysql_log_error: /var/log/mariadb/mariadb.log
mysql_syslog_tag: mariadb
mysql_pid_file: /var/run/mariadb/mariadb.pid

To make Drupal use Redis as a cache backend, you have to include and enable the Redis module on your site. The official repository on Drupal.org doesn't currently have a Drupal 8 branch, but there's a fork on GitHub that currently works with Drupal 8. We need to add that module to the drupal.make.yml make file. Add the following just after the line with devel:

  devel: "1.x-dev"
  redis:
    download:
      type: git
      url: https://github.com/md-systems/redis.git
      branch: 8.x-1.x

Run vagrant up, and wait for everything to install inside the VM. After a bit, you can visit http://drupalvm.dev/, and log in (username admin and password admin). Go to the 'Extend' page, and enable the Redis module.

Once the module is enabled, you'll need to follow the Redis module's installation guide to make Drupal actually use Redis instead of MariaDB for persistent caching. The basic steps are:

  1. Create a new file services.yml inside the Drupal 8 codebase's sites/default folder, with the following contents:

    services:
      cache_tags.invalidator.checksum:
        class: Drupal\redis\Cache\RedisCacheTagsChecksum
        arguments: ['@redis.factory']
        tags:
          - { name: cache_tags_invalidator }

  2. Open sites/default/settings.php and add the following to the end of the file:

    $settings['redis.connection']['interface'] = 'PhpRedis';
    $settings['redis.connection']['host'] = '127.0.0.1';
    $settings['cache']['default'] = 'cache.backend.redis';

Once you've made those changes, go to the performance page (http://drupalvm.dev/admin/config/development/performance) and click the 'Clear all caches' button. If you log into the VM (vagrant ssh), then run the command redis-cli MONITOR, you can watch Drupal use Redis in real-time; browse the site and watch as Redis reports all it's caching data to your screen.

Benchmarking Redis, PHP 7, and Drupal 8

These are by no means comprehensive benchmarks, but the results are easily reproducible and consistent. I used ApacheBench (ab) to simulate a single authenticated user requesting the /admin page as quickly as possible.

# ApacheBench command used:
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

With these settings, Drupal VM's CPU usage was pegged at 200%, and it reported the following results (averaged over three runs):

Cache location PHP version Requests/second Percent difference MariaDB 5.6.16 21.86 req/s ~ Redis 5.6.16 21.34 req/s 2% slower MariaDB 7.0.0 30.32 req/s 32% faster Redis 7.0.0 34.64 req/s 45% faster

Assuming you're using PHP 7, there's approximately a 13% performance boost using a local Redis instance rather than a local database to persist Drupal 8's cache. This falls in line with my findings in a related project, when I was building a cluster of Raspberry Pis to run Drupal 8 and found Redis to speed things up by about 15%!

It's odd that PHP 5.6 benchmarks showed a very slight performance decrease when using Redis, but I'm wondering if that's because the PhpRedis extension had some optimizations in its php7 branch that weren't present in the older compiled versions.

It's important to run your own benchmarks in your own environment, to make sure the performance optimizations are worth the extra applications running on your infrastructure... and that they're actually helping your Drupal site run better, not worse!

Summary

I hope Drupal VM can help you build a great local development environment; I have been using it for every Drupal project I work on, and have even taken to using it as a base for building out single-server Drupal infrastructure as-needed, by removing roles and settings I don't need, and enabling the extra security settings, and it has served me well.

If there's anything you see missing from Drupal VM that would make your local Drupal development experience easier, please take a look in the issue queue and let me know what else you'd like to see!

Dec 09 2015
Dec 09

I was recently futzing around with a Drupal site that has a fairly complex theme setup, and which relies on npm/gulp to setup and build the theme assets. One time after not touching the project for a couple weeks, when I came back and ran the gulp command again, I got the following error:

/path/to/node_modules/node-sass/lib/extensions.js:158
    throw new Error([
          ^
Error: The `libsass` binding was not found in /path/to/node_modules/node-sass/vendor/darwin-x64-14/binding.node
This usually happens because your node version has changed.
Run `npm rebuild node-sass` to build the binding for your current node version.
    at Object.sass.getBinaryPath (/path/to/node_modules/node-sass/lib/extensions.js:158:11)
    at Object.<anonymous> (/path/to/node_modules/node-sass/lib/index.js:16:36)
    at Module._compile (module.js:460:26)
    at Object.Module._extensions..js (module.js:478:10)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/path/to/node_modules/gulp-sass/index.js:176:21)
    at Module._compile (module.js:460:26)

And I think that the line This usually happens because your node version has changed. was exactly right. I manage npm and my Node.js install using Homebrew, and I remember having updated recently; it went from 4.x to 5.x. I also remembered that the project itself was supposed to be managed with 0.12!

So first things first, I installed nvm and used it to switch back to Node.js 0.12:

brew install nvm
source $(brew --prefix nvm)/nvm.sh
nvm install 0.12

Then, I re-ran npm install inside the theme directory to make sure I had all the proper versions/dependencies. But I was still getting the error. So I found this answer on Stack Overflow, which suggested rebuilding node-sass with:

npm rebuild node-sass

After that ran a couple minutes, gulp worked again, and I could move along with development on this particular Drupal site. Always mind your versions for Node.js!

Nov 27 2015
Nov 27

Recently I had to upgrade someone's Apache Solr installation from 1.4 to 5.x (the current latest version), and for the most part, a Solr upgrade is straightforward, especially if you're doing it for a Drupal site that uses the Search API or Solr Search modules, as the solr configuration files are already upgraded for you (you just need to switch them out when you do the upgrade, making any necessary customizations).

However, I ran into the following error when I tried loading the core running Apache Solr 4.x or 5.x:

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: MMapIndexInput(path="/var/solr/cores/[corename]/data/spellchecker2/_1m.cfx") [slice=_1m.fdx]): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later.

To fix this, you need to upgrade your index using Solr 3.5.0 or later, then you can upgrade to 4.x, then 5.x (using each version of Solr to upgrade from the previous major version):

  1. Run locate lucene-core to find your Solr installation's lucene-core.jar file. In my case, for 3.6.2, it was named lucene-core-3.6.2.jar.
  2. Find the full directory path to the Solr core's data/index.
  3. Stop Solr (so the index isn't being actively written to.
  4. Run the command to upgrade the index: java -cp /full/path/to/lucene-core-3.6.2.jar org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose /full/path/to/data/index

It will take a few seconds for a small index (hundreds of records), or a bit longer for a huge index (hundreds of thousands of records), and then once it's finished, you should be able to start Solr again using the upgraded index. Rinse and repeat for each version of Solr you need to upgrade through.

If you have directories like index, spellchecker, spellchecker1, and spellchecker2 inside your data directory, run the command over each subdirectory to make sure all indexes are updated.

For more info, see the IndexUpgrader documentation, and the Stack Overflow answer that instigated this post.

Nov 17 2015
Nov 17

Drupal 8 Logo

On November 19, the St. Louis Drupal Users Group is having a party to celebrate the release of Drupal 8, which has been 4 years in the making! The party will be hosted at Spry Digital in downtown St. Louis, and will have beer provided by Manifest, food and drinks provided by Spry, and a Raspberry Pi 2 model B giveaway sponsored by Midwestern Mac!

Drupal 8.0.0 has been built by over 3,000 contributors in all corners of the globe, and will help kick off the next generation of personalized, content-driven websites. During the meetup, we'll build a brand new Drupal 8 site on the Raspberry Pi using Jeff Geerling's Drupal Pi project, and we'll highlight some of the awesome new features of Drupal 8.

Raspberry Pi and Acquia dancing man

After we build one of the first Drupal 8 sites, we'll give away the Raspberry Pi to a lucky winner to take home and tinker with! Special thanks to the Austin Drupal Users Group, who came up with the Pi giveaway idea!

We'll also eat, drink and be merry, celebrating the start of a new era of site building with the best version of Drupal yet!

If you'd like to join us, please RSVP on the STLDUG Meetup page: STLDUG Drupal 8.0.0 Release Party.

Oct 19 2015
Oct 19

Ansible is a simple, but powerful, server and configuration management tool. Ansible for Devops is a book I wrote to teach you to use Ansible effectively, whether you manage one server—or thousands.

Ansible for DevOps cover - Book by Jeff Geerling

I've spent a lot of time working with Ansible and Drupal over the past couple years, culminating in projects like Drupal VM (a VM for local Drupal development) and the Raspberry Pi Dramble (a cluster of Raspberry Pi computers running Drupal 8, powering http://www.pidramble.com/). I've also given multiple presentations on Ansible and Drupal, like a session at DrupalCon Austin, a session at MidCamp earlier this year, and a BoF at DrupalCon LA.

Ansible for DevOps includes a few different examples of Drupal deployments specifically, and many examples pertaining to LAMP-based infrastructure management. In the next few months, I'm finally going to publish posts I've had in the wings about using Ansible for Drupal infrastructure management, beginning with one of my simplest and most fun projects, the Drupal Pi.

Check for it soon on Drupal Planet!

Purchase Ansible for DevOps on LeanPub, Amazon, or iTunes.

Aug 21 2015
Aug 21

[Update 2015-08-25: I reran some of the tests using two different settings in VirtualBox. First, I explicitly set KVM as the paravirtualization mode (it was saved as 'Legacy' by default, due to a bug in VirtualBox 5.0.0), which showed impressive performance improvements, making VirtualBox perform 1.5-2x faster, and bringing some benchmarks to a dead heat with VMware Fusion. I also set the virtual network card to use 'virtio' instead of emulating an Intel PRO/1000 MT card, but this made little difference in raw network throughput or any other benchmarks.]

My Mac spends the majority of the day running at between one and a dozen VMs. I do all my development (besides iOS or Mac dev) running code inside VMs, and for many years I used VirtualBox, a free virtualization tool, along with Vagrant and Ansible, to build and manage all these VMs.

Since I use build and rebuild dozens of VMs per day, and maintain a popular Vagrant configuration for Drupal development (Drupal VM), as well as dozens of other VMs (like Ansible Vagrant Examples), I am highly motivated to find the fastest and most reliable virtualization software for local development. I switched from VirtualBox to VMware Fusion (which requires a for-pay plugin) a year ago, as a few benchmarks I ran at the time showed VMware was 10-30% faster.

Since VirtualBox 5.0 was released earlier this year, I decided to re-evaluate the two VM solutions for local web development (specifically, LAMP/LEMP-based Drupal development, but most of these benchmarks apply to any dev workflow).

I benchmarked the raw performance bits (CPU, memory, disk access) as well as some 'full stack' scenarios (load testing and per-page load performance for some CMS-driven websites). I'll present each benchmark, some initial conclusions based on the result, and the methodology I used for each benchmark.

The key question I wanted to answer: Is purchasing VMware Fusion and the required Vagrant plugin ($140 total!) worth it, or is VirtualBox 5.0 good enough?

Baseline Performance: Memory and CPU

I wanted to make sure VirtualBox and VMWare could both do basic operations (like copying memory and performing raw number crunching in the CPU) at similar rates; both should pass through as much of this performance as possible to the underlying system, so numbers should be similar:

Memory and CPU benchmark - VirtualBox and VMware Fusion

VMware and VirtualBox are neck-in-neck when it comes to raw memory and CPU performance, and that's to be expected these days, as both solutions (as well as most other virtualization solutions) are able to use features in modern Intel processors and modern chipsets (like those in my MacBook Air) to their fullest potential.

CPU or RAM-heavy workloads should perform similarly, though VMware Fusion has a slight edge.

Methodology - CPU/RAM

I used sysbench for the CPU benchmark, with the command sysbench --test=cpu --cpu-max-prime=20000 --num-threads=2 run.

I used Memory Bandwidth Benchmark (mbw) for the RAM benchmark, with the command mbw -n 2 256 | grep AVG, and I used the MEMCPY result as a proxy for general RAM performance.

Baseline Performance: Networking

More bandwidth is always better, though most development work doesn't rely on a ton of bandwidth being available. A few hundred megabits should serve web projects in a local environment quickly.

Network throughput benchmark - VirtualBox and VMware Fusion

This is one of the few tests in which VMware really took VirtualBox to the cleaners. It makes some sense, as VMware (the company) spends a lot of time optimizing VM-to-VM and VM-to-network-interface throughput since their products are more often used in production environments where bandwidth matters a lot, whereas VirtualBox is much more commonly used for single-user or single-machine purposes.

Having 40% more bandwidth available means VMware should be able to perform certain tasks, like moving files between host/VM, or your network connection (if it's fast enough) and the VM, or serving hundreds or thousands of concurrent requests, with much more celerity than VirtualBox—and we'll see proof of this fact with a Varnish load test, later in the post.

Methodology - Networking

To measure raw virtual network interface bandwidth, I used iperf, and set the VM as the server (iperf -s), then connected to it and ran the benchmark from my host machine (iperf -c drupalvm.dev). iperf is an excellent tool for measuring raw bandwidth, as no non-interface I/O operations are performed. Tests such as file copies can have irregular results due to filesystem performance bottlenecks.

Disk Access and Shared/Synced Folders

One of the largest performance differentiators—and one of the most difficult components to measure—is filesystem performance. Virtual Machines use virtual filesystems, or connect to folders on the host system via some sort of mounted share, to provide a filesystem the guest OS uses.

Filesystem I/O perfomance is impossible to measure simply and universally, because every use case (e.g. media streaming, small file reads, small file writes, or database access patterns) benefits from different types of file read/write performance.

Since most filesystems (and even the slowest of slow microSD cards) are fast enough for large file operations (reading or writing large files in large chunks), I decided to benchmark one of the most brutal metrics of file I/O, 4k random read/write performance. For many web applications and databases, common access patterns either require hundreds or thousands of small file reads, or many concurrent small write operations, so this is a decent proxy of how a filesystem will perform under the most severe load (e.g. reading an entire PHP application's files from disk, when opcaches are empty, or rebuilding key-value caches in a database table).

I measured 4k random reads and writes across three different VM scenarios: first, using the VM's native share mechanism (or 'synced folder' in Vagrant parlance), second, using NFS, a common and robust network share mechanism that's easy to use with Vagrant, nad third, reading and writing directly to the native VM filesystem:

Disk or drive random access benchmark - VirtualBox and VMware Fusion

The results above, as with all other benchmarks in this post, were repeated at least four times, with the first result set discarded. Even then, the standard deviation on these benchmarks was typically 5-10%, and the benchmarks were wildly different depending on the exact benchmark I used.

I was able to reproduce the strange I/O performance numbers in Mitchell Hashimoto's 2014 post when I didn't use direct filesystem access to do reads and writes; certain benchmarks suggest the VM filesystem is capable of over 1 GB/sec of random 4K reads and writes! Speaking of which, running the same benchmarks on my MacBook Air's internal SSD showed maximum performance of 1891 MB/s read, and 389 MB/s write.

Passing the -I option to the iozone benchmarking tool makes sure the tests bypass the VM's disk caching mechanisms that masks the actual filesystem performance. Unfortunately, this parameter (which uses O_DIRECT filesystem access) doesn't work with native VM shares, so those numbers may be a bit inflated over real-world performance.

The key takeaway? No matter the filesystem you use in a VM, raw file access is an order of magnitude slower than native host I/O if you have a fast SSD. Luckily, the raw performance isn't horrendous (as long as you're not copying millions of tiny files!), and common development access patterns help filesystem and other caches speed up file operations.

Methodology - Disk Access

I used iozone to measure disk access, using the command iozone -I -e -a -s 64M -r 4k -i 0 -i 2 [-f /path/to/file]. I also repeated the tests numerous times with different -s values ranging from 128M to 1024M, but the performance numbers were similar with any value.

If you're interested in diving deeper into filesystem benchmarking, iozone's default set of tests are much broader and applicable across a very wide range of use cases (besides typical LAMP/LEMP web development).

Full Stack - Drupal 7 and Drupal 8

When it comes down to it, the most meaningful benchmark is a 'full stack' benchmark, which tests the application I'm developing. In my case, I am normally working on Drupal-based websites, so I wanted to test both Drupal 8 and Drupal 7 (the current stable release) in two scenarios—a clean install of Drupal 8 (with nothing extra added), and a fairly heavy Drupal 7 site, to mirror some of the more complicated sites I have to work with.

First, here's a comparison of 'requests per second' with VirtualBox and VMware. Higher numbers are better, and this test is a decent proxy for how fast the VM is rendering specific pages, as well as how many requests the full stack/server can serve in a short period of time:

Drupal 8 requests per second benchmark - VirtualBox and VMware Fusion

The first two benchmarks are very close. When your application is mostly CPU-and-RAM-constrained (Drupal 8 is running almost entirely out of memory using PHP's opcache and MySQL caches), both virtualization apps are about the same, with a very slight edge going to VMware Fusion.

The third graph is more interesting, as it shows a large gap—VMware can serve up 43% more traffic than VirtualBox. When you compare this graph with the raw network throughput graph above, it's obvious VMware Fusion's network bandwidth is the reason it can almost double the requests/sec for a network-constrained benchmark like Varnish capacity.

Developing a site with frequently-changing code requires more disk I/O, since the opcache has to be rebuilt from disk, so I tested raw page load times with a fresh PHP thread:

Page load performance for Drupal 7 and 8 - VirtualBox and VMware Fusion

For this test, I restarted Apache entirely between each page request, which wiped out PHP's opcache, causing all the PHP files to be read from the disk. These benchmarks were run using an NFS share, so the main performance increase here (over the load test in the previous benchmark) comes from VMware's slightly faster NFS shared filesystem performance.

In real world usage, there's a perceptible performance difference between VirtualBox and VMware Fusion, and these benchmarks confirm it.

Many people decide to use native synced folders because file permissions and setup can often be simpler, so I wanted to see how much not using NFS affects these numbers:

Page load performance for Drupal 7 and 8 with different synced folder methods - VirtualBox and VMware Fusion

As it turns out, NFS has a lot to offer in terms of performance for apps running in a shared folder. Another interesting discovery: VMware's native shared folder performs nearly as good as the ideal scenario in VirtualBox (running the codebase on an NFS mount).

I still highly recommend using NFS instead of native shared folders if you're sharing more than a few files between host and guest.

Methodology - Full Stack Performance

I used ab, wrk, and curl to run performance benchmarks and simple load tests:

  • Drupal anonymous cached page load: wrk -d 30 -c 2 http://drupalvm.dev/
  • Drupal authenticated page load: ab -n 500 -c 2 -C "SESS:COOKIE" http://drupalvm.dev/ (used the uid 1 user session cookie)
  • Varnish anonymous proxied page load: wrk -d 30 -c 2 http://drupalvm.dev:81/ (a cache lifetime value of '15 minutes' was set on the performance configuration page)
  • Drupal 8 front page uncached: time curl --silent http://drupalvm.dev/ > /dev/null, run once after clicking 'Clear all caches' on the admin/config/development/performance page, averaged over six runs)
  • Large Drupal 7 site views/panels page request: time curl --silent http://local.example.com/path > /dev/null (run once after clicking 'Clear all caches' on the `admin/config/development/performance` page, averaged over six runs)

Drupal 8 tests were run with a standard profile install of a Drupal 8 site (ca. beta 12) on Drupal VM 2.0.0, and Drupal 7 tests were run using a very large scale Drupal codebase, with over 150 modules.

Summary

I hope these benchmarks help you to decide if VMware Fusion is right for your Vagrant-based development workflow. If you use synced folders a lot and need as much bandwidth as possible, choosing VMware is a no-brainer. If you don't, then VirtualBox is likely 'fast enough' for your development workflow.

It's great to have multiple great choices for VM providers for local development—and in this case, the open source option holds its own against the heavyweight proprietary virtualization app!

Methodology - All Tests

Since I detest when people post benchmarks but don't describe the system under test and all their reasons behind testing things certain ways, I thought I'd explicitly outline everything here, so someone else with the time and materials could replicate all my test results verbatim.

  • I ran all benchmarks four times (with the exception of some of the disk benchmarks, which I ran six times for better coverage of random I/O variance), discarded the first result, and averaged the remaining results.
  • All tests were run using an unmodified copy of Drupal VM version 2.0.0, with all the example configuration files (though all extra installations besides Varnish were removed), using the included Ubuntu 14.04 LTS minimal base box (which is built using this Packer configuration, the same for both VirtualBox and VMware Fusion).
  • For full stack Drupal benchmarking for Varnish-cached pages, I logged into Drupal and set a minimum cache lifetime value of '15 minutes' on the performance configuration page, and for authenticated page loads, I used the session cookie for the logged in uid 1 user.
  • All tests were run on my personal 11" Mid 2013 MacBook Air, with a 1.7 GHz Intel Core i7 processor, 8 GB of RAM, and a 256 GB internal SSD. The only other applications (besides headless VMs and Terminal) that were open and running during tests were Mac OS X Mail and Sublime Text 3 (in which I noted benchmark results.
  • All tests were performed with my Mac disconnected entirely from the Internet (WiFi disabled, and no connection otherwise), to minimize any strange networking problems that could affect performance.
Jul 31 2015
Jul 31

I've been working with Drupal 8 for a long time, keeping Honeypot and some other modules up to date, and doing some dry-runs of migrating a few smaller sites from Drupal 7 to Drupal 8, just to hone my D8 familiarity.

Raspberry Pi Dramble Drupal 8 Website

I finally launched a 'for real' Drupal 8 site, which is currently running on Drupal 8 HEAD—on a cluster of Raspberry Pi 2 computers in my basement! You can view the site at http://www.pidramble.com/, and I've already started posting some articles about running Drupal 8 on the servers, how I built the cluster, some of the limitations of at-home webhosting, etc.

Some of the things I've already learned from building and running this cluster for the past few days:

  • Drupal 8 (just core, alone) is awesome. Building out simple sites with zero contributed modules, and no custom code, is a real possibility in Drupal 8. Drupal 7 will never feel the same again :(
  • Drupal 8 is finally fast; not super fast, but fast enough. And with some recent cache stampede protections that have been added, Drupal 8 is running much more stable in my testing—stable enough that I was finally comfortable launching a site on Drupal 8 on these Raspberry Pis!
  • My (very) limited upload bandwidth isn't yet an issue. I only have 4-5 Mbps up, and as long as I host most images externally, serving up tiny 8-10 KB resources for normal page loads allows for a pretty large amount of traffic without a hiccup. Or, more importantly, without interfering with my day-to-day Internet use as a work-from-home employee!
  • It's really awesome being able to see the live traffic to the servers using the LEDs on the front. See for yourself: Nginx Load Balancer Visualization w/ LEDs. It's fun watching live traffic a few feet away from my desk, especially when I do things like tweet the URL (immediately following, I can see all the requests come in from Twitter-related bots!).

I'm hoping to continue writing about my experiences with Drupal 8 (especially on the Pi cluster), etc. in the next few weeks, both here and elsewhere!

Jul 28 2015
Jul 28

After some more tinkering with the Raspberry Pi Dramble (a cluster of 6 Raspberry Pis used to demonstrate Drupal 8 deployments using Ansible), I finally was able to get the RGB LEDs to react to Nginx accesses—meaning every time a request is received by Nginx, the LED toggles to red momentarily.

This visualization allows me to see exactly how Nginx is distributing requests among the servers in different load balancer configurations. The default (not only for Nginx, but also for Varnish, HAProxy, and other balancers) is to use round-robin distribution, meaning each request is sent to the next server. This is demonstrated first, in the video below, followed by a demonstration of Nginx's ip_hash method, which pins one person's IP address to one backend server, based on a hash of the person's IP address:

It's fun to be able to visualize things like Drupal deployments, Nginx requests, etc., on this cluster of Raspberry Pis, and in addition to a presentation on Ansible + Drupal 8 at MidCamp, and Ansible 101, I'll be showing the Dramble in a soon-to-be-released episode of Jam's Drupal Camp from Acquia—stay tuned!

Jul 21 2015
Jul 21

On many Drupal 7 sites, I have encountered issues with Emoji (mostly) and other special characters (rarely) when importing content from social media feeds, during content migrations, and in other situations, so I finally decided to add a quick blog post about it.

Have you ever noticed an error in your logs complaining about incorrect string values, with an emoji or other special character, like the following:

PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x98\x89" ...' for column 'body_value' at row 1: INSERT INTO {field_data_body} (entity_type, entity_id, revision_id, bundle, delta, language, body_value, body_summary, body_format) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8); Array ( [:db_insert_placeholder_0] => node [:db_insert_placeholder_1] => 538551 [:db_insert_placeholder_2] => 538550 [:db_insert_placeholder_3] => story [:db_insert_placeholder_4] => 0 [:db_insert_placeholder_5] => und [:db_insert_placeholder_6] => <p>????</p> [:db_insert_placeholder_7] => [:db_insert_placeholder_8] => filtered_html ) in field_sql_storage_field_storage_write() (line 514 of /drupal/modules/field/modules/field_sql_storage/field_sql_storage.module).

To fix this, you need to switch the affected MySQL table's encoding to utf8mb4, and also switch any table columns ('fields', in Drupal parlance) which will store Emojis or other exotic UTF-8 characters. This will allow these special characters to be stored in the database, and stop the PDOExceptions.

Using Sequel Pro on a Mac, this process is relatively quick and painless:

  1. Open the affected tables (in the above case, field_data_body, and the corresponding revision table, field_revision_body), and click on the 'Table info' tab.
  2. In the 'Encoding' menu, switch from "UTF-8 Unicode (utf8)" to "UTF-8 Unicode (utf8mb4)". This will take a little time for larger data sets.
  3. Switch over to the 'Structure' tab, and for each field which will be storing data (in our case, the body_value and body_summary fields), choose "UTF-8 Unicode (utf8mb4)" under the 'Encoding' column. This will take a little time for larger data sets.

After converting the affected tables, you will also need to patch Drupal 7 to make sure the MySQL connection uses the correct encoding. Apply the latest patch from the issue Drupal 7 MySQL does not support full UTF-8, and add the following keys to your default database connection settings:

$databases = array(
  'default' => array(
    'default' => array(
      'database' => 'database',
      'username' => 'username',
      'password' => 'password',
      'host' => '127.0.0.1',
      'driver' => 'mysql',
      // Add default charset and collation for mb4 support.
      'charset' => 'utf8mb4',
      'collation' => 'utf8mb4_general_ci',
    ),
  ),
);

That issue is actually a child issue of MySQL driver does not support full UTF-8, which has already been fixed in Drupal 8 (which now requires MySQL 5.5.3 or later as a result). It may take a little time for the problem to get an 'official' fix in Drupal 7, since it's a complicated problem that requires a delicate touch—we don't want a bunch of people's sites to go belly up because some contributed modules are using large VARCHAR columns, or because their hosting provider is running an old version of MySQL!

There's also a handy table_converter module for Drupal 7, which helps you automate the process of converting tables to the new format. It still requires the core patch mentioned above, but it can help smooth out the process of actually converting the tables to the new format.

Once you've fixed the issue, you won't be quite as annoyed next time you see one of these guys: ????

Jul 20 2015
Jul 20

I build and destroy a lot of VMs using Vagrant in the course of the day. Between developing Drupal VM, writing Ansible for DevOps, and testing dozens of Ansible Galaxy roles, I probably run vagrant up and vagrant destroy -f at least a dozen times a day.

Building all these VMs would be a pain, and require much more user intervention, if it weren't for a few things I've done on my local workstation to help with the process. I thought I'd share these tips so you can enjoy a much more streamlined Vagrant workflow as well!

Extremely helpful Vagrant plugins

None of my projects require particular Vagrant plugins—but many, like Drupal VM, will benefit from adding at least one venerable plugin, vagrant-hostsupdater. Every time you start or shut down a VM with Vagrant, the relevant hosts entries will be placed in your system's hosts file, without requiring you to do anything manually. Great time-saver, and highly recommended! To install: vagrant plugin install vagrant-hostsupdater

Another plugin that many people have used to provide the fastest filesystem synchronization support is vagrant-gatling-rsync, which uses an rsync-based sync mechanism similar to the one built into Vagrant, but much faster and less resource-intense on your host machine.

Helpful modifications to /etc/sudoers

One major downside to using the vagrant-hostsupdater plugin, or to using NFS mounts (which are much faster than native shares in either VirtualBox or VMWare Fusion), is that you have to enter your sudo password when you build and destroy VMs. You can avoid this gotcha by adding the following lines to your /etc/sudoers configuration (then quit and restart your Terminal session so the new settings are picked up):

# Vagrant configuration.
# Allow Vagrant to manage NFS exports.
Cmnd_Alias VAGRANT_EXPORTS_ADD = /usr/bin/tee -a /etc/exports
Cmnd_Alias VAGRANT_NFSD = /sbin/nfsd restart
Cmnd_Alias VAGRANT_EXPORTS_REMOVE = /usr/bin/sed -E -e /*/ d -ibak /etc/exports
# Allow Vagant to manage hosts file.
Cmnd_Alias VAGRANT_HOSTS_ADD = /bin/sh -c echo "*" >> /etc/hosts
Cmnd_Alias VAGRANT_HOSTS_REMOVE = /usr/bin/sed -i -e /*/ d /etc/hosts
%admin ALL=(root) NOPASSWD: VAGRANT_EXPORTS_ADD, VAGRANT_NFSD, VAGRANT_EXPORTS_REMOVE, VAGRANT_HOSTS_ADD, VAGRANT_HOSTS_REMOVE

Important note: If you're editing sudoers by hand, make sure you edit the file with sudo visudo instead of just editing it in your favorite editor. This ensures the file is valid when you save it, so you don't get locked out from sudo on your system!

This configuration works out of the box on Mac OS X, and only needs slight modifications to make sure it works on Linux distributions (make sure the 'admin' group is changed to whatever group your user account is in).

I've even wrapped up the configuration of /etc/sudoers into my Mac Development Ansible Playbook, so I can automatically ensure all my Macs are configured for an optimal Vagrant experience!

SSH keys inside your VM

If you want to use your SSH credentials inside a Vagrant-powered VM, you can turn on SSH agent forwarding on by adding the following line inside your Vagrantfile:

  config.ssh.forward_agent = true

Drupal VM includes agent forwarding by default, so you can build your VM, log in, and work on Git projects, log into remote servers, use drush, etc., just as you would on your host computer.

Note that I usually don't have fowarding enabled in my own environments, as I treat Vagrant VMs strictly as sandboxed development environments—if I install some software for testing inside the VM as the vagrant user, I don't want it to be able to use my SSH credentials to do anything nefarious! Generally that won't happen, but I like erring on the side of caution.

Summary

What are some of your favorite tips and tricks for Vagrant-based workflows? Any other tricks you know of to solve common pain points (e.g. using the vagrant-vbguest if you have issues with native shares or guest additions)?

Jul 03 2015
Jul 03

Drupal VM - Vagrant and Ansible Virtual Machine for Drupal Development

For the past couple years, I've been building Drupal VM to be an extremely-tunable, highly-performant, super-simple development environment. Since MidCamp earlier this year, the project has really taken off, with almost 200 stars on GitHub and a ton of great contributions and ideas for improvement (some implemented, others rejected).

In the time since I wrote Developing for Drupal with Vagrant and VMs, I've focused on meeting all my defined criteria for the perfect local development environment. And now, I'm able to say that I use Drupal VM when developing all my projects—as it is now flexible and fast enough to emulate any production environment I use for various Drupal projects.

Easy PHP 7 testing with CentOS 7 and MariaDB

After a few weeks of work, Drupal VM now officially supports running PHP 7 (currently, 7.0.0 alpha 2) on CentOS 7 with MariaDB, or you can even tweak the settings to compile PHP from source yourself (following to the PHP role's documentation).

Doing this allows you to see how your own projects will fare when run with the latest (and fastest) version of PHP. Drupal 8 performance improves dramatically under PHP 7, and most other PHP applications will have similar gains.

Read PHP 7 on Drupal VM for more information.

Other major improvements and features

Here are some of the other main features that have recently been added or improved:

  • Flexible database support: MySQL, MariaDB, or (soon) Percona are all supported out of the box, pretty easily. See guide for Use MariaDB instead of MySQL.
  • Flexible OS support: Drupal VM officially supports Ubuntu 14.04, Ubuntu 12.04, CentOS 7, or CentOS 6 out of the box; other OSes like RHEL, Fedora, Arch and Debian may also work, but are not supported. See: Using different base OSes.
  • Use with any Drupal deployment methodology — works with any dev workflow, including Drush make files, local Drupal codebases, and multisite installations.
  • Automatic local drush alias configuration
  • 'Batteries included' — developer utilities and essentials like Varnish, Solr, MailHog, XHProf are easy to enable or disable.
  • Production-ready, security-hardened configuration you can install on DigitalOcean
  • Thoroughly-documented — check out the Drupal VM Wiki on GitHub
  • First class support for any host OS — Mac, Linux or Windows
  • Drupal version agnostic — works great with 6, 7, or 8.
  • Easy configuration of thousands of parameters (powered by a few dozen component-specific Ansible roles) through the config.yml file.

I'd especially like to thank the dozens of people who have filed issues against the project to add needed functionality or fix bugs (especially for multi-platform, multi-database support!), and have helped improve Drupal VM through over 130 issues and 17 pull requests.

There are dozens of other VM-based or Docker/container-based local development solutions out there, and Drupal VM is one of many, but I think that—even if you don't end up using it for your own work—you will find sound ideas and best practices in environment configuration in the project.

Jun 21 2015
Jun 21

DrupalCamp St. Louis 2015 was held this past weekend, June 20-21, 2015, at SLU LAW in downtown St. Louis. We had nine sessions and a great keynote on Saturday, and a full sprint day on Sunday.

DrupalCamp St. Louis 2015 Registration
The view coming off the elevators at SLU LAW.

Every session was recorded (slides + audio), and you can view all the sessions online:

The Camp went very well, with almost sixty participants this year! We had a great time, learned a lot together, and enjoyed some great views of downtown St. Louis (check out the picture below!), and we can't wait until next year's DrupalCamp St. Louis (to be announced)!

PS Thug Life St. Louis - Jeff Geerling and Mike Ryan
A candid shot of myself and Mike Ryan, 'the Drupal Migrate guy' who lives near St. Louis.

High Performance Drupal

geerlingguy delivers presentation on High Performance Drupal at DrupalCamp St. Louis 2015
Yours truly, talking about Drupal and Performance.

I delivered a session titled High Performance Drupal, going over performance planning, benchmarking, and easy performance wins. You can click the link in the previous line to see more details, watch the session video on YouTube, or view the slides from the presentation on SlideShare.

Check out more from DrupalCamp St. Louis 2015 on the official camp website: DrupalCamp St. Louis 2015.

Jun 08 2015
Jun 08

The organizers of DrupalCamp St. Louis 2015 are excited to announce that the schedule is set for DrupalCamp STL.15; we will have sessions from a variety of presenters on a variety of topics—for both beginners and seasoned veterans alike!

DrupalCamp 2015 St. Louis - SLU LAW

Some of the great sessions lined up include a session on Git basics, the status of Migrate in Drupal 8, content strategy, securing Drupal, improving performance, improving search, Twig, and more! To kick it off, we'll have an awesome keynote from Alina Mackenzie (alimac) about getting involved in the Drupal Community.

Check out the sessions: DrupalCamp St. Louis 2015 Session Schedule.

Register for DrupalCamp STL.15 today, and build your schedule on the site—besides these excellent sessions, you'll get a tasty catered lunch, a comfy t-shirt, and some great memories and networking opportunities on both days of the Camp!

May 29 2015
May 29

DrupalCamp STL.15 (June 20-21, in St. Louis, MO) will be the first DrupalCamp in St. Louis with a day dedicated to sprints to help the Drupal community. We're expecting a great turnout, and there are already a number of proposed sessions (many of which will be selected and announced on June 5!), and it's not yet too late to propose a session of your own!

DrupalCamp 2015 St. Louis - SLU LAW

This year's keynote, by Alina Mackenzie, will focus on the Drupal Community—what it is, why it rocks, and how you can get involved in the community. After the keynote, some great sessions, a tasty lunch, happy hour, and a good night's rest, we'll spend sprint day (Sunday June 21) making Drupal better, and maybe even pushing Drupal 8 a little closer to an 8.0.0 rc1 release!

Registration is now open, so go reserve your spot at DrupalCamp St. Louis 2015; I'll see you there, hopefully at one of the sessions I proposed, either on High Performance Drupal, or Local Development Environments and Drupal VM!

May 21 2015
May 21

DrupalCamp 2015 St. Louis - SLU LAW

DrupalCamp St. Louis is scheduled for June 20-21, 2015, and will be held at SLU LAW in downtown St. Louis, MO. Less than a month away, there are a few important bits of news:

DrupalCamp STL.15 Keynote Speaker: Alina Mackenzie (alimac)

Alina Mackenzie is a developer and system administrator based in Chicago. In the Drupal community she is a camp organizer, speaker and communications lead for DrupalCon mentored sprints. She is passionate about learning organizations, automation, and making open source friendly for beginners.

Alina's keynote will focus on "Finding the entrance: Why and how to get involved with the Drupal community".

Alina's Drupal.org profile is https://www.drupal.org/u/alimac

Session Submission Deadline: May 29

Please submit your session proposals by Friday, May 29—just over a week from today! We'll notify speakers on June 5th whether a session was accepted or not.

We hope to see you at DrupalCamp St. Louis 2015! Registration will open next Monday, and sessions will be announced on June 5th.

May 13 2015
May 13

We had a great discussion about how different companies and individuals are using Ansible for Drupal infrastructure management and deployments at DrupalCon LA, and I wanted to post some slides from my (short) intro to Ansible presentation here, as well as a few notes from the presentation.

The slides are below:

And video/audio from the BoF:

[embedded content]

Notes from the BoF

If first gave an overview of the basics of Ansible, demonstrating some Ad-Hoc commands on my Raspberry Pi Dramble (a cluster of six Raspberry Pi 2 computers running Drupal 8), then we dove headfirst into a great conversation about Ansible and Drupal.

Raspberry Pi Dramble - Hero
The Raspberry Pi #Dramble

Some notes from that discussion:

  • There are now many different local and production open source environment stacks built with Ansible, like Drupal VM, DevShop, Pubstack, Valkyrie, and Vlad.
  • Many companies are using Ansible as an infrastructure management tool, but sticking with tools like Cobbler, Bower, etc. for actual code deployment. Some people also use Ansible for deployment, but it really depends on the project/team's needs.
  • A lot of people liked (especially in comparison to tools like Chef and Puppet) how approachable and straightforward Ansible is; instead of taking days or weeks to get up to speed, you can dive right into Ansible and start using it in a day.
  • Connor Krukowsky has Drupal 8 running on his 8-core rooted Android phone!

Discount on Ansible for DevOps

I'm almost finished writing Ansible for DevOps, and you can purchase it now from LeanPub and keep getting updates as I continue writing—here's a coupon code for half off!

Summary

It was a great BoF, and I hope we can keep the discussion going about how different teams are using Ansible with Drupal infrastructure, and how we can all help each other through shared projects, roles, and techniques!

And maybe I'll finally get back to my work on a drush module for Ansible ;)

May 13 2015
May 13

After taking the trifecta of Acquia Developer Certification (General, Back-end, Front-end) exams and earned a new black 'Grand Master' sticker, I decided to complete the gauntlet and take the Acquia Certified Drupal Site Builder Exam at DrupalCon LA.

Acquia Certified Drupal Site Builder - 2015

Taking the test in Acquia's testing center was a welcome reprieve from taking the exams online. There's much less of a 'big brother' feel when you don't have a 'sentinel' application running on your computer and a camera focusing on your face the entire time. Also, the exam room is nice and quiet, and has a good 'library' vibe to it.

Exam Content

The site builder exam is, in many ways, the most straightforward of the Drupal certification exams. Most of the scenarios are very cut-and-dry, and there are only 50 questions on the test (as opposed to 60 for the other exams).

There are a few questions that made me think a bit. Most are presented as general scenarios, just like the other exams, with a list of solutions from which you pick the best, and many are things that I've encountered or been asked about on a project in the past (e.g. "User johndoe tried doing X, but got an error... how do I give johndoe the ability to do X"). There were also a decent number of questions asking about how to set up a view and/or block correctly to display relevant information for a given scenario.

One oddity was the number of aggregator-related questions on the exam. I can count on one finger the number of times I've used aggregator.module, and I've built hundreds of Drupal sites. I think I had three or four aggregator-related questions, and it felt a little strange (though those questions were straightforward enough I could answer them without much familiarity with aggregator).

I think the exam could use a tiny bit more expansion into contrib-powered site building. Maybe a question or two on panels-based layout, flag, organic groups, or some of the other more popular contrib modules that a seasoned Drupal site builder would need to use on a larger project.

Results

On this exam, I scored a 92%, and (as with the other exams) a nice breakdown of all the component scores was provided in case I want to brush up on a certain area:

  • Drupal Features: 80.00%
  • Content and User Management: 100.00%
  • Content Modeling: 91.66%
  • Site Display: 90.00%
  • Community and Contributed Projects: 100.00%
  • Module and Theme Management: 75.00%
  • Security and Performance: 100.00%

Again, I like how the exam gives a breakdown of each area of strength/weakness. It helps me to validate areas where I could improve my skills through workshops, books, research, etc.

Summary

On the whole, the exam hits on most of the right bits of Drupal core + Views, and gives a good set of questions to evaluate how good you may be at site building in general. It's a simpler, less 'development'-focused exam in comparison to the other exams, and would be great for those wishing to validate their general site building skills.

Apr 29 2015
Apr 29

Many blog posts have outlined the benefits of using VMs (Virtual Machines) for local Drupal development instead of either using native PHP and Apache, or a bundled environment like MAMP, XAMPP, or Acquia Dev Desktop. The advantages of using virtualization (usually managed by Vagrant) are numerous, but in certain cases, you can make a good argument for sticking with the traditional solutions.

If you'd like to take the dive and start using virtualized development environments, or if you're already using Vagrant and VirtualBox or some other VM environment (e.g. VMWare Fusion or Parallels Desktop), how do you optimize local development, and which pre-bundled Drupal development VM will be best for you and your team?

Criteria for the Perfect Local Development Environment

These are the criteria I use when judging solutions for local Drupal development (whether virtualized or traditional):

  • Should be simple and easy to set up
  • Should be fast by default
  • Should be flexible:
    • Should work with multiple providers; VirtualBox is free, but VMWare can be much faster!
    • Should allow configuration of the PHP version.
    • Should work with your preferred development workflow (e.g. drush, makefiles, manual database sync, etc.)
    • Should prevent filesystem friction (e.g. permissions issues, slow file access speeds, etc.)
    • Shouldn't have hardcoded defaults
  • Should be complete:
    • Should work without requiring a bunch of extra plugins or 3rd party tools
    • No extra languages or libraries should be required (why install Ruby gems, npm modules, etc. unless you need them for your particular project?)
  • Should be Free and Open Source
  • Should include all the tools you need, but allow you to disable whatever you don't need (e.g. XHProf, Apache Solr, etc.)
  • Should work on Windows, Mac, and Linux with minimal or no adjustment
  • Should be deployable to production (so your local dev environment matches prod exactly)

A lot of these points may have more or less importance to a particular team or individual developer. If you're a die-hard Mac user and don't ever work with any developers on Windows or Linux, you don't need to worry about Windows support. But some of these points apply to everyone, like being fast, simple, and flexible.

If you're looking for a way to improve team-based Drupal development, all these bullet points apply. If your entire team is going to standardize on something, you should standardize on something that gives everyone the standard layout that's required, but the flexibility to work with each developer's environment and preferred development tools.

Announcing Drupal VM

I built Drupal VM over the past two years for my local Drupal development needs, and continue to improve it so it meets all the above criteria.

Drupal VM is a local development environment that works with a variety of Drupal site development workflows with minimal friction. Whether a site is built via drush makefiles, uses a 'codebase-in-a-git-repo' approach, or is built with install profiles and drush commands, it works with Drupal VM. Drupal VM also includes all the tools I need in my day-to-day development, and even installs helpful software like Apache Solr, Memcache, and MailHog.

Another common scenario I have as a contrib module maintainer and core contributor is my need for a quick, fresh Drupal environment where I can run Drupal 8, 7 or 6 HEAD and hack on core or one of my contrib modules (like Honeypot). Drupal VM is preconfigured to install a fresh copy of Drupal 8 for local hacking, but it's easy to configure it to run whatever Drupal site and configuration you like!

Since Drupal VM has been helpful to other developers, I've made it more flexible, built a simple marketing page (at www.drupalvm.com), and polished up the documentation on the Drupal VM Wiki. I'm continuing to improve Drupal VM as I get time, adding features like:

  • Ability to choose between Nginx and Apache for the webserver.
  • Ability to deploy to DigitalOcean, Linode, or AWS with the same (but security-hardened) configuration as your local environment.
  • Ability to add Varnish or Nginx as a reverse-proxy cache.

Drupal VM has also been a fun project to work on while writing Ansible for DevOps. My work on Drupal VM allows me to flex some Ansible muscle and work on a large number of Ansible Galaxy roles (like geerlingguy.php and geerlingguy.solr) that are used by Drupal VM—in addition to hundreds of other projects not related to Drupal!

A VM for Everyone

Drupal VM is my weapon of choice... but there are many great projects with similar features:

Alternatively, if you know how to use Puppet, Chef, Ansible, or SaltStack, and want to fork and develop your own alternative dev environment, or build one on your own, that's always an option! Especially if you have a highly specialized production environment, it may be best to reflect that environment with a more specialized local development environment.

On Docker and LXC/LXC (Container-based environments)

Before I wrap up, I wanted to also specifically call out some projects like Drocker and the next-generation Drupal.org testbot infrastructure project, DrupalCI, both of which are using Docker containers for local development. Containerized development environments offer many of the same benefits of virtualization, but can be faster to build and rebuild, and easier to maintain.

Container-based infrastructure is likely going to become standard in the next 5-10 years (much like VM-based infrastructure has become standard in the past 5-10 years)—whether with Docker or some other standard format/methodology (a container's just a container!).

Many hosting platforms use a container-everywhere approach, like:

  • Platform.sh
  • Pantheon
  • Google Container Engine
  • Amazon EC2 Container Service

However, I caution that container-based development has it's own complexities, especially in production—especially with more complicated web applications like Drupal. I also caution against blindly running other people's pre-built container images in production; you should build them and manage them on your own (just like I build and manage my own VM images using Packer, e.g. packer-ubuntu-1404).

In Summary

In short, I've been working on Drupal VM for the past couple years, and I've made it flexible enough for the variety of Drupal sites I work on. I hope it's flexible enough for your development needs, and if not, open an issue and I'll see what I can do!

Apr 27 2015
Apr 27

Almost three years ago, on Feb 19, 2013, I opened the 8.x-dev branch of the Honeypot module (which helps prevent form spam on thousands of Drupal sites). These were heady times in the lifetime of the then-Drupal 8.x branch; 8.0-alpha1 wasn't released until three months later, on May 19. I made the #D8CX pledge—when Drupal 8 was released, I'd make sure there was a full, stable Honeypot release ready to go.

Little did I know it would be more than 2.5 years—and counting—before I could see that promise through to fruition!

As months turned into years, I've kept to the pledge, and eventually decided to also port a couple other modules that I use on many of my own Drupal sites, like Wysiwyg Linebreaks and Simple Mail.

Two years ago, I mentioned in the original Honeypot D8 conversion issue that I'd likely write a blog post "about the process of porting a moderately-complex module like this from D7 to D8". Well, I finally had some time to write that post—and I'm still wondering how far off will be the release of Drupal 8.0.0!

Ch-Ch-Changes

When working on the initial port, and when opening a new issue almost on a monthly basis to rework parts of the module to keep up with Drupal 8 core changes, I would frequently read through all the new nodes posted to the list of Change records for Drupal core.

These change records are like the Bible of translating 'how do I do Y in Drupal 8 when I did X in Drupal 7'? Most of the change records have fitting examples, contain a good amount of detail, and link back to the one, two, or ten issues that caused the particular change record to be written.

However, there were a few that were in a sorry state; these change records didn't have references back to all the relevant Drupal core issues, or only provided contrived examples that didn't help me much. In these cases, I took the following approach:

  1. Try to find the git commits that caused Honeypot tests or code to fail, do a git blame.
  2. Find the issue(s) referenced by the breaking commits.
  3. Read through the issue summary and see if it helps figure out how to fix my code.
  4. If that doesn't help, read through the commit itself, then the code that was changed, and see if that helps.
  5. If that doesn't help, read the entire issue comment history to see if that helps.
  6. If that doesn't help, pop over to the ever-helpful #drupal-contribute IRC channel.
  7. (The most important part) Go back to the deficient change record and edit it, adding appropriate issue references, code examples and documentation.

In the course of the 71 distinct Honeypot 8.x commits that have been added so far, I had to go all the way to numbers 5 and 6 quite often. If it weren't for the incredible helpfulness of people like webchick, tim.plunkett, and others who seem to be living change record references, I would've probably given up the endeavor to keep Honeypot's Drupal 8 branch up to date the past three years!

Automated tests are a pain to maintain... but help immensely

The Drupal 7 version of Honeypot had almost complete SimpleTest coverage for primary module functionality. One of the first steps in porting the module to Drupal 8—and the best way to make sure all the primary functionality was working correctly—was to port the tests to Drupal 8.

There have been dozens of automated testing changes in Drupal 8 that have caused tests to fail or give unexpected results. This caused some frustration in figuring out whether a particular failure was due to failing code or changes to the testing API.

Even with the small frustrations of broken tests every month or two, the test coverage is a huge help in ensuring long-term stability for a moderately-complex module like Honeypot. Especially when refactoring a large part of the module, or porting a feature between major Drupal versions, automated test coverage has more than made up for the extra time spent creating the tests.

The Drupal community is ever-helpful

The other thing that's been an immense help throughout the development cycle is community involvement. Since Honeypot was one of the earliest modules with a stable Drupal 8 version (it's already seen 15 stable releases with 100% passing tests!), it's already used on many public Drupal 8 sites (over 80 at this point!). And this means there are users of the module invested in its success.

These early Drupal 8 adopters and other generous Drupal developers contributed code to fix a total of 12 of the hairest issues during the D8 development cycle so far.

Come for the code, stay for the community; my experience porting Honeypot to Drupal 8 (the easy part), and chasing Drupal 8 HEAD for three years (the hard part) has again provent to me the truth of this catch phrase. I hope I can say thanks in person to at least some of the following Honeypot D8 contributors over the past three years:

2.5 years, and counting

Much has been written about core contributor burnout, but I wanted to give some credit and kudos to the army of dedicated contributed module maintainers who have already made the #D8CX pledge. A major reason for Drupal's success in so many industries is the array of contributed modules available.

The very long development cycle between major releases—coupled with the fact that many contrib maintainers are now supporting three major versions of their modules—means that contributed module maintainers are at risk for burning out too.

I'd really like to be able to focus more of my limited time for Honeypot development on new features again, especially since a few of these new features would greatly benefit the 55,000+ Drupal 6 and 7 websites already using the module today. But until we have a solid API freeze for Drupal 8.0.x, most of my time will be spent fixing tests and code just to keep Honeypot working with HEAD.

I'll be at DrupalCon LA, and I hope to do whatever small part I can to get Drupal 8.0.0 out the door—will you do the same?

Apr 16 2015
Apr 16

Previously, I posted my thoughts on the Acquia Certified Developer - Back End Specialist exam as well as my thoughts on the Certified Developer exam. To round out the trifecta of developer-oriented exams, I took the Front End Specialist exam this morning, and am posting some observations for those interested in taking the exam.

Acquia Certified Developer - Front End Specialist badge

My Theming Background

I started my Drupal journey working on design/theme-related work, and the first few Drupal themes I built were in the Drupal 5 days (I inherited some 4.7 sites, but I only really started learning how Drupal's front end worked in Drupal 5+). Luckily for me, a lot of the basics have remained the same (or at least similar) from 5-7.

For the past couple years, though, I have shied away from front end work, only doing as much as I need to keep building out features on sites like Hosted Apache Solr and Server Check.in, and making all my older Drupal sites responsive (and sometimes, mobile-first) to avoid penalization in Google's search rankings... and to build a more usable web :)

Exam Content

A lot of the questions on the exam had to do with things like properly adding javascript and CSS resources (both internal to your theme and from external sources), setting up theme regions, managing templates, and working with theme hooks, the render API, and preprocessors.

In terms of general styling/design-related content, there were few questions on actual CSS and jQuery coding standards or best practices. I only remember a couple questions that touched on breakpoints, mobile-first design, or responsive/adaptive design principles.

There were also a number of questions on general Drupal configuration and site building related to placing blocks, menus, rearranging content, configuring views etc. (which would all rely on a deep knowledge of Drupal's admin interface and how it interacts with the theme layer).

Results

On this exam, I scored an 86.66%, and (as with the other exams) a nice breakdown of all the component scores was provided in case I want to brush up on a certain area:

  • Fundamental Web Development Concepts : 90%
  • Theming concepts: 80%
  • Sub-theming concepts: 100%
  • Templates: 75%
  • Template functions: 87%
  • Layout Configuration: 90%
  • Performance: 80%
  • Security: 100%

Not too surprising, in that I hate using templates in general, and try to do almost all work inside process and preprocess functions, so my templates just print the markup they need to print :P

I think it's somewhat ironic that the Front End and general Developer exams both gave me pretty good scores for 'Fundamentals', yet the back-end exam (which would target more programming-related situations) gave me my lowest score in that area!

Summary

I think, after taking this third of four currently-available exams (the Site Builder exam is the only one remaining—and I'm planning on signing up for that one at DrupalCon LA), I now qualify for being in Acquia's Grand Master Registry, so yay!

If you'd like to take or learn about this or any of the other Acquia Certification exams, please visit the Acquia Certification Program overview.

Apr 10 2015
Apr 10

A little under a year ago, I took the Acquia Certified Developer exam at DrupalCon Austin, and posted Thoughts on the Acquia Drupal Developer Certification Exam. My overall thoughts on the idea of certifications for OSS like Drupal remain unchanged, so go read that previous post to hear them.

I wanted to post a little more about the additional certifications Acquia is now offering; in addition to the initial, more generalist-oriented Acquia Certified Developer Exam, Acquia now offers:

Earlier today, I took the Back End Specialist Exam, which focuses more specifically on things like Drupal's core API, general PHP syntax and style, secure code, content caching, debugging, and interacting with the Drupal community.

Acquia Certified Developer - Back End Specialist badge

Like the other certification exams, you get 90 minutes to complete the exam (60 questions total), and you have to take the exam either online or in a testing center with an active proctor. This time, I elected to take the exam on my own computer, which was a little more annoying than taking the exam in-person at a test center (as I did at DrupalCon last year).

Taking the online proctored exam

To prevent cheating, there are a few things you have to get set up before you can start the exam: you have to install a 'Sentinel' app on your computer that basically takes control of everything (including webcam, microphone, screen, UI, etc.), which is a little off-putting (for privacy/security reasons), then you have to position your webcam in such a way that the proctor can see your face, hands, and keyboard at all times.

It made me feel a little weird, in that scratching an itch or even stretching made me feel like I was about to be denied access or auto-flunked. Thankfully, I was only stopped once, when I was told to remove my headset at the beginning of the exam (I usually have it on when at my work desk, and I didn't realize headsets weren't allowed).

I felt a little bit more comfortable and relaxed when I took the exam at DrupalCon Austin, so I'm planning on taking at least one exam in person at DrupalCon LA in a month or so (you can register for one of the exams here!).

The exam

The exam was pretty well balanced with problems you'll face day-to-day in backend development. There were a few performance and caching-related questions geared a little more towards larger sites, and there were a couple CSS and JS-related questions that I felt would be more fitting in the Front End Specialist exam, but on the whole, the questions were challenging and unambiguous.

There were even two questions about very specific PHP coding standards that made my OCD tendencies very happy!

I completed this exam in about 55 minutes (a little more quickly than the general exam), and only had four questions to review at the end. The questions felt like they were half cut-and-dry, "here's some code, answer a specific question, and half user-story-like, "here's the situation, what would you do?"

I was a little disappointed there weren't any questions (at least, not that I recall) specific to configuring or running MySQL, PHP, or any other back-end components of a modern infrastructure stack... but maybe Acquia will add a DevOps or Infrastructure specialist exam soon. I can dream, can't I?

Results

Overall, I passed with an 88.33%, and the exam results provide a nice, detailed results breakdown to highlight areas for improvement:

  • Fundamental Web Development Concepts: 75.00%
  • Drupal core API : 89.47%
  • Database Abstraction Layer: 83.33%
  • Debug code and troubleshooting: 75.00%
  • Theme Integration: 100.00%
  • Performance: 87.50%
  • Security: 100.00%
  • Leveraging Community: 100.00%

Apparently I need to brush up on 'Fundamental Web Development Concepts' (strange, because that's the area where I scored highest in the general exam from last year!).

Pricing

The price of this exam is $350, though I was able to take the exam free of charge as an Acquia employee. This exam is $100 more than the price of the general Developer exam, so if you haven't taken that exam yet, and are considering the Acquia Certification, I'd recommend taking the less expensive general exam first, then seeing if taking the Back End Specialist exam is worth it for you.

Summary

I've now taken two of four exams currently available, and I'm looking forward to taking the rest, possibly completing all the current ones at DrupalCon LA in a month!

Pages

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web