Feb 22 2016
Feb 22

I'll be hosting a Reddit AMA on the Drupal subreddit tomorrow morning, Monday February 22, starting at 10 a.m. Eastern / 9 a.m. Central.

During the AMA, I would love to hear any questions you have about Drupal VM, Honeypot, Ansible, writing, open source communities, or really anything else you can think of! I just wrapped up a big project last week, so I'll have a couple hours tomorrow to talk about anything and everything with the Drupal community on Reddit. Even horse-sized ducks and Legos, if you're so inclined.

I'll also be formally announcing the next major release of Drupal VM, with some amazing new features for local Drupal development, so please check in tomorrow morning!

Feb 04 2016
Feb 04

I wanted to document this here just because it took me a little while to get all the bits working just right so I could have a hierarchical taxonomy display inside a Facet API search facet, rather than a flat display of only the taxonomy terms directly related to the nodes in the current search.

Basically, I had a search facet on a search page that allowed users to filter search results by a taxonomy term, and I wanted it to show the taxonomy's hierarchy:

Flat taxonomy to hierarchical taxonomy display using Search API Solr and Facet API in Drupal 7

To do this, you need to do two main things:

  1. Make sure your taxonomy field is being indexed with taxonomy hierarchy data intact.
  2. Set up the Facet API facet for this taxonomy term so it will display the full hierarchy.

Let's first start by making sure the taxonomy information is being indexed (refer to the image below):

Search API Solr index Filters configuration for hierarchical taxonomy

  1. In Search API's configuration, edit the Filters for the search index you're using (e.g. /admin/config/search/search_api/index/[index]/workflow).
    1. Make sure the 'Index hierarchy' checkbox is checked.
    2. In the 'Index hierarchy' Callback settings (which appear after you check the box in step 1), scroll down and make sure you select 'Parent terms' and 'All parent terms' under the Taxonomy type you need to display hierarchically.
  2. Save the Filters configuration, then reindex all the content on your site (otherwise Solr won't have the updated hierarchy information).

Next, we need to edit the Facet API facet for this taxonomy:

  1. Go to the taxonomy Facet's configuration page (e.g. /admin/config/search/facetapi/search_api%40[index]/block/field_release/edit).
  2. Check the 'Expand hierarchy' checkbox under 'Display settings' (near the top of the form).
  3. Set 'Treat parent items as individual facet items' to 'No'.
  4. Set 'Flatten hierarchy' to 'No'.
  5. Set 'Minimum facet count' to 0 (to show all terms in the taxonomy).

After you've done that (make sure you reindexed your content!), you should have a nice hierarchical facet display.

Jan 21 2016
Jan 21

In a prior post on the constraints of in-home website hosting, I mentioned one of the major hurdles to serving content quickly and reliably over a home Internet connection is the bandwidth you get from your ISP. I also mentioned one way to mitigate the risk of DoSing your own home Internet is to use a CDN and host images externally.

At this point, I have both of those things set up for www.pidramble.com (a Drupal 8 site hosted on a cluster of Raspberry Pis in my basement!), and I wanted to outline how I set up Drupal 8 and CloudFlare so almost all requests to www.pidramble.com are served through CloudFlare directly to the end user!

CloudFlare Configuration

Before anything else, you need a CloudFlare account; the free plan offers all the necessary features (though you should consider upgrading to a better plan if you have anything beyond the simplest use cases in mind!). Visit the CloudFlare Plans page and sign up for a Free account.

Once there, you can add your site and use all the default settings for security, SSL, DNS, etc. You'll have to configure your website's DNS to point to CloudFlare, then CloudFlare will have some DNS records that point to your 'origin' (the server IP where your Drupal 8 site is running).

After all that's done, go to the Caching section and choose the 'Standard' level of caching, as well as 'Always Online' (so CloudFlare keeps your static site up even if your server goes down).

The most important part of the configuration is adding 'Page Rules', which will allow you to actually enable the cache for certain paths and bypass cache for others (e.g. site login and admin pages). Free accounts are limited to only 3 rules, so we have to be a bit creative to make the site fully cached but not accidentally lock ourselves out of it!

We'll need to add three rules total:

  1. A rule to 'cache everything' on www.pidramble.com/*
  2. A rule to 'bypass cache' on www.pidramble.com/user/login (allows us to log into the site)
  3. A rule to 'bypass cache' on www.pidramble.com/admin/* (allows content management and administration)

The free account 3-rule limitation means that we have to do a little trickery to bypass the cache on non-admin paths when we're working on the site. Otherwise, our options would be to have some sort of alternate URL for editing (e.g. edit.example.com) that bypasses CloudFlare, or turn off caching entirely while doing development work through the CloudFlare-powered URL!

One major downside to this approach—URLs like node/[id]/edit, if accessed by someone who is not logged in, will be cached in CloudFlare as a '403 - Access Denied' page, and then you won't be able to edit that content (even when logged in) unless you purge that path from CloudFlare or use a different workaround mentioned above).

www.pidramble.com CloudFlare caching rules for Drupal 8

For the three rules, set the following options (only the non-default options you should change are shown here):

'cache everything' on /*:

  • Custom caching: Cache everything
  • Edge cache expire TTL: Respect all existing headers
  • Browser cache expire TTL: 1 hour (adjust as you see fit)

'bypass cache' on /user/login:

  • Custom caching: Bypass cache
  • Browser cache expire TTL: 4 hours (adjust as your see fit)

'bypass cache' on /admin/*:

  • Custom caching: Bypass cache
  • Browser cache expire TTL: 4 hours (adjust as you see fit)

Drupal 8 Configuration

To make sure CloudFlare (or any other reverse proxy you use) caches your Drupal site pages correctly, you need to make the following changes to your Drupal 8 site:

  1. Make sure the 'Internal Page Cache' module is enabled.
  2. Set a 'Page cache maximum age' on the Performance configuration page (/admin/config/development/performance).
  3. Add a few options to tell Drupal about your reverse proxy inside your settings.php file:

Inside sites/default/settings.php, add the following configuration to tell Drupal it is being served from behind a reverse proxy (CloudFlare), and also to make sure the trusted_host_patterns are configured:

<?php
// Reverse proxy configuration.
$settings['reverse_proxy'] = TRUE;
$settings['reverse_proxy_addresses'] = array($_SERVER['REMOTE_ADDR']);
$settings['reverse_proxy_header'] = 'HTTP_CF_CONNECTING_IP';$settings['omit_vary_cookie'] = TRUE;// Trusted host settings.
$settings['trusted_host_patterns'] = array(
 
'^pidramble\.com$',
 
'^.+\.pidramble\.com$',
);
?>

Once you've added this configuration, open another browser or an incognito browser session so you can access your site as an anonymous user. Click around on a few pages so CloudFlare gets a chance to cache your pages.

You can check that pages are being served correctly by CloudFlare by checking the HTTP headers returned by a request. The quickest way to do this is using curl --head in your terminal:

$ curl -s --head http://www.pidramble.com/ | grep CF
CF-Cache-Status: HIT
CF-RAY: 25bf2a08a7f425a3-ORD

If you see a value of HIT for the CF-Cache-Status, that means CloudFlare is caching the page. You should also notice it loads very fast now; for this site, I'm seeing the page load in < .3 seconds when cached through CloudFlare; it takes almost twice as long without CloudFlare caching!

Jan 16 2016
Jan 16

tl;dr: Drupal VM 2.2.0 'Wormhole' was released today, and it adds even more features for local dev!

Over the past few months, I've been working towards a more reliable release cadence for Drupal VM, and I've targeted one or two large features, a number of small improvements, and as many bugfixes as I have time to review. The community surrounding Drupal VM's development has been amazing; in the past few months I've noticed:

  • Lunchbox, a new Node.js-based app wrapper for Drupal VM for managing local development environments.
  • A mention of using Drupal VM + docker-selenium for running Behat tests with Chrome or FireFox, complete with automatic screenshots of test steps!
  • A great discussion about using Drupal VM with teams in the issue queue, along with a PR with some ideas in code.
  • A total of 27 individual contributors to Drupal VM (who have helped me work through 307 issues and 77 pull requests), along with hundreds of contributors for the various Ansible roles that support it.

Drupal VM is the fruit of a lot of open-source effort, and one of the things that I'm most proud of is the architecture—whereas many similar projects (whether they use Docker, Vagrant, or locally-installed software) maintain an 'island' of roles/plugins/configuration scripts within one large project, I decided to build Drupal VM on top of a few dozen completely separate Ansible roles, each of which serves an independent need, can be used for a variety of projects outside of Drupal or PHP-land, and is well tested, even in some cases on multiple platforms via Travis CI and Docker.

For example, the Apache and Nginx roles that Drupal VM uses are also used for many individual's and companies' infrastructure, even if they don't even use PHP! I'm happy to see that even some other VM-based Drupal development solutions use some of the roles as a foundation, because by sharing a common foundation, all of our tooling can benefit. It's kind of like Drupal using Twig, which benefits not only our community, but all the other PHP developers who are used to Twig!

If you want to kick the tires on Drupal VM (want to test Drupal 8 with Redis, PHP 7, Nginx, and Maria DB, or easily benchmark Drupal 8 on PHP 7 and HHVM?), follow the Quick Start Guide and let me know how it goes!

Dec 30 2015
Dec 30

I spent about an hour yesterday debugging a Varnish page caching issue. I combed the site configuration and code for anything that might be setting cache to 0 (effectively disabling caching), I checked and re-checked the /admin/config/development/performance settings, verifying the 'Expiration of cached pages' (page_cache_maximum_age) had a non-zero value and that the 'Cache pages for anonymous users' checkbox was checked.

After scratching my head a while, I realized that the headers I was seeing when using curl --head [url] were specified as the defaults in drupal_page_header(), and were triggered any time there was a message displayed on the page (e.g. via drupal_set_message()):

X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
X-Content-Type-Options: nosniff

On this particular site, the error_level was set to 1 to show all errors on the screen, and the page in question had a PHP error displayed on every page load.

After setting error_level to 0 ('None' on the /admin/config/development/logging page), Drupal sent the correct cache headers, Varnish was able to cache the page, and my sanity was restored.

Kudos especially to this post on coderwall, which jogged my memory.

Other potential reasons a page might not be showing as cacheable:

  • A form with a unique per-user token may be present.
  • An authenticated user is viewing the page (Drupal by default marks any page view with a valid session as no-cache).
  • Someone set \Drupal::service('page_cache_kill_switch')->trigger(); (Drupal 8), or drupal_page_is_cacheable() (Drupal 7).
  • Some configuration file that's being included is either setting cache or page_cache_maximum_age to 0.
Dec 30 2015
Dec 30

I spent about an hour yesterday debugging a Varnish page caching issue. I combed the site configuration and code for anything that might be setting cache to 0 (effectively disabling caching), I checked and re-checked the /admin/config/development/performance settings, verifying the 'Expiration of cached pages' (page_cache_maximum_age) had a non-zero value and that the 'Cache pages for anonymous users' checkbox was checked.

After scratching my head a while, I realized that the headers I was seeing when using curl --head [url] were specified as the defaults in drupal_page_header(), and were triggered any time there was a message displayed on the page (e.g. via drupal_set_message()):

X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
X-Content-Type-Options: nosniff

On this particular site, the error_level was set to 1 to show all errors on the screen, and the page in question had a PHP error displayed on every page load.

After setting error_level to 0 ('None' on the /admin/config/development/logging page), Drupal sent the correct cache headers, Varnish was able to cache the page, and my sanity was restored.

Kudos especially to this post on coderwall, which jogged my memory.

Other potential reasons a page might not be showing as cacheable:

  • A form with a unique per-user token may be present.
  • An authenticated user is viewing the page (Drupal by default marks any page view with a valid session as no-cache).
  • Someone set \Drupal::service('page_cache_kill_switch')->trigger(); (Drupal 8), or drupal_page_is_cacheable() (Drupal 7).
  • Some configuration file that's being included is either setting cache or page_cache_maximum_age to 0.
Dec 23 2015
Dec 23

[Multiple updates: I've added results for concurrencies of 1 and 10, results on bare metal vs. VMware instances, tested Drupal 8 vs Drupal 7 vs Wordpress 4.4, and I've also retested every single benchmark at least twice! Please make sure you're read through the entire post prior to contesting these benchmark results!]

tl;dr: Always test your own application, and trust, but verify every benchmark you see. PHP 7 is actually faster than HHVM in many cases, neck-in-neck in others, and slightly slower in others. Both PHP 7 and HHVM blow PHP ? 5.6 out of the water.

Skip to benchmark results:

Introduction and Methodology

As PHP 7 became a reality through this past year, there were scores of benchmarks pitting PHP 7 against 5.6 and HHVM using applications and frameworks like Drupal, Wordpress, Joomla, Laravel, October, etc.

One benchmark that really stood out to me (in that it seemed so wrong for Drupal, based on my experience) was The Definitive PHP 7.0 & HHVM Benchmark from Kinsta. Naming a benchmark that way certainly makes the general PHP populace take it seriously!

The results are pretty damning for PHP 7:

PHP 7 HHVM Definitive Benchmark screenshot by Kinsta

In the comments on that post, Thomas Svenson mentioned:

Standard installation for Drupal 8 has cache on as default. If you did not turn that off, then it is probably a reason to why the PHP 7 boost isn't bigger.

Would be interesting to see the result comparing the benchmark with/without caching enabled in Drupal 8. Should potentially reveal something interesting.

This was my main concern too, as there wasn't enough detail in the benchmarking article to determine what exactly was the system under test. Therefore, I'll submit my own PHP 7 vs HHVM benchmark here, using the following versions:

  • Ubuntu 14.04
  • Drupal 8.0.1
  • Nginx 1.4.6
  • MySQL 5.5.46
  • PHP 5.6.16, PHP 7.0.1, or HHVM 3.11.0

All tests were run using Drupal VM version 2.1.2 with VMware Fusion 8.1.0, on my mid-2013 MacBook Air 13" 1.7 GHz i7 with 8GB of RAM. Using the above notes, you can exactly replicate this benchmarking environment should you desire. All tests were run five times, the first two results were discarded (because they often reflect times when some caches are still warming), and the latter three were averaged.

After installing Drupal 8.0.1 with the standard installation profile (this is done automatically by Drupal VM), I logged in as the admin user (user 1), then grabbed the admin user's session cookie, and ran the following two commands:

# Benchmark Drupal 8 home page out of the box with default caching options enabled.
ab -n 750 -c 10 http://drupalvm.dev/

# Benchmark Drupal 8 /admin page logged in as user 1.
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

Drupal 8 results (concurrency 10)

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 214.39 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 407.10 req/s 62% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 260.19 req/s 19% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 20.09 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 39.26 req/s 65% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 34.41 req/s 53% faster than 5.6

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, cached

PHP 5, PHP 7, HHVM benchmark cached Drupal 8 home page request

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, uncached

PHP 5, PHP 7, HHVM benchmark uncached Drupal 8 admin request

Drupal 8 results (concurrency 1)

Sometimes, the use of concurrency (-c 10 in the above case)(to simulate concurrent users hitting the site at the same time, can cause benchmarks to be slightly inaccurate. The reason I usually use a level of concurrency is so the benchmark more closely mirrors real-world usage, and tests the full stack a little better (because PHP by itself is nice to benchmark, but very few sites are run on top of PHP alone!).

Anyways, I re-ran all the tests using -c 1, and am publishing the results below:

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 171.34 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 242.00 req/s 34% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 192.92 req/s 12% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 19.89 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 30.07 req/s 41% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 23.37 req/s 16% faster than 5.6

In all my benchmarking, I care more about deltas and reproducibility than measuring raw, clean-room-scenario performance, because unless a result is absolutely reproducible, it's of no value to me. Therefore if I can prove that there's no particular difference to testing with certain concurrency levels, I typically move the benchmark to a level that mirrors traffic patterns I actually see on my sites :)

Absolute numbers mean nothing to me—it's the comparison between test A and test B, and how reproducible that comparison is, that matters. That's why I enjoy benchmarking on the incredibly slow Raspberry Pi model 2 sometimes, because though it's much slower than my i7 laptop, it sometimes exposes surprising results!

Drupal 8 Results ('bare metal', concurrency 1)

Some people argue that running benchmarks in a VM is highly unreliable and leads to incorrect benchmarks, so I've also sacrificed a partition of a Lenovo T420 core i5 laptop (it has 3 SSDs inside, so I just formatted one, installed Ubuntu desktop 15.10, then installed PHP, MySQL, and Nginx exactly the same as with Drupal VM (same settings, same apt repos, etc.), and re-ran all the tests in that environment—so-called 'bare metal', where there's absolutely no overhead from shared filesystems, the hypervisor, etc.

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 152.35 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 230.67 req/s 41% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 142.50 req/s 7% slower than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 11.37 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 13.13 req/s 14% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 11.40 req/s 0.3% faster than 5.6

After running these benchmarks with an identical environment on 'bare metal' (e.g. a laptop with a brand new/fresh install of Ubuntu 15.10 running the same software, with 8 GB of RAM and an SSD), it seems HHVM for some reason performed even worse than PHP 5.6.16 for Drupal 8.

Since this result is wildly different than the Kinsta post (basically the opposite of their results for Drupal 8), I decided to test Wordpress 4.4 as well.

Wordpress 4.4 Results ('bare metal', concurrency 1)

For Wordpress, I ran the test using the exact same Lenovo T420 environment as the test above, and tested an anonymous user (no cookie value) hitting the default home page, and an admin logged in (using a valid session cookie—actually all five of the cookies wordpress uses to track valid sessions) visiting the admin Dashboard page (/wp-admin/index.php).

Environment WP 4.4 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 18.76 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 40.45 req/s 73% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 40.14 req/s 73% faster than 5.6 PHP 5.6.16 Bypassed (/wp-admin/index.php, admin) 13.45 req/s ~ PHP 7.0.1 Bypassed (/wp-admin/index.php, admin) 28.10 req/s 71% faster than 5.6 HHVM 3.11.0 Bypassed (/wp-admin/index.php, admin) 35.43 req/s 90% faster than 5.6

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Wordpress 4.4, anonymous home page

PHP 5 7 and HHVM benchmark comparison of Wordpress 4.4 home page anonymous

PHP 5.6, PHP 7, and HHVM running Wordpress 4.4, admin dashboard

PHP 5 7 and HHVM benchmark comparison of Wordpress 4.4 admin dashboard

These results highlight to me how much the particular project's architecture influences the benchmark. Wordpress still uses a traditional quasi-functional-style design, while Drupal 8 is heavily invested in OOP and a bit more formal data architecture. While I'm not as familiar with Wordpress's quirks as I am Drupal, I know that it's no speed demon, and also benefits from added caching layers in front of the site! It's interesting to see that PHP 7 and HHVM are practically neck-and neck for front-facing portions of Wordpress (and FAR faster than 5.6), while HHVM runs even a little faster than PHP 7 for administrative tasks.

Drupal 7 Results (concurrency 10)

I also benchmarked Drupal 7 on Drupal VM for another point of comparison (using -c 10):

Environment D7 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 511.40 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 736.90 req/s 36% faster than 5.6 HHVM 3.11.0 Enabled (home, anonymous) 585.71 req/s 14% faster than 5.6 PHP 5.6.16 Bypassed (/admin, user 1) 93.78 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 169.95 req/s 57% faster than 5.6 HHVM 3.11.0 Bypassed (/admin, user 1) 143.25 req/s 42% faster than 5.6

For these tests, I went to the Performance configuration page prior to running the tests, and enabled anonymous page cache, block cache, and CSS and JS aggregation (to make D7 match up to D8 cached anonymous user results a little more evenly).

Some people point out benchmarks like these and say "Drupal 8 is slow"... and they're right, of course. But Drupal 8 trades performance for better architecture, much more pluggability, and the inclusion of many more essential 'out-of-the-box' features than Drupal 7, so there's that. Having built a few Drupal 8 sites, I don't ever want to go back to 7 again—but it's nice to know that PHP 7 can still accelerate all my existing D7 sites quite a bit!

Summary

tl;dr: For Drupal 7 and Drupal 8 at least, PHP 7 takes the performance crown—by a wide margin.

After running the benchmarks, I scratched my head, because almost every other benchmark I've seen either puts HHVM neck-and-neck with PHP 7 or makes it seem HHVM is still the clear victor. Maybe other people running these benchmarks didn't have PHP's opcache turned on? Maybe something else was missing? Not sure, but if you'd like to reproduce the SUT and find any results different than the above (in terms of percentages), please let me know!

I ran the HHVM benchmarks three times with fresh new VM instances just because I was surprised PHP 7 stepped out in front. PHP 5.6's performance is as expected... it's better than 5.3, but that's not saying much :)

The moral of the story: Trust, but verify... especially for benchmarks which compare a plethora of totally different applications, each result can tell a completely different story depending on the test process and system under test! Please run your own tests with your own application before definitively stating that one server is faster than another.

Installing HHVM in Drupal VM

Just for posterity, since I want people to be able to reproduce the steps exactly, here's the process I used after using Drupal VM's default config.yml (with Ubuntu 14.04) to build the VM:

  1. Log into Drupal VM with vagrant ssh
  2. $ sudo su
  3. # service php5-fpm stop
  4. # apt-get install -y python-software-properties
  5. # curl http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
  6. # add-apt-repository http://dl.hhvm.com/ubuntu
  7. # apt-get update && apt-get install -y hhvm
  8. # update-rc.d hhvm defaults
  9. # /usr/share/hhvm/install_fastcgi.sh
  10. # vi /etc/nginx/sites-enabled/drupalvm.dev.conf and inside the location ~ \.php$|^/update.php block:
    1. Clear out the contents of this configuration block.
    2. Replace with include hhvm.conf;
  11. # service hhvm restart
  12. # service nginx restart

Visit the /admin/reports/status/php page after logging in to confirm you're running HHVM instead of PHP.

Dec 23 2015
Dec 23

As PHP 7 became a reality through this past year, there were scores of benchmarks pitting PHP 7 against 5.6 and HHVM using applications and frameworks like Drupal, Wordpress, Joomla, Laravel, October, etc.

One benchmark that really stood out to me (in that it seemed so wrong for Drupal, based on my experience) was The Definitive PHP 7.0 & HHVM Benchmark from Kinsta. Naming a benchmark that way certainly makes the general PHP populace take it seriously!

The results are pretty damning for PHP 7:

PHP 7 HHVM Definitive Benchmark screenshot by Kinsta

In the comments on that post, Thomas Svenson mentioned:

Standard installation for Drupal 8 has cache on as default. If you did not turn that off, then it is probably a reason to why the PHP 7 boost isn't bigger.

Would be interesting to see the result comparing the benchmark with/without caching enabled in Drupal 8. Should potentially reveal something interesting.

This was my main concern too, as there wasn't enough detail in the benchmarking article to determine what exactly was the system under test. Therefore, I'll submit my own PHP 7 vs HHVM benchmark here, using the following versions:

  • Ubuntu 14.04
  • Drupal 8.0.1
  • Nginx 1.4.6
  • MySQL 5.5.46
  • PHP 5.6.16, PHP 7.0.1, or HHVM 3.11.0

All tests were run using Drupal VM version 2.1.2 with VMware Fusion 8.1.0, on my mid-2013 MacBook Air 13" 1.7 GHz i7 with 8GB of RAM. Using the above notes, you can exactly replicate this benchmarking environment should you desire. All tests were run five times, the first two results were discarded (because they often reflect times when some caches are still warming, and the latter three were averaged.

After installing Drupal 8.0.1 with the standard installation profile (this is done automatically by Drupal VM), I logged in as the admin user (user 1), then grabbed the admin user's session cookie, and ran the following two commands:

# Benchmark Drupal 8 home page out of the box with default caching options enabled.
ab -n 750 -c 10 http://drupalvm.dev/

# Benchmark Drupal 8 /admin page logged in as user 1.
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

Results are as follows:

Environment D8 Caching Requests/second Percent difference PHP 5.6.16 Enabled (home, anonymous) 214.39 req/s ~ PHP 7.0.1 Enabled (home, anonymous) 407.10 req/s ~ HHVM 3.11.0 Enabled (home, anonymous) 260.19 req/s ~ PHP 5.6.16 Bypassed (/admin, user 1) 20.09 req/s ~ PHP 7.0.1 Bypassed (/admin, user 1) 39.26 req/s ~ HHVM 3.11.0 Bypassed (/admin, user 1) 34.41 req/s ~

...and some graphs of the above data:

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, cached

PHP 5, PHP 7, HHVM benchmark cached Drupal 8 home page request

PHP 5.6, PHP 7, and HHVM running Drupal 8.0.1, uncached

PHP 5, PHP 7, HHVM benchmark uncached Drupal 8 admin request

Summary

After running the benchmarks, I scratched my head, because almost every other benchmark I've seen either puts HHVM neck-and-neck with PHP 7 or makes it seem HHVM is still the clear victor. Maybe other people running these benchmarks didn't have PHP's opcache turned on? Maybe something else was missing? Not sure, but if you'd like to reproduce the SUT and find any results different than the above (in terms of percentages), please let me know!

I ran the HHVM benchmarks three times with fresh new VM instances just because I was surprised PHP 7 stepped out in front. PHP 5.6's performance is as expected... it's better than 5.3, but that's not saying much :)

The moral of the story: Trust, but verify... especially for benchmarks which compare a plethora of totally different applications, each result can tell a completely different story depending on the test process and system under test! Please run your own tests with your own application before definitively stating that one server is faster than another.

Installing HHVM in Drupal VM

Just for posterity, since I want people to be able to reproduce the steps exactly, here's the process I used after using Drupal VM's default config.yml (with Ubuntu 14.04) to build the VM:

  1. Log into Drupal VM with vagrant ssh
  2. $ sudo su
  3. # service php5-fpm stop
  4. # apt-get install -y python-software-properties
  5. # curl http://dl.hhvm.com/conf/hhvm.gpg.key | apt-key add -
  6. # add-apt-repository http://dl.hhvm.com/ubuntu
  7. # apt-get update && apt-get install -y hhvm
  8. # update-rc.d hhvm defaults
  9. # /usr/share/hhvm/install_fastcgi.sh
  10. # vi /etc/nginx/sites-enabled/drupalvm.dev.conf and inside the location ~ \.php$|^/update.php block:
    1. Clear out the contents of this configuration block.
    2. Replace with include hhvm.conf;
  11. # service hhvm restart
  12. # service nginx restart

Visit the /admin/reports/status/php page after logging in to confirm you're running HHVM instead of PHP.

Dec 15 2015
Dec 15

One of the motivations behind Drupal VM is flexibility in local development environments. When you develop many different kinds of Drupal sites you need to be able to adapt your environment to the needs of the site—some sites use Memcached and Varnish, others use Solr, and yet others cache data in Redis!

Drupal VM has recently gained much more flexibility in that it now allows configuration options like:

  • Choose either Ubuntu or CentOS as your operating system.
  • Choose either Nginx or Apahe as your webserver.
  • Choose either MySQL or MariaDB for your database.
  • Choose either Memcached or Redis as a caching layer.
  • Add on extra software like Apache Solr, Node.js, Ruby, Varnish, Xhprof, and more.

Out of the box, Drupal VM installs Drupal 8 on Ubuntu 14.04 with PHP 5.6 (the most stable release as of December 2015) and MySQL. We're going to make a few quick changes to config.yml so we can run the following local development stack on top of CentOS 7:

Drupal VM - Drupal 8 status report page showing Nginx, Redis, MariaDB, and PHP 7

Configure Drupal VM

To get started, download or clone a copy of Drupal VM, and follow the Quick Start Guide, but before you run vagrant up (step 2, #6), edit config.yml and make the following changes/additions:

# Update vagrant_box to use the geerlingguy/centos7 box.
vagrant_box: geerlingguy/ubuntu1404

# Update drupalvm_webserver to use nginx instead of apache.
drupalvm_webserver: nginx

# Make sure 'redis' is listed in installed_extras, and memcached, xhprof, and
# xdebug are commented out.
installed_extras:
  [ ... ]
  - redis

# Switch the PHP version to "7.0".
php_version: "7.0"

# Add the following variables to the end of the file to make sure the PhpRedis
# extension is compiled to run with PHP 7.
php_redis_install_from_source: true
php_redis_source_version: php7

# Add the following variables to the 'MySQL Configuration' section to make sure
# the MariaDB installation works correctly.
mysql_packages:
  - mariadb
  - mariadb-server
  - mariadb-libs
  - MySQL-python
  - perl-DBD-MySQL
mysql_daemon: mariadb
mysql_socket: /var/lib/mysql/mysql.sock
mysql_log_error: /var/log/mariadb/mariadb.log
mysql_syslog_tag: mariadb
mysql_pid_file: /var/run/mariadb/mariadb.pid

To make Drupal use Redis as a cache backend, you have to include and enable the Redis module on your site. The official repository on Drupal.org doesn't currently have a Drupal 8 branch, but there's a fork on GitHub that currently works with Drupal 8. We need to add that module to the drupal.make.yml make file. Add the following just after the line with devel:

  devel: "1.x-dev"
  redis:
    download:
      type: git
      url: https://github.com/md-systems/redis.git
      branch: 8.x-1.x

Run vagrant up, and wait for everything to install inside the VM. After a bit, you can visit http://drupalvm.dev/, and log in (username admin and password admin). Go to the 'Extend' page, and enable the Redis module.

Once the module is enabled, you'll need to follow the Redis module's installation guide to make Drupal actually use Redis instead of MariaDB for persistent caching. The basic steps are:

  1. Create a new file services.yml inside the Drupal 8 codebase's sites/default folder, with the following contents:

    services:
      cache_tags.invalidator.checksum:
        class: Drupal\redis\Cache\RedisCacheTagsChecksum
        arguments: ['@redis.factory']
        tags:
          - { name: cache_tags_invalidator }

  2. Open sites/default/settings.php and add the following to the end of the file:

    $settings['redis.connection']['interface'] = 'PhpRedis';
    $settings['redis.connection']['host'] = '127.0.0.1';
    $settings['cache']['default'] = 'cache.backend.redis';

Once you've made those changes, go to the performance page (http://drupalvm.dev/admin/config/development/performance) and click the 'Clear all caches' button. If you log into the VM (vagrant ssh), then run the command redis-cli MONITOR, you can watch Drupal use Redis in real-time; browse the site and watch as Redis reports all it's caching data to your screen.

Benchmarking Redis, PHP 7, and Drupal 8

These are by no means comprehensive benchmarks, but the results are easily reproducible and consistent. I used ApacheBench (ab) to simulate a single authenticated user requesting the /admin page as quickly as possible.

# ApacheBench command used:
ab -n 750 -c 10 -C "SESSxyz=value" http://drupalvm.dev/admin

With these settings, Drupal VM's CPU usage was pegged at 200%, and it reported the following results (averaged over three runs):

Cache location PHP version Requests/second Percent difference MariaDB 5.6.16 21.86 req/s ~ Redis 5.6.16 21.34 req/s 2% slower MariaDB 7.0.0 30.32 req/s 32% faster Redis 7.0.0 34.64 req/s 45% faster

Assuming you're using PHP 7, there's approximately a 13% performance boost using a local Redis instance rather than a local database to persist Drupal 8's cache. This falls in line with my findings in a related project, when I was building a cluster of Raspberry Pis to run Drupal 8 and found Redis to speed things up by about 15%!

It's odd that PHP 5.6 benchmarks showed a very slight performance decrease when using Redis, but I'm wondering if that's because the PhpRedis extension had some optimizations in its php7 branch that weren't present in the older compiled versions.

It's important to run your own benchmarks in your own environment, to make sure the performance optimizations are worth the extra applications running on your infrastructure... and that they're actually helping your Drupal site run better, not worse!

Summary

I hope Drupal VM can help you build a great local development environment; I have been using it for every Drupal project I work on, and have even taken to using it as a base for building out single-server Drupal infrastructure as-needed, by removing roles and settings I don't need, and enabling the extra security settings, and it has served me well.

If there's anything you see missing from Drupal VM that would make your local Drupal development experience easier, please take a look in the issue queue and let me know what else you'd like to see!

Dec 09 2015
Dec 09

I was recently futzing around with a Drupal site that has a fairly complex theme setup, and which relies on npm/gulp to setup and build the theme assets. One time after not touching the project for a couple weeks, when I came back and ran the gulp command again, I got the following error:

/path/to/node_modules/node-sass/lib/extensions.js:158
    throw new Error([
          ^
Error: The `libsass` binding was not found in /path/to/node_modules/node-sass/vendor/darwin-x64-14/binding.node
This usually happens because your node version has changed.
Run `npm rebuild node-sass` to build the binding for your current node version.
    at Object.sass.getBinaryPath (/path/to/node_modules/node-sass/lib/extensions.js:158:11)
    at Object.<anonymous> (/path/to/node_modules/node-sass/lib/index.js:16:36)
    at Module._compile (module.js:460:26)
    at Object.Module._extensions..js (module.js:478:10)
    at Module.load (module.js:355:32)
    at Function.Module._load (module.js:310:12)
    at Module.require (module.js:365:17)
    at require (module.js:384:17)
    at Object.<anonymous> (/path/to/node_modules/gulp-sass/index.js:176:21)
    at Module._compile (module.js:460:26)

And I think that the line This usually happens because your node version has changed. was exactly right. I manage npm and my Node.js install using Homebrew, and I remember having updated recently; it went from 4.x to 5.x. I also remembered that the project itself was supposed to be managed with 0.12!

So first things first, I installed nvm and used it to switch back to Node.js 0.12:

brew install nvm
source $(brew --prefix nvm)/nvm.sh
nvm install 0.12

Then, I re-ran npm install inside the theme directory to make sure I had all the proper versions/dependencies. But I was still getting the error. So I found this answer on Stack Overflow, which suggested rebuilding node-sass with:

npm rebuild node-sass

After that ran a couple minutes, gulp worked again, and I could move along with development on this particular Drupal site. Always mind your versions for Node.js!

Nov 27 2015
Nov 27

Recently I had to upgrade someone's Apache Solr installation from 1.4 to 5.x (the current latest version), and for the most part, a Solr upgrade is straightforward, especially if you're doing it for a Drupal site that uses the Search API or Solr Search modules, as the solr configuration files are already upgraded for you (you just need to switch them out when you do the upgrade, making any necessary customizations).

However, I ran into the following error when I tried loading the core running Apache Solr 4.x or 5.x:

org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported (resource: MMapIndexInput(path="/var/solr/cores/[corename]/data/spellchecker2/_1m.cfx") [slice=_1m.fdx]): 1 (needs to be between 2 and 3). This version of Lucene only supports indexes created with release 3.0 and later.

To fix this, you need to upgrade your index using Solr 3.5.0 or later, then you can upgrade to 4.x, then 5.x (using each version of Solr to upgrade from the previous major version):

  1. Run locate lucene-core to find your Solr installation's lucene-core.jar file. In my case, for 3.6.2, it was named lucene-core-3.6.2.jar.
  2. Find the full directory path to the Solr core's data/index.
  3. Stop Solr (so the index isn't being actively written to.
  4. Run the command to upgrade the index: java -cp /full/path/to/lucene-core-3.6.2.jar org.apache.lucene.index.IndexUpgrader -delete-prior-commits -verbose /full/path/to/data/index

It will take a few seconds for a small index (hundreds of records), or a bit longer for a huge index (hundreds of thousands of records), and then once it's finished, you should be able to start Solr again using the upgraded index. Rinse and repeat for each version of Solr you need to upgrade through.

If you have directories like index, spellchecker, spellchecker1, and spellchecker2 inside your data directory, run the command over each subdirectory to make sure all indexes are updated.

For more info, see the IndexUpgrader documentation, and the Stack Overflow answer that instigated this post.

Nov 17 2015
Nov 17

Drupal 8 Logo

On November 19, the St. Louis Drupal Users Group is having a party to celebrate the release of Drupal 8, which has been 4 years in the making! The party will be hosted at Spry Digital in downtown St. Louis, and will have beer provided by Manifest, food and drinks provided by Spry, and a Raspberry Pi 2 model B giveaway sponsored by Midwestern Mac!

Drupal 8.0.0 has been built by over 3,000 contributors in all corners of the globe, and will help kick off the next generation of personalized, content-driven websites. During the meetup, we'll build a brand new Drupal 8 site on the Raspberry Pi using Jeff Geerling's Drupal Pi project, and we'll highlight some of the awesome new features of Drupal 8.

Raspberry Pi and Acquia dancing man

After we build one of the first Drupal 8 sites, we'll give away the Raspberry Pi to a lucky winner to take home and tinker with! Special thanks to the Austin Drupal Users Group, who came up with the Pi giveaway idea!

We'll also eat, drink and be merry, celebrating the start of a new era of site building with the best version of Drupal yet!

If you'd like to join us, please RSVP on the STLDUG Meetup page: STLDUG Drupal 8.0.0 Release Party.

Oct 19 2015
Oct 19

Ansible is a simple, but powerful, server and configuration management tool. Ansible for Devops is a book I wrote to teach you to use Ansible effectively, whether you manage one server—or thousands.

Ansible for DevOps cover - Book by Jeff Geerling

I've spent a lot of time working with Ansible and Drupal over the past couple years, culminating in projects like Drupal VM (a VM for local Drupal development) and the Raspberry Pi Dramble (a cluster of Raspberry Pi computers running Drupal 8, powering http://www.pidramble.com/). I've also given multiple presentations on Ansible and Drupal, like a session at DrupalCon Austin, a session at MidCamp earlier this year, and a BoF at DrupalCon LA.

Ansible for DevOps includes a few different examples of Drupal deployments specifically, and many examples pertaining to LAMP-based infrastructure management. In the next few months, I'm finally going to publish posts I've had in the wings about using Ansible for Drupal infrastructure management, beginning with one of my simplest and most fun projects, the Drupal Pi.

Check for it soon on Drupal Planet!

Purchase Ansible for DevOps on LeanPub, Amazon, or iTunes.

Aug 21 2015
Aug 21

[Update 2015-08-25: I reran some of the tests using two different settings in VirtualBox. First, I explicitly set KVM as the paravirtualization mode (it was saved as 'Legacy' by default, due to a bug in VirtualBox 5.0.0), which showed impressive performance improvements, making VirtualBox perform 1.5-2x faster, and bringing some benchmarks to a dead heat with VMware Fusion. I also set the virtual network card to use 'virtio' instead of emulating an Intel PRO/1000 MT card, but this made little difference in raw network throughput or any other benchmarks.]

My Mac spends the majority of the day running at between one and a dozen VMs. I do all my development (besides iOS or Mac dev) running code inside VMs, and for many years I used VirtualBox, a free virtualization tool, along with Vagrant and Ansible, to build and manage all these VMs.

Since I use build and rebuild dozens of VMs per day, and maintain a popular Vagrant configuration for Drupal development (Drupal VM), as well as dozens of other VMs (like Ansible Vagrant Examples), I am highly motivated to find the fastest and most reliable virtualization software for local development. I switched from VirtualBox to VMware Fusion (which requires a for-pay plugin) a year ago, as a few benchmarks I ran at the time showed VMware was 10-30% faster.

Since VirtualBox 5.0 was released earlier this year, I decided to re-evaluate the two VM solutions for local web development (specifically, LAMP/LEMP-based Drupal development, but most of these benchmarks apply to any dev workflow).

I benchmarked the raw performance bits (CPU, memory, disk access) as well as some 'full stack' scenarios (load testing and per-page load performance for some CMS-driven websites). I'll present each benchmark, some initial conclusions based on the result, and the methodology I used for each benchmark.

The key question I wanted to answer: Is purchasing VMware Fusion and the required Vagrant plugin ($140 total!) worth it, or is VirtualBox 5.0 good enough?

Baseline Performance: Memory and CPU

I wanted to make sure VirtualBox and VMWare could both do basic operations (like copying memory and performing raw number crunching in the CPU) at similar rates; both should pass through as much of this performance as possible to the underlying system, so numbers should be similar:

Memory and CPU benchmark - VirtualBox and VMware Fusion

VMware and VirtualBox are neck-in-neck when it comes to raw memory and CPU performance, and that's to be expected these days, as both solutions (as well as most other virtualization solutions) are able to use features in modern Intel processors and modern chipsets (like those in my MacBook Air) to their fullest potential.

CPU or RAM-heavy workloads should perform similarly, though VMware Fusion has a slight edge.

Methodology - CPU/RAM

I used sysbench for the CPU benchmark, with the command sysbench --test=cpu --cpu-max-prime=20000 --num-threads=2 run.

I used Memory Bandwidth Benchmark (mbw) for the RAM benchmark, with the command mbw -n 2 256 | grep AVG, and I used the MEMCPY result as a proxy for general RAM performance.

Baseline Performance: Networking

More bandwidth is always better, though most development work doesn't rely on a ton of bandwidth being available. A few hundred megabits should serve web projects in a local environment quickly.

Network throughput benchmark - VirtualBox and VMware Fusion

This is one of the few tests in which VMware really took VirtualBox to the cleaners. It makes some sense, as VMware (the company) spends a lot of time optimizing VM-to-VM and VM-to-network-interface throughput since their products are more often used in production environments where bandwidth matters a lot, whereas VirtualBox is much more commonly used for single-user or single-machine purposes.

Having 40% more bandwidth available means VMware should be able to perform certain tasks, like moving files between host/VM, or your network connection (if it's fast enough) and the VM, or serving hundreds or thousands of concurrent requests, with much more celerity than VirtualBox—and we'll see proof of this fact with a Varnish load test, later in the post.

Methodology - Networking

To measure raw virtual network interface bandwidth, I used iperf, and set the VM as the server (iperf -s), then connected to it and ran the benchmark from my host machine (iperf -c drupalvm.dev). iperf is an excellent tool for measuring raw bandwidth, as no non-interface I/O operations are performed. Tests such as file copies can have irregular results due to filesystem performance bottlenecks.

Disk Access and Shared/Synced Folders

One of the largest performance differentiators—and one of the most difficult components to measure—is filesystem performance. Virtual Machines use virtual filesystems, or connect to folders on the host system via some sort of mounted share, to provide a filesystem the guest OS uses.

Filesystem I/O perfomance is impossible to measure simply and universally, because every use case (e.g. media streaming, small file reads, small file writes, or database access patterns) benefits from different types of file read/write performance.

Since most filesystems (and even the slowest of slow microSD cards) are fast enough for large file operations (reading or writing large files in large chunks), I decided to benchmark one of the most brutal metrics of file I/O, 4k random read/write performance. For many web applications and databases, common access patterns either require hundreds or thousands of small file reads, or many concurrent small write operations, so this is a decent proxy of how a filesystem will perform under the most severe load (e.g. reading an entire PHP application's files from disk, when opcaches are empty, or rebuilding key-value caches in a database table).

I measured 4k random reads and writes across three different VM scenarios: first, using the VM's native share mechanism (or 'synced folder' in Vagrant parlance), second, using NFS, a common and robust network share mechanism that's easy to use with Vagrant, nad third, reading and writing directly to the native VM filesystem:

Disk or drive random access benchmark - VirtualBox and VMware Fusion

The results above, as with all other benchmarks in this post, were repeated at least four times, with the first result set discarded. Even then, the standard deviation on these benchmarks was typically 5-10%, and the benchmarks were wildly different depending on the exact benchmark I used.

I was able to reproduce the strange I/O performance numbers in Mitchell Hashimoto's 2014 post when I didn't use direct filesystem access to do reads and writes; certain benchmarks suggest the VM filesystem is capable of over 1 GB/sec of random 4K reads and writes! Speaking of which, running the same benchmarks on my MacBook Air's internal SSD showed maximum performance of 1891 MB/s read, and 389 MB/s write.

Passing the -I option to the iozone benchmarking tool makes sure the tests bypass the VM's disk caching mechanisms that masks the actual filesystem performance. Unfortunately, this parameter (which uses O_DIRECT filesystem access) doesn't work with native VM shares, so those numbers may be a bit inflated over real-world performance.

The key takeaway? No matter the filesystem you use in a VM, raw file access is an order of magnitude slower than native host I/O if you have a fast SSD. Luckily, the raw performance isn't horrendous (as long as you're not copying millions of tiny files!), and common development access patterns help filesystem and other caches speed up file operations.

Methodology - Disk Access

I used iozone to measure disk access, using the command iozone -I -e -a -s 64M -r 4k -i 0 -i 2 [-f /path/to/file]. I also repeated the tests numerous times with different -s values ranging from 128M to 1024M, but the performance numbers were similar with any value.

If you're interested in diving deeper into filesystem benchmarking, iozone's default set of tests are much broader and applicable across a very wide range of use cases (besides typical LAMP/LEMP web development).

Full Stack - Drupal 7 and Drupal 8

When it comes down to it, the most meaningful benchmark is a 'full stack' benchmark, which tests the application I'm developing. In my case, I am normally working on Drupal-based websites, so I wanted to test both Drupal 8 and Drupal 7 (the current stable release) in two scenarios—a clean install of Drupal 8 (with nothing extra added), and a fairly heavy Drupal 7 site, to mirror some of the more complicated sites I have to work with.

First, here's a comparison of 'requests per second' with VirtualBox and VMware. Higher numbers are better, and this test is a decent proxy for how fast the VM is rendering specific pages, as well as how many requests the full stack/server can serve in a short period of time:

Drupal 8 requests per second benchmark - VirtualBox and VMware Fusion

The first two benchmarks are very close. When your application is mostly CPU-and-RAM-constrained (Drupal 8 is running almost entirely out of memory using PHP's opcache and MySQL caches), both virtualization apps are about the same, with a very slight edge going to VMware Fusion.

The third graph is more interesting, as it shows a large gap—VMware can serve up 43% more traffic than VirtualBox. When you compare this graph with the raw network throughput graph above, it's obvious VMware Fusion's network bandwidth is the reason it can almost double the requests/sec for a network-constrained benchmark like Varnish capacity.

Developing a site with frequently-changing code requires more disk I/O, since the opcache has to be rebuilt from disk, so I tested raw page load times with a fresh PHP thread:

Page load performance for Drupal 7 and 8 - VirtualBox and VMware Fusion

For this test, I restarted Apache entirely between each page request, which wiped out PHP's opcache, causing all the PHP files to be read from the disk. These benchmarks were run using an NFS share, so the main performance increase here (over the load test in the previous benchmark) comes from VMware's slightly faster NFS shared filesystem performance.

In real world usage, there's a perceptible performance difference between VirtualBox and VMware Fusion, and these benchmarks confirm it.

Many people decide to use native synced folders because file permissions and setup can often be simpler, so I wanted to see how much not using NFS affects these numbers:

Page load performance for Drupal 7 and 8 with different synced folder methods - VirtualBox and VMware Fusion

As it turns out, NFS has a lot to offer in terms of performance for apps running in a shared folder. Another interesting discovery: VMware's native shared folder performs nearly as good as the ideal scenario in VirtualBox (running the codebase on an NFS mount).

I still highly recommend using NFS instead of native shared folders if you're sharing more than a few files between host and guest.

Methodology - Full Stack Performance

I used ab, wrk, and curl to run performance benchmarks and simple load tests:

  • Drupal anonymous cached page load: wrk -d 30 -c 2 http://drupalvm.dev/
  • Drupal authenticated page load: ab -n 500 -c 2 -C "SESS:COOKIE" http://drupalvm.dev/ (used the uid 1 user session cookie)
  • Varnish anonymous proxied page load: wrk -d 30 -c 2 http://drupalvm.dev:81/ (a cache lifetime value of '15 minutes' was set on the performance configuration page)
  • Drupal 8 front page uncached: time curl --silent http://drupalvm.dev/ > /dev/null, run once after clicking 'Clear all caches' on the admin/config/development/performance page, averaged over six runs)
  • Large Drupal 7 site views/panels page request: time curl --silent http://local.example.com/path > /dev/null (run once after clicking 'Clear all caches' on the `admin/config/development/performance` page, averaged over six runs)

Drupal 8 tests were run with a standard profile install of a Drupal 8 site (ca. beta 12) on Drupal VM 2.0.0, and Drupal 7 tests were run using a very large scale Drupal codebase, with over 150 modules.

Summary

I hope these benchmarks help you to decide if VMware Fusion is right for your Vagrant-based development workflow. If you use synced folders a lot and need as much bandwidth as possible, choosing VMware is a no-brainer. If you don't, then VirtualBox is likely 'fast enough' for your development workflow.

It's great to have multiple great choices for VM providers for local development—and in this case, the open source option holds its own against the heavyweight proprietary virtualization app!

Methodology - All Tests

Since I detest when people post benchmarks but don't describe the system under test and all their reasons behind testing things certain ways, I thought I'd explicitly outline everything here, so someone else with the time and materials could replicate all my test results verbatim.

  • I ran all benchmarks four times (with the exception of some of the disk benchmarks, which I ran six times for better coverage of random I/O variance), discarded the first result, and averaged the remaining results.
  • All tests were run using an unmodified copy of Drupal VM version 2.0.0, with all the example configuration files (though all extra installations besides Varnish were removed), using the included Ubuntu 14.04 LTS minimal base box (which is built using this Packer configuration, the same for both VirtualBox and VMware Fusion).
  • For full stack Drupal benchmarking for Varnish-cached pages, I logged into Drupal and set a minimum cache lifetime value of '15 minutes' on the performance configuration page, and for authenticated page loads, I used the session cookie for the logged in uid 1 user.
  • All tests were run on my personal 11" Mid 2013 MacBook Air, with a 1.7 GHz Intel Core i7 processor, 8 GB of RAM, and a 256 GB internal SSD. The only other applications (besides headless VMs and Terminal) that were open and running during tests were Mac OS X Mail and Sublime Text 3 (in which I noted benchmark results.
  • All tests were performed with my Mac disconnected entirely from the Internet (WiFi disabled, and no connection otherwise), to minimize any strange networking problems that could affect performance.
Jul 31 2015
Jul 31

I've been working with Drupal 8 for a long time, keeping Honeypot and some other modules up to date, and doing some dry-runs of migrating a few smaller sites from Drupal 7 to Drupal 8, just to hone my D8 familiarity.

Raspberry Pi Dramble Drupal 8 Website

I finally launched a 'for real' Drupal 8 site, which is currently running on Drupal 8 HEAD—on a cluster of Raspberry Pi 2 computers in my basement! You can view the site at http://www.pidramble.com/, and I've already started posting some articles about running Drupal 8 on the servers, how I built the cluster, some of the limitations of at-home webhosting, etc.

Some of the things I've already learned from building and running this cluster for the past few days:

  • Drupal 8 (just core, alone) is awesome. Building out simple sites with zero contributed modules, and no custom code, is a real possibility in Drupal 8. Drupal 7 will never feel the same again :(
  • Drupal 8 is finally fast; not super fast, but fast enough. And with some recent cache stampede protections that have been added, Drupal 8 is running much more stable in my testing—stable enough that I was finally comfortable launching a site on Drupal 8 on these Raspberry Pis!
  • My (very) limited upload bandwidth isn't yet an issue. I only have 4-5 Mbps up, and as long as I host most images externally, serving up tiny 8-10 KB resources for normal page loads allows for a pretty large amount of traffic without a hiccup. Or, more importantly, without interfering with my day-to-day Internet use as a work-from-home employee!
  • It's really awesome being able to see the live traffic to the servers using the LEDs on the front. See for yourself: Nginx Load Balancer Visualization w/ LEDs. It's fun watching live traffic a few feet away from my desk, especially when I do things like tweet the URL (immediately following, I can see all the requests come in from Twitter-related bots!).

I'm hoping to continue writing about my experiences with Drupal 8 (especially on the Pi cluster), etc. in the next few weeks, both here and elsewhere!

Jul 28 2015
Jul 28

After some more tinkering with the Raspberry Pi Dramble (a cluster of 6 Raspberry Pis used to demonstrate Drupal 8 deployments using Ansible), I finally was able to get the RGB LEDs to react to Nginx accesses—meaning every time a request is received by Nginx, the LED toggles to red momentarily.

This visualization allows me to see exactly how Nginx is distributing requests among the servers in different load balancer configurations. The default (not only for Nginx, but also for Varnish, HAProxy, and other balancers) is to use round-robin distribution, meaning each request is sent to the next server. This is demonstrated first, in the video below, followed by a demonstration of Nginx's ip_hash method, which pins one person's IP address to one backend server, based on a hash of the person's IP address:

It's fun to be able to visualize things like Drupal deployments, Nginx requests, etc., on this cluster of Raspberry Pis, and in addition to a presentation on Ansible + Drupal 8 at MidCamp, and Ansible 101, I'll be showing the Dramble in a soon-to-be-released episode of Jam's Drupal Camp from Acquia—stay tuned!

Jul 21 2015
Jul 21

On many Drupal 7 sites, I have encountered issues with Emoji (mostly) and other special characters (rarely) when importing content from social media feeds, during content migrations, and in other situations, so I finally decided to add a quick blog post about it.

Have you ever noticed an error in your logs complaining about incorrect string values, with an emoji or other special character, like the following:

PDOException: SQLSTATE[HY000]: General error: 1366 Incorrect string value: '\xF0\x9F\x98\x89" ...' for column 'body_value' at row 1: INSERT INTO {field_data_body} (entity_type, entity_id, revision_id, bundle, delta, language, body_value, body_summary, body_format) VALUES (:db_insert_placeholder_0, :db_insert_placeholder_1, :db_insert_placeholder_2, :db_insert_placeholder_3, :db_insert_placeholder_4, :db_insert_placeholder_5, :db_insert_placeholder_6, :db_insert_placeholder_7, :db_insert_placeholder_8); Array ( [:db_insert_placeholder_0] => node [:db_insert_placeholder_1] => 538551 [:db_insert_placeholder_2] => 538550 [:db_insert_placeholder_3] => story [:db_insert_placeholder_4] => 0 [:db_insert_placeholder_5] => und [:db_insert_placeholder_6] => <p>????</p> [:db_insert_placeholder_7] => [:db_insert_placeholder_8] => filtered_html ) in field_sql_storage_field_storage_write() (line 514 of /drupal/modules/field/modules/field_sql_storage/field_sql_storage.module).

To fix this, you need to switch the affected MySQL table's encoding to utf8mb4, and also switch any table columns ('fields', in Drupal parlance) which will store Emojis or other exotic UTF-8 characters. This will allow these special characters to be stored in the database, and stop the PDOExceptions.

Using Sequel Pro on a Mac, this process is relatively quick and painless:

  1. Open the affected tables (in the above case, field_data_body, and the corresponding revision table, field_revision_body), and click on the 'Table info' tab.
  2. In the 'Encoding' menu, switch from "UTF-8 Unicode (utf8)" to "UTF-8 Unicode (utf8mb4)". This will take a little time for larger data sets.
  3. Switch over to the 'Structure' tab, and for each field which will be storing data (in our case, the body_value and body_summary fields), choose "UTF-8 Unicode (utf8mb4)" under the 'Encoding' column. This will take a little time for larger data sets.

After converting the affected tables, you will also need to patch Drupal 7 to make sure the MySQL connection uses the correct encoding. Apply the latest patch from the issue Drupal 7 MySQL does not support full UTF-8, and add the following keys to your default database connection settings:

$databases = array(
  'default' => array(
    'default' => array(
      'database' => 'database',
      'username' => 'username',
      'password' => 'password',
      'host' => '127.0.0.1',
      'driver' => 'mysql',
      // Add default charset and collation for mb4 support.
      'charset' => 'utf8mb4',
      'collation' => 'utf8mb4_general_ci',
    ),
  ),
);

That issue is actually a child issue of MySQL driver does not support full UTF-8, which has already been fixed in Drupal 8 (which now requires MySQL 5.5.3 or later as a result). It may take a little time for the problem to get an 'official' fix in Drupal 7, since it's a complicated problem that requires a delicate touch—we don't want a bunch of people's sites to go belly up because some contributed modules are using large VARCHAR columns, or because their hosting provider is running an old version of MySQL!

There's also a handy table_converter module for Drupal 7, which helps you automate the process of converting tables to the new format. It still requires the core patch mentioned above, but it can help smooth out the process of actually converting the tables to the new format.

Once you've fixed the issue, you won't be quite as annoyed next time you see one of these guys: ????

Jul 20 2015
Jul 20

I build and destroy a lot of VMs using Vagrant in the course of the day. Between developing Drupal VM, writing Ansible for DevOps, and testing dozens of Ansible Galaxy roles, I probably run vagrant up and vagrant destroy -f at least a dozen times a day.

Building all these VMs would be a pain, and require much more user intervention, if it weren't for a few things I've done on my local workstation to help with the process. I thought I'd share these tips so you can enjoy a much more streamlined Vagrant workflow as well!

Extremely helpful Vagrant plugins

None of my projects require particular Vagrant plugins—but many, like Drupal VM, will benefit from adding at least one venerable plugin, vagrant-hostsupdater. Every time you start or shut down a VM with Vagrant, the relevant hosts entries will be placed in your system's hosts file, without requiring you to do anything manually. Great time-saver, and highly recommended! To install: vagrant plugin install vagrant-hostsupdater

Another plugin that many people have used to provide the fastest filesystem synchronization support is vagrant-gatling-rsync, which uses an rsync-based sync mechanism similar to the one built into Vagrant, but much faster and less resource-intense on your host machine.

Helpful modifications to /etc/sudoers

One major downside to using the vagrant-hostsupdater plugin, or to using NFS mounts (which are much faster than native shares in either VirtualBox or VMWare Fusion), is that you have to enter your sudo password when you build and destroy VMs. You can avoid this gotcha by adding the following lines to your /etc/sudoers configuration (then quit and restart your Terminal session so the new settings are picked up):

# Vagrant configuration.
# Allow Vagrant to manage NFS exports.
Cmnd_Alias VAGRANT_EXPORTS_ADD = /usr/bin/tee -a /etc/exports
Cmnd_Alias VAGRANT_NFSD = /sbin/nfsd restart
Cmnd_Alias VAGRANT_EXPORTS_REMOVE = /usr/bin/sed -E -e /*/ d -ibak /etc/exports
# Allow Vagant to manage hosts file.
Cmnd_Alias VAGRANT_HOSTS_ADD = /bin/sh -c echo "*" >> /etc/hosts
Cmnd_Alias VAGRANT_HOSTS_REMOVE = /usr/bin/sed -i -e /*/ d /etc/hosts
%admin ALL=(root) NOPASSWD: VAGRANT_EXPORTS_ADD, VAGRANT_NFSD, VAGRANT_EXPORTS_REMOVE, VAGRANT_HOSTS_ADD, VAGRANT_HOSTS_REMOVE

Important note: If you're editing sudoers by hand, make sure you edit the file with sudo visudo instead of just editing it in your favorite editor. This ensures the file is valid when you save it, so you don't get locked out from sudo on your system!

This configuration works out of the box on Mac OS X, and only needs slight modifications to make sure it works on Linux distributions (make sure the 'admin' group is changed to whatever group your user account is in).

I've even wrapped up the configuration of /etc/sudoers into my Mac Development Ansible Playbook, so I can automatically ensure all my Macs are configured for an optimal Vagrant experience!

SSH keys inside your VM

If you want to use your SSH credentials inside a Vagrant-powered VM, you can turn on SSH agent forwarding on by adding the following line inside your Vagrantfile:

  config.ssh.forward_agent = true

Drupal VM includes agent forwarding by default, so you can build your VM, log in, and work on Git projects, log into remote servers, use drush, etc., just as you would on your host computer.

Note that I usually don't have fowarding enabled in my own environments, as I treat Vagrant VMs strictly as sandboxed development environments—if I install some software for testing inside the VM as the vagrant user, I don't want it to be able to use my SSH credentials to do anything nefarious! Generally that won't happen, but I like erring on the side of caution.

Summary

What are some of your favorite tips and tricks for Vagrant-based workflows? Any other tricks you know of to solve common pain points (e.g. using the vagrant-vbguest if you have issues with native shares or guest additions)?

Jul 03 2015
Jul 03

Drupal VM - Vagrant and Ansible Virtual Machine for Drupal Development

For the past couple years, I've been building Drupal VM to be an extremely-tunable, highly-performant, super-simple development environment. Since MidCamp earlier this year, the project has really taken off, with almost 200 stars on GitHub and a ton of great contributions and ideas for improvement (some implemented, others rejected).

In the time since I wrote Developing for Drupal with Vagrant and VMs, I've focused on meeting all my defined criteria for the perfect local development environment. And now, I'm able to say that I use Drupal VM when developing all my projects—as it is now flexible and fast enough to emulate any production environment I use for various Drupal projects.

Easy PHP 7 testing with CentOS 7 and MariaDB

After a few weeks of work, Drupal VM now officially supports running PHP 7 (currently, 7.0.0 alpha 2) on CentOS 7 with MariaDB, or you can even tweak the settings to compile PHP from source yourself (following to the PHP role's documentation).

Doing this allows you to see how your own projects will fare when run with the latest (and fastest) version of PHP. Drupal 8 performance improves dramatically under PHP 7, and most other PHP applications will have similar gains.

Read PHP 7 on Drupal VM for more information.

Other major improvements and features

Here are some of the other main features that have recently been added or improved:

  • Flexible database support: MySQL, MariaDB, or (soon) Percona are all supported out of the box, pretty easily. See guide for Use MariaDB instead of MySQL.
  • Flexible OS support: Drupal VM officially supports Ubuntu 14.04, Ubuntu 12.04, CentOS 7, or CentOS 6 out of the box; other OSes like RHEL, Fedora, Arch and Debian may also work, but are not supported. See: Using different base OSes.
  • Use with any Drupal deployment methodology — works with any dev workflow, including Drush make files, local Drupal codebases, and multisite installations.
  • Automatic local drush alias configuration
  • 'Batteries included' — developer utilities and essentials like Varnish, Solr, MailHog, XHProf are easy to enable or disable.
  • Production-ready, security-hardened configuration you can install on DigitalOcean
  • Thoroughly-documented — check out the Drupal VM Wiki on GitHub
  • First class support for any host OS — Mac, Linux or Windows
  • Drupal version agnostic — works great with 6, 7, or 8.
  • Easy configuration of thousands of parameters (powered by a few dozen component-specific Ansible roles) through the config.yml file.

I'd especially like to thank the dozens of people who have filed issues against the project to add needed functionality or fix bugs (especially for multi-platform, multi-database support!), and have helped improve Drupal VM through over 130 issues and 17 pull requests.

There are dozens of other VM-based or Docker/container-based local development solutions out there, and Drupal VM is one of many, but I think that—even if you don't end up using it for your own work—you will find sound ideas and best practices in environment configuration in the project.

Jun 21 2015
Jun 21

DrupalCamp St. Louis 2015 was held this past weekend, June 20-21, 2015, at SLU LAW in downtown St. Louis. We had nine sessions and a great keynote on Saturday, and a full sprint day on Sunday.

DrupalCamp St. Louis 2015 Registration
The view coming off the elevators at SLU LAW.

Every session was recorded (slides + audio), and you can view all the sessions online:

The Camp went very well, with almost sixty participants this year! We had a great time, learned a lot together, and enjoyed some great views of downtown St. Louis (check out the picture below!), and we can't wait until next year's DrupalCamp St. Louis (to be announced)!

PS Thug Life St. Louis - Jeff Geerling and Mike Ryan
A candid shot of myself and Mike Ryan, 'the Drupal Migrate guy' who lives near St. Louis.

High Performance Drupal

geerlingguy delivers presentation on High Performance Drupal at DrupalCamp St. Louis 2015
Yours truly, talking about Drupal and Performance.

I delivered a session titled High Performance Drupal, going over performance planning, benchmarking, and easy performance wins. You can click the link in the previous line to see more details, watch the session video on YouTube, or view the slides from the presentation on SlideShare.

Check out more from DrupalCamp St. Louis 2015 on the official camp website: DrupalCamp St. Louis 2015.

Jun 08 2015
Jun 08

The organizers of DrupalCamp St. Louis 2015 are excited to announce that the schedule is set for DrupalCamp STL.15; we will have sessions from a variety of presenters on a variety of topics—for both beginners and seasoned veterans alike!

DrupalCamp 2015 St. Louis - SLU LAW

Some of the great sessions lined up include a session on Git basics, the status of Migrate in Drupal 8, content strategy, securing Drupal, improving performance, improving search, Twig, and more! To kick it off, we'll have an awesome keynote from Alina Mackenzie (alimac) about getting involved in the Drupal Community.

Check out the sessions: DrupalCamp St. Louis 2015 Session Schedule.

Register for DrupalCamp STL.15 today, and build your schedule on the site—besides these excellent sessions, you'll get a tasty catered lunch, a comfy t-shirt, and some great memories and networking opportunities on both days of the Camp!

May 29 2015
May 29

DrupalCamp STL.15 (June 20-21, in St. Louis, MO) will be the first DrupalCamp in St. Louis with a day dedicated to sprints to help the Drupal community. We're expecting a great turnout, and there are already a number of proposed sessions (many of which will be selected and announced on June 5!), and it's not yet too late to propose a session of your own!

DrupalCamp 2015 St. Louis - SLU LAW

This year's keynote, by Alina Mackenzie, will focus on the Drupal Community—what it is, why it rocks, and how you can get involved in the community. After the keynote, some great sessions, a tasty lunch, happy hour, and a good night's rest, we'll spend sprint day (Sunday June 21) making Drupal better, and maybe even pushing Drupal 8 a little closer to an 8.0.0 rc1 release!

Registration is now open, so go reserve your spot at DrupalCamp St. Louis 2015; I'll see you there, hopefully at one of the sessions I proposed, either on High Performance Drupal, or Local Development Environments and Drupal VM!

May 21 2015
May 21

DrupalCamp 2015 St. Louis - SLU LAW

DrupalCamp St. Louis is scheduled for June 20-21, 2015, and will be held at SLU LAW in downtown St. Louis, MO. Less than a month away, there are a few important bits of news:

DrupalCamp STL.15 Keynote Speaker: Alina Mackenzie (alimac)

Alina Mackenzie is a developer and system administrator based in Chicago. In the Drupal community she is a camp organizer, speaker and communications lead for DrupalCon mentored sprints. She is passionate about learning organizations, automation, and making open source friendly for beginners.

Alina's keynote will focus on "Finding the entrance: Why and how to get involved with the Drupal community".

Alina's Drupal.org profile is https://www.drupal.org/u/alimac

Session Submission Deadline: May 29

Please submit your session proposals by Friday, May 29—just over a week from today! We'll notify speakers on June 5th whether a session was accepted or not.

We hope to see you at DrupalCamp St. Louis 2015! Registration will open next Monday, and sessions will be announced on June 5th.

May 13 2015
May 13

We had a great discussion about how different companies and individuals are using Ansible for Drupal infrastructure management and deployments at DrupalCon LA, and I wanted to post some slides from my (short) intro to Ansible presentation here, as well as a few notes from the presentation.

The slides are below:

And video/audio from the BoF:

[embedded content]

Notes from the BoF

If first gave an overview of the basics of Ansible, demonstrating some Ad-Hoc commands on my Raspberry Pi Dramble (a cluster of six Raspberry Pi 2 computers running Drupal 8), then we dove headfirst into a great conversation about Ansible and Drupal.

Raspberry Pi Dramble - Hero
The Raspberry Pi #Dramble

Some notes from that discussion:

  • There are now many different local and production open source environment stacks built with Ansible, like Drupal VM, DevShop, Pubstack, Valkyrie, and Vlad.
  • Many companies are using Ansible as an infrastructure management tool, but sticking with tools like Cobbler, Bower, etc. for actual code deployment. Some people also use Ansible for deployment, but it really depends on the project/team's needs.
  • A lot of people liked (especially in comparison to tools like Chef and Puppet) how approachable and straightforward Ansible is; instead of taking days or weeks to get up to speed, you can dive right into Ansible and start using it in a day.
  • Connor Krukowsky has Drupal 8 running on his 8-core rooted Android phone!

Discount on Ansible for DevOps

I'm almost finished writing Ansible for DevOps, and you can purchase it now from LeanPub and keep getting updates as I continue writing—here's a coupon code for half off!

Summary

It was a great BoF, and I hope we can keep the discussion going about how different teams are using Ansible with Drupal infrastructure, and how we can all help each other through shared projects, roles, and techniques!

And maybe I'll finally get back to my work on a drush module for Ansible ;)

May 13 2015
May 13

After taking the trifecta of Acquia Developer Certification (General, Back-end, Front-end) exams and earned a new black 'Grand Master' sticker, I decided to complete the gauntlet and take the Acquia Certified Drupal Site Builder Exam at DrupalCon LA.

Acquia Certified Drupal Site Builder - 2015

Taking the test in Acquia's testing center was a welcome reprieve from taking the exams online. There's much less of a 'big brother' feel when you don't have a 'sentinel' application running on your computer and a camera focusing on your face the entire time. Also, the exam room is nice and quiet, and has a good 'library' vibe to it.

Exam Content

The site builder exam is, in many ways, the most straightforward of the Drupal certification exams. Most of the scenarios are very cut-and-dry, and there are only 50 questions on the test (as opposed to 60 for the other exams).

There are a few questions that made me think a bit. Most are presented as general scenarios, just like the other exams, with a list of solutions from which you pick the best, and many are things that I've encountered or been asked about on a project in the past (e.g. "User johndoe tried doing X, but got an error... how do I give johndoe the ability to do X"). There were also a decent number of questions asking about how to set up a view and/or block correctly to display relevant information for a given scenario.

One oddity was the number of aggregator-related questions on the exam. I can count on one finger the number of times I've used aggregator.module, and I've built hundreds of Drupal sites. I think I had three or four aggregator-related questions, and it felt a little strange (though those questions were straightforward enough I could answer them without much familiarity with aggregator).

I think the exam could use a tiny bit more expansion into contrib-powered site building. Maybe a question or two on panels-based layout, flag, organic groups, or some of the other more popular contrib modules that a seasoned Drupal site builder would need to use on a larger project.

Results

On this exam, I scored a 92%, and (as with the other exams) a nice breakdown of all the component scores was provided in case I want to brush up on a certain area:

  • Drupal Features: 80.00%
  • Content and User Management: 100.00%
  • Content Modeling: 91.66%
  • Site Display: 90.00%
  • Community and Contributed Projects: 100.00%
  • Module and Theme Management: 75.00%
  • Security and Performance: 100.00%

Again, I like how the exam gives a breakdown of each area of strength/weakness. It helps me to validate areas where I could improve my skills through workshops, books, research, etc.

Summary

On the whole, the exam hits on most of the right bits of Drupal core + Views, and gives a good set of questions to evaluate how good you may be at site building in general. It's a simpler, less 'development'-focused exam in comparison to the other exams, and would be great for those wishing to validate their general site building skills.

Apr 29 2015
Apr 29

Many blog posts have outlined the benefits of using VMs (Virtual Machines) for local Drupal development instead of either using native PHP and Apache, or a bundled environment like MAMP, XAMPP, or Acquia Dev Desktop. The advantages of using virtualization (usually managed by Vagrant) are numerous, but in certain cases, you can make a good argument for sticking with the traditional solutions.

If you'd like to take the dive and start using virtualized development environments, or if you're already using Vagrant and VirtualBox or some other VM environment (e.g. VMWare Fusion or Parallels Desktop), how do you optimize local development, and which pre-bundled Drupal development VM will be best for you and your team?

Criteria for the Perfect Local Development Environment

These are the criteria I use when judging solutions for local Drupal development (whether virtualized or traditional):

  • Should be simple and easy to set up
  • Should be fast by default
  • Should be flexible:
    • Should work with multiple providers; VirtualBox is free, but VMWare can be much faster!
    • Should allow configuration of the PHP version.
    • Should work with your preferred development workflow (e.g. drush, makefiles, manual database sync, etc.)
    • Should prevent filesystem friction (e.g. permissions issues, slow file access speeds, etc.)
    • Shouldn't have hardcoded defaults
  • Should be complete:
    • Should work without requiring a bunch of extra plugins or 3rd party tools
    • No extra languages or libraries should be required (why install Ruby gems, npm modules, etc. unless you need them for your particular project?)
  • Should be Free and Open Source
  • Should include all the tools you need, but allow you to disable whatever you don't need (e.g. XHProf, Apache Solr, etc.)
  • Should work on Windows, Mac, and Linux with minimal or no adjustment
  • Should be deployable to production (so your local dev environment matches prod exactly)

A lot of these points may have more or less importance to a particular team or individual developer. If you're a die-hard Mac user and don't ever work with any developers on Windows or Linux, you don't need to worry about Windows support. But some of these points apply to everyone, like being fast, simple, and flexible.

If you're looking for a way to improve team-based Drupal development, all these bullet points apply. If your entire team is going to standardize on something, you should standardize on something that gives everyone the standard layout that's required, but the flexibility to work with each developer's environment and preferred development tools.

Announcing Drupal VM

I built Drupal VM over the past two years for my local Drupal development needs, and continue to improve it so it meets all the above criteria.

Drupal VM is a local development environment that works with a variety of Drupal site development workflows with minimal friction. Whether a site is built via drush makefiles, uses a 'codebase-in-a-git-repo' approach, or is built with install profiles and drush commands, it works with Drupal VM. Drupal VM also includes all the tools I need in my day-to-day development, and even installs helpful software like Apache Solr, Memcache, and MailHog.

Another common scenario I have as a contrib module maintainer and core contributor is my need for a quick, fresh Drupal environment where I can run Drupal 8, 7 or 6 HEAD and hack on core or one of my contrib modules (like Honeypot). Drupal VM is preconfigured to install a fresh copy of Drupal 8 for local hacking, but it's easy to configure it to run whatever Drupal site and configuration you like!

Since Drupal VM has been helpful to other developers, I've made it more flexible, built a simple marketing page (at www.drupalvm.com), and polished up the documentation on the Drupal VM Wiki. I'm continuing to improve Drupal VM as I get time, adding features like:

  • Ability to choose between Nginx and Apache for the webserver.
  • Ability to deploy to DigitalOcean, Linode, or AWS with the same (but security-hardened) configuration as your local environment.
  • Ability to add Varnish or Nginx as a reverse-proxy cache.

Drupal VM has also been a fun project to work on while writing Ansible for DevOps. My work on Drupal VM allows me to flex some Ansible muscle and work on a large number of Ansible Galaxy roles (like geerlingguy.php and geerlingguy.solr) that are used by Drupal VM—in addition to hundreds of other projects not related to Drupal!

A VM for Everyone

Drupal VM is my weapon of choice... but there are many great projects with similar features:

Alternatively, if you know how to use Puppet, Chef, Ansible, or SaltStack, and want to fork and develop your own alternative dev environment, or build one on your own, that's always an option! Especially if you have a highly specialized production environment, it may be best to reflect that environment with a more specialized local development environment.

On Docker and LXC/LXC (Container-based environments)

Before I wrap up, I wanted to also specifically call out some projects like Drocker and the next-generation Drupal.org testbot infrastructure project, DrupalCI, both of which are using Docker containers for local development. Containerized development environments offer many of the same benefits of virtualization, but can be faster to build and rebuild, and easier to maintain.

Container-based infrastructure is likely going to become standard in the next 5-10 years (much like VM-based infrastructure has become standard in the past 5-10 years)—whether with Docker or some other standard format/methodology (a container's just a container!).

Many hosting platforms use a container-everywhere approach, like:

  • Platform.sh
  • Pantheon
  • Google Container Engine
  • Amazon EC2 Container Service

However, I caution that container-based development has it's own complexities, especially in production—especially with more complicated web applications like Drupal. I also caution against blindly running other people's pre-built container images in production; you should build them and manage them on your own (just like I build and manage my own VM images using Packer, e.g. packer-ubuntu-1404).

In Summary

In short, I've been working on Drupal VM for the past couple years, and I've made it flexible enough for the variety of Drupal sites I work on. I hope it's flexible enough for your development needs, and if not, open an issue and I'll see what I can do!

Apr 27 2015
Apr 27

Almost three years ago, on Feb 19, 2013, I opened the 8.x-dev branch of the Honeypot module (which helps prevent form spam on thousands of Drupal sites). These were heady times in the lifetime of the then-Drupal 8.x branch; 8.0-alpha1 wasn't released until three months later, on May 19. I made the #D8CX pledge—when Drupal 8 was released, I'd make sure there was a full, stable Honeypot release ready to go.

Little did I know it would be more than 2.5 years—and counting—before I could see that promise through to fruition!

As months turned into years, I've kept to the pledge, and eventually decided to also port a couple other modules that I use on many of my own Drupal sites, like Wysiwyg Linebreaks and Simple Mail.

Two years ago, I mentioned in the original Honeypot D8 conversion issue that I'd likely write a blog post "about the process of porting a moderately-complex module like this from D7 to D8". Well, I finally had some time to write that post—and I'm still wondering how far off will be the release of Drupal 8.0.0!

Ch-Ch-Changes

When working on the initial port, and when opening a new issue almost on a monthly basis to rework parts of the module to keep up with Drupal 8 core changes, I would frequently read through all the new nodes posted to the list of Change records for Drupal core.

These change records are like the Bible of translating 'how do I do Y in Drupal 8 when I did X in Drupal 7'? Most of the change records have fitting examples, contain a good amount of detail, and link back to the one, two, or ten issues that caused the particular change record to be written.

However, there were a few that were in a sorry state; these change records didn't have references back to all the relevant Drupal core issues, or only provided contrived examples that didn't help me much. In these cases, I took the following approach:

  1. Try to find the git commits that caused Honeypot tests or code to fail, do a git blame.
  2. Find the issue(s) referenced by the breaking commits.
  3. Read through the issue summary and see if it helps figure out how to fix my code.
  4. If that doesn't help, read through the commit itself, then the code that was changed, and see if that helps.
  5. If that doesn't help, read the entire issue comment history to see if that helps.
  6. If that doesn't help, pop over to the ever-helpful #drupal-contribute IRC channel.
  7. (The most important part) Go back to the deficient change record and edit it, adding appropriate issue references, code examples and documentation.

In the course of the 71 distinct Honeypot 8.x commits that have been added so far, I had to go all the way to numbers 5 and 6 quite often. If it weren't for the incredible helpfulness of people like webchick, tim.plunkett, and others who seem to be living change record references, I would've probably given up the endeavor to keep Honeypot's Drupal 8 branch up to date the past three years!

Automated tests are a pain to maintain... but help immensely

The Drupal 7 version of Honeypot had almost complete SimpleTest coverage for primary module functionality. One of the first steps in porting the module to Drupal 8—and the best way to make sure all the primary functionality was working correctly—was to port the tests to Drupal 8.

There have been dozens of automated testing changes in Drupal 8 that have caused tests to fail or give unexpected results. This caused some frustration in figuring out whether a particular failure was due to failing code or changes to the testing API.

Even with the small frustrations of broken tests every month or two, the test coverage is a huge help in ensuring long-term stability for a moderately-complex module like Honeypot. Especially when refactoring a large part of the module, or porting a feature between major Drupal versions, automated test coverage has more than made up for the extra time spent creating the tests.

The Drupal community is ever-helpful

The other thing that's been an immense help throughout the development cycle is community involvement. Since Honeypot was one of the earliest modules with a stable Drupal 8 version (it's already seen 15 stable releases with 100% passing tests!), it's already used on many public Drupal 8 sites (over 80 at this point!). And this means there are users of the module invested in its success.

These early Drupal 8 adopters and other generous Drupal developers contributed code to fix a total of 12 of the hairest issues during the D8 development cycle so far.

Come for the code, stay for the community; my experience porting Honeypot to Drupal 8 (the easy part), and chasing Drupal 8 HEAD for three years (the hard part) has again provent to me the truth of this catch phrase. I hope I can say thanks in person to at least some of the following Honeypot D8 contributors over the past three years:

2.5 years, and counting

Much has been written about core contributor burnout, but I wanted to give some credit and kudos to the army of dedicated contributed module maintainers who have already made the #D8CX pledge. A major reason for Drupal's success in so many industries is the array of contributed modules available.

The very long development cycle between major releases—coupled with the fact that many contrib maintainers are now supporting three major versions of their modules—means that contributed module maintainers are at risk for burning out too.

I'd really like to be able to focus more of my limited time for Honeypot development on new features again, especially since a few of these new features would greatly benefit the 55,000+ Drupal 6 and 7 websites already using the module today. But until we have a solid API freeze for Drupal 8.0.x, most of my time will be spent fixing tests and code just to keep Honeypot working with HEAD.

I'll be at DrupalCon LA, and I hope to do whatever small part I can to get Drupal 8.0.0 out the door—will you do the same?

Apr 16 2015
Apr 16

Previously, I posted my thoughts on the Acquia Certified Developer - Back End Specialist exam as well as my thoughts on the Certified Developer exam. To round out the trifecta of developer-oriented exams, I took the Front End Specialist exam this morning, and am posting some observations for those interested in taking the exam.

Acquia Certified Developer - Front End Specialist badge

My Theming Background

I started my Drupal journey working on design/theme-related work, and the first few Drupal themes I built were in the Drupal 5 days (I inherited some 4.7 sites, but I only really started learning how Drupal's front end worked in Drupal 5+). Luckily for me, a lot of the basics have remained the same (or at least similar) from 5-7.

For the past couple years, though, I have shied away from front end work, only doing as much as I need to keep building out features on sites like Hosted Apache Solr and Server Check.in, and making all my older Drupal sites responsive (and sometimes, mobile-first) to avoid penalization in Google's search rankings... and to build a more usable web :)

Exam Content

A lot of the questions on the exam had to do with things like properly adding javascript and CSS resources (both internal to your theme and from external sources), setting up theme regions, managing templates, and working with theme hooks, the render API, and preprocessors.

In terms of general styling/design-related content, there were few questions on actual CSS and jQuery coding standards or best practices. I only remember a couple questions that touched on breakpoints, mobile-first design, or responsive/adaptive design principles.

There were also a number of questions on general Drupal configuration and site building related to placing blocks, menus, rearranging content, configuring views etc. (which would all rely on a deep knowledge of Drupal's admin interface and how it interacts with the theme layer).

Results

On this exam, I scored an 86.66%, and (as with the other exams) a nice breakdown of all the component scores was provided in case I want to brush up on a certain area:

  • Fundamental Web Development Concepts : 90%
  • Theming concepts: 80%
  • Sub-theming concepts: 100%
  • Templates: 75%
  • Template functions: 87%
  • Layout Configuration: 90%
  • Performance: 80%
  • Security: 100%

Not too surprising, in that I hate using templates in general, and try to do almost all work inside process and preprocess functions, so my templates just print the markup they need to print :P

I think it's somewhat ironic that the Front End and general Developer exams both gave me pretty good scores for 'Fundamentals', yet the back-end exam (which would target more programming-related situations) gave me my lowest score in that area!

Summary

I think, after taking this third of four currently-available exams (the Site Builder exam is the only one remaining—and I'm planning on signing up for that one at DrupalCon LA), I now qualify for being in Acquia's Grand Master Registry, so yay!

If you'd like to take or learn about this or any of the other Acquia Certification exams, please visit the Acquia Certification Program overview.

Apr 10 2015
Apr 10

A little under a year ago, I took the Acquia Certified Developer exam at DrupalCon Austin, and posted Thoughts on the Acquia Drupal Developer Certification Exam. My overall thoughts on the idea of certifications for OSS like Drupal remain unchanged, so go read that previous post to hear them.

I wanted to post a little more about the additional certifications Acquia is now offering; in addition to the initial, more generalist-oriented Acquia Certified Developer Exam, Acquia now offers:

Earlier today, I took the Back End Specialist Exam, which focuses more specifically on things like Drupal's core API, general PHP syntax and style, secure code, content caching, debugging, and interacting with the Drupal community.

Acquia Certified Developer - Back End Specialist badge

Like the other certification exams, you get 90 minutes to complete the exam (60 questions total), and you have to take the exam either online or in a testing center with an active proctor. This time, I elected to take the exam on my own computer, which was a little more annoying than taking the exam in-person at a test center (as I did at DrupalCon last year).

Taking the online proctored exam

To prevent cheating, there are a few things you have to get set up before you can start the exam: you have to install a 'Sentinel' app on your computer that basically takes control of everything (including webcam, microphone, screen, UI, etc.), which is a little off-putting (for privacy/security reasons), then you have to position your webcam in such a way that the proctor can see your face, hands, and keyboard at all times.

It made me feel a little weird, in that scratching an itch or even stretching made me feel like I was about to be denied access or auto-flunked. Thankfully, I was only stopped once, when I was told to remove my headset at the beginning of the exam (I usually have it on when at my work desk, and I didn't realize headsets weren't allowed).

I felt a little bit more comfortable and relaxed when I took the exam at DrupalCon Austin, so I'm planning on taking at least one exam in person at DrupalCon LA in a month or so (you can register for one of the exams here!).

The exam

The exam was pretty well balanced with problems you'll face day-to-day in backend development. There were a few performance and caching-related questions geared a little more towards larger sites, and there were a couple CSS and JS-related questions that I felt would be more fitting in the Front End Specialist exam, but on the whole, the questions were challenging and unambiguous.

There were even two questions about very specific PHP coding standards that made my OCD tendencies very happy!

I completed this exam in about 55 minutes (a little more quickly than the general exam), and only had four questions to review at the end. The questions felt like they were half cut-and-dry, "here's some code, answer a specific question, and half user-story-like, "here's the situation, what would you do?"

I was a little disappointed there weren't any questions (at least, not that I recall) specific to configuring or running MySQL, PHP, or any other back-end components of a modern infrastructure stack... but maybe Acquia will add a DevOps or Infrastructure specialist exam soon. I can dream, can't I?

Results

Overall, I passed with an 88.33%, and the exam results provide a nice, detailed results breakdown to highlight areas for improvement:

  • Fundamental Web Development Concepts: 75.00%
  • Drupal core API : 89.47%
  • Database Abstraction Layer: 83.33%
  • Debug code and troubleshooting: 75.00%
  • Theme Integration: 100.00%
  • Performance: 87.50%
  • Security: 100.00%
  • Leveraging Community: 100.00%

Apparently I need to brush up on 'Fundamental Web Development Concepts' (strange, because that's the area where I scored highest in the general exam from last year!).

Pricing

The price of this exam is $350, though I was able to take the exam free of charge as an Acquia employee. This exam is $100 more than the price of the general Developer exam, so if you haven't taken that exam yet, and are considering the Acquia Certification, I'd recommend taking the less expensive general exam first, then seeing if taking the Back End Specialist exam is worth it for you.

Summary

I've now taken two of four exams currently available, and I'm looking forward to taking the rest, possibly completing all the current ones at DrupalCon LA in a month!

Apr 09 2015
Apr 09

Posts in this series:

In earlier Solr for Drupal Developers posts, you learned about Apache Solr and it's history in and integration with Drupal. In this post, I'm going to walk you through a quick guide to getting Apache Solr running on your local workstation so you can test it out with a Drupal site you're working on.

The guide below is for those using Mac or Linux workstations, but if you're using Windows (or even if you run Mac or Linux), you can use Drupal VM instead, which optionally installs Apache Solr alongside Drupal.

As an aside, I am writing this series of blog posts from the perspective of a Drupal developer who has worked with large-scale, highly customized Solr search for Mercy (example), and with a variety of small-to-medium sites who are using Hosted Apache Solr, a service I've been running as part of Midwestern Mac since early 2011.

Installing Apache Solr in a Virtual Machine

Apache Solr can be run directly from any computer that has Java 1.7 or later, so technically you could run it on any modern Mac, Windows, or Linux workstation natively. But to keep your local workstation cleaner, and to save time and hassle (especially if you don't want to kludge your computer with a Java runtime!), this guide will show you how to set up an Apache Solr virtual machine using Vagrant, VirtualBox, and Ansible.

Let's get started:

  1. Clone the ansible-vagrant-examples project from GitHub (you can also download ansible-vagrant-examples directly).
  2. Change directory in Terminal to the /solr subdirectory, and follow the instructions in the Solr example's README for installing Vagrant, VirtualBox, and Ansible, then follow the rest of the instructions for building that example (e.g. vagrant up).
  3. At this point, if you visit http://192.168.33.44:8983/solr in your browser, you should see the Apache Solr admin interface:
    Apache Solr Administration Dashboard - 4.10
  4. The next step is to point your local Drupal installation (assuming you have a Drupal site running locally) at this Solr instance and make sure it can connect. We're using the Apache Solr Search module in this example, but Search API Solr Search setup is similar.
    1. Visit /admin/config/search/apachesolr/settings, and click 'Add search environment'.
    2. Enter http://192.168.33.44:8983/solr/collection1 (this is the default search core that Apache Solr includes out of the box) for 'Solr server URL', check the checkbox to make this the default environment, add a description (e.g. 'Local Solr server'), and click 'Save':
      Drupal Apache Solr module search environment configuration form
    3. After saving the new environment, the next page should show the environment with a green-colored background. That means your Drupal site can connect to the Solr server.
  5. After Drupal is able to connect, you need to add the Drupal module's Solr configuration files to the search core you'll be using. This takes a few steps, but will ensure all your Drupal content is indexed by Solr correctly.
    1. Change directory in Terminal to the /solr directory (where you ran vagrant up earlier), and run vagrant ssh to log into the Solr VM.
    2. While logged into the VM, enter the following commands:
      1. curl http://ftp.drupal.org/files/projects/apachesolr-7.x-1.x-dev.tar.gz | tar -xz (download the Apache Solr module into the current directory).
      2. sudo cp -r apachesolr/solr-conf/solr-4.x/* /var/solr/collection1/conf/ (copy the Apache Solr module configuration into the default Solr core).
      3. sudo chown -R solr:solr /var/solr/collection1/conf/* (fix permissions for the copied files).
      4. sudo service solr restart (restart Apache Solr so the configuration is updated).
    3. Once this is complete, go back to the Apache Solr search settings page (/admin/config/search/apachesolr/settings), and click on the 'Index' configuration in your local solr server row. You should see something like drupal-4.3-solr-4.x for the 'Schema', meaning the Drupal module schema.xml has been picked up successfully.

At this point, you should be able to index your site content into Apache Solr (scroll down and check some content types you want to index), and start playing around with Apache Solr search!

The best first steps are to look around in all the Apache Solr configuration pages, test indexing your entire site, then work on setting up search pages and maybe even install the Facet API module to configure some search facets. In very little time, you should be able to make your site search as user-friendly and speedy as Amazon, Newegg, etc.

Further Reading

Apr 02 2015
Apr 02

Drupal.org has an excellent resource page to help you create a static archive of a Drupal site. The page references tools and techniques to take your dynamically-generated Drupal site and turn it into a static HTML site with all the right resources so you can put the site on mothballs.

From time to time, one of Midwestern Mac's hosted sites is no longer updated (e.g. LOLSaints.com), or the event for which the site was created has long since passed (e.g. the 2014 DrupalCamp STL site).

I though I'd document my own workflow for converting typical Drupal 6 and 7 sites to static HTML to be served up on a simple Apache or Nginx web server without PHP, MySQL, or any other special software, since I do a few special things to preserve the original URL alias structure, keep CSS, JS and images in order, and make sure redirections still work properly.

1 - Disable forms and any non-static-friendly modules

The Drupal.org page above has some good guidelines, but basically, you need to make sure to all the 'dynamic' aspects of the site are disabled—turn off all forms, turn off modules that use AJAX requests (like Fivestar voting), turn off search (if it's using Solr or Drupal's built-in search), and make sure AJAX and exposed filters are disabled in all views on the site—a fully static site doesn't support this kind of functionality, and if you leave it in place, there will be a lot of broken functionality.

2 - Download a verbatim copy of the site with SiteSucker

CLI utilities like HTTrack and wget can be used to download a site, using a specific set of parameters to make sure the download is executed correctly, but since I only convert one or two sites per year, I like the easier interface provided by SiteSucker.

SiteSucker lets you set options for a download (you can save your custom presets if you like), and then it gives a good overview of the entire download process:

SiteSucker Drupal Download Site

I change the following settings from the defaults to make the download go faster and result in a mostly-unmodified download of the site:

  • General
    • Ignore Robot Exclusions
      (If you have a slower or shared server and hundreds or thousands of pages on the site, you might not want to check this box—Ignoring the exclusions and the crawler delay can greatly increase the load on a slow or misconfigured webserver when crawling a Drupal site).
    • Always Download HTML and CSS
    • File Modification: None
    • Path Constraint: Host
  • Webpage
    • Include Supporting Files

After the download completes, I zip up the archive for the site, transfer it to my static Apache server, and set up the virtualhost for the site like any other virtualhost. To test things out, I point the domain for my site to the new server in my local /etc/hosts file, and visit the site.

3 - Make Drupal paths work using Apache rewrites

Once you're finished getting all the files downloaded, there are some additional things you need to configure on the webserver level—in this case, Apache—to make sure that file paths and directories work properly on your now-static site.

A couple neat tricks:

  • You can preserve Drupal pager functionality without having to modify the actual links in HTML files by setting DirectorySlash Off (otherwise Apache will inject an extra / in the URL and cause weird side effects), then setting up a specialized rewrite using mod_rewrite rules.
  • You can redirect links to /node (or whatever was configured as the 'front page' in Drupal) to / with another mod_rewrite rule.
  • You can preserve links to pages that are now also directories in the static download using another mod_rewrite rule (e.g. if you have a page at /archive that should load archive.html, and there are also pages accessible at /archive/xyz, then you need a rule to make sure a request to /archive loads the HTML file, and doesn't try loading a directory index!).
  • Since the site is now static, and presumably won't be seeing much change, you can set far future expires headers for all resources so browsers can cache them for a long period of time (see the mod_expires section in the example below).

Here's the base set of rules that I put into a .htaccess file in the document root of the static site on an Apache server for static sites created from Drupal sites:

<IfModule mod_dir.c>
  # Without this directive, directory access rewrites and pagers don't work
  # correctly. See 'Rewrite directory accesses' rule below.
  DirectorySlash Off
</IfModule>

<IfModule mod_rewrite.c>
  RewriteEngine On

  # Fix /node pagers (e.g. '/node?page=1').
  RewriteCond %{REQUEST_URI} ^/node$
  RewriteCond %{QUERY_STRING} ^page=(.+$)
  RewriteRule ^([^\.]+)$ index-page=%1.html [NC,L]

  # Fix other pagers (e.g. '/archive?page=1').
  RewriteCond %{REQUEST_URI} !^/node$
  RewriteCond %{QUERY_STRING} ^page=(.+$)
  RewriteRule ^([^\.]+)$ $1-page=%1.html [NC,L]

  # Redirect /node to home.
  RewriteCond %{QUERY_STRING} !^page=.+$
  RewriteRule ^node$ / [L,R=301]

  # Rewrite directory accesses to 'directory.html'.
  RewriteCond %{REQUEST_FILENAME} -d
  RewriteCond %{QUERY_STRING} !^page=.+$
  RewriteRule ^(.+[^/])/$ $1.html [NC,L]

  # If no extension included with the request URL, invisibly rewrite to .html.
  RewriteCond %{REQUEST_FILENAME} !-f
  RewriteRule ^([^\.]+)$ $1.html [NC,L]

  # Redirect non-www to www.
  RewriteCond %{HTTP_HOST} ^example\.com$ [NC]
  RewriteRule ^(.*)$ http://www.example.com/$1 [L,R=301]
</IfModule>

<IfModule mod_expires.c>
  ExpiresActive On
  <FilesMatch "\.(ico|pdf|flv|jpg|jpeg|png|gif|js|css|swf)$">
    ExpiresDefault "access plus 1 year"
  </FilesMatch>
</IfModule>

Alternative method using a localized copy of the site

Another more time-consuming method is to download a localized copy of the site (where links are transformed to be relative, linking directly to .html files instead of the normal Drupal paths (e.g. /archive.html instead of /archive). To do this, download the site using SiteSucker as outlined above, but select 'Localize' for the 'File Modification' option in the General settings.

There are some regex-based replacements that can clean up this localized copy, depending on how you want to use it. If you use Sublime Text, you can use these for project-wide find and replace, and use the 'Save All' and 'Close All Files' options after each find/replace operation.

I'm adding these regexes to this post in case you might find one or more of them useful—sometimes I have needed to use one or more of them, other times none:

Convert links to index.html to links to /:

  • Find: (<a href="http://www.midwesternmac.com/blogs/jeff-geerling/drupal-on-mothballs-convert-static-html/)[\.\./]+?index\.html(")
  • Replace: \1/\2

Remove .html in internal links:

  • Find: (<a href="http://www.midwesternmac.com/blogs/jeff-geerling/drupal-on-mothballs-convert-static-html/[^http].+)\.html(")
  • Replace: \1\2

Fix one-off link problems (e.g. Feedburner links detected as internal links):

  • Find: (href="http://www.midwesternmac.com/blogs/jeff-geerling/drupal-on-mothballs-convert-static-html/).+(feeds2?.feedburner)
  • Replace: \1http://\2

Fix other home page links that were missed earlier:

  • Find: href="index"
  • Replace: href="http://www.midwesternmac.com/"

Fix relative links like ../../page:

  • Find: ((href|src)=")[\.\./]+(.+?")
  • Replace: \1/\3

Fix relative links in top-level files:

  • Find: ((href|src)=")([^/][^http].+?")
  • Replace: \1/\3

This secondary method can sometimes make for a static site that's easier to test locally or distribute offline, but I've only ever localized the site like this once or twice, since the other method is generally easier to get going and doesn't require a ton of regex-based manipulation.

Mar 21 2015
Mar 21

Earlier today, I gave a presentation on Ansible and Drupal 8 at MidCamp in Chicago. In the presentation, I introduced Ansible, then deployed and updated a Drupal 8 site on a cluster of 6 Raspberry Pi computers, nicknamed the Dramble.

Video from the presentation is below (sadly, slides/voice only—you can't see the actual cluster of Raspberry Pis... for that, come see me in person sometime!):

[embedded content]

My slides from the presentation are embedded below, and I'll be posting a video of the presentation as soon as it's available.

Ansible + Drupal: A Fortuitous DevOps Match from geerlingguy

Mar 21 2015
Mar 21

MidCamp Camp Organizers sign

On March 21, 2015, there was a fairly well-attended Camp Organizers BoF at MidCamp in Chicago. I took notes during the BoF and am simply publishing them here for the benefit of camp organizers in the Drupal Community. They're fairly raw, but hopefully they'll be helpful for you!

Camps Represented

  • DrupalCorn (Iowa)
  • MidCamp (Chicago)
  • RADCamp (potential future camp)
  • BADCamp (San Fransisco)
  • DrupalCamp STL (St. Louis)
  • DrupalNorth (Toronto)
  • DrupalCamp Costa Rica - July 29-31

Ideas

  • Add your camp dates to:
  • Pooling resources for things like signage, printed booklets, etc.
  • Video recording equipment at MidCamp is in 3rd revision; next step is to make things 'less unweildy'
  • Pre-communication: especially with speakers
  • Have some people who don't have particular responsibilities (mainly), but just 'do things that need to be done'
    • Some camps have 'runners' (this role, day of and/or before hand)
  • Camp doesn't run itself—but if you can put together a good team, you can make amazing things happen.
  • Best part of this BoF / sharing in camps: cross-pollination for ideas, layout, etc.
  • Bluespark + RedHat thinking about starting a 'RADCamp' :)
  • WiFi: Made sure there were additional access points here, helped give bandwidth/reliable connectivity
  • Some things work great at one camp, disastrous at another (e.g. video recording at Fox Valley vs. BADCamp vs. MidCamp)
  • Good to get a list together of 'things you could do at a DrupalCamp' (e.g. board game night)
  • Resource guide like Rails Girls
    • A kit of information for DrupalCamps
    • Documentation is on GitHub; allow improvements/contributions via GitHub PRs
  • Date selection:
    • Not just a North American problem; in Europe, camp dates run into each other pretty frequently as well.
    • Mailing list: not necessarily the most effective way to organize dates; needs to have low signal/noise ratio.
    • Drupical, coordinate in #drupalcamp, etc. (not much consensus here)

Current resources

Video recording

  • Need sets of video recording equipment for session recording
  • Working on getting equipment into small/neat 'packages', and have them transportable in a pelican case or something
  • MidCamp PVR kit
    • About 3 lbs, costs about $425
    • Currently in 'beta 3' (Fox Valley, BADCamp, MidCamp)
    • Uses HDMI, requires some dongles for VGA and other formats
    • Records audio, but also has a backup Zoom voice recorder
    • Doesn't work great with older PC laptops
    • Goal is for DA and/or Camps to ship around the equipment
    • More information: Blue Drop Shop
  • Pain points:
    • Training the speakers (make sure they do things in the right order)
    • Packaging the parts so they're simpler to set up and use

Dates

  • DrupalCon moving around all the time upsets all the DrupalCamp apple carts!
  • Drupal Association should hopefully be able to help coordinate dates a little better.

Accessibility

  • MidCamp had a few unique accessibility features, e.g. blue tape lines throughout venue for accessible paths, pre-camp online walkthrough
  • Pre-walkthrough:
    • Included pictures for how to get through the actual venue
    • Give plenty of detail/guides/signage for physical location
    • Go around the venue, take lots of pictures, make sure you take lots of notes
  • Wanted to do three more things:
    1. Pay for a 'sprint room' so we have one every day that we do sprints
    2. Pay for captioning of talks while they're happening
    3. Pay for ASL for talks while they're happening
  • ASL Interpreter / Closed Captioning
    • about $125/hour per interpreter... but if it's more than 1 hour, 2 required
    • 2 hour minimum
    • Another idea: Skype the sessions to a remote transcriber, and that captioning can come back in real time to the screen
    • Issues:
      • Might need a second screen, and video post-production gets more complicated
      • Need a low-latency connection to get a skype-based transcription service working

Volunteers / Organizers

  • MidCamp
    • 10 regular volunteers
    • 6 people who came and went
    • Question: How many organizers do you need, how much committment do you need?
    • Some people came in and did small things, then stopped (e.g. putting tape on the floor). That's fine!
  • Need to help people figure out how many organizers / volunteers are needed.
  • Session on recruiting and retaining dedicated volunteers

Session selection

  • MidCamp had anonymous session selection.
    • Problems:
      • Quality: One group said sessions could have presenters who are terrible, making for terrible camp
      • Homogeneity: Worry that entire group of speakers would be a bunch of white guys (not very diverse).
    • Answers:
      • Quality:
        • Past performance doesn't necessarily indicate future success
        • "Take really good care of the presenters"
        • Send out reminders for speakers, help train them, improve them, etc.
        • Communication to the speakers was amazing; lots of great feedback
      • Homogeneity:
        • We have unconscious biases when we know about things. Anonymous selection helps mitigate this.
        • Non-anonymous solicitation helps with this (intentionally email/solicit a more diverse group)
        • MidCamp has about 20% female speakers.
    • Solicitations:
      • YesCT had a huge list of topics that were good at camps; seeded the list to dozens of people via Twitter, email, etc.
    • Process:
      • Dumped everything to spreadsheet, removing all identifying information (e.g. pronouns, names, business, etc.)
      • Can't be 100% anonymous (e.g. 'everyone knows what Larry talks about')
      • Put UIDs on everything to track
      • 25% of people submitted 2 sessions—but picked only one session per presenter
      • Deduplicated all the sessions, made sure groupings were good
      • Read through the sessions before the selection meeting (individuals made some notes)
      • 6 or so people went through the list and voted "Yes/No" (marathon)
      • After selection, selection group was dispersed, but schedule needed to be made
    • "It's a lot of work"

Taking care of yourselves

  • Need to eat well, relax, use good posture, etc.
  • Services like message therapy, aroma therapy, etc.
    • BADCamp: "The Hippie Tent"
    • If no space for an entire tent, maybe at least a table for helping people 'be well'
Feb 26 2015
Feb 26

Dramble - 6 Raspberry Pi 2 model Bs running Drupal 8 on a cluster
Version 0.9.3 of the Dramble—running Drupal 8 on 6 Raspberry Pis

I've been tinkering with computers since I was a kid, but in the past ten or so years, mainstream computing has become more and more locked down, enclosed, lightweight, and, well, polished. I even wrote a blog post about how, nowadays, most computers are amazing. Long gone are the days when I had to worry about line voltage, IRQ settings, diagnosing bad capacitors, and replacing 40-pin cables that went bad!

But I'm always tempted back into my earlier years of more hardware-oriented hacking when I pull out one of my Raspberry Pi B+/A+ or Arduino Unos. These devices are as raw of modern computers as you can get—requiring you to actual touch the silicone chips and pins to be able to even use the devices. I've been building a temperature monitoring network that's based around a Node.js/Express app using Pis and Arduinos placed around my house. I've also been working a lot lately on a project that incorporates three of my current favorite technologies: The Raspberry Pi 2 model B (just announced earlier this month), Ansible, and Drupal!

In short, I'm building a cluster of Raspberry Pis, and designating it a 'Dramble'—a 'bramble' of Raspberry Pis running Drupal 8.

Motivation


This LED will light up that wonderful Drupal Blue, #0678BE

I've been giving a number of presentations on managing infrastructure with Ansible in the past couple years. And in the course of writing Ansible for DevOps (available on LeanPub!), I've done a lot of testing on VMs both locally and in the cloud.

But doing this testing on a 'local datacenter'—especially one that fits in the palm of my hand—is great for two reasons:

  • All networking is local; conferences don't always have the most stable networking, so I can do all my infrastructure testing on my own 'local cloud'.
  • It's pretty awesome to be able to hold a cluster of physical servers and a Gigabit network in my hand!

Lessons Learned (so far!)


Drool... I own these!

Building out the Pi-based infrastructure has taught me a lot about small-scale computing, efficient use of resources, benchmarking, and also how Drupal 8 differs (spoiler: it's way better) from Drupal 7 in terms of multi-server deployment and high-availability/high-performance configurations.

I've also learned:

Benchmarking


Wiring up the mini Cat5e network cables.

I've been benchmarking the heck out of this infrastructure, and besides finding that the major limiting factor with a bunch of low-cost computers is almost always slow I/O, I've found that:

  • On-the-fly gzip actually harms performance (in general) when your CPU isn't that fast.
  • Redis caching gives an immediate 15% speedup for Drupal 8.
  • Different microSD cards deliver order-of-magnitude speedups. As an example, one card took 20 minutes to import a 6MB database; another card? 9 seconds.
  • Drupal 8 is kinda slow (but I don't need to tell you that).
  • Still to come: Nginx vs. Apache with php-fpm, Nginx vs. Varnish for load balancing, Redis vs. Memcached for caching. MySQL vs. MariaDB for database. And more!

Since I have this nice little cluster of Raspberry Pis humming along using half the power of a standard light bulb, the sky is the limit! And the fact that the servers are slower and have different performance considerations than typical modern cloud-based infrastructure actually helps to expose certain performance-related flaws that I wouldn't have otherwise!

Finally, it helps me stay creative in finding ways to eke out another 50 KB/sec of bandwidth here, or 100 iops there :)

See the Dramble in person!

So why am I mentioning all this? Because I want to bring the Dramble with me to some Drupal events, and I'd love to share it with you, explain everything in more detail, and most importantly: demonstrate modern and easy Drupal 8 deployment with Ansible on it.

I'll be bringing it to #MidCamp in Chicago on Saturday, March 21, and I've also submitted a session for DrupalCon LA: Deploying Drupal 8 to Bare Metal with Ansible - Live!

I hope the session is selected and I can bring the Dramble with me to LA in a couple months :)

Also, if you haven't submitted your own session for DrupalCon LA, the deadline is Friday; go submit it now!

For more on the Dramble itself, check out the Raspberry Pi Dramble project on GitHub, and see what I'm working on over in the Dramble issue queue.

Dec 15 2014
Dec 15

I just posted a large excerpt from Ansible for DevOps over on the Server Check.in blog: Highly-Available Infrastructure Provisioning and Configuration with Ansible. In it, I describe a simple set of playbooks that configures a highly-available infrastructure primarily for PHP-based websites and web applications, using Varnish, Apache, Memcached, and MySQL, each configured in a way optimal for high-traffic and highly-available sites.

Here's a diagram of the ultimate infrastructure being built:

Highly Available Infrastructure

The configuration is similar to what many larger Drupal sites would use, and with the exception of the varnish default.vcl and the actual PHP script being deployed (in the example, it's just a PHP file that tests the rest of the infrastructure and outputs success/fail statuses), you could drop a Drupal site on the Apache servers and immediately start scaling up your traffic!

The example highlights the powerful simplicity of Ansible as a tool for not only configuration management (like Puppet, Chef, etc.), but also for provisioning and managing servers in different cloud providers. With under a hundred lines of YAML configuration, I can spin up the exact same infrastructure locally with Vagrant and VirtualBox, on DigitalOcean droplets, or on AWS EC2 instances!

Nov 24 2014
Nov 24

It's been a well-known fact that using native VirtualBox or VMWare shared folders is a terrible idea if you're developing a Drupal site (or some other site that uses thousands of files in hundreds of folders). The most common recommendation is to switch to NFS for shared folders.

NFS shared folders are a decent solution, and using NFS does indeed speed up performance quite a bit (usually on the order of 20-50x for a file-heavy framework like Drupal!). However, it has it's downsides: it requires extra effort to get running on Windows, requires NFS support inside the VM (not all Vagrant base boxes provide support by default), and is not actually all that fast—in comparison to native filesystem performance.

I was developing a relatively large Drupal site lately, with over 200 modules enabled, meaning there were literally thousands of files and hundreds of directories that Drupal would end up scanning/including on every page request. For some reason, even simple pages like admin forms would take 2+ seconds to load, and digging into the situation with XHProf, I found a likely culprit:

is_dir xhprof Drupal

There are a few ways to make this less painful when using NFS (since NFS incurs a slight overhead for every directory/file scan):

  • Use APC and set stat=0 to prevent file lookups (this is a non-starter, since that would mean every time I save a file in development, I would need to restart Apache or manually flush the PHP APC cache).
  • Increase PHP's realpath_cache_size ini variable, which defaults to '16K' (this has a small, but noticeable impact on performance).
  • Micro-optimize the NFS mounts by basically setting them up on your own outside of Vagrant's shared folder configuration (another non-starter... and the performance gains would be almost negligible).

I wanted to benchmark NFS against rsync shared folders (which I've discussed elsewhere), to see how much of a difference using VirtualBox's native filesystem can make.

For testing, I used a Drupal site with about 200 modules, and used XHProf to measure the combined Excl. Wall Time for calls to is_dir, readdir, opendir, and file_scan_directory. Here are my results after 8 test runs on each:

NFS shared folder:

  • 1.5s* (realpath_cache_size = 16K - PHP default)
  • 1.0s (realpath_cache_size = 1024K)
  • Average page load time: 1710ms (realpath_cache_size = 1024K, used admin/config/development/devel)

*Note: I had a two outliers on this test, where the time would go to as much as 6s, so I discarded those two results. But realize that, even though this NFS share is on a local/internal network, the fact that every file access goes through the full TCP stack of the guest VM, networking issues can make NFS performance unstable.

Native filesystem (using rsync shared folder):

  • 0.15s (realpath_cache_size = 16K - PHP default)
  • 0.1s (realpath_cache_size = 1024K)
  • Average page load time: 900ms (realpath_cache_size = 1024K, used admin/config/development/devel)

Tuning PHPs realpath_cache_size makes a meaningful difference (though not too great), since the default 16K cache doesn't handle a large Drupal site very well.

As you can see, there's really no contest—just as NFS is an order of magnitude faster than standard VirtualBox shared folders, native filesystem performance is an order of magnitude faster than NFS. Overall site page load times for the Drupal site I was testing went from 5-10s to 1-3s by switching from NFS to rsync!

I've updated my Drupal Development VM and Acquia Cloud VM to use rsync shares by default (though you can still configure NFS or any other supported share type), and to use a realpath_cache_size of 1024K). Hopefully Drupal developers everywhere will save a few minutes a day from these changes :)

Note that other causes for abysmal filesystem performance and many calls to is_dir, opendir, etc. may include things like a missing module or major networking issues. Generally, when fixing performance issues, it's best to eliminate the obvious, and only start digging deeper (like this post) when you don't find an obvious problem.

Notes on using rsync shared folders

Besides the comprehensive rsync shared folder documentation in Vagrant's official docs, here are a few tips to help you get up and running with rsync shared folders:

  • Use rsync__args to pass CLI options to rsync. The defaults are ["--verbose", "--archive", "--delete", "-z"], but if you want to preserve the files created within the shared folder on the guest, you can set this option, but without --delete.
  • Use rsync__exclude to exclude directories like .git and other non-essential directories that are unneccessary for running your application within the VM. While not incredibly impactful, it could shave a couple seconds off the rsync process.

Not all is perfect; there are a few weaknesses in the rsync model as it is currently implemented out-of-the-box:

  1. You have to either manually run vagrant rsync when you make a change (or have your IDE/editor run the command every time you save a file), or have vagrant rsync-auto running in the background while you work.
  2. rsync is currently one-way only (though there's an issue to add two-way sync support).
  3. Permissions can still be an issue, since permissions inside the VM sometimes require some trickery; read up on the rsync__chown option in the docs, and consider passing additional options to the rsync__args to manually configure permissions as you'd like.
Nov 14 2014
Nov 14

Drupal 8's expanded and broadly-used Entity API extends even to Contact Forms, and recently I needed to create a contact form programmatically as part of Honeypot's test suite. Normally, you can export a contact form as part of your site configuration, then when it's imported in a different site/environment, it will be set up simply and easily.

However, if you need to create a contact form programmatically (in code, dynamically), it's a rather simple affair:

First, use Drupal's ContactForm class at the top of the file so you can use the class in your code later:

<?php
use Drupal\contact\Entity\ContactForm;
?>

Then, create() and save() a ContactForm entity using:

<?php
    $feedback_form
= ContactForm::create([
     
'id' => 'help',
     
'label' => 'Help',
     
'recipients' => ['[email protected]'],
     
'reply' => '',
     
'weight' => 0,
    ]);
   
$feedback_form->save();
?>

If you also want to update the default contact form so you can set your new form as the default sitewide contact form category, you can do so by updating the global contact.settings.

<?php
$contact_settings
= \Drupal::config('contact.settings');
$contact_settings->set('default_form', 'help')->save();
?>

One of the things I'm most excited about in Drupal 8 is how this entire process is the same (or almost exactly so) for every kind of entity—and almost everything's an entity! Need to create a content type? A configuration entity? A node? User? Almost everything follows this pattern now, and Drupal 8's APIs are so much more easy to learn as a side effect.

Nov 13 2014
Nov 13

In support of my mission to make local development easier and faster, I've released boxes for four of the most popular Linux distributions I use and see used for Drupal sites: CentOS 6/7 and Ubuntu 12.04/14.04.

Vagrant Boxes - Midwestern Mac, LLC

I've been using other base boxes in the past, but it's hard to find updated boxes (especially for newer OSes) from people or companies you can trust that are truly minimal base boxes (e.g. no extra configuration management tools or junk to kludge up my development environment!). These boxes are all minimal installs that let you bring your own configuration however you want; I typically use an Ansible playbook to build a LAMP server, or a Solr server, or an ELK server for monitoring all the other servers...

You can find all the info on the boxes (including links to the Packer/Ansible build configuration used to create the boxes) on files.midwesternmac.com, and the boxes are also available on Vagrant Cloud: geerlingguy's boxes.

You can quickly build a Linux VM using Vagrant and VirtualBox for local Drupal development with vagrant init geerlingguy/[boxname] (e.g. for Ubuntu 14.04, vagrant init geerlingguy/ubuntu1404. These boxes are also used as the base boxes for the Drupal Development VM (which is currently being reworked to be much more powerful/flexible) and Acquia Cloud VM (which simulates the Acquia Cloud environment locally).

I'll be writing more about local development with these VMs as well as many other interesting DevOps-related tidbits in Ansible for DevOps, on this blog, and on the Server Check.in Blog.

Nov 06 2014
Nov 06

For all the sites I maintain, I have at least a local and production environment. Some projects warrant a dev, qa, etc. as well, but for the purposes of this post, let's just assume you often run drush commands on local or development environments during development, and eventually run a similar command on production during a deployment.

What happens if, at some point, you are churning through some Drush commands, using aliases (e.g. drush @site.local break-all-the-things to break things for testing), and you accidentally enter @site.prod instead of @site.local? Or what if you were doing something potentially disastrous, like deleting a database table locally so you can test a module install file, using drush sqlq to run a query?

$ drush @site.prod break-all-the-things -y
Everything is broken!                                    [sadpanda]

Most potentially-devastating drush commands will ask for confirmation (which could be overridden with a -y in the command), but I like having an extra layer of protection to make sure I don't do something dumb. If you use Bash for your shell session, you can put the following into your .profile or .bash_profile, and Bash will warn you whenever the string .prod is in one of your commands:

prod_command_trap () {
  if [[ $BASH_COMMAND == *.prod* ]]
  then
    read -p "Are you sure you want to run this command on prod [Y/n]? " -n 1 -r
    if [[ $REPLY =~ ^[Yy]$ ]]
    then
      echo -e "\nRunning command "$BASH_COMMAND" \n"
    else
      echo -e "\nCommand was not run.\n"
      return 1
    fi
  fi
}
shopt -s extdebug
trap prod_command_trap DEBUG

Now if I accidentally run a command on production I get a warning/confirmation before the command is run:

$ drush @site.prod break-all-the-things -y
Are you sure you want to run this command on prod [Y/n]?

This code, as well as other aliases and configuration I use to help my command-line usage more efficient, is also viewable in my Dotfiles repository on GitHub.

Oct 30 2014
Oct 30

I recently ran into an issue where drush vset was not setting a string variable (in this case, a time period that would be used in strtotime()) correctly:

# Didn't work:
$ drush vset custom_past_time '-1 day'
Unknown options: --0, --w, --e, --k.  See `drush help variable-set`      [error]
for available options. To suppress this error, add the option
--strict=0.

Using the --strict=0 option resulted in the variable being set to a value of "1".

After scratching my head a bit, trying different ways of escaping the string value, using single and double quotes, etc., I finally realized I could just use variable_set() with drush's php-eval command (shortcut ev):

# Success!
$ drush ev "variable_set('custom_past_time', '-1 day');"
$ drush vget custom_past_time
custom_past_time: '-1 day'

This worked perfectly and allowed me to go make sure my time was successfully set to one day in the past.

Pages

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web