Feeds

Author

Apr 04 2019
Apr 04

Responsive images overview

As screen resolutions and pixel densities continue to climb year after year, it's becoming more important to deliver the best possible image quality to your visitors. The easy way out is to deliver a single high resolution image, but this can have a real impact on page load time & bandwidth usage, especially for visitors on mobile devices & networks. The better solution is to deliver the appropriately sized image based on the screen width/resolution of the browser. So, instead of always delivering a super high res image to mobile device users (who's browsers will be forced to downsize the image to fit anyway), deliver an image that's better sized for that screen. Smaller resolution images have a much smaller filesize, so your visitors won't have to download as much data and the image will download faster.

Thankfully, a native HTML solution for delivering different images for different browser viewports has existed for years: using the "srcset" and "sizes" attributes of the existing <img> element.

To quickly demonstrate how it works, let's take this super simple scenario of an image on your site that will always be displayed at 100% width of the browser. This is how the image element would look:

<img src="https://bkosborne.com/path/to/fallback.jpg" srcset="/path/to/higher/resolution.jpg 1500w, /path/to/lower/resolution 750w" sizes="100vw"/>

The srcset attribute provides your browser a list of images and how wide each is in real pixels. The sizes attribute tells the browser how wide the image will be displayed after it's been laid out and CSS rules applied to it.

But wait, don't browsers already know how wide an image will be when it's rendered on a page? It's responsible for rendering the page after all! Why can't it just figure out how wide the image will be rendered and then just select the most appropriate image source from the "srcset" list? Why is this "sizes" attribute needed at all?

Well, it's true that browsers know this information, but they don't know it until they have completed parsing all JS and CSS on the page. Because processing the CSS/JS takes a while, browsers don't wait and will instead begin downloading images referenced in your HTML immediately, meaning they need to know what image to download immediately.

In the simple scenario above, the site is designed to always render the image at 100% width via CSS, so we indicate as such by adding a single value "100vw" (vw stands for viewport width) to the sizes attribute. The browser then decides which image to load depending on the width of the viewport when the page is loaded. An iPhone 8 in portrait mode has a "CSS" width of 375 pixels, but it has a 2:1 pixel density ratio (a "retina" screen), which means it can actually display images that are double that width at 750px wide. So the browser on this phone will download the lower resolution version of the image which happens to match exactly at 750px wide. On a 1080p desktop monitor the browser will be wider than 750px wide, so the larger resolution image will be downloaded.

Responsive images delivered in this manner work really well for this simple use case.

Things start to get more complicated when the image being displayed on your site does NOT take up the full width of the browser viewport. For example, imagine a site design where an image is displayed 1500px wide at the desktop breakpoint, but is displayed at 50% width at tablet/mobile breakpoints. Now the image element changes to this:

<img src="https://bkosborne.com/path/to/fallback.jpg" srcset="/path/to/high/resolution.jpg 1500w, /path/to/low/resolution 750w" sizes="(min-width: 1500px) 1500px, 50vw"/>

The sizes attribute has changed to indicate that if the viewport width is at least 1500px wide, then the site's CSS is going to render the image at 1500px and no larger. If the viewport width is lower, then that first rule in the sizes attribute fails, and it falls back to the next one, so the site will render the image at 50% viewport width. The browser will translate that value to an actual pixel width (and take into account pixel density of the device) to select the appropriate image to download.

The problem this creates for dynamic layout builders

Now, imagine a dynamic layout builder tool on a content management system, like the new Layout Builder module for Drupal 8:

layout builder edit page

This great layout tool allows site builders to dynamically add rows and columns to the content region of a page and insert blocks of content into the columns.

One of the "blocks" that can be inserted into a column is an image. How do you determine the value of the "sizes" attribute for the image element? Remember, the sizes attribute tells the browser how wide the image will be when it's rendered and laid out by your CSS. Let's just focus on desktop screen resolutions for now, and say that your site will display the content region at a width of 1500 CSS pixels for desktops. A site builder could decide to insert an image in any of the following ways:

  • Into a single column row (image displays at 1500px wide)
  • Into the left-most column of a 50% - 25% - 25% row (image displays at 750px wide)
  • Into the right-most column of a 33% - 33% - 33% row (image displays at 500px wide)

The value of the "sizes" attribute differs for each of those three scenarios, which means that when Drupal is generating the image element markup, it needs to know the width of the column that the image was placed in.

The Drupal-specific problem is that (to my current knowledge) there's no practical way for the code that generates the image element markup to know information about the column the image was inserted in. Without this knowledge transfer, it's impossible to convey an accurate value for the "sizes" attribute.

Things get even more complicated if you're developing a solution that has to work with multiple different themes, where each theme may have different breakpoints and rules about the width of the content region at various breakpoints.

Moving forward

I think this is a new and interesting challenge, and I don't know that anyone has put much thought into how to solve it yet. I'm certainly hoping others read this and provide some ideas, because I'm not sure what the best solution is. The easy solution is of course to just not output the image responsively, and just use a single image src like the old days. In the example above, the image would need to be 1500px wide to account for the largest possibility.

Jul 21 2018
Jul 21

Unicode characters encoded using UTF8 can technically use 1 to 4 bytes to represent a single character. However, older versions of MySQL only provided support for storing UTF8 encoded characters that used 1 to 3 bytes. This was enough to cover the most commonly used characters, but is not suitable for applications that accept user input where any character can be submitted (like emojis, which use 4 bytes). Newer versions of MySQL provide a character encoding called utf8mb4 to fix this issue. Drupal 7 supports this, but requires some special configuration. Drupal 8 is configured this way by default.

Existing Drupal 7 sites that were setup with MySQL's old 3-byte-max UTF8 encoding must undergo a conversion process to change the character set on tables and text columns from utf8 to utf8mb4. The collation value (what MySQL uses to determine how text fields are sorted) also needs to be changed to the newer utf8mb4 variant. Thankfully, there's already a drush command you can download that does this conversion for you on a single database. Before running it, you should ensure that your MySQL server is properly setup to use the utf8mb4 character encoding. There's a helpful guide on this available on Drupal.org. Afterward the conversion is run, you still must configure Drupal to communicate with MySQL using this new encoding as described in the guide I linked to.

Part of my job is to help maintain hundreds of sites running as multi-site in a single codebase. So, same codebase, but hundreds of databases, each of which needed to have its database tables converted over to the new encoding. Converting a single database is not such a big deal, because it only takes a few minutes to run, but since I was dealing with hundreds, I wanted to make sure I had a good process laid out with plenty of logging. I created the below bash script which placed each site in maintenance mode (if it wasn't already), ran the drush command to convert the database, then took the site out of maintenance mode.

All in all, it took about 10 hours to do this for ~250 websites. While the script was running, I was monitoring for errors or other issues, ready to kill the script off if needed. I added a 3 second sleep at the end of each conversion to allow me time to cleanly kill the script.

After the script was completed, I pushed up new code for the common settings.php file (each site is configured to load a common settings file that they all share) which configured Drupal to connect to MySQL using the proper character set. In between the time that a database was converted, and the settings.php was updated for that site, there still should not have been any issues, because MySQL's UTF8MB4 character encoding should be backwards compatible with the original encoding that only supports 3 byte characters.

Here's the script for any that may be interested:

#!/usr/bin/env bash

#
# Usage:
# Alter this script to specify the proper Drupal docroot.
# 
# Run this command and pass to it a filename which contains a list of
# multisite directory names, one per line.
#
# For each site listed in the file, this script will first put the site in
# maintenance mode (if it's not already in that state), then run the
# uf8mb4 conversion script. Afterwards it will disable maintenance mode if
# it was previously disabled.
#

### Set to Drupal docroot
docroot="/var/www/html/"

script_begin=$(date +"%s")

count=0
total="$(wc -l $1 | awk '{ print $1 }')"
while read -r site || [[ -n "$site" ]]; do
    start_time=$(date +"%s")
    count=$((count+1))
    echo "--- Processing site #${count}/${total}: $site ---"
    mm="$(drush --root=${docroot} -l ${site} vget --exact maintenance_mode)"
    if [ $? -ne 0 ]; then
        echo "Drush command to check maintenance mode failed, skipping site"
        continue
    fi

    # If maintenance mode is not enabled, enable it.
    if [ -z $mm ] || [ $mm = '0' ]; then
        echo "Enabling maintenance mode."
        drush --root=${docroot} -l ${site} vset maintenance_mode 1
    else
        echo "Maintenance mode already enabled."
    fi

    drush --root=${docroot} -l ${site} utf8mb4-convert-databases -y $site

    # Now disable maintenance mode, as long as it was already disabled before.
    if [ -z $mm ] || [ $mm = '0' ]; then
        echo "Disabling maintenance mode."
        drush --root=${docroot} -l ${site} vset maintenance_mode 0
    else
        echo "Maintenance mode will remain on, it was already on before update."
    fi

    echo "Clearing cache"
    drush --root=${docroot} -l ${site} cc all

    end_time=$(date +"%s")
    echo "Completed in $(($end_time - $start_time)) seconds"
    echo "Done, sleeping 3 seconds before next site"
    sleep 3
done < "$1"

script_end=$(date +"%s")

echo "Ended: $script_end ; Total of $(($script_end - $script_begin)) seconds."

Mar 23 2018
Mar 23

I'm working in creating a Drupal 8 installation profile and learning how they can override default configuration that its modules provide at install time.

All Drupal 8 modules can provide a set of configuration that should be installed to the site when the module is installed. This configuration is placed in the module's config/install or config/optional directory. The only difference is that the configuration objects placed in the config/optional directory will only be installed if all of their dependencies are met. For example, the core "media" module has a config file config/optional/views.view.media.yml which will install the standard media listings view, but only if the views module is available on your site at the time of install.

The power of installation profiles is that they can provide overrides for any configuration objects that a module would normally provide during its installation. This is accomplished simply by placing the config object file in the installation profile's config/install or config/optional directory. This works because when Drupal's ConfigInstaller is installing any configuration object, it checks to see if that config object exists in your installation profile, and uses that version of it if it exists.

However, overriding default configuration that a module would normally provide is a double edged sword and brings up some interesting challenges.

If you dramatically alter a configuration object that a module provides, what happens when that module releases a new version that includes an update hook to modify that config? The module maintainers may write the update hook assuming that the config object that's installed on your site is identical to the one that it provided out-of-the-box during install time. I think this falls on the module maintainer to write update hooks that first check to make sure that the config object is mostly what it expects it to be before modifying it. If not, fatal errors could be thrown.

Another challenge that I ran into recently is more complicated. My installation profile was overriding an entity browser config object provided by the Lightning Media module. Entity browsers use views to display lists of entities on your site that an editor can choose from. My override changed this config object to point to a custom view that my installation profile provided (placed in its config/install directory), but it didn't work. When installing a site with the profile, I was met with an UnmetDependenciesException which claimed that the entity browser override I provided depended on a view that didn't exist. Well, it did exist, it's right there in the install folder for the profile! After some debugging, this is happening because the Drupal doesn't install config from the installation profile until all of the modules your install profile depends are installed first. So to summarize, it's not possible for a module's default config objects to depend on config that is provided by an install profile.

Feb 14 2018
Feb 14

Sometimes you need to make custom modifications to a composer package. Assuming that your modification is a bug fix, the best approach is to file an issue with the package's issue queue and submit the fix as a pull request (or a patch file when dealing with Drupal projects). Then you can use the composer-patches plugin to include the change in your project.

However this doesn't always work. I had a need to modify the composer.json file of a package that my project used. I tried creating a patch to modify it as I mentioned above, but composer didn't use the patched changes to composer.json. I imagine this is because a package's composer.json file is parsed before composer-patches has a change to modify it.

So the next best thing is to fork the package you need to modify to make the changes you need. The package I was modifying was already hosted on GitHub, so I forked it, made my change in a new branch, and pushed it up to my fork.

From there, I just had to change my project's composer.json file to add my fork to the list of package repositories to scan when looking for project dependencies. This is described in composer's documentation. I changed the version to "dev-my-branch-name" as instructed.

But for some reason, composer was still refusing to use my version of the repo. After more digging, it turns out that's because composer looks at the default branch of the forked repo to "discover" what package it is. Turns out my fork was really old, and the default branch was an older branch. This old branch of code used a different name for the package in it's composer.json file! The package name needs to match exactly what you have in your project's requirements list. To fix this, all I had to do was sync the default branch of my fork with the upstream.

Feb 14 2018
Feb 14

Yes, a blog post about Drupal 7!

I recently worked on an enhancement for a large multi-site Drupal 7 platform to allow its users to import news articles from RSS feeds. Pretty simple request, and given the maturity of the Drupal 7 contrib module ecosystem, it wasn't too difficult to implement.

One somewhat interesting requirement was that images from the RSS feed be imported to an image field on the news article content type. RSS doesn't have direct support for an image element, but it has indirect support via the enclosure element. According to the RSS spec:

It has three required attributes. url says where the enclosure is located, length says how big it is in bytes, and type says what its type is, a standard MIME type.

RSS feeds will often use the enclosure element to provide an image for each item in the feed.

Despite being in a beta release still, the Drupal 7 Feeds module is considered quite stable and mature, with it's most recent release in September 2017. It has a robust interface that suited my use case quite well, allowing me to map RSS elements to fields on the news article content type. However, it doesn't support pulling data out of enclosure elements in the source. But alas, in there's an 8 year old issue containing a very small patch that adds the ability.

With that patch installed, the final step is to find the proper "target" to map it's data to. It's not immediately clear how this should work. Feeds needs to be smart enough to accept the URL to the image, download it, create a file entity from it, and assign the appropriate data to the image field on the node. Feeds exposes 4 different targets for an image field:

Feeds image field targets

Selecting the "URI" target is the proper choice. Feeds will recognize that you're trying to import a remote image and download it.

May 18 2017
May 18

Imagine you have a view that lists upcoming events on your Drupal 8 site. There's a date filter that filters out any event who's start date is less than the current date. This works great until you realize that the output of the view will be cached in one or many places (dynamic page cache, internal page cache, varnish, etc). Once it's cached, views doesn't execute the query and can't compare the date to the current time, so you may get older events sticking around.

One way of fixing this is to assign a custom cache tag to your view, and then run a cron task that purges that cache tag at least once a day, like so:

/**
 * Implements hook_cron().
 */
function YOUR_MODULE_cron() {
  // Invalidate the events view cache tag if we haven't done so today.
  // This is done so that the events list always shows the proper "start"
  // date of today when it's rendered. If we didn't do this, then it's possible
  // that events from previous days could be shown.
  // This relies on us setting a custom cache tag "public_events_block" on the
  // view that lists the events via the views_custom_cache_tag module.
  $state_key = 'events_view_last_cleared';
  $last_cleared = \Drupal::state()->get($state_key);
  $today = date('Y-m-d');
  if ($last_cleared != $today) {
    \Drupal::state()->set($state_key, $today);
    \Drupal::service('cache_tags.invalidator')->invalidateTags(['public_events_block']);
  }
}

Assuming you have cron running just after midnight, this will refresh the cache of the view's block and the page at an appropriate time so that events from the previous day are not shown.

Dec 09 2016
Dec 09

I'm working on a site where the editorial staff may occasionally produce animated GIFs and place them in an article. Image styles and animated GIFs in Drupal don't play nice out of the box. Drupal's standard image processing library, GD, does not preserve GIF animation when it processes them, so any image styles applied to the image will remove the animation. The ImageMagick image processing library is capable of preserving animation, but I believe the only way is to first coalesce the GIF which dramatically increases the output size which in unacceptable for this project (my sample 200kb GIF ballooned to nearly 2mb). For anyone interested in this approach anyway, the Drupal ImageMagick contrib module has a seemingly stable alpha release, but it would require a minor patch to get it working to retain animation.

I'm mostly interested in somehow getting Drupal to just display the original image when it's a GIF to prevent this problem. On this site, images are stored in an image field that's part of an Image Media Bundle. This media bundle supports JPEGs and PNGs as well, and those are typically uploaded in high resolution and need to have image styles applied to them. So the challenge is to use the same media bundle and display mode for GIFs, JPEGs, and PNGs, but always display the original image when rendering a GIF.

After some digging and xdebugging, I created an implementation of hook_entity_display_build_alter which lets you alter the render array used for displaying an entity in all view modes. I use this hook to remove the image style of the image being rendered.

/**
 * Implements hook_entity_display_build_alter().
 */
function my_module_entity_display_build_alter(&$build, $context) {
  $entity = $context['entity'];

  // Checks if the entity being displayed is a image media entity in the "full" display mode.
  // For other display modes it's OK for us to process the GIF and lose the animation.
  if ($entity->getEntityTypeId() == 'media' && $entity->bundle() == 'image' && $context['view_mode'] == 'full') {
    /** @var \Drupal\media_entity\Entity\Media $entity */
    if (isset($build['image'][0])) {
      $mimetype = $mimetype = $build['image'][0]['#item']->entity->filemime->value;
      $image_style = $build['image'][0]['#image_style'];
      if ($mimetype == 'image/gif' && !empty($image_style)) {
        $build['image'][0]['#image_style'] = '';
      }
    }
  }
}

So now whatever image style I have configured for this display mode will still be applied to JPEGs and PNGs but will not be applied for GIFs.

However, as a commenter pointed out, this would be better served as an image field formatter so you can configure it to be applied to any image field and display mode. I've created a sandbox module that does just that. The code is even simpler than what I've added above.

Feb 07 2014
Feb 07

I recently worked on porting over a website to Drupal that had several dynamic elements throughout the site depending on the IP address of the user. Different content could be shown depending on if the user was within a local network, a larger local network, or completely outside the network.

When porting the site over, I realized that it wouldn't be possible to enable page caching for any page that had this dynamic content on it. In Drupal, standard page caching is all or nothing. If you have it enabled and a page is "eligible" to be cached, Drupal saves the entire output of the page and uses it for future requests for the same page (I go into much more detail about page caching in previous blog post). In my case, if I enabled it, users who hit within one of the local intranets could trigger a page cache set, and now any users outside the intranet would view that same content.

I wanted a solution that let me either differentiate cache entries per by visitor "type" (but not role), or to at least prevent Drupal from serving cached pages to some of the visitors when a cached page already existed. I found a solution for the latter that I'll describe below. But first...

Why this is a hard problem

I already knew I could prevent Drupal from generating a page cache entry using drupal_page_is_cacheable(FALSE);. In fact, there's a popular yet very simple module called Cache Exclude that uses this function and provides an admin interface to specify which pages you want to prevent from being cached.

But what if you wanted to cache the pages, but force some visitors to view the un-cached version? This is what I needed, but Drupal has no API functions to do this. Many Drupal developers know that hook_boot is run on every page request, even for cache hits. So why can't you implement the hook and tell Drupal you don't want to serve a cached page? The reason is because of the way Drupal bootstraps, and when it determines if it should return a cached page or not.

There's a whole bootstrap "phase" dedicated to serving a cached page called _drupal_bootstrap_page_cache. If you take a close look, you can see that Drupal doesn't invoke the boot hook until after it already determined it's going to serve a cached page. In other words, there's no going back at this point.

Enter the "Dynamic Cache" module

I came across the Dynamic Cache module that seemed solve this problem. Once enabled, this module lets you disable serving a cached page by setting $GLOBALS['conf']['cache'] = false; within your own modules hook_book implementation - exactly what I suggested was not possible above!

So how was Dynamic Cache doing this? In summary, Dynamic Cache implements hoot_boot, checks if you tried to disable serving the cached page, and if so will "hijack" the bootstrap process to render the whole page and ignore the page cache entry that may exist. In then makes sure to "finish" up the request by completing the bootstrap process itself and calling menu_execute_active_handler(); that is normally done in index.php (but no longer get executed because of the hijack).

I want to note that what Dynamic Cache is doing is pretty scary in that it's almost hacking core without actually modifying any core functions. This fear is actually what triggered me to explore how the Drupal bootstrap process works under the hood so I could understand if there'd be any potential issues.

It's not an easy concept to understand initially, especially since for Drupal 7 you have to enable a second module called "Dynamic Cache Bootfix" that hijacks the bootstrap process a second time to properly finish up the request! I don't want to go into much more detail, but the modules code is pretty slim and I encourage developers to take a look. It will help you get a greater understanding of the bootstrap process and the obstacles this module tries to overcome.

There's also a core issue that is trying to address this problem of not being able to easily disable a cached page from being served. I also encourage you to read thru that to get a better understanding of what the problems are.

How I implemented it

In my case, I found that the majority of traffic to the site was from users outside any of the intranets, so I decided to allow them to both trigger cache entries being generated and to be served those cached page entries. For everyone else (a small % of traffic), Drupal would always ignore whatever was in the cache for that page and would also not generate a cache entry:

function my_module_boot() {
  $location = _my_module_visitor_network();
  if ($location != 'world') {
    # Prevent Drupal from serving a cached page thanks to help from the Dynamic Cache module
    $GLOBALS['conf']['cache'] = false;
    # Prevent Drupal from generating a cached page (standard Drupal function)
    drupal_page_is_cacheable(FALSE);
  }
}

Note that Dynamic Cache relies on having a heavy module weight so it runs last - which allows me to disable the cache in my own hook_boot. Make sure you read the README that comes with the module so you set everything up properly.

Also note that I still called drupal_page_is_cacheable(FALSE);. Without this, Drupal may still generate a cached paged based on what this user saw. With my code in place, anonymous users outside the networks I was checking would both generate page cache entries and be served page cache entries. Anonymous users within the networks/intranets would never trigger a cache generation and would never be served a cached page.

Final Thoughts

Ideally, I would be able to generate separate page caches for each "type" of visitor I had. I think this is possibly by creating your own cache store (which is not that difficult in Drupal 7) and changing the cache ID for the page to include the visitor type. I think the boost module may also allow for this sort of thing.

For really high traffic sites, you're probably going to be using something like Varnish anyway - and completely disable Drupal's page caching mechanism. I don't know much about Varnish but I imagine you could put this similar type of logic in the Varnish layer and selectively let some users through and hit Drupal directly to get the dynamically generated page (especially since my check for visitor network is just based on IP address).

There you have it. Dynamic Cache is by no means an elegant module, but it gets the job done! If you're better informed than I and I made a mistake somewhere in this writeup, please let me know in the comments. I certainly don't want to spread misinformation!

Feb 05 2014
Feb 05

I just finished up a small project at work to create a basic resource management calendar to visualize and manage room and other asset reservations. The idea was to have a calendar that displayed reservations for various resources and allow privileged users the ability add reservations themselves. The existing system that was being used was a pain to work with and very time consuming - and I knew this could be done easily in Drupal 7. The solution could be extended to create a more general resource booking / room booking system.

I wanted to share the general setup I used to get this done. I won't go into fine detail, and this is not meant to be a complete step by step guide. I'm happy to answer any questions in the comments.

Step 1: The "Reservation" content type

I quickly created a new content type "Resource Reservation" and added a required date field. Due to a bug in a module I used below, I had to use a normal date field and not ISO or Unix (I usually prefer Unix timestamps). These three different types of date fields are explained here. Aside from that, I also made the date field have a required end date and support repeating dates using the Date Repeat Field module (part of the main Date module). I then needed to decide how I would manage the resources and link them to a reservation.

I created another content type "Resource" and linked it to a reservation using the Entity Reference module. Another option I considered was using a Taxonomy vocabulary with terms for reach resource, and adding a term reference field to the reservation content type. I decided to go for a full blown entity reference to allow greater flexibility in the future for the actual resource node.

In my case, I created the 6 "Resource" nodes (all rooms in a building) that would be used in my department.

Step 2: The Calendar

Years ago at the 2011 DrupalCamp NJ, I attended Tim Plunkett's session "Calendaring in Drupal." Tim provided a great introduction to a new Drupal module called Full Calendar that utilized an existing JavaScript plugin with the same name. I was very impressed with the capability of the module and wrote about it after the camp was over.

I immediately knew I wanted to use the module and was happy to see it has been well maintained since I last checked it out in 2012. The setup was incredibly simple:

  • Create a new "Page" view displaying node content
  • Set the style to "Full Calendar"
  • Add a filter to only show published "Resource Reservation" node
  • Add the date field that is attached to "Resource Reservation" nodes

The style plugin for Full Calendar has a good set of options that let you customize the look and functionality of the calendar. I quickly able to shorten it quite a bit to display the start and end times as "7:30a - 2:00p".

One thing to note is that while you can add any fields you want to the view, the style plugin only utilizes two: A date field and a title field. Both are displayed on the calendar cell - and nothing else. If you add a date field, the style plugin automatically uses it as "the" date field to use, but if you have multiple date fields for whatever reason you can manually specify it in the settings. Similarly, for the title field, you can add any field and tell the plugin which one to use as "the" title for the event. In my case the node title was suitable. If you wanted to display more than one field, try adding them and then add a global field that combines them, then assign that as the title field.

I loaded up some reservation nodes and viewed them in the calendar and everything was looking great so far. Next I wanted to provide some filtering capability based on the resource of the reservation "events".

Step 3: Filtering the Calendar by Resource

In my case there was a desire to be able to display the reservations for select resources at a time instead of all of them at once. This would be a heavily used calendar with lots of events each day, and it would become a mess without some filtering capability. This was easy enough by creating an exposed filter for the calendar view.

Ideally I would have a filter that exposed all of the possible resources as checkboxes - allowing the user to control what reservations for what resource they are viewing. I'm sure I could have done that by writing my own views filter plugin or doing some form altering, but I settled for this approach:

  • Added a new input filter for my "Resource" entity reference field.
  • Exposed it
  • Made it optional
  • Changed it to "grouped filter" instead of "single filter". This let me specify each Resource individually since there's no out-of-the-box way of listing all available.
  • Used the "radio" widget
  • Allowed multiple selections - this actually changed the radio buttons to checkboxes instead - exactly what I want.
  • Added 6 options for the filter - one for each resource. I looked up the node ID's for each resource and put them in with their appropriate label. Downside is each time a new resource is added I have to manually update the filter.
  • Changed the "filter identifier" to the letter "r", so that the query string params when filters are used aren't so awful looking

There are two major gotchas here. The first is that if you have more than 4 options to chose from, Views changes the checkboxes to a multi select field (bleh). This is an easy fix:

function YOUR_MODULE_form_views_exposed_form_alter(&$form, &$form_state) {
  if ($form['#id'] == 'views-exposed-form-calendar-page') { # find your own views ID
    $options =& $form['r']; # my exposed field is called "r" (see last step above)
    if ($options['#type'] == 'select') {
      $options['#type'] = 'checkboxes';
      unset($options['#size']);
      unset($options['#multiple']);
    }
  }
}
This ensures that the exposed filter is ALWAYS going to be checkboxes. The second gotcha is how views handles the multiple selections. By default, views will "AND" all of the selections together. So if you select "Room 5" and "Room 6", I get reservations that have both selected - which is not possible in my case since I purposely limited the entity reference field on the reservation to only reference one resource. Instead I want views to "Or" them, so it shows any reservations for either "Room 5" or "Room 6". The fix for this is simple, but not obvious:
  • In the filter criteria section in the View UI, I went to "Add/Or, Rearrange" which is a link in the drop down next to the "Add" button.
  • I created a new filter group and dragged my exposed filter into it.
  • The top group has the published filter and the content type filter, and the operator is set to AND.
  • The bottom group has my single exposed filter for the resource, and the operator is set to OR.
  • The two groups are joined together with an AND operator.

Setting the second group to use OR is the key here. Even though there is just one item in the filter group, it's a special filter because it allows multiple selections. Views recognizes this and will apply the OR operator to each selection that was made within that filter. By default I had everything checked (which is actually the same as having nothing checked, at least in terms of the end result). This makes it obvious to calendar viewers that they can uncheck resources.

Step 4: Adding Colors for each Resource

Since the default calendar view includes 6 resources, I wanted each resource to be displayed with a color that corresponded to the resource it was reserving. The Full Calendar module can sort of do this for you with help of the Colors module. This module lets you arbitrarily assign colors to taxonomy terms, content types, and users. Colors then exposes an API for other modules to utilize those color assignments however they want. Full Calendar ships with a sub module called "Full Calendar Colors" that does just this by letting you color the background of the event cells in the calendar based on any of those three types of assignments that may apply.

In my case, since I wasn't using Taxonomy terms, I couldn't use the Colors module to color my reservations. Someone opened an issue to get Colors working with entity references like in my case, but it's not an easy addition and I couldn't come up with a practical way of adding it to the Colors module myself.

Instead, I examined the API for Full Calendar and found I could add my own basic implementation in a custom module. Here's the basics of what I did:

  • Add my own color assignment form element to each "Resource" node using form alters and variable set/get.
  • Implement hook_fullcalendar_classes to add a custom class unique to each "Resource" for the calendar cell. Like ".resource-reservation-[nid]".
  • Implement hook_preprocess_fullcalendar to attach my own custom CSS file (created using ctools API functions) to the calendar that has the CSS selectors for each resource reservation with the proper color.

Finally I added a "legend" block that lists each Resource (with a link to that Resource node) displaying the color as the background, so users can quickly see what the colors in the calendar meant. You could also avoid some of this complexity by removing the ability to assign colors via the node form and just hardcode the color assignments in your theme CSS file. You'd still need to implement hook_fullcalendar_classes.

Step 5: Reservation Conflicts

With the basic calendar view completed and displaying the reservations, I shifted focus to the management aspect of the feature. Specifically, I needed to prevent reservations for the same resource to overlap with one another.

A little bit of digging led to me a great module called Resource Conflict. This module "simply detects conflicts/overlaps between two date-enabled nodes, and lets you respond with Rules". It requires Rules which is used to setup reaction rules when a conflict is detected, allowing you to set a form validation error. Note the module integrates with Rules Forms as well, but I've found it's not actually required. Resource Conflict is a very slim but capable module - I was very impressed and happy with its capabilities.

The module provides a Rules event "A resource conflict node form is validated". To get this event to trigger, I had to enable "conflict detection" for the Resource Reservation content type (part of the Resource Conflict module). To do this, I edit the Resource Reservation type, went to the new "Resource Conflict" vertical tab, and enabled it by selecting the date field to perform conflict checking on.

<

p>The Resource Conflict module provides a default rule that by prevents form submissions if there are any other nodes of the same type with an overlapping date. This is too general because I want the Rule to only throw a validation error if the conflicting reservation is for the same resource I'm trying to reserve. I disabled that default rule and worked to create a rule to also take the resource into consideration. This part was somewhat complicated and I was happy to find some guidance in the issue queue. EDIT: I've since taken maintainership of the module and updated the real documentation page with details on how to perform the following steps.

First, I needed to create a Rule Component that encapsulates the logic to compare two Reservation nodes, check if they have the same Resource entity reference, and if so set a form error. Here's how I did that:

2 Variables:

  • "Node" data type, "Unsaved Reservation" label, "unsaved_reservation" machine name, usage as a "parameter"
  • "Node" data type, "Conflicting Reservation" label, "conflicting_reservation" machine name, usage as a "parameter"

3 conditions:

  • "Entity has field" on the "unsaved-reservation" data selector, checking it has the resource entity reference field
  • "Entity has field" on the "conflicting-reservation" data selector, checking it has the resource entity reference field
  • "Data comparison" to make sure that the values of the two entity reference fields are equal

1 action:

  • Set a form validation error. I wrote in a message including a link to the conflicting resource using the available tokens.

Rule Component

<

p> Now, with this rule component in place, I could incorporate it into a normal Rule that reacted on the node submission, loading all the conflicting reservations (based on date alone) and looping through each one to execute the component actions for the more complicated comparison. Here's how I did that:

  • React on event "A resource conflict node form is validated"
  • Added condition for "Contains a resource conflict" - this relies on the "node" param that is made available from the event
  • Added action for "Load a list of conflicting nodes". This is provided by the Resource Conflict module and this is where the all the conflict detection is done, comparing other nodes of the same type for conflicting dates. This action is added as a Loop.
  • Add a Rule Component within the action loop, selecting the one we just created.

Since I setup the component with three variables, I needed to pass them in as arguments to the component after adding it into the loop. For the "unsaved-reservation" variable, I fill in "node". For the "conflicting-reservation" variable, I supply the original "list-item" variable from the loop.

Main Rule

Testing the rule proved that I was not able to overlap any dates for the same resource when creating a reservation. Perfect!

Final Thoughts

The basic functionality of the resource management was there. Users could add new reservations for existing resources and were alerted if the reservation conflicted with others. Reservations were displayed in a calendar for the department to see, and users could filter out specific resources to provide a cleaner view. Here are some additional notes and considerations:

  • To allow the calendar to scale, you'll want to enable AJAX on the calendar view which will only display events for a given month (+/- two weeks). There's a bug in the stable release of the module related to AJAX but I provided a patch.
  • If you're using repeating date fields, make sure you uncheck "Display all values in the same row" on the date field settings in the view. If you don't, any exposed filters for the date range (which is how the AJAX feature works for Full Calendar) will not apply to dates with multiple values. If you do this properly, only the repeating dates for the given date range will be loaded.
  • There's a bug in the Resource Conflict module that only allows you to use the standard "Date" database storage type for a date field. I'm working on a patch.
  • Remember that you could also implement a "resource" using taxonomy terms instead of entity references. If you do, you'll have a much better time getting the Colors stuff working.

And that's pretty much it! Let me know if you have any questions in the comments below.

Dec 16 2013
Dec 16

I've been away from full time Drupal development for a couple of years and have recently returned, this time making a commitment improve my understanding of core. There's a lot of information out there on Drupal caching, but I found much of it to be fragmented and outdated (Drupal 6). I wanted to provide a more comprehensive look at Drupal 7's core caching, explaining how some of this stuff is actually working under the hood.

Measuring Performance

Before we get started, it's worth discussing how you can measure the performance of your site so you can see for yourself the impact caching will have. The easiest way to do this is to use the devel module, which most Drupal developers should already be familiar with. Among other useful features, this module allows you to print out the time it took to render the page and the total memory usage of PHP to serve a page request.

Devel will reveal that for a Drupal 7 site with a couple dozen contrib modules enabled and no caching enabled, about 30-50 MB of memory will be used to serve each page request. It will also show that page execution time (time it took to render the HTML) is around 400-500ms.

Generally, those numbers are not performant and you won't be serving a lot of simultaneous page requests before bringing your server down. You should always be concerned with optimizing your site to increase page response time and reduce memory usage. Even if you're not developing for high traffic sites, you want every visitor to have the best experience possible.

Another useful and easy to use tool is your browser's developer tools. Years ago you had to use FireBug w/ FireFox, but most of the FireBug features are now built into the dev tools native to all the popular browsers, including Internet Explorer (which isn't so bad these days!).

I use Chrome, and the rapid release cycle for the browser means the packaged dev tools suite is very robust and constantly improving. For looking at performance of your site, dev tools is useful in showing you the number of HTTP requests made (the fewer the better), the time it took the server to respond to these requests, and the HTTP headers sent and received for each request. I encourage you to explore the dev tools and discover their usefulness.

How Drupal sets cached pages

Page caching is when Drupal takes the entire rendered output of a page and stores it in the database (or another cache store; defaults to the database). Pages will only be cached for anonymous traffic and for users that don't have session data, like items in a shopping cart. This is because if dynamic data like a shopping cart or a "Welcome Brian" message was cached, it would screw things up when that cached page was delivered for other anonymous traffic.

We need to look into how Drupal loads up every time a page is requested. It's not as complicated as you may think, and it's fairly straight forward to follow. This process is called "bootstrapping" and is split into many different phases. Each phase loads a different part of Drupal, progressively loading more core API functions, theme code, and module code.

If we look at index.php, you can see a call to drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);. If you take a look at the the code for drupal_bootstrap, you can see each of the 8 phases and get an idea of what each is doing. Drupal's index.php passes in DRUPAL_BOOTSTRAP_FULL, which indicates that every single phase should be executed to load the entire environment. Each phase is loaded in succession. Also of note are the comments for this function, which indicate how you could call drupal_bootstrap yourself to load the Drupal environment for a custom script (very useful!).

So how does page caching tie into this? Well, more time, memory, and CPU is used for each bootstrap function that is loaded. Under normal circumstances, each phase of the process is needed so that the page can be properly rendered. However, when page caching is turned on, and the page is eligible to be cached, Drupal will store the rendered page output via the drupal_page_set_cache function. This function is called right before the rendered output is flushed and delivered to the browser.

I mentioned above that the page must be eligible to be cached. Even with page caching enabled, Drupal may prevent some pages from being cached. An example is a page that displays a dynamic message, like when a user doesn't fill out a form properly and validation errors are displayed. You wouldn't want that message to be part of the cached page result.

Also of note, there's a useful function drupal_page_is_cacheable that can be used to instruct Drupal NOT to cache the page it was called on.

How Drupal serves cached pages

Now let's say the user reloads that page that was just generated and cached. Drupal again kicks off the full bootstrap process, but this time things are different because of the second phase of the bootstrap: DRUPAL_BOOTSTRAP_PAGE_CACHE. This phase is used to determine if a cached page can be delivered to the user, and if so, output it directly. The code is simple to follow. Checks are made to see that:

  • The user has no Drupal session cookie (therefore, user is "anonymous" with no dynamic data)
  • Page caching is actually enabled
  • A page cache entry exists for the requested page

If all three conditions are met, then Drupal loads the cached data out of the cache store using drupal_page_get_cache, outputs it, and exits out of the bootstrap process early.

It's worth noting that under normal circumstances, two additional phases are loaded before Drupal can serve the page cache (thanks Mark Pavlitski for the heads up): DRUPAL_BOOTSTRAP_DATABASE and DRUPAL_BOOTSTRAP_VARIABLES. The database phase is needed so Drupal knows how to access the cache, which by default is stored in the database. The variables phase will load all the settings in the variables table and load the "bootstrap" modules (see next section for more info on that).

Alternative cache implementations (like memcache) don't typically need the database for anything when serving a cached page. You can explicitly tell Drupal to skip loading up the database and system variables by setting page_cache_without_database to false in the settings.php file to make responses even faster. Note that since the "bootstrap" modules are not loaded when this setting is enabled, you can't use the hooks discussed below. Everything has a trade off when it comes to performance.

Two hooks you can count on

Since the cache delivery happens almost immediately and early in the bootstrap process, most of Drupal's core API and modules are not loaded at all. That means you cannot run any hooks that affect page output. However, there are two hooks that Drupal will execute even on cached page delivery: hook_boot and hook_exit.

How are any hooks executed if Drupal doesn't load the hooks system and modules that implement them (this happens at a later bootstrap phase)? Well, when a module implements hook_boot or hook_exit, Drupal makes note of it in the "system" database table when the module is enabled. These modules will be loaded on demand when the hooks are invoked in DRUPAL_BOOTSTRAP_PAGE_CACHE. However, the more modules that implement these hooks, the slower it is for Drupal to actual serve a cached page entry (more code = more time).

Modules can use hook_boot to execute any code that must run on every page, where as its companion hook_init is invoked only when a page is first rendered (meaning not on cached pages).

Almost all of the time you'll want to use hook_init, typically for things like adding specific CSS or JS files to a page. hook_exit is used to execute any code after a page has already been sent to the browser and right before the php process exits.

The popular devel module uses hook_boot so it can ensure its profiling code is run even for cached pages. Note I previously wrote that the redirect module implemented hook_boot, but that is incorrect. Must have been a late night when I wrote that!

Both hook_boot and hook_exit can actually be disabled on cached pages as well to provide even further performance gains for cached pages. This can be done by setting page_cache_invoke_hooks to false in your settings.php file. A lot of modules rely on those hooks though, so you'd really need to understand the repercussions of turning those hooks off. In Drupal 6 you could control this on the performance settings page, but now it's just an override in your settings.php file.

Page compression

Once you enable page caching, Drupal will reveal an additional option on the performance page called "Compress cached pages." Doing so, Drupal will first compress the rendered content using PHP's gzencode function (see drupal_page_set_cache) before saving it. This reduces size of the data to store in the cache dramatically, as well as offering an additional benefit! Web browsers can accept this compressed content directly and uncompress it themselves.

Browsers that support this (just about all of them) add a header indicating as such, and Drupal will deliver the gzipped content directly to the browser. The heavy lifting of decompressing the data is left up to the resources on the users machine - which is a good thing. It reduces the load on your server (except for that first "hit" that must be compressed) and decreases the transfer time and bandwidth.

If page caching is disabled, Drupal won't compress it before delivering it, but your web server can do that if you wish. Apache and Nginx both support this. There's some debate about whether you should use this in conjunction with Drupal's compression or not. For the most part you should be okay just having Drupal handle it for you. If you are working on a site where performance is a huge concern, this is something you'll need to look into more yourself.

Performance gains

The benefits of page caching are immediately clear. Above I mentioned that a Drupal site could use around 30-50 MB of RAM just to serve one page request. While that RAM is used for only a half second or so, it severely limits the amount of traffic you can serve. If a cached page is delivered instead, you're looking at around 2-4 MB of RAM paired with a dramatic improvement in page response time.

You won't be able to use the Devel module to print out the memory usage and execution time for cached page results. That's because Devel has no opportunity to alter the output of a cached page (nor does any other module, as I discussed above). I wrote a blog post a while back explaining how you can determine php memory usage for cached page results. Check if out if you're interested.

Your browsers dev tools will also show the dramatic improvement in response time. To test it out, clear your page cache (in performance settings, or using drush) and then load a page with dev tools open. Note the time it took to get the page from the server. Now reload the page and look at the time again (make sure you're logged out). On the second request, Drupal is returning the cached page that was stored from the first request.

Of all the caching methods available in Drupal core, page caching is by far the most effective and performant. Of course it's of no use unless you're serving to "anonymous" logged out users, but the majority of Drupal sites are probably aimed toward static content delivery.

How and when the page cache is cleared

There's a Drupal function called cache_clear_all that is used all over the place to wipe out cache entires in various "bins". Here are some of the actions that trigger a call to cache_clear_all, clearing (among other cache bins) the page cache:

  • A node is created/edited/deleted
  • A block is created/edited/deleted
  • A comment is created/edited/deleted
  • A vote is registered in a poll
  • User profile fields are manipulated
  • System theme settings are changed
  • Taxonomy terms/vocabularies are manipulated
  • Permissions for roles are changed
  • Cron is run

That's quite a list! Why do so many actions trigger a cache clear? For the most part, it's because Drupal doesn't know where your content is displayed on the site. It's not quite intelligent enough (but it will be in Drupal 8). It makes the assumption that any one of your cached pages may include a poll, a node, a comment, a taxonomy term, etc. So any time those are changed or added, Drupal clears the entire page cache!

Here's a common example: Let's say you have a View that displays the 5 most recent news articles on your homepage. When you submit a new news article, you'd want that list to be updated. The only way that list is updated is if you clear the page cache entry for the homepage, or else it will display stale content.

All those cache clears can be problematic for a site that sees even a small amount of updates. Whenever a cached page is wiped out Drupal has to regenerate it on the next hit. That unlucky visitor will have to wait a few seconds while the whole thing is rendered instead of the snappy cached version. To combat this, Drupal allows you to enforce a minimum amount of time a cache must be valid.

Minimum cache lifetime

This is a setting on the performance page and has been confusing users for years. The minimum cache lifetime determines the minimum amount of time that must pass between entire cache clears. Many users misunderstand this setting to instead apply to the lifetime of individual cached entries, but it has nothing to do with individual entries. If you have the min set to 10 minutes, you could create a new page and have it only be cached for 1 minute before it is cleared from the cache. It doesn't mean that a page will be cached for 10 minutes at the minimum or automatically cleared out after 10 minutes. Nothing is broken, this is how the system is designed for better or worse.

If you don't have the min lifetime set (which is the default), the page cache will clear no matter what on any of those above actions (including cron!). If you do set a minimum, anytime cache_clear_all is called to clear the page cache, it will first set a system variable indicating the timestamp of the request. On a subsequent request to clear the page cache, Drupal compares the current time to that previous time that was recorded. If it exceeds the minimum you set, it will then clear the cache.

No matter what you do, the process is very inefficient. What this usually means is that a lot of a your visitors will be hitting non-cached pages and having a bad experience. One solution is to "warm" the cache after it has been cleared. You can do that by using a crawler that hits all pages on your site. You can also use boost, which has a built in crawler and more advanced cache logic. Sites with serious traffic will probably use a reverse HTTP proxy like Varnish instead of Drupal's page caching. There's also the Alternative Database Cache module that aims to correct some of these core shortcomings (thanks to Eric Peterson for authoring and bringing to my attention).

Expiration of Cached Pages

This other option on the performance page is more straight forward and hopefully shouldn't confuse people thanks to the helpful comment alongside it. At first you may think this is a way to control the maximum amount of time an individual page will be cached before Drupal forces a new rendering of it. However, the comment reads "The maximum time an external cache can use an old version of a page."

This will control the HTTP response header Cache-Control, setting the parameter "max-age" to whatever value you indicated. HTTP reverse proxies like Varnish or Nginx (or a CDN like Akamai), which can provide an extra caching layer in front of Drupal, use this important header to expire cached pages in their cache. The setting has nothing to do with Drupal's internal caching mechanisms.

Conclusion

Page caching is a no-brainer for most websites. It dramatically reduces system resources consuming when serving pages and allows for you to serve much more traffic at once. A lot of the page cache settings may seem counter intuitive. It can take hours to really dig through the code and see what's going on and try to figure out why. Hopefully this blog post can clear up some of the confusion and give you a better understanding at what's happening under the hood.

I plan on writing up more on the other forms of caching in Drupal, like Views, Block, and Form caches. Stay tuned, and please comment below.

Dec 06 2013
Dec 06

I recently started on a project that involves migrating some data from a legacy app & database into Drupal. The old application is a collection of PHP scripts that basically just generate forms, accept data, insert said data into the database, and output it on a website. Pretty simple stuff - there's not a whole lot going on. It was developed long before many of the popular CMS's and frameworks came to be, and probably before people really started paying attention to the character encoding of their data.

I initially began the data migration using Drupal's Migrate module, setting up my new content types and fields, and running through some test imports. Things seemed to go well, until I started scanning the imported data. I started to see some really strange characters like ’ and é

For the most part, it was pretty clear what these characters were supposed to represent, based on the context they were placed in. I knew right off the bat that it had something to do with character encodings, something that I've never taken the time to truly understand. I made the mistake of thinking "oh this shouldn't take too long to fix", later unraveling my very own "character encoding hell". I'll talk more about that towards the bottom of the post, but let's first get an intro character encodings.

So what are character encodings exactly?

When textual data is stored and transmitted, it has to first be converted to binary just like everything else. So how is text converted? There needs to be a lookup table matching characters with binary representations. That's determined by the character encoding that is chosen - which can be one of many.

Whenever data is decoded, the character encoding must be known beforehand. Web browsers do this every time you view a website. How do they know what encoding the text is in? The web server tells it, or the browser has to make an educated guess (not what you want!).

ASCII is a very basic character encoding that many are familiar with. It covers the English language along with common symbols and control characters, using only 7 bits to provide a maximum of 128 characters in the set. ASCII isn't really used anymore on the web because of the small number of characters it supports. There are many encodings that do use a full 8-bit byte for each character, re-claiming that wasted 8th bit from ASCII and bringing the total to 256 characters. In America, the most common single byte encodings are probably Windows-1252 and ISO-8859-1. However, no single-byte encoding can possibly hold all of the characters necessary to create one "universal" encoding that supports all known languages and symbols.

UTF-8 is a Unicode compliant character encoding that has become the dominant encoding on the web. One of the strongest properties of UTF-8 is that it's a variable width byte encoding - meaning a single character can be represented with one or more bytes (more advanced, less used characters take up more bytes). Most importantly, UTF-8 supports just about every character in every language you can think of. This is very important for the web. It makes multilingual sites easier to manage since you don't have to worry about any localized character sets for each language. Everything uses the same character set.

Most developers should only be dealing with UTF-8 at this point (or another Unicode encoding) and should understand how character encodings are involved in every part of your website or application.

Where you need to worry about them

Remember that any time textual data is transmitted, it needs to be encoded in a specific encoding, and decoded on the other end. The other end needs to know what encoding was used. There are at least 4 major areas where a web developer needs to be concerned with character encodings:

Web pages

When a response w/ text in it is delivered from a server to a client, the server needs to tell the client about the encoding.

There are two opportunities to do this - one is the "Content-Type" HTTP response header which is typically set to text/html; charset=utf-8 for standard HTML pages. Your application should set this before delivering the response to the browser. It also includes the MIME type of the response which tells the browser what type of document is being delivered (image, video, document, etc).

The other is a meta tag header <meta charset="utf-8"> or <meta http-equiv="Content-Type" content="text/html; charset=utf-8">. The former is the newer HTML5 version. It's a bit confusing to indicate the character encoding within the data that needs to be decoded, but this is allowed in HTML. Parsers will interpret everything as ASCII until it hits that header (which will work, since the HTML syntax is within ASCII and can be parsed that way) then re-parse the document with the new encoding. The reason it's supported in HTML is to account for any inability to set the HTTP response header which would otherwise provide the same info.

You should be using both methods. Without setting this data, your browser will have to guess, and it may display "garbage" text.

You can actually see what character encoding your browser chose to render the page in, and even force it to render it using a different encoding. It's a handy tool that can help diagnose if a page was meant to be rendered in some other encoding. Chrome, FireFox, and Safari all support this ability (IE probably does as well) in the "View" menu.

Form submissions

When data in input in text boxes or text areas in an HTML form, the browser has to encode it before sending it to the server. What encoding does it use to do this? Again, this is up to you to decide. By default most browsers will just use the same encoding that the page was rendered with. However, you can specify this in the <form> tag: <form action="/process.php" accept-charset="UTF-8">. So, while explicitly indicating the character set to encode the data with is not totally necessary, you should do it anyway just in case. If this is not set you risk the browser encoding the data in some random Encoding that your back-end is not anticipating.

MySQL connections

Something that is often overlooked is that when you communicate with a database server and you send textual data, you need to indicate the character encodings when sending and receiving data between your back-end and the MySQL server. This makes sense once you understand that any time text is transferred from one place to another, you need to indicate what encoding the text is in. The text does not automatically have a way of indicating what encoding it's in (not universally anyway).

It get's pretty tricky here, at least with MySQL. There are specific settings in MySQL that you can set after a connection has been established that dictate how characters are treated between the client and server. In particular, there are three things that are important to look at:

  • The encoding of data you send to the server from the client
  • What encoding the server should convert to after receiving the data
  • The encoding the server should return to the client when queries are run (a conversion if necessary).

Assuming the data you're sending MySQL actually IS in UTF-8 and the data in MySQL is stored as UTF-8, you'll want all three of those things to be UTF-8. No conversions will actually take place, and MySQL will just pass everything along as UTF-8. To set these values, you can simply execute the statement SET NAMES utf8 after making a connection.

If you DON'T set these values, then MySQL will more than likely default to latin1 (Windows-1252) which is just asking for trouble! A knowledgable developer may recognize that they need to set the character encoding for their web page and forms, and even their database fields (see below). However, if they have a backend script that accepts UTF-8 data from a form submission, but it doesn't tell MySQL it's UTF-8, then MySQL will think it's latin1 when it's actually UTF-8.

MySQL text field storage

Text fields in MySQL require you to indicate the character encoding of text fields. Defaults are set up on the server, database, and table level (each inheriting from the former). Since you're telling MySQL a field is a text field, it needs to know how to interpret the raw data it's storing as textual data. Without doing so, you wouldn't be able to query for text or have MySQL compare text fields with one another. You could alternatively just use BLOB fields instead, where the data is stored as binary data and not interpreted in any way. Just keep in mind you won't be able to search on these fields.

From what I can tell (as well as another blog author I reference below), MySQL doesn't change how the data is stored based on the encoding of a field. It just interprets the data differently when reading it. I could be wrong here (if I am, please let me know in the comments). However, you can alter the encoding of an existing field which will re-encode it (actually does change the data) as long as you recognized the limitations outlined in the that article.

For instance, if the string é was encoded as latin1 and stored in MySQL in a latin1 field, it will be stored as hex value E9. You can verify this yourself by running something like SELECT hex(text_field_name) FROM table_name. If you then run ALTER TABLE `table_name` CHANGE `text_field_name` `text` MEDIUMTEXT CHARACTER SET utf8 NULL; MySQL will convert that data from latin1 to UTF-8 for you. If you run the hex query again, you'll get back C3 A9.

In MySQL 5, all the text storage defaults are set to latin1 (Windows-1252) just like they are for the connection settings discussed above. While it's well known at this point that UTF-8 is preferred, I believe that the consensus was that the team behind MySQL didn't want to make the dramatic change to UTF-8 quite yet (though I think this is how it will be in MySQL 6).

Note: the collation of a field has nothing to do with how the data is stored, and instead effects how the data is compared and sorted.

What can go wrong

One of the important things to understand is that UTF-8 is a multibyte encoding. This means that the majority of characters are represented with more than one byte (typically two or three), while a traditional character set just uses one byte per character. To really understand let's look at a practical example. Here's the word "Résumé" in two different common encodings:

Encoded in Bytes Interpreted in Windows-1252 Interpreted in UTF-8 Windows-1252 52 E9 73 75 6D E9 Résumé R�sum� UTF-8 52 C3 A9 73 75 6D C3 A9 R©sum© Résumé

The normal letters in Résumé are actually the same in both Windows-1252 and UTF-8 (that's was an intentional design choice when creating UTF-8). However, once we get into the special accented e, it's actually represented with two bytes in UTF-8 and just one in Windows-1252.

If your data is sent from the browser to the server as UTF-8 (this is something you as a web developer have control over), is then stored in your database as UTF-8, and finally pulled out of the database and displayed on your website as UTF-8, then will be all dandy. This is how things should be and you have little to worry about.

But let's say that you accidentally removed the charset meta tag from your page template and your web server and app don't set the "character-encoding" HTTP header. Your app will still be delivering UTF-8 encoded text to the browser, but the browser no longer knows it's UTF-8. So the browser guesses. It's possible it will correctly guess that it's UTF-8, but it's also possibly it will guess Windows-1252 as the encoding. If that happens, Résumé will look like R©sum©. Why? Because the é was sent to the browser as C3 A9, which in Windows-1252 translates to the characters  and ©. If it were correctly interpreted as UTF-8, the browser would correctly translate C3 A9 to é.

Whoops! You catch the error a few hours later and make sure the character set is properly set to UTF-8. No harm done, right? Since the the data didn't actually change at all, there is no data corruption to worry about. The browser now interprets the text data properly as UTF-8. Well, not so fast. Let's say you have a form on your site that accepts comment submissions. Someone submitted the form when the browser rendered the page in Windows-1252. Since most browsers will encode data in form submissions using the same encoding that it used to render the page, the text may be encoded as Windows-1252 and sent along to your backend.

Now you have a problem. Your app is expecting the data to come in as UTF-8 when it's actually encoded as Windows-1252! If the comment contains with the word "Résumé", it will be encoded to 52 E9 73 75 6D E9 instead of 52 C3 A9 73 75 6D C3 A9 like it should be in UTF-8. You may have your database driver setup to indicate that the data you're sending it is UTF-8, and the database field may be set to store it as UTF-8. It's possible that your DB abstraction code throws an exception when it sees that 52 E9 73 75 6D E9 is not valid UTF-8 (which is correct, it's not valid), but it may let it slide and insert it anyway.

After you fixed the website to tell browsers you're sending it UTF-8, let's say someone goes to view the comment that was submitted earlier. Instead of displaying Résumé, the browser will actually display R�sum�. That funny question mark box is the Unicode "replacement character" - which is used when the byte sequence in the text is not supported in UTF-8. E9, the byte that Windows-1252 uses for é, is NOT a valid UTF-8 character. If you're curious why, you need to read more about how UTF-8 encodes text.

You have data corruption now. To fix this, you'd need to find the data entries that were submitted with the wrong encoding and manually convert them to UTF-8. Not a fun task. Yo can't even assume that any text submitted after your bad commit would be encoded as Windows-1252 and is stored incorrectly. Some browsers may have properly send UTF-8 data and it was properly encoded. So if you tried to re-encode all the data after that commit to UTF-8, you may end up double-encoding the valid UTF-8 characters.

Conclusion and further reading

Character encodings are not trivial to deal with and it's not terribly fun trying to figure out why funny characters are being displayed on your app or website. Hopefully after reading this you'll take the subject seriously and really begin to understand what character encodings mean to your website and application. Generally you should be using UTF-8 everywhere!. If you're using Drupal - you don't need to worry about much, since Drupal already creates database fields with UTF-8 and sets the correct database connection settings for UTF-8. You should still take time to really understand as much as you can about encodings though.

The application I'm working on has been operating for years without a character encoding set in the HTML pages or form submissions. I've determined that form submission data has been encoded in UTF-8, Windows-1252, and ISO-8859-1, with no definitive way to figure out what data is what. The application does not set the character encoding for the MySQL connection at all, so it defaults to latin1 (Windows-1252). To top it all off, the database text fields are a mix of UTF-8 and latin1 (and the UTF-8 fields may have been "converted" from latin1). It's been frustrating trying to determine what is right and wrong in the database and how to properly fix it because there's both a mix of encodings on the field storage level as well as the input coming into the app. There's no sure fix for any of it.

If you're interested in learning more, these are some great starting points:

Feb 16 2012
Feb 16

In case you've been living under a rock for the past two years: Drupal 7 is known to use quite a bit of PHP memory everytime a page is loaded. I won't get into anything why that is and why it's such a big jump from Drupal 6 (I'm in no position to comment on that), but what you need to know is that Drupal 7 can easily use 40-50 MB per page load for a small to mid size website, and much much more for larger websites. Turn on some sort of PHP opcode cache like APC and you can probably get that small to mid site down to 10MB of usage or lower. With this number in mind, as well as an idea of how much traffic your site will see per day, you can come up with a ballpark server specification that will handle your website.

How do I know that? The Devel module of course! There's an option in the settings for Devel that will report to you (at the very bottom of your page) the total memory that Drupal used from bootstrap until the page was completely built and served to the browser. If you've never done this before, go try it out for yourself. There are also a bunch of other performance reporting options in there that are worth playing around with as well. 

But let's get back to the memory usage itself... that 10MB per page load I was talking about actually only relates to authenticated users on your Drupal site. It's most likely that the vast majority of your visitors will be anonymous, non logged in users. This is very important to understand. If you are building a content-driven website with no community portal or user login area, that 10MB is not an accurate representation of the memory usage of your website... What you really want to know is how much memory Drupal uses when serving pages for anonymous users.

Assuming you have anonymous page caching turned on (you do, right ?), Drupal will use far less memory and system resources to build page requests for anonymous users than it does for authenticated users. Cool, so let's just go to the Devel permissions and turn give anonymous users access to see devel info.

Nope. For reasons I don't completely understand at the moment (I'm not an expert on Drupal page caching), the devel summary information is not printed out for cached pages. So I did some exploring and found the function in devel.module that gathers the memory usage and prints it out. Turns out the code is still executed, but the text it generates is not printed out. For Drupal 7, the code is in the function "devel_shutdown_memory" on the file devel.module. This is a sort of helper function that is called when Drupal is just about completely done doing it's thing, and all it's code has been processed and memory has been used. Here's that function:

function devel_shutdown_memory() {
  global $memory_init;

  if (variable_get('dev_mem', FALSE)) {
    $memory_shutdown = memory_get_usage();
    $args = array('@memory_boot' => round($memory_init / 1024 / 1024, 2), '@memory_shutdown' => round($memory_shutdown / 1024 / 1024, 2), '@memory_peak' => round(memory_get_peak_usage(TRUE) / 1024 / 1024, 2));
    $msg = ' Memory used at: devel_boot()[email protected]_boot MB, devel_shutdown()[email protected]_shutdown MB, PHP [email protected]_peak MB.';
    // theme() may not be available. not t() either.
    return t_safe($msg, $args);
  }
}

This function is still called for anonymous page views, so this is where we will put our dirty hack. That return statement sends off the built string with the memory usage in it which eventually gets printed out; but like I said, this is not visible for cached pages. Instead, lets write the memory shutdown variable the PHP error log. This is just a temporary adjustment to the code so we can get reports of how much memory Drupal is using. Once we get what info we need, we can just remove the error_log line.

Add this just before the return statement:

error_log(round($memory_shutdown / 1024 / 1024, 2));

Then you can tail your error log as you browse your website as an anonymous user. Now you'll be able to see the memory usage of anonymous page visits on various pages of your site.

I wrote this blog post because over the summer I was unable to find any information on how to easily find memory usage for anonymous page views, and I was in a situation where I needed to know that. Working on a site where memory usage for logged in users (with APC) was reaching 30MB, I was happy to find out that anonymous users were only using 4-6MB. This allowed me to tweak Apache's settings and find a good cloud hosting environment for the website.

Feb 13 2012
Feb 13

Last weekend @ DrupalCamp New Jersey, Tim Plunkett presented a wonderful alternative of the Calendar module for Drupal, called FullCalendar. I could be wrong, but I believe this was the first presentation given on the module which is very close to a stable release for Drupal 7.

Many who have wanted to create calendars in Drupal have used the time-tested Calendar module that's been around for years. I haven't used that module in about a year, and I don't know what sort of advancements have been made in the Drupal 7 version of the module (also close to a stable release it seems). This blog post will focus on FullCalendar and its features, and not compare it to the Calendar module.

Side note: this post provides a very general overview of how to use the module and it's features. I urge anyone looking to use it to read over the documentation, and visit the project page for a more comprehensive overview.

Set up

Creating visual calendars in Drupal should be a fairly easy thing to do - and that's what FullCalendar sets out to do. After installation, all you have to do is create a view, set the format to FullCalendar, and add a date field. With just the base module installed, the style format settings allow you to do a good amount of customization to the Calendar. The whole module is based on the FullCalendar jQuery plug-in created by Adam Shaw, which itself is based on Google Calendar. So just about anything that can be done using that plugin is also possible to do using the Drupal module.

You can have the calendar display Month, Week, or Day and allow the user to toggle the displays without reloading the page or generating a new Views Display. There will likely be support for a Year view down the road as well, but that's dependent on the maintainer of the jQuery plug-in itself and not Tim. There are options to control how the header of the calendar looks, change the time & date formats (for everything), and control which fields in the view correspond to the functional components of the calendar.

If you want more control over the plugin, enable the FullCalendar Options sub-module. This will present you a ton more options when editing the display format settings in the view. Casual users won't need these options, thus the reason for them being placed in an additional module.

Styling/Theming the Calendar

Out of the box, the module utilizes jQuery UI to theme the calendar. This presents a beautiful looking calendar that can change to match your site's theme quite easily using ThemeRoller. If you aren't satisfied with the display options you have with jQuery UI, you can disable the inclusion of it so you don't have to fight against all of the complex styles that are added by utilizing it. Instead, you can look at the markup that is produced from the module and style it as you see fit. It will take longer, but you'll have much more control over how it looks.

If you want to embed the calendar into a smaller block on the site - that's easy enough. Just create a block display and the calendar will adapt to fit the area the best it can. For whatever the calendar style settings can't do, you can override using CSS.

If you are using the Colorbox module, you can also have events open up in a Colorbox on click instead of bringing you to the event page. You can specify which HTML element from the event you want to bring in and display in the colorbox. This is a really great feature to have. And of course you can change the styling of the colorbox using that module or your own CSS.

The power of FullCalendar

Because the module uses jQuery so heavily, you can do some really powerful things with it. Right out of the box, users have the ability to drag events to new days and even change their time. This is done simply by dragging the events from one day to another or expanding the box they take up on the calendar to change their times. This can be disabled if that's not what you need the Calendar for, or a module can hook into the access permissions for doing this for fine control.

Enable the FullCalendar Colors sub-module (and it's dependency, the Colors module) to bring in coloring for events based on Node Types, Taxonomy Term, and User Roles. You can control the weight at which the colors are applied to each event as well!

By default all results from the view are loaded, even if the user may never view them. FullCalendar supports Views AJAX to only load the results that are needed for the display type - so that's a relief for performance junkies.

Many many be asking what happens when JavaScript is turned off? FullCalendar creates the calendar dynamically using JavaScript so when it's turned off, it displays just a flat list of the events and nothing more. I think that this is a good compromise. All of the events and their date/times are still viewable, just not with the great calendar view.

But the true power of FullCalendar is that it's really good at one thing - displaying calendars with your events in them. jQuery is utilized heavily to create a really slick and customizable appearance that will leave site builders in awe at the pure ease of creating a simple calendar display for a site their working on.

Closing Comments

FullCalendar is seeing a ton of active development at the moment. Tim stated during the presentation that bug reports are features are usually implemented really quickly - in part because the other maintainer lives in Europe so one of them is typically available. I can't see a reason why someone would opt to use the Calendar module instead of FullCalendar. The only thing that FullCalendar doesn't seem to have that Calendar does have is Year view, but like I said earlier, that may eventually make it in.

Something that was stressed by Tim is that FullCalendar is for display only. There are a few functional aspects of the module like dragging around events to change their time/date, but not much more. There's some recent code in the FullCalendar Create module that allows admin's to create events using the calendar interface to select the date to create the event under, so that's something to keep an eye on... But for the most part, FullCalendar doesn't attempt to do more than just display a calendar - and it's very good at doing it!

Feb 04 2012
Feb 04

At DrupalCamp NJ (the first ever, glad NJ is getting some love!), Jesse Beach from Acquia presented her thoughts how content is served in Drupal, and how to fix some problems that have surfaced over the years of web development. The traditional method for serving content to browsers is to have the server send off the entire DOM all at once. This was okay when websites were much simpler and developers created them using individual HTML pages, but websites are much more complicated now and often behind a complex CMS like Drupal.

The problems

To help illustrate this problem, lets take a small Drupal 7 site configuration into consideration. When Drupal serves a page, it does a lot of work for that one request. Drupal needs to load and process every module that is enabled on your website all at once, which is part of the Drupal "bootstrap" process. This uses up a ton of memory on your the web server... A Drupal 7 site with ~20 contrib modules enabled could use between 40-50MB of RAM every time a page is requested! To alleviate this, there are a ton of ways you can cache data to prevent the server from working too hard. With anonymous page caching, APC, and CSS & JS caching enabled, you can reduce that memory usage to around 5 MB per anonymous page request. Big difference!

Even with all of the advances in caching techniques, there is a large push to alter the traditional way that pages are served. Jesse's presentation focused on what's called client-side content inclusion. All this really means is loading content as it's needed and not all at once. Most users only care about the primary page content, especially on mobile devices where bandwidth is a big concern. 

How it's done

So how is this done? Instead of having the server work to serve everything that can possible be on the page, have it instead just load the menu and main content area. In addition, instead of having the server handling templating the data, have the client do it instead. Web browsers and JavaScript are good at working with the DOM, and web servers are best at processing code and sending off data.

There's are two main ways this is handled. The first is providing links to load secondary content, like a sidebar containing related articles, or comments for a blog post. Once a user clicks these links, JavaScript intercepts the click and sends out a POST request to the server for the additional content. Once the server responds, JavaScript injects the content into the DOM dynamically.

The second approach is similar, but instead of loading the content when a user takes an action, the content is automatically loaded after the main content has already been retrieved. This is not a new concept by any means, but it's not something that's easily achieved today. However, this approach is not ideal for Drupal because every request bootstraps the whole CMS.

Ideally the bits of requested data from the server is presented in a raw format like JSON. Then you can use a JavaScript templating plugin to inject the data into an HTML template before then placing that content in the page DOM. This approach takes significant load off of the web server and hands it off to the web browser. Browsers like Chrome and Firefox are constantly releasing new versions with faster and faster JavaScript engines in them that help to improve this workflow.

Thoughts & concerns

One big concern with doing client side content inclusion is how it effects SEO. Since search engine crawlers don't load JavaScript, they won't see your secondary content that isn't served with the primary page. To get around this, there are some "best practices" you can use for both SEO and accessibility (screen readers don't load JS either). Have each "loader link" for secondary content actually go to a real page with the content on it. That Google and screen readers will follow those links just fine and index the content, but you can disable that linking using JavaScript for normal page views. Jesse also mentioned that Google recently blogged about how they are going to begin processing JavaScript on pages though, so that's something to keep an eye on. I couldn't find the blog post she mentioned, but did find this.

As for Drupal, work is being done to make it easier to include content without doing massive Drupal bootstraps each time. Check out the Content API module (in active development). There is also a fairly large initiative for Drupal 8 called WSCCI that is attempting to make Drupal more of a "service" with a CMS on top of if. That will make it much much easier to do content includes dynamically.

Jan 24 2012
Jan 24

I've been working on a project that requires the search block look a very specific way. The text input and submit button needed to be directly next to one another. I could probably get it done with just CSS, but all the extra markup in the search block was really bothering me.

I've heard great things about Drupal 7's new rendering system, but it was pretty mysterious to me. I remember attending a session about it @ DrupalCon Chicago but it was a bit over my head at the time. Now that I had to mess around with this search block, it was a great opportunity to research the drupal_render(). FAPI adds what are called theme_wrapper functions to every element before they get rendered. These are like "mini" theme functions that are applied, in order, to each element they are attached to.

For the search block, there are two elements that I wanted to remove extra, unnecessary markup from. The search input field is one, and has a theme_wrapper function called "form_element". When drupal_render renders this form, it will see that this form element has a theme_wrapper and pass the element data into it for theming. This is when the extra markup around the form element is added. All you have to do is clear out the theme_wrapper before rendering to prevent that from happening. The same applies for the submit button.

But where do we clear it out? Turns out this can't be done in a form_alter. That's because the theme_wrappers are not added by the time your module gets it's hands on the form. Thanks to Drupal 7's new rendering, you can use a hook_block_view_alter to unset the theme_wrapper functions. Since a blocks $content variable is populated with a renderable array and not static content (like it was in Drupal 6!), we can modify with the fully populated form array using this hook. What's even greater is that you can use this hook in your theme's template.php file, which I don't believe was possible in Drupal 6.

Here's the code:

function YOURTHEMENAME_block_view_alter(&$data, $block) {
  if ($block->module == 'search' && $block->bid == 'search-form') {
    unset($data['content']['actions']['#theme_wrappers']);
    unset($data['content']['search_block_form']['#theme_wrappers']);
  }
}

And there you go. Now the extra markup around the form elements will be removed.

Jan 23 2012
Jan 23

This is a fairly common menu structure for Drupal developers to deal with, and depending on how it functions, it could be super easy. Here's the scenario: you need to display the 1st level of links in a menu horizontally, and the 2nd level in that same menu right beneath it.

There are at least two different ways that this can be implemented. The first is a static approach, where you do NOT need to show children of each primary link when hovering.

Consider this menu strucure:

- About
--- Staff
--- Company
--- Employment
- Services
--- Our Industries
--- Our Packages

The static approach means that 2nd level of horizontal links will only be populated with links from the active menu trail. So when you are on the "About" page, the second level is populated with Staff, Company, and Employment links. If you want to see sub-links of Services, you have to first browse to the Services page. To get this method working quickly in Drupal, make sure your menu settings are configured to populate the main links and secondary links variables properly (see screenshot). Then you can place them one on top of another and you're good with some CSS styling.

However, this approach could be considered undesirably if you want your visitors to find what they are looking for quickly.

That brings us to the second approach, which is using JavaScript to show the second level of links for each primary when hovering over any of the primary links. That means that if you hover over the "Services" link, the second horizontal menu is populated with Our Industires and Our Packages. When hovering over "About", the second level dynamically changes to those sublinks. You get the idea.

The problem? You need to pre-load all menu link for the 1st and 2nd level of the menu. Thanks to the Menu Block module, this is very easy to do. Just add a new menu block and configure it to "expand all children of this tree" and set a maximum depth of 2. This should populate the menu block with every link in the 1st and 2nd level of the menu.

By default, your menu block is probably configured to display nicely as if it was in a sidebar or something. We don't want that. You want to use your super CSS skills to get the links displaying horizontally for both levels. Set each <li> in the menu block to float to the left or display inline instead of block. The hard part here is creating the second level of links to display properly. Because a the 2nd level of links are displayed as a <ul> within the parent link's <li>, you have an issue here. This can be solved by setting all of the nested <ul>'s (second level) to position: absolute, and giving them a "top" value of enough spacing to push them below the primary links. You'll probably need to give the <ul>'s for both the 1st and 2nd level a fixed width and height as well. Don't forget to set position: relative to a <div> that encapsulates all of the menu links as well.

Obviously each implementation will be different here, but this is a pretty good approach given how menu block outputs it's menu tree in HTML. After you get the CSS coded out, you'll want to add some basic jQuery to hide/show the second level <ul>'s based on where the user is hovering.

This approach should work for all major browsers, including *gasp* IE6.

Dec 02 2011
Dec 02

Drupal is a wonderful content management system. Any Drupal web developer will tell you that there seems to be a module for anything you can think of. For the most part, this is what makes Drupal so great. The community support is second to none and it's never too hard to find a module that suites your needs.

With that said, it's important to understand when there may be something other than Drupal that you should be using for a particular functionality. Commenting is the best example of this.

Drupal has a comment module included in core that provides all of the standard features of commenting. You can restrict what content types allow commenting, control threading and anonymous commenting, and more. I don't think anyone would argue that the comment module is by any means useless or completely feature-lacking.

DisqusThat said, I think it can be done better. The main issue with using a Drupal's (or any other CMS's) comment implementation is that users either have to register to comment, or comment anonymously. Commenting anonymously has a bunch of issues associated with it, one of which is that users often forget that they commented and won't come back to continue a conversation.

Sites that don't allow anonymous commenting will force users to register with a local account. If I want to comment on a random article I found on the web, and I have to register to leave a comment, I'm out of there. Users don't want to register with yet another website, remember the password, and give out their email address. And that's how it should be, right? There are a ton of websites that offer logging in using your Twitter or Facebook account as an alternative to registering separately - because no one wants another account to remember.

This is especially true if the only benefit of registration is the ability to leave comments. I feel it's an older way of approaching SPAM control and perhaps gathering email addresses for marketing. Modern SPAM prevention software and algorithms have greatly reduced the need to force user registration, and most users probably won't opt-in to marketing during registration.

There are a few services out there that do comments really well that should be a serious consideration when starting up a new site. These services fix the above issue by allowing users to have one account that they use for commenting all over the web. Two big ones are LiveFyre and IntenseDebate. But the one I'm going to focus on is Disqus.

Disqus is a proven service used by sites like CNN, IGN, and Engadget. At the time of writing, there are 58 million ​users and over 1 million sites using the service. It's super easy to install on any website, and there is a Drupal module that provides integration with the service as well. Here's a few of the main features of Disqus:

  • One profile for all commenting - create one account and use it on every site that uses Disqus
  • Social integration - create a Disqus profile with one of your existing social site logins, and share comments with them
  • Liking - users (even anonymous ones) can "like" comments and you can then sort on the most liked comments
  • Spam - Disqus has an incredible Spam prevention engine
  • Moderation - login to Disqus' great control panel to moderate comments

There are a lot more features, but the idea here is that Disqus does only commenting, whereas Drupal does not. As a result, Disqus has refined their offerings to provide the best possible commenting experience.

There is one major caveat worth mentioning though. Disqus stores the comments on their server, unless you are a huge company that runs their own Disqus servers. This is a reason for concern for two reasons:

  1. You're tied to Disqus - when they go down, so do your comments.
  2. Comments are brought in dynamically via JavaScript - you don't get the SEO benefits of the comments.

The first problem shouldn't be that big of a deal. Disqus has a very solid infrastructure and they've been around for years. If you're worried about the comments being on Disqus forever, they let you export all of the comments via XML should you decided to leave the service.

The second problem is the big one. A lot of sites are hesitant to use the service because they lose the SEO benefits of serving comments server-side. Google will never see the comments when using Disqus, unless you use a synchronization feature. Such a feature uses the API to periodically bring comments from Disqus back into your local database. You could then show the comments server-side and just hide them using CSS.

The WordPress plugin already supports this. I've recently become a maintainer of the Disqus module for Drupal, and have been working on a similar feature for Drupal. Once that's completed, Drupal developers (including myself) will be more open to using it.

I haven't tried out LiveFyre or IntenseDebate, so they may be better for you depending on your situation. I've taken an interest in Disqus because I see it all over the web, and they have a great API that's getting bigger and more powerful. The point is, if you run a blog or news site, you want it to be as easy as possible for your users to get involved. Forcing users to register on your site they may only use a few times a year is not the correct way to go about it.

Apr 21 2011
Apr 21

I had a great time at DrupalCon. This was the last day of presentations before I had to catch my flight back to Philly. Because of my flight time I had to miss the last session track and any meet ups that night, but such is life. My favorite presentation of the conference was on this day: Drupal Commerce. I look forward to exploring all of these modules, ideas, and technologies in the upcoming months!

This was actually the first session I went to that focused completely on Drupal 7. The Render API is what allows you to easily remove HTML (markup) from code. I absolutely love everything about it, and was excited to learn more. Franz Heinzmann gave a great and informative presentation on the subject.

The basic concept of the Render API is that everything you want to be rendered, you place in a multi-dimensional array indicating how you want your data to be output. Rendering data this way allows other modules to easily hook into the pipeline and mess around with whatever they want. Render API unifies ways modules render data throughout the sites code base, using the Drupal render function. This is available in Drupal 6, but it didn't work that well and was full of limitations. I also believe it was used primarily for form outputs.

Now in Drupal 7, just about everything can be rendered using the API. I'm not sure how far along the documentation is yet and I haven't had time to check it out. Franz gave a pretty in depth look at how the Render API works though, and he did a pretty good job at it. Things got a bit technical in a few areas but I was able to follow along. Hopefully as I get more time to code for Drupal 7 I can blog more about the API.

Arguably the most anticipated session at DrupalCon. Ryan Szrama, the man being Ubercart and now Drupal Commerce, demonstrated how to get a simple shop started in Drupal 7. He covered as much as he could in the hour he had for the session, but the audience was craving more.

As having worked on several eCommerce sites in Drupal 6 using Ubercart, I couldn't be more excited to see Ryan's demo site up and running. Every 5 minutes Ryan was explaining how a new feature in DC knocked out 6 contrib modules for Ubercart. Ryan stated that he has yet to be stumped by a developer wishing for a specific feature. The backbone of Drupal Commerce consists of Entities, Views, and Rules. I am so ecstatic that conditional actions are gone and replaced with Rules, and that every single admin form and user facing table is now handled in Views.

I have yet to get started on building a site using Drupal Commerce, but can't wait to get started. My agency often utilizes a gateway that is not commonly used throughout the industry, so a good way for me to get started is to write a payment module. I'm glad to see that Ryan has completely rethought and re-programmed everything from the ground up when working on Drupal Commerce. His goal is to have it be the #1 e-Commerce solution on the web, and I really believe it is possible.

Irakli Nadareishvili and Erik Summerfield gave a presentation on how to make Drupal as efficient as possible. Drupal is known throughout the industry as being a rather large memory hog and gets a bad rep for it. There is always a debate going on in the community around making the site efficient vs making the site practical and useful out of the box. Drupal 7 "ate" somewhere around 50 contrib modules which are now in core and more are in consideration (views?).

The session focused on how to write modules and code to focus on performance. While many of the large scale Drupal sites run on dedicated servers with great software support (memcached is a big one), a bigger portion run on shared hosting. Drupal has always had some problems running on shared hosting, most notably because of the PHP memory it requires. I've worked on sites that demand 128M or more, which is certainly a lot. Some shared hosting providers only allow you 32M or less, making Drupal 7 very difficult to work with.

Writing modules that can scale and utilize the server's resources efficiently is extremely important. I was surprised to find out that an average page load on Drupal runs over 100 queries to the database. This is of course attributed to the fact that Drupal is extremely modular and many modules run the same or similar queries to get the information they need. The presenters acknowledge this and suggest utilizing frameworks like CTools while writing code.

The session gave a great overview on the challenges that module developers are presented with and how to address them in a sensible way. I personally don't do to much heavy module development, but it's great to have a good understanding on how to write them the correct way!

Mar 22 2011
Mar 22

This is the second entry in recapping my trip to Chicago for DrupalCon. On the opening night (Monday), a party was held at the Field Museum of Natural History. Beer, wine, and plenty of food was served as we were all free to roam around most of the building. The most impressive was certainly Sue, the world's most complete T-Rex fossil. Certainly worth checking out if you are end up in Chicago.

The second session focusing on jQuery, this one was great. The focus was on the somewhat recently announced jQuery Mobile. For those that have not heard of it, there are a few great benefits this will bring to mobile websites. Something themers can get excited about is that jQM provides an elegant, simple, and consistent user interface. This is especially important now that there are tons of different mobile browser resolutions and display types out there. Another big pro is that jQM has a ton of awesome AJAX functions built in to the UI to load only the data that is needed. This is huge for mobile sites where bandwidth is almost always the biggest concern.

Tom Cosgrove and Brian McMurray gave a great presentation on not only jQuery Mobile in general, but utilizing it in Drupal themes as well. Dries stressed in his keynote on Tuesday how important mobile is becoming on the web. With the amount of mobile devices like iPads and smart phones out there, you can see why. Drupal has a few mobile themes that work well with jQuery mobile, but the presenters stressed that jQuery Mobile relies heavily on HTML5 markup (specifically the data-role attribute). This makes sense of course, and I'm glad the developers behind jQuery are focusing on HTML5.

All in all, this was a great presentation. Drupal is not quite ready yet for jQuery Mobile but it is very close. Brian and Tim demonstrated using jQuery Mobile in a theme to do simple administrative tasks in Drupal on an iPhone. It went extremely well, but there are some caveats like displaying form tables that don't work. Whenever I decide to develop a mobile version of a site, I'll be utilizing this for sure.

This was a great presentation by Sumit Kataria who works for Civic Actions. The presentation focused on using Drupal as a service (using the Services module) platform to serve as as the back end for mobile apps. There were a good deal of "wow" moments as Sumit went through how easy (relatively speaking) to develop a simple app.

The focus was on the software package called Titanium. This software lets you develop mobile apps using only JavaScript, which then gets complied into Java/Object C. This allows you to easily develop an app that will run on Android and iPhones/iPads alike. I should note here there was a very large focus at DrupalCon on mobile development. Sumit makes some great points about how more and more web traffic is generated by serving pages to mobile browsers and applications. It's very important as a developer to understand and utilize some of the latest tech (like Titanium).

Sumit's presentation was the first I was hearing of using Drupal as a service. From what little I know, it allows applications to ping Drupal for information and send data back and forth. He demonstrated a simple app that lets a user login with a username and password, just like a Drupal site. The accounts and data were all stored on a Drupal site somewhere on the web. Titanium essentially acts a layer bridging communication between the app and Drupal. The communication for this app was done using JSON, and Sumit showed a window displaying all of the code that was executing and debug info as a user attempted a login.

Harnessing the power of Drupal as a backend for an app, the possibilities are rather endless. I got very excited to learn more about mobile development after the presentation was over. Sumit showed some JS code samples, and it certainly peaked my interest. He went further and showed a more complex app to get people excited about development. For anyone interested, you can view the video of his complete presentation on the DrupalCon site (follow header link).

This was the smallest presentation I went to, but well worth it. Kyle Cunninham presented the awesomeness of Haml and Sass and how much easier it can make a Drupal themer's life. Haml is a new scripting style to writing HTML (seems outdated and clunky to many). Sass is something similar but targeted towards writing CSS more efficiently.

HAML seems really odd at first. They even admit it on the official site. However, Kyle claims that once you get the hang of it you will never want to go back to writing HTML "the old way". It got me thinking about how clunky writing HTML can really be. If not for a decent IDE like Coda, I'd be totally lost at times. If you look at the output of a standard Drupal site utilizing views or panels, there is an insane amount of markup there. While it's all there for good reason, it is difficult to maintain and keep track of. Haml simplifies everything and allows you to take a programatic approach to writing markup. My favorite part about Haml is that white space is relevant! I can't do it justice is this short paragraph so be sure to check it out here.

Sass cleans up CSS style sheets and can often reduce the amount of code written by half. Anyone who has themed a large and complex site knows how bad CSS files can get. It's annoying and tough to keep track of hierarchies and preventing duplicated code. Sass takes the same approach as Haml (which inspired it). My favorite feature of Sass is the ability to use 'functions', referred to as "mixins". This allows you to easily create base styles and include them in other style elements with parameters. Check out the great site on Sass which includes side by side examples of writing in Sass vs CSS.

Kyle is also the maintainer of the Peroxide theme engine for Drupal. I remember when I picked up my first Drupal book it mentioned the ability to use different themeing engines. Drupal comes packaged with PHPTemplate which is common and easy to user. Peroxide allows you to utilize Sass and Haml in your development. Overall this was a great presentation that opened my eyes a bit to the new tools out there for themeing. I have to find the time to invest into this, as well as get some of my coworkers involved. It would be nice to work on a microsite utilizing Peroxide to take it for a test drive.

One of my co-workers told me that I was bound to see many tweens at DrupalCon that know a crap-load more than me about Drupal. Dmitri Gaskin was one of them, and oh boy did he know what he was doing. Dmitri's presentation was about how Features and Drush Make can allow you do deploy Drupal sites in the blink of an eye.

I have toyed about with Features before but never really got too interested in it. Features is great in that it allows you to create custom modules without writing any code. You can create a view, a content type, a bunch of custom fields, some imagecache presets, and package it all into one module. Dmitri demonstrated how you could do this to deploy a 'news' module. The interface for using Features is very straightforward. Modules can be updated and overridden when you have updates to be made, and deployed across several websites quickly and easily. I've never really had a need to do this, but it's not hard to imagine how some people would.

Drush Make (a module written by Dmitri) allows a developer to package entire sites and deploy them. So create a bunch of feature modules, and then create a drush make file that writes out the site's modules (with versions) and dependencies. It's incredible watching a drush make file being executed as it quickly downloads and installs all of the modules needed for the site. Developers that often develop microsites with similar functionality can save so much time using this. I'm all about saving time; it frees up some precious ours to work on more advanced functionality.

Dmitri gave a great presentation and he is surely a great asset to the Drupal community. It was inspiring watching his presentation!

Mar 18 2011
Mar 18

Hello all... I've just returned from my first DrupalCon this past week, and it was a great time & experience. It was also the first time I was on a commercial flight, but all went well. I've been developing sites in Drupal for about a year now and couldn't have been more excited to head to Chicago to check out the bi-annual conference. Chicago is certainly a beautiful city!

There were certainly over 3,000 people at the Sheraton Towers in downtown Chicago, and it was a bit mind-boggling to see the insane amount of people equally obsessed with Drupal as I am. A large focus of the conference was the recent release of Drupal 7 what to look forward to for Drupal 8. It's going to be a few more months before I adapt Drupal 7 in a site, but I'm eager to get started on some small projects.

The conference started off with memorable keynote speech by Dries Buytaert and lasted for three days straight, with some small activity and events before and after. I'm going to go through some of the memorable sessions I attended - both as a record for me and a good read for you.

[adsense:468x15:5395931887]

This was the first session I attended at DrupalCon. The two speakers, Aaron Winbor and Jacob Singh, are actively involved in the development of the new multimedia workhorse of Drupal 7. It's my understanding that this module is the logical replacement of the wonderful emfield module, and it looks promising. The module is not quite ready yet but the speakers assured it was close. A few large sites were already using it.

What really excites me about Media is the new unified interface to adding and managing media throughout the site. This means a new overlay interface with lists of all multimedia content in one place, and shareable on any node. There are some nice bells & whistles that will please content managers who are used to working with the clunky IMCE interface.

The module has been in development for while now and has a growing issue queue. Seems to be rather buggy at the moment, but they are working their asses off to get Media production ready. This is a pretty major module I'll be keeping an eye on for Drupal 7 development.
 

Jenn Simmons gave a great, well delivered session on the topic of HTML5 and Drupal. As a web developer, it's impossible not to get swept up in the buzz surrounding the new standard. It's a bit hard to get super exited about it however; several of the projects I work on require compliance with older browsers that are far from supporting it). Jenn assured the audience that HTML5 degrades nicely for the most part, and there is a JavaScript plugin which adds support otherwise.

Something to take away from this session was the great new features introduced with the standard. Creating forms is going to be a helluva lot easier moving forward, with added support for specific field types like phone numbers. Semantics are a huge focus in HTML5, with new elements such as "header", "footer", and "aside" which provide a nice alternative to the typical Drupal theme utilizing a huge amount of div markup. Looks like a dream come true for SEO. Offline in-browser storage, amount a slew of other new features, is now introduced as well. I'm not quite sure how to utilize this yet and Jenn didn't get into much detail.

A major issue discussed was the current lack of support for HTML5 in Drupal 6 & 7. Jenn seemed pretty flustered by the fact that is it still pretty difficult get modules and themes to output HTML5 (without some significant code diving). A related discussion touched upon in the Dries keynote was having Drupal 8 un-coupled from HTML completely. This would allow data to be output via any format, such as XML, HTML, JSON, etc. This allows for Drupal to serve as a service architecture and wonderful news for mobile developers using Drupal as a platform.

There is a module called HTML5 Tools and a base theme called HTML5 Base that a worth a look.
 

jQuery is a huge part of Drupal so I thought I would check this session out. It was a bit less than I had hoped for, but still took away some good info. The presenter was an HP employee Nathan Smith who certainly knew his way around jQuery. He showed off his baby, an incredible desktop interface made completely in jQuery. I believe he created it just to mess around and learn new things, but it sure turned out great. It really demonstrates why jQuery is such a great and powerful JavaScript library.

Part of the session was dedicated to how to write jQuery in a way that doesn't hurt site performance. He showed off some interesting code and gave away a few tips. To be honest, most of what he discussed was stuff I've already learned and been through. Biggest tip I can give anyone writing jQuery: target elements using an ID!
 

I was really excited to check this session out but left a bit disappointed. The session was slow to start and get rolling, and the presenter didn't go into as much detail as I had hoped. He tried to demonstrate how to extend and customized views to do what you want, but it just didn't flow that well. He gave a very good overview of how views works and how it builds a query, but there was a beginners session scheduled the next day or that demonstrated this. The questions asked at the end were obnoxious at best, with several people asking basic Views questions that could have been googled. I suppose I'll just have to wade through the documentation (or lack there of) of extending views a bit more.

That's all for day 1. Check back soon for an overview of the sessions I attended on Wednesday.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web