Feeds

Author

Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Nov 21 2021
Nov 21

I'm not a fan of the summary option on body fields in Drupal. I've never really got on with how users interact with it or the content is produces.

The field type I'm talking about is called "Text (formatted, long, with summary)" and it appears as an add-on summary field to a normal content editor area. It comes as standard on Drupal installs and appears on all body fields. The field type has it's uses, but I often find that the content it produces is unpredictable and doesn't have a great editing experience. I have even written articles in the past about swapping the summary field to a fully fledged wysiwyg area for Drupal 7, which worked on the project I implemented it on.

Let's look at the body summary field in Drupal and see why I have a few problems with it.

The Body Summary

If you create a new content type in Drupal you will automatically get a body field. This field will always be a "Text (formatted, long, with summary) and will have a machine name of "body". This is unlike most other fields in Drupal which get a prefix of "field_" to the field machine name. You can also create your own fields with of the same type as the default body field.

The field footprint looks like this in the field management page of the content type.

Drupal node body field, showing the description of the body field.

When editing or creating a page of content the body field is shown like this.

Drupal node body field, showing the collapsed summary field.

Clicking on "Edit summary" will show the summary of the field, allowing users to enter text into it.

Drupal node body field, showing the expanded summary field.

The problem is that if you enter content into the summary it doesn't get treated as markup because the summary isn't a markup field. Users with the ability to do so can enter raw HTML into the summary, but it will just be stripped out as the field is passed through the standard twig output filters. You can allow users to bypass this, but that poses a security risk as no input filter is applied to the output at all.

If you don't enter content into the summary field it then the beginning of the body field is used as the summary, with the markup intact. This can be controlled through the break element, but that feature isn't commonly used on most Drupal sites I have seen in the last few years.

This mix of content makes the summary unpredictable and difficult to work with. You can get around some of these inconsistencies by simply wrapping the summary field in paragraph tags when it is rendered on the front end, but this creates a unique situation where the summary doesn't act in the same way as normal fields. Users will often want to add markup to the summary since the body field contains that markup and they will expect to be able to add links or bold tags to it. The body summary field will quite often be the focus of support requests to allow it to produce HTML output or even be a normal wysiwyg area.

A much better way of capturing the summary is to create a secondary field called Summary that is a "Text (formatted, long)" field type. This allows users to enter rich content into both the body field and the summary field and is a standard formatted text field (without any summary option). The same field (with the same machine name) can be found in the Umami demo install profile.

Drupal node summary field, showing the description of the field.

With this new Summary field in place we also need to setup the content display so that the summary field is displayed in full on the teaser display mode. Since we don't want the summary field to be too long we can also include the Drupal Maxlength module that will restrict the amount of characters that can be entered into the content. We now have much better control over what is considered the summary of the page and it is much more predictable.

We also need to prevent users from entering copy into the body summary, since that copy will never be presented anywhere. Let's look at a couple of ways to remove the summary from the body field.

Solution 1: Turn Off The Summary

The simple solution here is to just turn off the summary input. This is easily achieved in Drupal via a checkbox on the field configuration screen, found by clicking the 'edit' button next to the field description on the field management page of the content type.

This is what that option looks like.

Drupal node body, showing the option to remove the summary.

Turning this Summary input setting off prevents users from seeing the summary field and entering content into it as the option to show the summary is no longer present. The database table for the body will still contain the summary field and any data already entered into it is now locked since users can't access the summary input.

The edit page of the content type we changed now looks like this. The body and summary fields are separate fields and we have now turned off the body summary entry.

Drupal node body field, showing the summary and body fields.

Solution 2: Completely Remove The Body Summary

Whilst turning that setting off is fine, I wanted to take this to another level and completely remove the summary from the body field. This essentially means changing the type of field that the body field is defined as, therefore removing the summary option from the field and database table. Finding out how to do this took me down a bit of a rabbit hole in the Drupal field and configuration API, but it turns out that it is perfectly possible.

What I also wanted to do was to take any content that was already in the body summary and copy it into the new Summary field.

I should warn you before going any further that this code can be dangerous! In other words, here be dragons! Running this code without fully understanding it can potentially butcher your site configuration and cause your site to crash and be rendered useless. If you do want to run this code then make sure you have database backups before entering into this process in case anything goes wrong. Changing the type of field is not something to be done lightly, but I was sure that this is something that was needed in order to have the body and summary fields as separate, fully controllable, fields with no option of adding in a summary from the body field.

Let's go through each step one at a time. There is a little bit of code to do through here and the idea is that this is run in an update hook, which I will detail in full at the end.

1) Remove the summary field from the body field storage definition. This setting is tells Drupal what shape the database table for the body field is so by removing the summary field we are effectively changing the shape of the table that Drupal can see. It is kept in the Drupal key/value store so we just fetch it out of there, remove the summary from the table definition, and add the setting back into the key/value storage.

# Load the field schema definition and remove the summary.
$bodyStorageDefinition = \Drupal::keyValue('entity.storage_schema.sql')->get('node.field_schema_data.body');
unset($bodyStorageDefinition['node__body']['fields']['body_summary']);
unset($bodyStorageDefinition['node_revision__body']['fields']['body_summary']);
\Drupal::keyValue('entity.storage_schema.sql')->set('node.field_schema_data.body', $bodyStorageDefinition);

2) Next we need to update the field storage definition to have a type of "text_long" instead of "text_long_summary". This setting is used to tell Drupal what the field looks like when creating new content types so without changing this setting you would end up creating new content types with the summary field. This setting is also kept in the Drupal key/value store, but there is a dedicated API that allows access to change and update this setting.

If you use the key/value API to get this setting it is unserialised into an array of objects and it is not possible to change the type property of the FieldConfig object that stores the body field definition. Using the entity definition update manager perhaps the only way to do this, but is also considered the best practice approach to changing field storage definitions.

# Update the field storage definition.
$manager = \Drupal::entityDefinitionUpdateManager();
$body_storage = $manager->getFieldStorageDefinition('body', 'node');
$new_body_storage = $body_storage->toArray();
$new_body_storage['type'] = 'text_long';

$new_body_storage = \Drupal\field\Entity\FieldStorageConfig::create($new_body_storage);
$new_body_storage->original = $new_body_storage;
$new_body_storage->enforceIsNew(FALSE);
$new_body_storage->save();

3) Set the body field to storage configuration to have a field type of "text_long". This setting is used to connect the body field to the storage system and is closely linked with the entity.storage_schema.sql setting above.

# Set the body field to have a field type of 'text_long', without the summary field.
$config = \Drupal::service('config.factory')->getEditable('field.storage.node.body');
$config->set('type', 'text_long');
$config->save();

4) It is also quite important is to change the field instance setting for every content type that contains a body field. This setting tells Drupal to load the body field as a "text_long" field when creating or editing content. We change this setting by loading the field instance configuration out of Drupal, changing the field type and then saving it. This could be further automated by loading in a list of all of the content types, but I have instead just hard coded the standard Article and Page content types.

# Set the body field instance on the content types that contain the body field to have a type of 'text_long'.
foreach (['article', 'page'] as $content_type) {
  $config = \Drupal::service('config.factory')->getEditable('field.field.node.' . $content_type . '.body');
  $config->set('field_type', 'text_long')->save();
}

5) With the field definition sorted out we now need to copy the content from the body field into the new summary field. This will populate the summary field in the site with content from the body field and mean that there are no gaps in the content. The simplest (and quickest) way to do this is to pull the available data out of the body field using a the database and simply insert it into the relevant data storage tables. There is some logic here to look at the value of the body field and either use the current summary or generate a new one on the fly from the existing markup.

# Load all of the data from the body field.
$database = \Drupal::database();
$result = $database->query("SELECT * FROM {node__body};");

if ($result) {
  # Copy the body_summary field data into the new field_summary value.
  while ($row = $result->fetchAssoc()) {
    if (is_null($row['body_summary']) || trim($row['body_summary']) == '') {
      # There is no summary so create one fom the body copy.
      $summary = text_summary($row['body_value'], $row['body_format'], 1000);
    } else {
      # A summary has been set, so surround it with usable markup.
      $summary = '

' . $row['body_summary'] . '

'; } $placeholders = [ ':bundle' => $row['bundle'], ':deleted' => $row['deleted'], ':entity_id' => $row['entity_id'], ':revision_id' => $row['revision_id'], ':langcode' => $row['langcode'], ':delta' => $row['delta'], ':field_summary_value' => $summary, ':field_summary_format' => $row['body_format'], ]; $database->query('INSERT INTO {node__field_summary} VALUES (:bundle, :deleted, :entity_id, :revision_id, :langcode, :delta, :field_summary_value, :field_summary_format)', $placeholders); $database->query('INSERT INTO {node_revision__field_summary} VALUES (:bundle, :deleted, :entity_id, :revision_id, :langcode, :delta, :field_summary_value, :field_summary_format)', $placeholders); } }

If you have a lot of content on your site then you might want to run this as a batch process as doing so will prevent the code from timing out if there are lots of data to process. This process only took a few seconds on an existing Drupal site with around 800 items of content so it is quite quick.

After running all that code we are left with a field definition that looks like this (without the summary field).

Drupal node body field, showing the field without a summary.

Any new content types that we create will automatically get the same field definition of a body field without the summary option.

The final piece in all of this is the deployment. There's a sort of chicken and egg situation between the new configuration of the site and the update hook we are trying to run. We can't write data to a summary field table that doesn't exist, so we can't run the update hook before importing config. We also can't import the config and create the new Summary field table correctly as it the new body field definition causes a configuration import error. It isn't possible to change the field storage type in the configuration so Drupal will just error when it sees that change.

My normal deployment process runs the update hooks first before importing config, so the update hook therefore needs to create the summary field tables before we can put data into it. I have written about adding config to Drupal in an update hook before, but this is slightly different since it's a field that we need to import from configuration. If we just import it using the \Drupal::service('config.storage') service then the tables are not created so we need an alternative method that will set up the fields correctly.

We can use the entity type manager service from Drupal to create the field storage and instance configuration (and by extension the tables needed for the field) on the fly. The following code will read the new Summary field configuration from the configuration files and write it into Drupal. Doing it this way ensures that the correct tables and configuration is setup for the rest of the code to run.

$config_path = realpath('../config/sync');
$source = new FileStorage($config_path);

// Import the field_summary storage config.
\Drupal::entityTypeManager()->getStorage('field_storage_config')
  ->create($source->read('field.storage.node.field_summary'))
  ->save();

// Import the instance config for the field_summary field.
\Drupal::entityTypeManager()->getStorage('field_config')
  ->create($source->read('field.field.node.article.field_summary'))
  ->save();

Here is the full update hook with all of the needed code added together. This will create the new Summary field, delete the summary option from the body field and populate the new field with content from the body.

/**
 * Transfer body field into body without summary option.
 */
function mymodule_update_9001()
{
  $config_path = realpath('../config/sync');
  $source = new FileStorage($config_path);

  // Import the field_summary storage config.
  \Drupal::entityTypeManager()->getStorage('field_storage_config')
    ->create($source->read('field.storage.node.field_summary'))
    ->save();

  // Import the instance config for the field_summary field.
  \Drupal::entityTypeManager()->getStorage('field_config')
    ->create($source->read('field.field.node.article.field_summary'))
    ->save();

  # Load the field schema definition and remove the summary.
  $bodyStorageDefinition = \Drupal::keyValue('entity.storage_schema.sql')->get('node.field_schema_data.body');
  unset($bodyStorageDefinition['node__body']['fields']['body_summary']);
  unset($bodyStorageDefinition['node_revision__body']['fields']['body_summary']);
  \Drupal::keyValue('entity.storage_schema.sql')->set('node.field_schema_data.body', $bodyStorageDefinition);

  # Update the field storage definition.
  $manager = \Drupal::entityDefinitionUpdateManager();
  $body_storage = $manager->getFieldStorageDefinition('body', 'node');
  $new_body_storage = $body_storage->toArray();
  $new_body_storage['type'] = 'text_long';

  $new_body_storage = \Drupal\field\Entity\FieldStorageConfig::create($new_body_storage);
  $new_body_storage->original = $new_body_storage;
  $new_body_storage->enforceIsNew(FALSE);
  $new_body_storage->save();

  # Set the body field to have a field type of 'text_long', without the summary field.
  $config = \Drupal::service('config.factory')->getEditable('field.storage.node.body');
  $config->set('type', 'text_long');
  $config->save();

  # Set the body field instance on the content types that contain the body field to have a type of 'text_long'.
  foreach (['article', 'page'] as $content_type) {
    $config = \Drupal::service('config.factory')->getEditable('field.field.node.' . $content_type . '.body');
    $config->set('field_type', 'text_long')->save();
  }

  # Load all of the data from the body field.
  $database = \Drupal::database();
  $result = $database->query("SELECT * FROM {node__body};");

  if ($result) {
    # Copy the body_summary field data into the new field_summary value.
    while ($row = $result->fetchAssoc()) {
      if (is_null($row['body_summary']) || trim($row['body_summary']) == '') {
        # There is no summary so create one fom the body copy.
        $summary = text_summary($row['body_value'], $row['body_format'], 1000);
      } else {
        # A summary has been set, so surround it with usable markup.
        $summary = '

' . $row['body_summary'] . '

'; } $placeholders = [ ':bundle' => $row['bundle'], ':deleted' => $row['deleted'], ':entity_id' => $row['entity_id'], ':revision_id' => $row['revision_id'], ':langcode' => $row['langcode'], ':delta' => $row['delta'], ':field_summary_value' => $summary, ':field_summary_format' => $row['body_format'], ]; $database->query('INSERT INTO {node__field_summary} VALUES (:bundle, :deleted, :entity_id, :revision_id, :langcode, :delta, :field_summary_value, :field_summary_format)', $placeholders); $database->query('INSERT INTO {node_revision__field_summary} VALUES (:bundle, :deleted, :entity_id, :revision_id, :langcode, :delta, :field_summary_value, :field_summary_format)', $placeholders); } } }

Finally, since all the data has been moved around we can now delete the original summary field. I've created this as a separate update as the database still contains all of the needed data to get back to the old field type (if needed). As Drupal doesn't know about this field it won't have any problems writing new records to the tables, so leaving it in won't have any negative side effects. By removing the field from the table we essentially have no way to get back so this is more of a destructive action.

function mymodule_update_9002() {
  // Drop the body_summary field from the body and revision tables.
  $database = \Drupal::database();
  $database->query('ALTER TABLE `node__body` DROP `body_summary`');
  $database->query('ALTER TABLE `node_revision__body` DROP `body_summary`');
}

That is essentially it. Through a couple of update hooks the site now has no record of the old body summary and the users are now able to enjoy a rich content area for the summary field. The final touches to this would be to add some descriptive help text to the two fields to help users know where that content will appear and to use the Maxlength module to prevent the new summary field from being too long.

How do you feel about the body summary field? Have you encountered any issues with it in the past? Let us know in the comments.

Oct 17 2021
Oct 17

When creating Drupal modules I like to keep the hard coded components to a minimum. This helps when changing parts of the module in the future as hard coded links and other elements will require manual intervention and slow down maintenance. Sometimes, though, this isn't an option as you just need to have a few routes in your *.routing.yml file that point to controllers or forms within your module.

I had a situation today where I was looking to load all of the routes that are contained in a module. I could then construct a page of links that would handily point to different parts of the module or feed those links into a sitemap. This meant that I wouldn't need to hard code this list into a controller, I just needed to load all the routes and print that list out instead. Especially handy if I ever added or removed a route as that would mean that list would update without me having to do it manually.

Using Core Services

As it happens, Drupal doesn't have a service that allows you to search for routes that have a similar signature or structure. There are a couple of things that look like they might work, but end up not being what I was looking for. I'll go through them here for completeness.

The first option I found was the getRouteByName() method from the router.route_provider service. This does a one-to-one match of a given route against the routes you have within a site. Because the searching is done as lookup on an array of routes the method doesn't accept wildcard searching.

The following example would only retrieve a single route.

$route = \Drupal::service('router.route_provider')->getRouteByName('some_route');

Next, I tried the getRoutesByPattern() method, which is part of the same service. Despite the name, this is actually a wrapper around a method called getRoutesByPath() that does a database lookup for routes that match against a given path that contain wildcard parameters. The basic idea is that you can find the routes that surround a given path. For example, you could pass in "/comment/%" and retrieve routes like "/comment/{comment}/approve" and "/comment/{comment}/delete". This turned out unsuitable for my needs as very few of the paths I had in the module were wildcard paths.

The method should be used like this, passing a wildcard path in to retrieve one or more routes that are connected to this path.

$routes = \Drupal::service('router.route_provider')->getRoutesByPattern('/somepath/%');

The next service I looked at is actually used during the bootstrap process by Drupal. The mat() method on the router.no_access_checks service is used to find the current route based on the given path. The difference here is that this isn't a searching method, it will match against the given path or not.

$route = \Drupal::service('router.no_access_checks')->match('/mymodule');

Be careful of this one as if your path doesn't exist then it will throw an exception.

The Solutions

The solution to this problem fell into two routes (if you'll forgive the pun). Either pull the routes directly out of the database, or fetch them from the module file routing itself.

For these examples I am going to assume that we have a module called "my_module" that contains a number of routes in the routing file (called my_module.routing.yml) that would look something like this.

my_module_page:
  path: '/page'
  defaults:
    _controller: '\Drupal\my_module\Controller\MyModuleController::page'
    _title: 'Page'
  requirements:
    _access: 'TRUE'

my_module_form:
  path: '/form'
  defaults:
    _form: '\Drupal\my_module\Form\MyModuleForm'
    _title: 'Form'
  requirements:
    _access: 'TRUE'

I have included this snippet here to give some context to these examples.

1. Get Routes From The Database

The most obvious solution to this is to search the database for the routes that we want.

When Drupal builds its internal routing structure (after a cache clear) it will pick up the routes from your module and create records of them in the router table in your database. This is ultimately where the Drupal core route matching methods were getting their information from so it makes sense to pull the data from this table.

Using the code below we are querying the database for all routes that start with the string "my_module".

$database = \Drupal::database();
$query = $database->query("SELECT name, path FROM {router} WHERE name LIKE :name", [":name" => 'my_module%']);
$results = $query->fetchAll();

This will return an array of results, so the next step is to use getRouteByName() to turn this into a route object. This is a legitimate use of this method as we have a one to one match being done here.

foreach ($results as $id => $result) {
    $routeName = $result->name;
    /** @var $route \Symfony\Component\Routing\Route */
    $route = \Drupal::service('router.route_provider')->getRouteByName($routeName);
    
    $text => $route->getDefault('_title'),

    // Do things with route name and text.
}

The $route variable in the above code now contains a standard Route object that we can use to extract information about the route.

2. Get Routes From The Module routing.yml File

Another solution to this problem is to pull data out of the module's *.routing.yml file.

To do this we need to load the contents of the file from the module and use the \Drupal\Core\Serialization\Yaml::decode() static method to convert the file data into a PHP array. This can be done using the following code.

$routingFilePath = DRUPAL_ROOT . '/' . drupal_get_path('module', 'my_module') . '/my_module.routing.yml';
$routingFileContents = file_get_contents($routingFilePath);
$results = \Drupal\Core\Serialization\Yaml::decode($routingFileContents);

With this PHP array in hand we can then go about loading the routes in the same way as the previous example. Note that all we really need to know about the route is the name, which we can then use to pull information about the route from Drupal.

foreach ($results as $id => $result) {
    $routeName = $id;
    /** @var $route \Symfony\Component\Routing\Route */
    $route = \Drupal::service('router.route_provider')->getRouteByName($routeName);
    
    $text => $route->getDefault('_title'),

    // Do things with route name and text.
}

The downside here is that we now have all of the routes in the module, regardless of what their original name was. We therefore need to use some logic to remove the ones we don't want. Not a big issue, but if you have administration pages in your module that you don't want to print out then you'll need to filter them out here.

A Real World Example: Adding Routes To The sitemap.xml File

Rather than just leave it there I thought it would be a good idea to add a real world example of this approach.

The Simple XML Sitemap Drupal module is used to generate sitemap.xml files. I use this module as standard on all my Drupal sites as it is very stable, feature rich, and is good at generating sitemap.xml files.

It is possible to add arbitrary links to the file through the user interface in your Drupal site, but the module also provides a couple of hooks to alter the sitemap.xml file generation. The hook hook_simple_sitemap_arbitrary_links_alter() can be used to inject additional links to the sitemap.xml generation process. This gives us a handy mechanism for us to load in the routes from a module and inject them into the sitemap.xml file.

The following code (which would be in a file called my_module.module) implements the hook_simple_sitemap_arbitrary_links_alter() hook and uses the database technique above to load routes that start with the text "my_module". These routes are then iterated over and each one is placed into the sitemap.xml file as an absolute URL.

We do this by adding each link to the $arbitrary_links array that is passed by reference to the hook. This means that anything we add to this array will be seen by the Simple XML Sitemap module as a new link.

query("SELECT name, path FROM {router} WHERE name LIKE :name", [":name" => 'my_module%']);
  $results = $query->fetchAll();

  // Loop through routes and add to sitemap.
  foreach ($results as $id => $result) {
    $routeName = $result->name;

    /** @var $route \Symfony\Component\Routing\Route */
    $route = \Drupal::service('router.route_provider')->getRouteByName($routeName);

    if ($route->getRequirement('_access') == 'TRUE') {
      // This is a public link, so add to the sitemap.xml file.
      $url = Url::fromRoute($routeName, [], ['absolute' => TRUE, 'https' => TRUE]);
      $arbitrary_links[] = [
        'url' => $url->toString(),
        'priority' => '0.3',
      ];
    }
  }
}

One important thing to realise from the above example is that we are performing a permission check on the route. In this case we are ensuring that the route is publicly available before attempting to place it into the sitemap.xml file. This is critical to remember as the links you load directly from the database are done so without any knowledge of the permission of the user. You must implement that check yourself before showing the link to the user (or in this case, the Simple XML Sitemap module). The only issue is that the user would receive a 403 error code as the page itself is still permission checked, it does, however, lead to a poor user experience.

After regenerating the sitemap.xml file you will now see that it contains extra (publicly available) routes that come from the module's routing file.

Conclusion

Whilst this functionality is not built into Drupal it is simple to implement using Drupal classes and services. You just need to be very careful about the routes that you fetch from your module as you will be responsible for checking the permissions on those routes before showing them to your users.

Selecting between using the database or the file system is up to you, but I think the file based mechanism might be slightly easier to unit test as it doesn't rely on a database layer. The downside of using the file system is that you return all of the links, so you need to add more code to filter out the ones you don't need.

If you want to know more about how to create routes in Drupal modules take a look at the structure of routes documentation on Drupal.org. That documentation page is a good grounding in getting to grips with routes.

Aug 22 2021
Aug 22

If you've been building websites sites for a while you will realise that no site lives in isolation. Almost every site you build integrates with some form of API, and this is especially the case for the more enterprise sites where data is often synchronised back to a CRM system or similar. Drupal's hook and service architecture means that you can easily build integration points to that API to pull in data.

Pulling in data from an API into a Drupal site means installing an off the shelf module or creating a custom module to provide the integration. What route you go for depends on the integration, but for enterprise sites the API is quite often very custom to the business. I have even seen APIs being built at the same time as the site that it needs to integrate with, which is especially the case for startups and new businesses.

One of the biggest issues in getting things working with a site that relies heavily on an API is testing. Both in terms of behavioural testing and user testing you need to be able to have a repeatable set of items that you can use and having an active changing API makes this a little bit tricky. If your API is still being built then you literally have no API to integrate against and so you can't build your site yet.

This is where stubbing comes in. Instead of directly asking the API for data we can swap out the API integration for a different data source. In other words we redirect the API calls to a local file or database table so that the functionality still exists but the data source is different. This is possible to do in Drupal thanks to the modular and extensible way in which services are used.

By stubbing we also get a repeatable set of data that we can use to run tests. If special situations are found that cause issues on the site they can be added to the stub dataset and tests can be written to check those situations. This ultimately makes for a more solid website that can handle edge cases in the API data.

Making An API

In order for this process to work correctly your API must be integrated into your site through a Drupal service class. This means that your API is controlled through one or more classes that are the gateway to the data in the API. This is best practice as you would otherwise tightly couple your API into your system, meaning that it would be difficult to create a stub. The API doesn't need to be directly built into the Drupal classes, it can easily be abstracted out into a composer package, but you still need a point of entry into your system that will allow Drupal to use it.

To demonstrate how to do stubbing I decided to create a simple example module that shows an API in action. I didn't think it was a good idea to create a large module with a complex API integration for a single article so I went looking for a small scale API. To this end I decided to create a simple integration with the free to use Joke API. This API is a RESTful service that returns a random joke from some given parameters and is perfect to show this technique in action.

I didn't want to add lots of code to this article so the integration with the Joke API is about as simple as possible. We first need to define a service class called JokeAPI that will contain the integration with the API. As the API is a RESTful service we need to use the GuzzleHttp\Client package that is available in Drupal, so this needs to be injected into the class as a dependency.

Assuming the module is called joke_api, then we create a file called joke_api.services.yml to register the JokeAPI as a service. This contains the following.

services:
  joke_api.joke:
    class: Drupal\joke_api\JokeApi
    arguments: ['@http_client']

The class JokeApi has the necessary dependency injection code and a single method called getJoke() that will pull a joke from the API. This method takes a couple of parameters that govern what sort of joke is returned from the API.

httpClient = $http_client;
  }

  /**
   * {@inheritdoc}
   */
  public function getJoke($options = [], $category = 'any') {
    $url = $this->url . $category;

    array_filter($options);
    if (!empty($options)) {
      $url .= '?' . UrlHelper::buildQuery($options);
    }

    $request = $this->httpClient->request('GET', $url);

    if ($request->getStatusCode() != Response::HTTP_OK) {
      return FALSE;
    }

    $data = json_decode($request->getBody()->getContents());
    return $data;
  }
}

The JokeApi class itself extends an interface called JokeApiInterface, which just contains the getJoke() method. Creating an interface here follows SOLID principles and is very important for the stubbing process later on as we will be creating a new service that will mimic the behaviour of this class.

The parameters to the getJoke() method are separated like this as the URL for the RESTful API always consists of the category, followed by an optional list of query parameters. We just build up the URL and use the GuzzleHttp\Client object to make the request to the API. If the request is successful then the method decodes the JSON we receive back and returns this. If something went wrong the method returns false. By the way, returning "false" from methods that would otherwise return an object isn't best practice as it changes the return type of the method. It would be better to throw an exception here, but it would mean adding more code so I opted for this simpler version here.

There is a second API endpoint that allows us to submit a joke to the API, but I won't be integrating with that in this code.

A service class that interacts with the API isn't that interesting if we can't see the output. So let's create a form that will allow us to pull jokes from the joke API service using this service class. The first thing that we need is to add a route for the form so we create a file called joke_api.routing.yml that contains a rule to point the path '/get-joke' to a form class called GetJokeForm.

get_joke:
  path: '/get-joke'
  defaults:
    _form: '\Drupal\joke_api\Form\GetJokeForm'
    _title: 'Get Joke'
  requirements:
    _access: 'TRUE'

In Drupal, forms are part of the ContainerInjectionInterface interface, and so we use that to inject the JokeApi service into the form itself. Note that the form constructor accepts an object of the type JokeApiInterface and as such allows us to pass in any object that implements this interface. This is part of following SOLID principles and is critical in later allowing us to swap out the JokeApi service for our own stubbed service.

The form consists of a submit button and a single text field that allows us to search for a joke. The joke returned from the API will be printed out above the form. I have tried to keep the code as short as possible, but there is always a little bit of boiler place code in Drupal forms.

jokeApi = $joke_api;
  }

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container) {
    return new static(
      $container->get('joke_api.joke')
    );
  }

  /**
   * {@inheritdoc}
   */
  public function buildForm(array $form, FormStateInterface $form_state) {
    $output = $form_state->getValue('joke');
    if ($output) {
      $form['joke'] = [
        '#markup' => '

' . $output . '

', ]; } $form['contains'] = [ '#type' => 'textfield', '#title' => 'Contains', ]; $form['submit'] = [ '#type' => 'submit', '#value' => 'Get Joke', ]; return $form; } /** * {@inheritdoc} */ public function submitForm(array &$form, FormStateInterface $form_state) { $options = [ 'contains' => $form_state->getValue('contains'), ]; $joke = $this->jokeApi->getJoke($options); if ($joke === FALSE || $joke->error == 'true') { $jokeString = 'Could not get joke.'; } elseif ($joke->type == 'single') { $jokeString = $joke->joke; } elseif ($joke->type == 'twopart') { $jokeString = $joke->setup . '
' . $joke->delivery; } $form_state->setValue('joke', $jokeString); $form_state->setRebuild(TRUE); } }

The return from the API can either be a 'single' joke, or a 'twopart' joke, each of which have slightly different data structures. Depending on what it is we build up a variable called $jokeString and pass that back into the Drupal form storage. It could also be an error returned from the API service, so we account for that as well.

There are a number of other options available to filter the Joke API in different ways, but as a simple integration this works and allows some filtering to happen to show the API in action.

When we visit the page at /get-joke we see the form printed out to the screen.

A custom Drupal form, ready to pull data from the Joke API.

If we click the "Get Joke" button the page is refreshed and we see a joke printed out above the form elements.

A custom Drupal form, showing data pulled from the Joke API.

This is clearly a joke. If we click the button again we get more jokes returned from the API. Be warned that some of the jokes from this API are NSFW, but you can add filters to remove that stuff from the results if you need to.

Stubbing The API

Now that we have this API integration with the Joke API we can set about creating a stub module that will mimic the API without actually pulling data from it.

In order to make the stub a plug and play solution a second module is created so that we can activate the stub by just enabling a module. Best practice is to keep the stub module within the parent module it is overriding and simply add the suffix "stub" to the module name. This will create a directory structure like this.

/joke_api
- /modules
- - /joke_api_stub
- - - joke_api_stub.info.yml
- /src
joke_api.info.yml

The stub module will override the API using a built in feature of Drupal that allows us to override service providers. This feature gives us the ability to alter things about any service defined in Drupal, but in this case we will be overriding the PHP class that gets used. To do this we need to create a PHP class with a special name by converting the module name to camel case and adding "ServiceProvider" to the end.

joke_api_stub -> JokeApiStubServiceProvider

This JokeApiStubServiceProvider class needs to extend the ServiceProviderBase class. This is an abstract class that we can either use to alter or register a service using methods that are automatically detected and called by Drupal. To alter the service we just need to implement the alter() method in our new class and use the parameter sent to the method to find and alter the class used in the joke_api.joke service.

getDefinition('joke_api.joke');
    $definition->setClass('Drupal\joke_api_stub\JokeApiStub');
  }

}

By placing this specially named class in the modules src directory Drupal will automatically find it and run any methods we have created. There is no need to add additional service definitions for this.

We have now overridden the joke_api.joke service with a different class. The next step is to extend the JokeApi class so that we get all of the same injected service as the original class. We can then override methods in the class for our own needs. Also, as we told the GetJokeForm to accept an interface instead of a class name it means that we don't need to change anything to get this to work. The new JokeApiStub class will still implement the JokeApiInterface and as such can be passed without any problems to anything we built in the original module.

The simplest implementation of the stub class in this situation is to return a single joke, so that's what I have done here. The important part is that the return of the getJoke() method must return the same kind of data that our original method returned so this takes a JSON string from the API and runs it through json_decode() just like the original method did.

All you need to do to activate this stub is turn the module on. With the module active, when we use the form again we get the same joke over and over again. This is because we are now pulling data from the stubbed class instead of from the actual API itself.

If you want to give this code a go then I have created a Github repo that contains the entire JokeApi and stub modules. Feel free to give this a go and see how it works.

Conclusion

The code here demonstrates the very simplest version of stubbing data. We could store this data in a database table or CSV files and write some code to return that data. Using more complex data storage like database tables or CSV is good because it allows us to create situations that can react to input, which is important for any API. For example, we could return different jokes depending on what sort of search query or filter was performed.

Taking this a step further we can create full data sets where we return things like user account details or event information from this stubbed service. It's possible to create fleshed out interactions by just using stubbed data. It's probably a good idea to use CSV as this will allow the stub data to live with the code and allow different code to be tested by just checking out that branch.

The usefulness of this technique speaks for itself. It can be easily activated and adds lots of possibilities to your development workflow. By also combining this with configuration split you can automatically activate the stub module locally so that it always provides that consistent interface.

In addition to testing there is also the benefit of convenience. By making the API essentially self contained with the site it means we don't need to access the API when building things. If the API being integrated with is behind a VPN or whitelisted IP address setup then there's a good chance that not all of your team will have access. By creating a stub module you instantly give the entire team the ability to work on the site without having access to the API. This really helps for building out front end components that integrate with the API as your theme builder will not need access to the API at all.

You can also turn the stub module around and prevent your Drupal site leaking testing data to a third party API. For example, if you have an analytics module that relies on user interaction you can swap out that module for something that records the user actions to a log, rather than sending data to the API. This allows you to run tests and ensure that your event setup works, without sending a bunch of test data to the analytics provider. This is especially useful if you are integrating with an analytics provider that has a limited stage environment.

The only downside is that it does take a little bit of time to get this setup and working and you have to keep it up to date. You must keep the stub module up to date with any changes to the API otherwise it's less than useless.

I have used this technique on a variety of different Drupal sites and it has always improved the live of the developers working on these projects.

Aug 15 2021
Aug 15

Drupal 8 and 9 are built upon services, with many parts of the system available through dependency injection, so it's important to understand the concepts. Services are a way to wrap objects and use dependency injection to produce a common interface. They are powerful and are used all over Drupal to do pretty much everything.

They can, however, be a little difficult for newcomers to the system to understand, especially if they are coming from Drupal 7 or other non-object oriented systems. When you look at some Drupal source code you are likely to see objects being created out of apparent thin air. It's a little hard to know where they come from if you aren't used to the how they work.

I first came across services when I started using Drupal 8 and it took me a little while to get my head around what they are and what they do. Before I understood them, I saw a lot of people online attempting to help by just pointing people to one service or another using this sort of construct.

$thing = \Drupal::service('thing');

This is helpful if you are familiar with Drupal services, but if you aren't then this doesn't tell you much. It is also bad practice to use this construct in certain situations, which I'll let into later on. If you have seen that construct around the internet but don't know what it means then I hope to clear things up a little.

I actually gave this article as a talk at DrupalCamp London 2018, but I have found myself referring to the slides quite often since then. I thought I would write it up as a couple of articles. Since I gave that talk around Drupal 8 I have updated the examples to be in line with Drupal 9.

Let's start with using Drupal services.

Using Drupal Services

The good news is that using Drupal services is pretty simple. Indeed, most of the complexity of services is deliberately hidden away from you. This allows you to get on with the work at hand without having to worry about where to get this or that object from and what parameters its constructor needs. 

There are many different services in Drupal, that govern everything. If you want access to configuration, the internal cron system, path and routing, the rendering process, translations, queues, cache and even date calculations then you can use a Drupal service to do that. I have just mentioned a handful here, but there are plenty more services available in Drupal 9

A good example of a service that is often used is the alias manager service. This service warps the Drupal\path_alias\AliasManager class in Drupal and allows developers access to find an alias for a given path. This means that given a path like "/node/123" you can translate this to an alias in the form of "/page/some-page". This is useful if you have the node ID and want to find the correct path to the node so you can print it out. There are other ways to do this, especially if you have the full Node object, but this is used outside of that situation.

The service can be used the in following way. We use the \Drupal::service() method to get an instantiated AliasManager object and then use a function in that object to translate the path to the alias.

$aliasManager = \Drupal::service('path_alias.manager');
$path = '/node/123';
$alias = $aliasManager->getAliasByPath($path);

As the service returns an object we can chain together the method calls and do the alias lookup in one line, like this.

$path = '/node/123';
$alias = \Drupal::service('path_alias.manager')->getAliasByPath($path);

This does exactly the same thing as the above example, but in a single line of code

If you are writing code in Drupal it is also good practice to include docblock comments around this line so that your IDE can translate what type of object the $aliasManager variable contains.

/* @var \Drupal\path_alias\AliasManager $aliasManager */
$aliasManager = \Drupal::service('path_alias.manager');

When you start writing code your IDE will how print out a list of the methods you have access through, via the service object. Having this in place really helps you tap into the full functionality of the service and will absolutely speed up your development. This is an example of this working in PHPStorm.

Using docblock comments in PHPStorm to show the methods inside a Drupal service object.

Before you go off and start using this construct in all of your custom Drupal classes you should know that the above code should only really be used in static methods, hooks and preprocess functions in your theme. This is because you can use Drupal to inject services into your objects, whereas this isn't possible with static methods and stand alone functions. I will address this again later in the article.

Where To Find Services In Drupal

Services are all defined in YML files within Drupal. Every module that wants to define a service needs to create a file with the name of [module name].services.yml. This means that if we want to find services we just need to search the Drupal codebase within files that end in services.yml.

The path_alias.manager service I looked at earlier is defined in the file path_alias.services.yml, along with a few other alias based services. Since I have already shown how this works let's look at the service footprint.

The path_alias.manager service is defined in the following way.

  path_alias.manager:
    class: Drupal\path_alias\AliasManager
    arguments: ['@path_alias.repository', '@path_alias.whitelist', '@language_manager', '@cache.data']

The first line here is the name of the service and is used to ask Drupal to instantiate it, in this case the name is path_alias.manager.

The second line tells Drupal where to find the class it needs to instantiate. This points to a namespace of the class, rather than the filename, but it tells us that this class is in source directory of the path_alias module.

The final line consists of an array of four arguments. These arguments are an optional parameter that tell Drupal what arguments the constructor of the AliasManager requires. If we look at the constructor of the AliasManager class we can see that it requires four parameters, which map from the list of arguments to the parameters of the constructor.

class AliasManager implements AliasManagerInterface {
  /**
   * Constructs an AliasManager.
   *
   * @param \Drupal\path_alias\AliasRepositoryInterface $alias_repository
   *   The path alias repository.
   * @param \Drupal\path_alias\AliasWhitelistInterface $whitelist
   *   The whitelist implementation to use.
   * @param \Drupal\Core\Language\LanguageManagerInterface $language_manager
   *   The language manager.
   * @param \Drupal\Core\Cache\CacheBackendInterface $cache
   *   Cache backend.
   */
  public function __construct($alias_repository, AliasWhitelistInterface $whitelist, LanguageManagerInterface $language_manager, CacheBackendInterface $cache) {
    $this->pathAliasRepository = $alias_repository;
    $this->languageManager = $language_manager;
    $this->whitelist = $whitelist;
    $this->cache = $cache;
  }
}

The @ symbol in the argument list above denotes that these arguments are other services. There are actually different types of arguments we can use here.

'@path_alias.repository' - This is a reference to another service. So in this case we are referencing the path_alias.repository, which is defined in the same services file. If you see this structure around the Drupal codebase you can find the service it referenced by searching for "path_alias.repository:" (i.e. with a trailing colon) in any services.yml file.

'%app.root%' - This is a configuration item. Some of these are set by Drupal internally (like app.root) but you can also inject configuration settings in this way. This tends to be used less often but it's an option if you want to inject a setting directly into the class.

'value' - This is a literal variable, so the a string of 'value' will be passed as an argument. You can also use numeric and boolean values here so you can pass values like 123 or true.

By knowing where to find services in Drupal you already know how to get access to the numerous different types of services that Drupal offers. There are a number of other options that are available when setting up a service class, but this is the minimum required. You should know that not all entries in services.yml files are pure services as there are a few other constructs that can be added to these files. You can create cache bins or event listeners though this interface and although they are created like services they shouldn't be created outside of Drupal's control.

Why Use Dependency Injection?

Dependency injection within Drupal is automated dependency injection. This means that with a few rules in a settings file we can create objects and have the dependencies automatically injected into it without having to create them manually. In the Drupal codebase, the Symfony component DependencyInjection manages the dependencies. If you use Symfony then you might find a lot of familiarity in how Drupal manages dependencies. 

But why use dependency injection? Couldn't we just create the objects we need and figure things out when we need them? Let's look at creating the AliasManager object without using any dependency injection.

Starting off with the AliasManager class, we know that it needs four parameters passed to it, so let's create that as a starting point where we create the AliasManager object.

use Drupal\path_alias\AliasManager;

$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

Of course, this doesn't work as we haven't defined any of the parameters, so start with the $alias_repository parameter. This is actually a reference to another service called path_alias.repository, which has it's own entry in the path_alias.services.yml file. The object we need to create here is called AliasRepository, so let's put the footprint of that object into the code.

use Drupal\path_alias\AliasManager;
use Drupal\path_alias\AliasRepository;

$alias_repository = new AliasRepository($connection)
$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

The AliasRepository constructor takes one parameter, which is a database connection. Thankfully, there exists a database factory that we can use to get a connection to the default database. Adding this to our code finishes the first parameter.

use Drupal\path_alias\AliasManager;
use Drupal\path_alias\AliasRepository;
use Drupal\Core\Database\Database;

$connection = Database::getConnection();
$alias_repository = new AliasRepository($connection);
$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

Moving onto the $whitelist parameter, this is also a service called path_alias.whitelist, also found in the path_alias.services.yml file. This service points to a class called AliasWhitelist. The constructor for this class takes 5 parameters. Adding the footprint of that object to our code we now have this.

use Drupal\path_alias\AliasManager;
use Drupal\path_alias\AliasRepository;
use Drupal\Core\Database\Database;
use Drupal\path_alias\AliasWhitelist;

$connection = Database::getConnection();
$alias_repository = new AliasRepository($connection);

$whitelist = new AliasWhitelist($cid, $cache, $lock, $state, $alias_repository);

$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

The next step is to start creating the other parameters for the AliasWhitelist object. The first is easy as this is just a string passed to the object to setup the cache identifier.

use Drupal\path_alias\AliasManager;
use Drupal\path_alias\AliasRepository;
use Drupal\Core\Database\Database;
use Drupal\path_alias\AliasWhitelist;

$connection = Database::getConnection();
$alias_repository = new AliasRepository($connection);

$cid = 'path_alias_whitelist';
$whitelist = new AliasWhitelist($cid, $cache, $lock, $state, $alias_repository);

$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

After this is starts getting complicated. The second parameter to the AliasWhitelist constructor is a Drupal cache object, which we can create using the built in CacheFactory object, which we also pass a couple of parameters to in order to create it.

use Drupal\path_alias\AliasManager;
use Drupal\path_alias\AliasRepository;
use Drupal\Core\Database\Database;
use Drupal\path_alias\AliasWhitelist;

$connection = Database::getConnection();
$alias_repository = new AliasRepository($connection);

$settings = Drupal\Core\Site\Settings::getInstance();
$default_bin_backends = $container->getParameter('cache_default_bin_backends');
$cacheFactory = new CacheFactory($settings, $default_bin_backends);

$cid = 'path_alias_whitelist';
$cache = $cacheFactory->get('bootstrap');
$lock = null;
$alias_repository = null;
$whitelist = new AliasWhitelist($cid, $cache, $lock, $state, $alias_repository);

$aliasManager = new AliasManager($alias_repository, $whitelist, $language_manager, $cache);

I'm already lost. I have written lots of code, I have more than 10 source code files open in my IDE, and I still haven't even finished creating the second parameter. There are another two parameters to create both of which have equally complex dependencies, and I haven't even gotten close to using the getAliasByPath() method.

What's worse is that I have already hard coded the database and configuration we are using as well as the configuration setup. If I go further I would need to also hard code other parameters and options into the code. Making these decisions means that it would be very hard to change this code in the future. If the AliasManager class changed in the future I would spend hours re-writing this code to make it work again. If this seems far fetched then remember that the path_alias.manager service used to be called path.alias_manager in Drupal 8, and this change also changed the underlying classes used by the service. 

Compare all of that complexity with using the dependency injection method. We would take more than 50 lines of code and reduce this down to just a single line.

$path = '/node/123';
$alias = \Drupal::service('path_alias.manager')->getAliasByPath($path);

This is far easier to read and understand and easily adaptable to changes to the underlying service without having to change our own code implementation. The example I have gone through here might seem convoluted, but I once showed all of this to a junior programmer who had been struggling to understand dependency injection. As soon as they saw the effect of not using dependency injection and all of the complexity involved they said that they understood why it was used. I wanted to include this example here as it really shows how services mask complexity.

Creating Custom Services With Injected Services

Whilst it is possible to use the \Drupal::service() construct wherever you need it, this shouldn't be used most of the time. Actually, it technically should only be used in static methods, hooks and theme functions where the flat function structure doesn't lend itself to dependency injection. Drupal allows you to inject it into the services into classes so that they are there and ready to use. When you are developing your own modules you will probably want to create your own services, which might have their own services being injected into them. The best way to show this in action is with a simple example.

To create a service we need to create a services.yml file. In the following example we are defining a custom service called mymodule.service_example that creates an object called ServiceExample, which will be created with another service called config.factory. The config.factory service is used to access the configuration of the Drupal site and is quite a common service to use.

services:
  mymodule.service_example:
    class: Drupal\mymodule\ServiceExample
    arguments: ['@config.factory']

Next, we need to create the ServiceExample class. It's best practice to create an interface that comes with your service, in the example this is ServiceExampleInterface. Using an interface allows you to follow proper SOLID principles by allowing other services that use this service to also accept different types of this class, which allows for better unit tests and a more versatile codebase.

The ServiceExample class just needs a constructor to accept the config.factory service and a class parameter to keep it in. When the object is created the config factory will automatically be injected into it, ready to use. I have added an example method called doThing() that makes use of the service.

configFactory = $config_factory;
  }

  public function doThing() {
    $config = $this->configFactory->get($this->configName);
  }
}

To make use of this service you just need to create and use it like any other service. As an example we could use a hook_preprocess_block() to alter things within the block rendering system and use our new service to perform those modifications.

function mymodule_preprocess_block(&$variables) {
  \Drupal::service('mymodule.service_example')->doThing($variables);
}

This construct is pretty much the same for any service you want to inject. For example, you want to use the path_alias.manager service you just need to add the service to the modules services.yml file and then update the class to accept that new parameter. Once the service is created you can use the object just like normal.

Drupal Dependency Injection Interface

Controllers and Forms in Drupal are not defined through the services file and as a result they need a different mechanism to allow dependency injection to be used. In the case of Controllers and Forms the dependency injection is built right into the class and so it has a slightly different construct.

Controllers extend ControllerBase and Forms extend FormBase, both of which implement ContainerInjectionInterface. This interface needs to implement a static method called create(). This method is used to create the services that are needed by the class, which are then automatically injected into the object when it is created.

As an example, the following controller class called ExampleController injects the config.factory service using the create() dependency injection interface.

class ExampleController extends ControllerBase {

  /**
   * The config factory object.
   *
   * @var \Drupal\Core\Config\ConfigFactoryInterface
   */
  protected $configFactory;

  /**
   * {@inheritdoc}
   */
  public static function create(ContainerInterface $container) {
    return new static(
      $container->get('config.factory')
    );
  }

  /**
   * Constructs a ExampleController object.
   *
   * @param \Drupal\Core\Config\ConfigFactoryInterface $config_factory
   *   A configuration factory instance.
   */
  public function __construct(ConfigFactoryInterface) {
    $this->configFactory = $configFactory;
  }
}

This controller is used in the normal way, we just need to define a route that uses this class in the module's routing.yml file. This calls a method in the class called page() that has access to all of the services that we have injected into the class using the create() construct.

mymodule_example_controller:
  path: '/example'
  defaults:
    _controller: '\Drupal\mymodule\Controller\ExampleController::page'
    _title: 'Example'
  requirements:
    _access: 'TRUE'

That's pretty much it, just remember that if the class you are extending implements ContainerInjectionInterface then it uses the create() method to do the dependency injection for the class. If not, then you need to define the services in your module services.yml file.

Conclusion

As I said at the start of this article, this can be a bit of a complex topic, especially for beginners to Drupal. Once you get your head around it, it becomes a really powerful tool and allows you to pull in different services without having to write lots of complex and fragile code. You can even override services and inject your own classes using ServiceProviderBase that I have talked about previously to poke holes in the Shield module. Just remember that services and their dependencies are defined in *.services.yml files and Controllers and Forms can have dependency injection built into them on creation.

Services and their automated dependency injection is intended to make your life as a developer easier so that you can concentrate on the code that matters to you.

You can read more about services in the Drupal documentation.

Aug 01 2021
Aug 01

Naming things is hard[citation needed] and there are a lot of things that you can name when configuring a Drupal site. Picking the right machine names for the different parts of Drupal can make your life easy in the long run. Changing labels is simply a case of tweaking the label in the interface, or through configuration updates. The issue is that once you decide on a machine name for something in Drupal it's pretty much set in stone forever.

The machine names you pick are often used in database tables, paths, interface elements and pretty much anywhere it is used. Changing entity or field machine names at a later date is difficult and can mean writing complex code or using migrations to achieve.

I have built a lot of Drupal sites over the years and done detailed audits on quite a few as well. This experience has given me a lot of insight on how to set up by machine names in Drupal. I have seem some horrific naming practices and poorly configured sites and in my experience there is a correlation between poorly named things in Drupal and bugs caused by those poor choice of names.

As far as I can tell the drupal.org documentation doesn't actually cover this aspect of setting up Drupal. It does address naming conventions in code and modules as part of the coding standards, but not with machine names for content and fields.

Recently, I got into a discussion with a couple of other Drupal developers about content entities and fields, what to name them, and what to avoid when reusing fields. Surprisingly few people have written about naming conventions in Drupal, despite the fact that it has such an effect on the life cycle of the site itself. I thought I would write down some of the conventions I use when naming things and some of the best practices when reusing fields within a Drupal site.

I will start by addressing the naming of content entities.

Naming Content Entities

Content entities aren't just the configurable content types you have in Drupal, although they play a big part. It also includes things like content blocks, taxonomy terms, paragraphs or anything else that you can configure and start making content. Drupal gives you 32 characters to name your content entities so you have a bit of space to play with.

When you create a content type Drupal will automatically create a machine name for you.

Creating a new content type in Drupal

Click on the edit link here to expand the machine name and ensure that it meets your naming requirements. This is much the same for different types of content entity that you might create around the site. This machine name expand functionality is built into most forms like this.

Creating a new content type in Drupal and editing the machine name.

When you set out to name content types in Drupal you must think about the long term use of that content. As an arbitrary example, let's say that you create a content type to store resources for university students in the first year. You can easily name this content "First year resources", which then had the machine name of first_year_resources. This proves popular and so you decide to roll the feature out to second year students as well. This means you either need to create a new content type called second_year_resources or continue to use the first_year_resources content type for second year students. Either way, if you then need to roll out the same feature to third year students then you will have multiple content types doing the same thing or a single content type with a name that doesn't fit its purpose.

This will cause a problem behind the scenes as your developers will be faced with one of two situations. They will either have a content type called "first_year_resources", that isn't just for first years. Or they will have multiple content types that are essentially doing the same thing and have lots of duplicated templates and configuration. In the long run this causes frustration and the confusion will ultimately can lead to mistakes (and bugs).

The solution to this situation would be to start out with a content type called "Resources", which can then be expanded to other years or courses when needed. This creates the machine name of "resources", which is easily understood by developers. This single content type can then be themed easily and rolled out to different users. It can be easily segmented into lists by the use of taxonomy terms. Once the content type and code is in place, it doesn't need to be changed if you want to create resources for other types of users. 

Fundamentally, you should be thinking about the underlying data structures of the content you are creating. I have seen multiple Drupal sites where a content type of "News" has been created and at a later date another content type called "Press Release" is created. The difference between an item of news and a press release is irrelevant to how the data is stored as they will both contain a title, some body copy and maybe some taxonomy terms. In one particular example I worked on a site that had no less than 5 different content types that were just for news related content. This caused a lot of confusion as even the users didn't know what sort of news item they should be creating. Content creators would often create the wrong news type and then had to copy/paste the content to create a different type of news item.

What the developers intended was that press releases would appear in one list and news items would appear in another, and so on. Instead of creating a single content type for news and using taxonomy terms to filter, they created multiple versions of the same thing. The problem is that because of the similarity of the data they will probably have duplicate templates and preprocess hooks to ensure the presentation is correct. A better approach, in my opinion, is to create a content type called "Article" and then use taxonomy to segment that content type into different lists. Permissions for different news types can be achieved by using taxonomy access control.

The same things apply to the other types of content entity in Drupal. They key thing to remember is that things like content blocks and paragraphs are meant to be re-used across the site. This means that you must adhere to the rule of 'name once, use many' when creating them.

It is quite easy to fall into the trap of setting up a paragraph with the name of 'Two column content' and then need to alter that paragraph to use different numbers of columns in different situations. Having a paragraph with that name and can have anywhere between one and four columns makes little sense and will confuse developers as well as editors of the site.

The rule is, don't get too specific. I have seen plenty of situations where a content type called "Homepage" was created that is only ever used to present the homepage. Or a content block called "Copyright" that is only ever used to print out the copyright statement at the bottom of the page. This sort of thing creates baggage that your Drupal site needs to drag along with it during its life. There are better solutions to this, especially in Drupal 9 with the layout system or the configuration pages module.

Paragraphs are quite often abused in Drupal sites. I once inherited a Drupal site that had over 45 types of paragraphs. I didn't even know the paragraph administration page was paginated until I saw that mess! The root of the issue was that the original developer had created 8 different variants of a "Hero" paragraph, each containing just an image field, but with a slightly different way to display the content. When the author wanted to change the type of hero on a given page they had to remove one paragraph and add another one with the correct setup. This really should have been a single paragraph called Hero that contained fields used to control the output. All this complexity could have been avoided through the use of a small amount of templating or preprocess logic.

Here are some tips to have best practice around naming content entities.

  • Don't make the content entities specific to a single type of use. This means avoiding names "Two columns" where more columns might be used or a content type called "Homepage".
  • Don't use similar looking names. For example, avoid using Event and Events as content types on the same site. Also, avoid paragraphs called "Hero" and "Hero wide" as this can easily be accomplished via templating.
  • Don't use the site name or project name for the name of the entity. This is generally a good rule to follow anyway as giving your content types site only names will restrict re-use on other projects.
  • Do keep the name it short and simple.
  • Do describe what the content entity is.
  • Do try to avoid the use of underscores in your content entity names.
  • Do use abbreviations to keep the name short, but still avoid the use of similarly named entities and difficult to understand (or spell) names.

Naming Fields

One of the most critical things to name correctly is the field as this is where all of your data will be stored. Developers will probably interact with fields the most so having them named correctly is critical to the development and maintenance of your site.

Whilst content entities have a single item of configuration, fields are created using two. There is the field storage configuration that tells Drupal what and how to store the data relating to the field. In addition to this there is a field instance configuration that is used when a field is connected to a content entity as well as any custom configuration the field might need.

Most difficulties I have found with field names in Drupal is when the field is either named counter to its function, or it named quite similarly to another field. You should be looking at creating field names that describe what their function is and what they are connected to. This means creating fields with a format like the following.

field__

As an example, let's say that you created a field called field_page_subtitle. This is immediately obvious that the field belongs to the page content type and contains some sort of string. This means that when I want to load in the value of field_page_subtitle I just need to reference the value, rather than load any files or entity references. Similarly, for a paragraph field containing a title the field name should be field_paragraph_title.

Adding the type of content that is being stored to the field can be helpful in understanding its function, although this shouldn't be a blanket rule. For example, adding a field called field_page_title or field_user_first_name is fine as people will understand that the field contains a string of some kind. Fields that store data like dates or images should ideally be named to show that they contain a different type of data. For example, if you create a field for the start date of an event it therefore makes sense to call it field_event_start_date. If the field also contains the time then you might call the field field_event_start_datetime to show that it contains more than just the date.

Similar named fields can cause confusion, so you should endeavour to keep your fields as unique as possible. I remember seeing one site I was auditing that had a field called field_image, which was fine, but the same content type had a field called field_images (with an extra 's'). There were quite a number of bugs associated with just these two fields having slightly different names, despite them clearly having the same function (ie, storing images). One of the fields should have been called field_page_hero in order to show where the image is used.

Just like content type machine names you have 32 characters to use, although this includes the prefix "field_" at the start, so you actually have a few less characters to play with.

Creating a new field in Drupal and editing the machine name.

One thing to watch out for is when you create a long field name on a content entity that also has a long name. Although the maximum length of the field is 32 characters, the maximum length of the table name is 64 characters if MySQL is being used. A good example of this is the taxonomy tables as the taxonomy term revision table must be prefixed with "taxonomy_term_reivsion_field". If that length, plus the length of your field name exceeds 64 characters then Drupal will reduce the prefix to "taxonomy_term_r__" and convert the field name to a hash of its name. This is what has happened if you spot any tables with a name like "taxonomy_term_r__2de1caf063" in your database.

Here are some tips to have best practice around naming fields.

  • Don't use similar field names. Don't use field_image and field_images as this is bound to cause confusion and errors in the long run.
  • Don't name your fields with regards to the project or site as this prevents their re-use on other projects.
  • Do describe what the field is for in the name. If it is storing a name call it 'name'.
  • Do include the type of data being stored in the field, but only if it makes sense to do so.
  • Do add the name of the content entity the field is connected to. For example, a field that stores a name field on a User entity would be called field_user_name.
  • Do keep the field name to a minimum, especially if it's being used on content entities with long names like taxonomy terms or paragraphs.

Re-Using Fields

Finally, after naming content entities and fields it is worth thinking about re-using fields. Drupal will allow you to add the same field to different types of the same content entity. This means that the field you created for a Page content type can be re-used on an Article content type. You can't share fields between different types of entity, so you can't add the same field to a paragraph and a content type.

If you intend to re-use a field then you should think about making the name of the field more generic. For example, let's say that you have a field that is used store the sub title as a text field on a Page content type. If you intend to re-use this field then you can name it with the generic content entity name instead of the content type. In this instance the field would be called field_node_subtitle.

It is also acceptable to leave out the content entity name in the field name. Drupal itself does this a lot of fields like field_comments or field_tags, which are part of the standard Drupal install profile. The fact that you can't share the field between different content entities means the confusion is somewhat reduced.

I would, however, still try to keep name the field in line with the type of content entity being used. For example, instead of field_page_subtitle you would call it field_node_subtitle to show that this is a node field. This will really help if you start dealing with entity relations as you might have node, paragraph, media and taxonomy entities all in the same context.

To understand re-use correctly, it is important to understand how a field is stored in Drupal. As I mentioned above there is a field storage configuration that tells Drupal the data type of the field and a field instance configuration that tells Drupal that this field is attached to a certain content entity. Drupal keeps track of field re-use by having one storage configuration and multiple field instance configurations. A single table in the database will be used for the field, even if it is present on multiple content entities.

The single table approach to data storage is an important consideration in the re-use of fields on your site. If you add a field to more than one content entity, then both content entities will store data in the same table. If you are happy with different content items writing data to a single table then that's fine.

Before you go adding the same field across all of your content entities you need to think about how they will be used. What you want to do is add the same field (i.e. with the same form and function) to different content entities in order to duplicate functionality. You can quite easily create confusion when you have a field on one content type that does one thing, and change the label or template of the field on another content type to do something else. This means that developers and users have to know that a field in a certain context does something different.

It is quite tempting to add a field to another content entity and change the label. For example, you might have a "subtitle" field on a content type and re-use it, but change the label to "author name". Although this is still a text field what you have essentially done here is change the function of that field. Developers might get confused as to why a field called field_node_subtitle has is being used to store author names.

This difficulty goes beyond the label though as it a developer may create a twig template for a field and find that they have to add additional logic to stop the template being used for fields on certain content types. Changing what the field is doing in other types and that can cause confusion, which will ultimately lead to bugs.

Whilst it is also possible to alter the output of the field for each content type it's probably a good idea to keep them the same. Again, try to make the field behave the same everywhere it is used. This feeds back into the "create once, use many" way of thinking.

Common fields that share functionality like titles, hero images, taxonomy terms, or even comments are good candidates for sharing between content entities. They generally stay the same between different types of content. It is essential to have the same field on different content entities act and behave the same.

Here are some tips on maintaining best practice on reusing fields.

  • Do re-use the field if it will have the exact same form and function wherever it is used.
  • Do think about the long term use of the field. If you re-use a number field on a content entity and that needs to be changed to a text field then you will need to be careful with that change. You need to write some migration code, but only for that field in a certain context.
  • Don't change the form or function of the field if it is re-used on a different content entities. This includes changing the label of the field as this can be confusing to users and developers alike.
  • Don't change the cardinality (i.e. number of items) of the field after the fact as this will change the cardinality for all fields of that type. This won't delete any data but will prevent items from appearing in content forms, which can lead to data loss if the item is saved.

Conclusion

This can be a complex topic, but the general rule is to keep things as simple as possible. Name things according to their function and don't alter things to introduce surprises as no one will thank you in the future. Avoid confusing names and situations and don't always accept the auto-generated machine name that Drupal gives you.

When planning Drupal content entities and fields I always think about the phrase "create once, use many". This means that a single item of content can be re-used around the site without the users having to manually add it.

Think about the long term use of the entity or field and what developers will find. Analyse the bugs and issues on your site to see if they are connected to poorly named elements or re-used fields. You might be able to solve these problems by changing names or removing confusing fields.

There is also a Drupal module available called Naming Conventions that deals with some aspects of what I have talked about here. This module is currently Drupal 8 only, but includes the ability to disable the machine name autocomplete field and to add help about the naming conventions in use to different routes on the site. It won't enforce the naming conventions though, that is still up to you, but it will help educate developers on how to name things.

Have I missed anything out here? What rules do you use in your Drupal sites when naming content entities and fields? Have you seen any bugs created from poorly named fields? Please comment below and let me know. Also, feel free to use these tips in your developer documentation.

Jul 25 2021
Jul 25

I have previously talked about configuring a Drupal site to authenticate against a remote SimpleSAMLphp install, but as Drupal is an excellent user management system I wanted to turn it around and use Drupal as the identity provider. This means that Drupal would allow users to log into other systems using their Drupal username and password by leveraging the power of SimpleSAMLphp.

This can be accomplished by wrapping the Drupal site and SimpleSAMLphp together along with a couple of modules to power the communication between the two systems.

The same terms apply as I described in the previous post, but to reiterate their meaning in this context I will go over them again.

SP - Service Provider - This is the system that users are trying to log into, which in this setup is some other site or service. Service providers will generally create a local user to track the user within the site and in this setup the user will be a Drupal user.

IdP - Identity Provider - The Drupal system holds information about the users and is therefore called an identity provider as it provides the identity of the user. This is used by the Service Provider (SP) to authenticate the user.

I'm going to assume that you have a Drupal site already installed via composer, preferably using the Drupal recommended composer file. This will be basis of the rest of the article.

Installing SimpleSAMLphp

To get this working we need to require SimpleSAMLphp in the same project as you Drupal site. The first step, therefore, is to require SimpleSAMLphp as a project dependency, which will install SimpleSAMLphp alongside Drupal.

composer require simplesamlphp/simplesamlphp

In order to allow SimpleSAMLphp to communicate with Drupal we need to install a SimpleSAMLphp module called drupalauth. This is required as another composer dependency.

composer require drupalauth/simplesamlphp-module-drupalauth

This should install the module in your SimpleSAMLphp module directory inside your vendor directory (i.e. in the directory vendor/simplesamlphp/simplesamlphp/modules/drupalauth).

The SimpleSAMlphp application needs to be served as a stand alone application. This means that Drupal will be the main application (served from the root directory) and SimpleSAMLphp will be served from a sub directory. I chose to serve it from the path /idp.

The www directory in the SimpleSAMLphp vendor directory needs to be exposed as the /idp path. I have tried a few methods of doing this, but I have found the best and most reliable approach is just to create symlink between the www directory and your chosen path in your web root.

Navigate to your Drupal web directory and run the following to create that symlink.

ln -s ../vendor/simplesamlphp/simplesamlphp/www idp

The last thing to do is follow through the rest of the instructions on configuring the application. I have detailed these instructions in a separate post that deals with installing SimpleSAMLphp as a composer project. This project structure should the same once you have completed setting things up.

The slight difference to those instructions is that you need to configure SimpleSAMLphp to understand that it lives in a separate directory. As this is different from the default path we need to alter the baseurlpath setting to point to the correct place

'baseurlpath' => '/idp',

Your project should have the following structure, with 'web' being used as the Drupal directory.

simplesamlphp/
- dev/
- - certs/
- - config/
- - metadata/
- prod/
- - certs/
- - config/
- - metadata/
vendor/
- simplesamlphp/
- - simplesamlphp/
- - - modules/
- - - - drupalauth/
web/
- autoload.php
- core/
- .htaccess
- idp -> ../vendor/simplesamlphp/simplesamlphp/www
- index.php
- modules/
- profiles/
- sites/
- themes/
- update.php
composer.json
composer.lock

Note that I have removed some of the items above to keep things brief.

Configure SimpleSAMLphp

Now that SimpleSAMLphp is installed we need to configure it to work with the drupalauth module. This follows on from the instructions in my previous article on getting SimpleSAMLphp installed using composer.

A requirement of the drupalauth module is a database connection, this doesn't need to be the same database as Drupal is installed in, but it requires changing some settings in the config.php file.

    'store.type'                    => 'sql',
    'store.sql.dsn'                 => 'mysql:host=localhost;dbname=drupal',
    'store.sql.username' => 'drupal',
    'store.sql.password' => 'drupal',

If you have the database details correct then you will see the tables SimpleSAMLphp_kvstore and SimpleSAMLphp_tableVersion created in your database. This is assuming that your 'store.sql.prefix' setting is set to the default of 'SimpleSAMLphp'.

You now need to enable the drupalauth module in SimpleSAMLphp. This is done by editing the config.php file and adding the drupalauth entry to the 'module.enable' option.

     'module.enable' => [
         'core' => true,
         'saml' => true,
         'drupalauth' => true,
     ],

With that in place we can add our Drupal site authentication details to the authsources.php file. This needs to reference the Drupal web root, the login and logout links as well as any attributes you want to pass between Drupal and the service providers when the user authenticates. In the example below we are returning the Drupal user ID, name and email address in the authentication package. The site www.ssotest.local is just a locally created domain for this example.

  'drupal-userpass' => [
    'drupalauth:External',

    // The filesystem path of the Drupal directory.
    'drupalroot' => '/var/www/drupaltest/web',

    // Whether to turn on debug
    'debug' => true,

    // Cookie name. Set this to use a cache-busting cookie pattern
    // (e.g. 'SESSdrupalauth4ssp') if hosted on Pantheon so that the cookie
    // is is not stripped away by Varnish. See https://pantheon.io/docs/cookies#cache-busting-cookies .
    'cookie_name' => 'SESSdrupalauth4ssp',

    // the URL of the Drupal logout page
    'drupal_logout_url' => 'https://www.ssotest.local/user/logout',

    // the URL of the Drupal login page
    'drupal_login_url' => 'https://www.ssotest.local/user/login',

    // Which attributes should be retrieved from the Drupal site.
    'attributes' => [
      ['field_name' => 'uid', 'attribute_name' => 'uid'],
      ['field_name' => 'name', 'attribute_name' => 'cn'],
      ['field_name' => 'mail', 'attribute_name' => 'mail'],
    ],

  ],

We now need to inform SimpleSAMLphp of our intent to use it as an IdP. This is done by editing the metadata file saml20-idp-hosted.php and adding an array that details what the name, certificates and authentication system for the IdP. The certificates in this array are the same certificates created for SimpleSAMLphp in the previous article on setting up Drupal as an SP. The authentication name of 'drupal-userpass' must match the authentication source added in the previous step.

$metadata['ssotestdrupal'] = [
    'host' => '__DEFAULT__',
    'privatekey' => 'simplesaml.pem',
    'certificate' => 'simplesaml.crt',
    'auth' => 'drupal-userpass',
];

With this configuration array in place, head over to the "Federation" tab in your SimpleSAMLphp setup, which is at the path /idp on your site. You should now see your IdP in the list. This is viewable even without being logged in as the administrator of the application. Click "Show metadata", you'll see a bunch of output. This consists of your metadata URL, the SAML 2.0 metadata XML, a PHP representation of the same and the public x509 certificate we created above that is used to encrypt/decrypt the data.

Copy the PHP output and paste it into saml20-ipd-remote.php in your metadata directory. It should contain something like the following output.

$metadata['ssotestdrupal'] = [
  'metadata-set' => 'saml20-idp-remote',
  'entityid' => 'ssotestdrupal',
  'SingleSignOnService' => [
    [
      'Binding' => 'urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect',
      'Location' => 'https://www.ssotest.local/idp/saml2/idp/SSOService.php',
    ],
  ],
  'SingleLogoutService' => [
    [
      'Binding' => 'urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Redirect',
      'Location' => 'https://www.ssotest.local/idp/saml2/idp/SingleLogoutService.php',
    ],
  ],
  'certData' => '[REMOVED]',
  'NameIDFormat' => 'urn:oasis:names:tc:SAML:2.0:nameid-format:transient',
  'contacts' => [
    [
      'emailAddress' => '[email protected]',
      'contactType' => 'technical',
      'givenName' => 'Administrator',
    ],
  ],
];

I have removed the certificate here as it is of no use really.

That is SimpleSAMLphp configured, so let's move onto configuring Drupal.

Configure Drupal As An IdP

A requirement of getting this all working in Drupal is the use of the drupalauth4ssp module. This is installed via composer.

composer require drupal/drupalauth4ssp

Once the module is installed you can head over to the configuration page for the module at the path /admin/config/people/drupalauth4ssp. There isn't much to configure here, you just need to make sure that the Authsource field matches the authentication source you set up in the authsources.php file. By default this is drupal-userpass.

Drupal auth 4 SimpleSAMLphp administration page, showing default settings.

You can optionally add URLs to the list of allowed ReturnTo parameters. This is used by the authentication system to push users back to their originating sites (i.e. the SP they are trying to log into) and so can be an added security measure to prevent unwanted sites from using your site as an IdP.

That's pretty much it in terms of setting up Drupal as an IdP. The drupalauth4ssp Drupal module is pretty much plug and play.

Of course, this setup does absolutely nothing on its own, you need a system to be a service provider so that you can log into it. Let's do that.

Configuring A Drupal SP To Log Into The Drupal IdP

What we want to do is setup another Drupal site as an SP so that users from the Drupal IdP can login to the SP using credentials created in the IdP. Essentially, if users want to log into site 2 (i.e. the SP), they must authenticate against site 1 (i.e. the IdP).

The instructions here are very similar to the instructions in the previous article in setting up Drupal to be an SP so you should install and configure the module in a similar way. In this instance, however, we have a much better IdP than a standard SimpleSAMLphp install (ie, Drupal + SimpleSAMLphp) and so we just need to repoint the Drupal SP site to point at out IdP. This requires us to just alter the SAML Authentication modules settings to change the Identity Provider settings and the user info and syncing settings.

The settings for the identity provider in the SAML authentication module are pretty straight forward (even though they don't look like it). You need to copy the values from the IdP settings we created in the saml20-idp-remote.php file. The fields are labelled well so that it's possible to match up what item goes where.

The following is a screenshot of the settings from the array in saml20-ipd-remote.php added to the form.

Drupal SAML auth administration page, showing the identity provider section.

Next is the user information and syncing section. Here, we just need to enter the user name and user email attributes to match the fields we set up in the authsources.php file. In this case the username is 'cn' and the user email address is 'mail'. The other options here are up to you to enable or disable. The settings in the screenshot below are pretty open and allow more control over user field matching than should be enabled in a production environment. That said, it's up to you how open you make the settings for your install.

Drupal SAML auth administration page, showing the user synchronisation section.

Finally, we need to copy the configuration from the Drupal SP into the Drupal IdP. This is detailed in the previous article under the section "Adding Drupal Configuration To SimpleSAMLphp". You should go to the path /saml/metadata in the SP site and copy the XML from that page, convert it to a PHP array in SimpleSAMLphp. The key difference is that you need to add the PHP array to the saml20-sp-remote.php file in your IdP SimpleSAMLphp instance.

Now, when users want to log into the Drupal SP site they will log into the Drupal IdP site and be redirected back to the SP site and authenticated there too. If the user isn't known to the SP site then the attributes sent over by the IdP are used to create the user and fill in their details (assuming they also exist on the IdP site).

Following these instructions, there is nothing to stop us from setting up multiple SP instances of Drupal and allow them all to authenticate against the central Drupal IdP install.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web