Feeds

Author

Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Jun 18 2020
Jun 18

Drupal 7 to 9 Upgrade

If you're one of the 70% of Drupal sites that are still on Drupal 7 at the time of this writing, you may be wondering what the upgrade path looks like to go from Drupal 7 to Drupal 9. What does the major lift look like to jump ahead two Drupal versions? How is this different than if you'd upgraded to Drupal 8 sometime in the last few years? And how long will it be before you have to do it again?

Before the release of Drupal 9, the best path for Drupal 7 sites to upgrade to Drupal 9 was to upgrade to Drupal 8. The big selling point in Drupal 9's evolution is that updating from a late version of Drupal 8 to Drupal 9.0 is more like an incremental upgrade than the massive replatforming effort that the older Drupal migrations used to entail. Sites that jumped on the Drupal 8 bandwagon before Drupal 9.0 was released could benefit from the simple upgrade path from Drupal 8 to Drupal 9.0 instead of another big migration project.

Migrating to Drupal 8 is still a good option for Drupal 7 sites, even though Drupal 9 is now out.

You might find that essential modules or themes you need are ready for Drupal 8 but not yet available for Drupal 9. The Drupal 8 to Drupal 9 upgrade path for many modules and themes should be relatively trivial, so many of them should be ready soon. But, there could be some outliers that will take more time. In the meantime, you can do the heavy lift of the Drupal 7 to Drupal 8 migration now, and the simpler Drupal 8 to Drupal 9 upgrade later, when everything you need is ready.

The Drupal 7 to Drupal 8 migration

The Drupal 7 to Drupal 8 upgrade involves some pretty significant changes. Some of the things you previously needed to do via contributed modules in Drupal 7 are now included in Drupal 8 core. However, the way you implement them may not be the same as some refactoring might be required to get feature parity when you migrate to Drupal 8. 

The migration itself isn't a straight database upgrade like it was in Drupal 6 to Drupal 7; instead, you can migrate your site configuration and site content to Drupal 8. You have a choice of doing it two ways: 

  1. Migrate everything, including content and configuration, into an empty Drupal 8 installation (the default method).
  2. Manually build a new Drupal 8 site, setting the content types and fields up as you want them, and then migrate your Drupal 7 content into it. 

For a deeper dive into what these migrations look like, check out An Overview for Migrating Drupal Sites to 8

Planning migrations

The Migration Planner is a helpful tool you may want to consider in your migration planning process. This tool queries a database to generate an Excel file that project managers or technical architects can use to help plan migrations. Developers who are performing the migrations can then use the spreadsheets.

Performing migrations

Core comes with some capability to migrate content automatically. If your site sticks to core and common contributed content types and fields, you may be able to use these automatic migrations. However, if your site relies heavily on contributed modules or custom code, an automatic migration might not be possible; you may need a custom migration approach.

The Drupal Migrate UpgradeMigrate Plus and Migrate Tools modules are good starting points for performing a custom migration. They add things like Drush support for the migration tasks and migration support for some non-core field types. You can access several custom migration processors that make it easy to do some fairly complex migrations. This can be done just by adding a couple of lines to a YAML file, like an entity_lookup processor that will take text from Drupal 7 content and do a lookup to determine what Drupal 8 entity the text refers to.

Drupal 7 works on older versions of PHP but recommends a minimum of 7.2. If you're migrating from an older Drupal 7 site, there may be several other platform requirements to investigate and implement. 

Tooling and paradigm shifts

With the change to Drupal 8, developers are also expected to use new tools. You now use Composer to add modules and their dependencies, rather than Drush. Twig has replaced PHPTemplate as the default templating engine. Some core paradigms have shifted; for instance, developers need to learn to think in terms of events, or extending objects, instead of the old system of hooks. Many hooks still work, but they will probably be deprecated over time, and the new methods are safer ways to write code. The changes aren't insurmountable, but your organization must invest in learning the new way of doing things. You'll need to account for this education overhead when coming from Drupal 7; development teams may need more time to complete tasks as they learn new tools and paradigms.

Drupal 8's deprecation model

In addition to big changes in Drupal 8 core and implementation details, Drupal 8 also features a deprecation model that's familiar in the software world, but new in Drupal version upgrades. Instead of deprecating a bunch of code when there's a major version upgrade, Drupal 8 has introduced a gradual deprecation model. 

As features and improvements are made in Drupal 8's codebase, old methods and functions are marked as deprecated within the code. Then, a few versions later - or in Drupal 9 - that code is removed. This gives development teams a grace period of backward compatibility, during which they can see alerts that code is deprecated, giving organizations time to implement the new code before it's completely removed. 

The deprecated code also provides an easy hint about how to rework your code using new services and methods. Just look at what the hook does, and do that directly in your code.

This gradual deprecation model is one of the core reasons that the Drupal 9 upgrade is more like a minor version release for Drupal 8 than a major replatforming effort.

With that said, can you jump ahead from Drupal 7 to Drupal 9? If you want to skip over Drupal 8 entirely, you can jump directly to Drupal 9. The Drupal 7 migration ecosystem is still available in Drupal 9. Drupal 9 contains the same migrate_drupal module you need to migrate to Drupal 8. There has been discussion around possibly moving this module to a contributed module by Drupal 10, although no decision has been made at the time of this writing.

If you intend to go this route, keep in mind that all of the considerations when upgrading from Drupal 7 to Drupal 8 apply if you jump straight to Drupal 9, as well. You'll still have to manage the migration planning, deal with tooling and paradigm shifts, and consider platform requirements.

Ultimately, however, jumping directly from Drupal 7 to Drupal 9 is a valid option for sites that haven't migrated to Drupal 8 now that Drupal 9 is released. 

Whichever route you choose, whether you're going to migrate via Drupal 8 or straight to Drupal 9, you should start the migration from Drupal 7 to Drupal 9 as soon as possible. Both Drupal 7 and Drupal 8 will reach end-of-life in November 2021, so you've got less than a year and a half to plan and execute a major platform migration before you'll face security implications related to the end of official Drupal security support. We'll cover that in more detail later in this series. 

For any site that's upgrading from Drupal 7, you'll need to do some information architecture work to prepare for the migration to Drupal 8 or Drupal 9. Once you're on Drupal 8, though, the lift to upgrade to Drupal 9 is minimal; you'll need to look at code deprecations, but there isn't a major content migration to worry about. Check out our Preparing for Drupal 9 guide for more details around what that planning process might look like.

But what about waiting for a later, more stable version of Drupal 9, you ask? This is a common strategy in the software world, but it doesn't apply to the Drupal 9 upgrade. Because Drupal 9 is being handled more like an incremental point-release upgrade to Drupal 8, there aren't any big surprises or massive swaths of new code in Drupal 9. The core code that powers Drupal 9 is already out in the world in Drupal 8. There are no new features in the Drupal 9.0 release; just the removal of code that has already been deprecated in minor versions of Drupal 8.

Going forward, the plan for Drupal 9 is to release new features every six months in minor releases. The intent is for these features to be backward compatible, and to bring Drupal into the current era of iterative development versus the major replatforming projects of olde. There aren't any big surprises or major reliability fixes on the horizon for Drupal 9; just continued iteration on a solid platform. So there's no need or benefit to waiting for a later version of Drupal 9!

Plan for migration

Planning for a Drupal 7 to Drupal 8 or Drupal 7 to Drupal 9 migration becomes a question of scope. Do you just want to migrate your existing site's content into a modern, secure platform? Or are you prepared to make a bigger investment to update your site by looking at information architecture, features, and design? Three factors that will likely shape this decision-making process include:

  • Time and budget
  • Developer skillset
  • Release window

Time and budget for a migration

How much time are you able to allocate for what is likely to be a major replatforming effort? What's your budget for the project? Do you need to launch before an important date for your organization, such as college registration or an important government deadline? Can your budget support additional work, such as a design refresh? 

For many organizations, getting the budget for a large project is easier as a one-time ask, so doing the design refresh as part of the migration project may be easier than migrating, and then planning a separate design project in six months. In other organizations, it may be difficult to get enough budget for all the work in one project, so it may be necessary to spread the project across multiple phases; one phase for the migration, and a separate phase for design.

When factoring in the time and budget for additional work, keep in mind that things like revisiting a site's information architecture could save you time and money in the migration process. Budgeting the time to do the work up-front can dramatically save in time and cost later in the migration process, by reducing unnecessary complexity before migrating instead of having to work with custom migrations to bring over content and entity types that you don't use anymore. This also improves maintainability and saves time for developers and editors doing everyday work on the new site.

Consider developer skills when planning your migration

10 years is a long time for developers to be working with a specific framework. If you've been on Drupal 7 since 2011, your developers are likely very experienced with "the Drupal 7 way" of doing things. Many of those things change in Drupal 8. This is a big factor in developer resistance around upgrading to Drupal 8 and Drupal 9.

Composer, for example, is a huge change for the better when it comes to managing dependencies. However, developers who don't know how to use it will have to learn it. Another big difference is that a lot of Drupal 8 and Drupal 9's core code is built on top of Symfony, which has changed many mental paradigms that experienced Drupal developers are accustomed to using. While some things may seem unchanged - a Block is still a Block, for example - the way they're implemented is different. Some things don't look the same anymore; developers will encounter things like needing to use YAML files instead of hooks to create menu items. Even debugging has changed; things like simple debugging via print() statement doesn't always cut it in the new world, so many developers are using IDEs like PHPStorm, or a host of plugins with other editors, just to code effectively in newer versions of Drupal.

All of this change comes with overhead. Developers must learn new tools and new ways of doing things when switching from Drupal 7 to Drupal 9. That learning curve must be factored into time and budget not only for the migration itself but for ongoing development work and maintenance after the upgrade. Progress during sprints will likely slow, and developers may initially feel resistant or frustrated while they learn the new ways of doing things.

Bringing in outside help during the migration process can mitigate some of this learning overhead. Partnering with an experienced Drupal development firm means your migration can be planned and implemented more quickly. One thing to consider when selecting an outside partner is how closely they collaborate with your internal team. When choosing a Drupal development firm to collaborate with your internal team, consider the value of partnering with experienced developers who can "teach" your internal teams how to do things. This reduces the learning curve for a team that's heavily experienced with older Drupal versions and can help your team get up to speed more quickly - saving money during the first year of your new site.

Plan a release window

The other aspect of planning for the Drupal 7 to Drupal 9 upgrade is planning a release window. Plan to have your migration project complete before Drupal 7 is scheduled to reach end-of-life in November 2021. If you can't make that deadline, then start planning now for an Extended Support engagement to keep your site secure until you're able to complete the migration.

You'll want to plan the release window around key dates for your organization, and around other support windows in your stack. For example, if you're a retailer, you may want to have the migration completed before the end of Q3 so you're not upgrading during holiday initiatives. Education organizations may plan their release during slow periods in the school's calendar, or government websites may need to be ready for key legislation. 

When it comes to your stack, you'll want to plan around other important release windows, such as end-of-support for PHP versions, or upgrading to Symfony 4.4. This is particularly important if you need to upgrade dependencies to support your Drupal 7 to Drupal 9 migration. Check out Drupal 8 Release Planning in the Enterprise for more insights about release planning.

Revisit information architecture, features, and design

Because the jump from Drupal 7 to Drupal 9 is so substantial, this is a good time to revisit the information architecture of the site, do a feature audit, and consider whether you want to make design changes. 

Is it time to update your site's information architecture?

Before you jump into a Drupal 9 upgrade project, you should perform an audit of your existing Drupal 7 site to see what you want to carry forward and what you can lose along the way. Did you set up a content type that you only used once or twice, and never touched again? Maybe you can delete that instead of migrating it. Are you using a taxonomy that was set up years ago, but no longer makes sense? Now is a good time to refine that for the new version of your site.

Content migration is also a relatively easy time to manipulate your data. You can migrate Drupal 7 nodes or files into Drupal 9 media entities, for instance. Or migrate text fields into address fields or list fields into taxonomy terms. Or merge multiple Drupal 7 content types into a single Drupal 9 content type. Or migrate content from a deprecated Drupal 7 field type into a different, but supported, Drupal 9 field type. These kinds of things take a bit more work in the migration, but are completely possible with the Migration toolset, and are not difficult for developers with migration experience. The internet is full of articles about how to do these kinds of things.

In addition to the fine details, it's also a good time to take a look at some big-picture questions, like who is the site serving? How has this changed since the Drupal 7 version of the site was established, and should you make changes to the information architecture to better serve today's audience in the upcoming Drupal 9 site? 

Have your feature needs changed?

Drupal 7 was released in 2011. Nearly a decade later, in 2020, the features that seemed important at Drupal 7's inception have changed. How have the feature needs of your content editors changed? Has your site become media-heavy, and do your content editors need large searchable image archives? Do you want to deliver a dynamic front-end experience via a bespoke React app, while giving content editors a decoupled Drupal framework to work in? 

Many editors love the new Layout Builder experience for creating customized site pages. It's something that doesn't exist in Drupal 7 core and is arguably better than what you get even when you extend Drupal 7 with contributed modules. Drupal 8 and 9 have built-in media handling and a WYSIWYG editor, eliminating the need for dozens of Drupal 7 contributed modules that do not always cooperate with each other, and focusing developer attention on the editorial UX for a single canonical solution.

Revisit the needs of your content editors and site users to determine whether any existing features of the current site are no longer important and whether new feature needs warrant attention in the upgrade process. This could be particularly helpful if you find that current features being provided by contributed modules are no longer needed; then you don't have to worry about whether a version of those modules is available in Drupal 8/9, and can deprecate those modules.

Ready for a design update?

If your Drupal 7 site hasn't had a design refresh in years, the upgrade process is a good time for a design refresh. Plan for a design refresh after the upgrade is complete. Drupal 9 will have a new default theme, Olivero, which features a modern, focused design that is flexible and conforms with WCAG AA accessibility guidelines. Olivero has not yet been added to Drupal core - it's targeted to be added in 9.1 - but it is available now as a contributed module any Drupal 8 or Drupal 9 site can use. Olivero is a great starting point for sites that want an updated design. 

If you're planning a custom design project, keep accessibility and simplicity at the forefront of your design process. You may want to engage in the design discovery process with a design firm before you plan your Drupal 9 release; a good design partner may make recommendations that affect how you proceed with your migration.

Perform the migration

The process of migrating from Drupal 7 to Drupal 8 has improved since Drupal 8's initial release, but it can still be an intricate and time-consuming process for complex sites. We wrote An Overview for Migrating Drupal Sites to 8 to provide some insight around this process, but upgrading sites must:

  • Plan the migration
  • Generate or hand-write migration files
  • Set up a Drupal 8 site to actually run migrations
  • Run the migrations
  • Confirm migration success
  • Do some migration cleanup, if applicable

Unlike prior Drupal upgrades, migrating to Drupal 8 isn't an automatic upgrade. A Drupal 7 site's configuration and content are migrated separately into a new Drupal 8 site. There are tools available to automate the creation of migration files, but if you've got a complex site that uses a lot of custom code or many contributed modules, you'll only go so far with automated tools. You'll need to revisit business logic and select new options to achieve similar results or deprecate the use of Drupal 7 contributed modules and custom code in your site to move forward to Drupal 8 and Drupal 9.

Whether you're going upgrade to Drupal 8 and then Drupal 9, or migrating directly from Drupal 7 to Drupal 9, these migration considerations and the process itself will be the same. The only difference would be whether the new site you migrate content into is a Drupal 8 site or a Drupal 9 site.

Upgrading from Drupal 8 to Drupal 9

If you choose to go through Drupal 8, once you get to Drupal 8, finishing the migration to Drupal 9 is relatively easy. Upgrade to the latest version of Drupal 8; the upgrade to Drupal 9 requires Drupal 8.8.x or 8.9.x. Along the way, you'll be notified of any deprecated code or contributed modules you'll need to remove before upgrading to Drupal 9. Make sure any custom code is compatible with Drupal 9, and then update the core codebase to Drupal 9 and run update.php

Voila! The long upgrade process is complete. 

May 27 2020
May 27

Drupal 7 to 9 Upgrade

Drupal 7, our much-loved CMS that was released in 2011, is nearing the end of its life. No, that's not hyperbole; Drupal 7 is scheduled to reach end-of-life in November 2021. Drupal 8 has been out for a few years, but at the time of this writing, Drupal core usage statistics indicate that only about 350,000 of the more than 1.1 million reporting Drupal core sites are using Drupal 8.x. Over 730,000 of those sites are still using Drupal 7. If your site is one of those 730,000 still on Drupal 7, should you upgrade to Drupal 9? 

Drupal 7 is coming to an end

Whether or not you choose to upgrade to Drupal 9, it's time to acknowledge one very important truth: Drupal 7 is coming to an end. After a decade in service, Drupal 7 will stop receiving official community support in November 2021, and the Drupal Association will stop supporting Drupal 7 on Drupal.org. Automated testing for Drupal 7 will stop being supported via Drupal.org, and Drupal 7 will no longer receive official security support.

Beyond the loss of support for Drupal 7 core, there is less focus on the Drupal 7 version of many contributed modules. Some of them are quite stable and may work well into the future, but others are more neglected. The reality is that once module maintainers have moved their own sites to Drupal 8 or Drupal 9, they may lose interest in spending the time it takes to keep a Drupal 7 version of their code up to date.

Upgrading from Drupal 7 is harder than from Drupal 8

Drupal 8 was released in November 2015. When the Drupal Association announced Drupal 9, they discussed a big change coming to the Drupal ecosystem: Major Drupal version changes would no longer be a substantial replatforming effort, but would instead be a continuation of an iterative development process. In practice, this means that Drupal 9 is built in Drupal 8, using deprecations and optional updated dependencies. The result is that upgrading from Drupal 8 to Drupal 9 is just an iterative change from the final Drupal 8 version. Drupal 9.0 involves the removal of some deprecated code, but introduces no new features; it's a continuation of the fully-tested, stable codebase that is Drupal 8. Basically, Drupal 9.0 is just another release of Drupal 8. 

On the other hand, Drupal 7 has significant differences from Drupal 8 and 9. The jump from Drupal 7 to Drupal 9 can be an enormous undertaking. Third-party libraries replaced huge swaths of custom Drupal code. The procedural code was reworked into object-oriented code. The code changes were massive. Upgrading a Drupal 7 site to Drupal 9 will bring it into the new upgrade paradigm, but there's quite a bit of work to do to get there.  So the question of whether, and how, to make the jump to Drupal 9 is more complicated.

That leaves Drupal 7 sites with a handful of options:

We’ll focus on the first option in this article, and the others later.

Benefits of Drupal 8 and 9

While Drupal 8 is a big change from Drupal 7, it features many developmental and editorial improvements that pay dividends for users who are willing to take the time to learn how to use them.

Lots of contributed module functionality now in core

One of the biggest upsides of Drupal 8 and Drupal 9 versus Drupal 7 is the fact that many of the things that require contributed modules in Drupal 7 are just baked into core now. This includes things like:

  • Layout Builder provides the ability to create customized page layouts that Panels or Display Suite provide in Drupal 7.
  • Blocks have been re-imagined to be fieldable and re-usable, things that require contributed modules like Bean in Drupal 7.
  • You don’t need to install a contributed module and third-party libraries to get a WYSIWYG editor; it’s built into core.
  • Views is in core, and most of the custom lists in core are now fully customizable views.
  • Media handling is not an add-on. It’s an integral feature. To get similar functionality in Drupal 7, you need a half dozen or more complicated contributed Media framework modules, each of which might require quite a bit of additional configuration. You can get a pretty decent media handling experience in Drupal 9 by doing nothing more than enabling the Media and Media Library modules and using the default configuration.
  • Web services are built in, like JSON:API.
  • Customized editorial workflows are now available in core, providing functionality that would have required contributed modules like Workbench Moderation or Workflow.

That’s just to mention a few features; there are many things in core that would require contributed modules in Drupal 7.

Maintaining this functionality is simplified by having more of it in core. Managing fewer contributed modules simplifies the process of keeping them in sync as you update versions and dependencies, and trying to decide what to do when you get a conflict or something breaks. As Drupal 7 development falls by the wayside, this is even more important, as it could take months - or longer - to get updates to Drupal 7 contributed modules, until they’re no longer supported at all after end-of-life.

Having these solutions in core means everyone is using the same solution, instead of splintering developer focus in different directions. And having them in core means they’re well-tested and maintained.

Composer gets us off the island

One of the changes to Drupal since the Drupal 7 release is that Drupal 8 and 9 extensively use third party libraries like Symfony for important functionality, instead of relying on custom Drupal-specific code for everything. That move “off the island” has introduced a need to manage Drupal’s dependencies on those libraries. This is handled with yet another tool, a package called Composer.

You need to manage the dependencies of these new top-level third-party libraries, but each of these libraries has dependencies on other libraries, which have dependencies on more libraries, creating a confusing spiderweb of dependencies and requirements and potential conflicts. Dependency management quickly becomes a maintenance nightmare. It’s a new tool to learn, but Composer is a great dependency manager. Taking the time to learn Composer gives developers a powerful new tool to deal with dependency management.

Composer can do other things. If you add cweagans/composer-patches, it’s also a very useful tool for managing patches from Drupal.org. You can add a patches section to composer.json with links to the patches you want to watch. Composer will automatically apply the patches, and your composer.json file becomes a self-documenting record of the patches in use.

You can read more about Composer in another Lullabot article: Drupal 8 Composer Best Practices.

No more Features for configuration management

In Drupal 7, many sites deploy configuration using the Features module. Depending on who you ask, using Features for configuration management could be regarded as a good thing or a bad thing. Many developers maintain that Drupal 8 (and therefore Drupal 9’s) Configuration Management system, which allows database configuration to be exported to YML files, is much easier than the Drupal 7 Features system. As with Composer, it takes time to learn, but it enables developers who understand the system to accomplish more with less effort. 

Secure PHP support

Drupal 7 sites could be running on deprecated versions of PHP, even as old as 5.3. Drupal 7 sites should already have moved to PHP 7, but could still be running on older, very outdated and insecure, versions of PHP. Drupal 7 currently works with PHP 7.3 but has problems with PHP 7.4. As PHP continues to progress and deprecate older versions, you may find that you can no longer keep your Drupal 7 site running on a secure version of PHP. Drupal 8 runs on PHP 7.0+, and Drupal 9 runs on and requires a minimum of PHP 7.3, so both provide a better window of compatibility with secure PHP versions.

Resistance to migrating to Drupal 8 and 9

There are some reasons why sites delay making this move:

Lack of Drupal 8 versions of Drupal 7 contributed modules

Early in Drupal 8’s release cycle, one of the big complaints about Drupal 8 was that many Drupal 7 contributed modules no longer worked in D8. It did take time for some contributed modules to be updated to Drupal 8. However, many Drupal 7 contributed modules were no longer needed in Drupal 8, because the functionality they provided is now a part of Drupal 8 core.

If you haven’t checked the state of Drupal contributed modules in the last few years, take a look at what’s now available for Drupal 8. You can check the Drupal 8 Contrib Porting Tracker to find the status of popular Drupal 7 modules and see whether they’ve gotten a Drupal 8 stable release. You may find that modules that were missing early on are now available, or that you no longer need some contributed modules because that functionality is now managed in another way.

More importantly, you don’t have to worry about lack of parity in Drupal 8 contributed modules when Drupal 9 is released; as long as the Drupal 8 module in question isn’t built on deprecated code, everything that works in 8.x should continue to work in Drupal 9. And if a D8 module is built on deprecated code, the maintainer should be aware of it. All the code that is being removed in Drupal 9 has already been deprecated in Drupal 8.8, so there won’t be any surprises for module or site maintainers.

Maintenance overhead for small teams

With the introduction of Drupal 8 and Drupal 9, the new paradigm in Drupal development is more frequent, smaller releases. This mirrors a larger trend in software development, where iterative development means frameworks make more frequent releases, and consequently, those releases aren’t supported as long. 

This means you need to commit to keeping your site current with the latest releases. If you’re part of a small team managing a large Drupal site, you may simply not have the bandwidth or expertise to keep up with updates. 

There are some tools to make it easier to keep a site current. There is an Automatic Updates module that might be helpful for small sites. That module is a work in progress, and it does not yet support contributed module updates or composer based site installs. These are planned for Phase 2. But this is a project to keep an eye on. 

You can manage updates yourself using Composer and Drush. Sites of any size can also use  DependaBot, a service that creates automatic pull requests with updates. 

And of course, some web hosts and most Drupal vendors will provide update services for a fee and just take care of this for you.

The new way of doing things is harder

The final complaint that has prevented many Drupal 7 sites from upgrading to Drupal 8 and Drupal 9 is that the new way of doing things is harder. Or, if not harder, different. There’s a lot to unpack here. In some cases, this reflects resistance to learning and using new tools. In other cases, it may be that long-time Drupal developers have a hard time learning new paradigms. Another option may be that developers are simply not interested in learning a new stack, and may no longer want to develop in new versions of Drupal. 

Drupal 6 and 7 have a lot of “Drupalisms,” Drupal-specific, custom ways of doing things, so developers who have been deep into Drupal for a long time may feel the number of things to re-learn is overwhelming. Fortunately, the “new” things, such as Composer, Twig, and PHPUnit are used by other PHP projects, so there is a lot that Drupal 7 developers can learn that will be useful if they work on a Symfony or Laravel project, for example.

Developing for Drupal 8 and Drupal 9 is certainly different compared to Drupal 7 and older versions. Some developers may choose this as a turning point to shift gears into other career paths, developing for a different stack, or making a more substantial change. But with the Drupal 7 end-of-life approaching, developers who don’t want to move to Drupal 8 and Drupal 9 must make some move, just as Drupal 7 sites must move to a modern platform.

Security considerations

In today's world, enterprises have a responsibility to protect their website users' personal data - and they face costly liability considerations if they don't. For many organizations, this means website security is a looming and ongoing concern. It's common for enterprise security policies to require that organizations only use services with ongoing security support. Relative to the Drupal 9 upgrade, this means that many enterprises can't continue to maintain Drupal 7 websites after they stop receiving security support.

But what does “no more security support” actually mean?

When Drupal 7 reaches end-of-life, the Drupal community at large will no longer provide “official” security updates or bug fixes. The Drupal Security Team will no longer provide support or Security Advisories for Drupal 7 sites. Automated or manual processes that you currently use to update your sites may no longer work.

There is a bit of nuance to the lack of security support, however. The Drupal 7 ES program involves partnering with a Drupal Association-vetted vendor and assuring that the vendor is coordinating responsible disclosure of security issues and fixes while publicly sharing the work toward those fixes.

Practically speaking, this means that even if you’re not partnered with an ES vendor, you can still get security patches for your site. However, websites using modules that aren’t actively supported by ES vendors won’t have the benefit of a partner to hunt down and fix issues with those modules, security, or otherwise. If you have modules or other dependencies that age out of security updates, such as the end-of-life of the PHP version you’re hosting, you may be left with a website with an increasing number of security holes.

Additionally, after November 2021, Drupal 7 core and Drupal 7 releases on all project pages will be flagged as not supported. As a result, third-party scans may flag sites using Drupal 7 as insecure since they’ll no longer get official security support.

No more bug fixes or active development

Alongside security considerations, a lesser concern of the Drupal 7 end-of-life timeline is an official end to community-at-large bug fixes and active development. Drupal 7 development has already shifted to focus on Drupal 8 over the past several years, with Drupal 7 bugs lingering. For example, take a look at the Drupal.org issue queue for Drupal 7 core bugs; you’ll see issues that haven’t been updated for weeks or months, versus hours or days for Drupal 8/9 development issues.

Questions to ask when migrating from Drupal 7

So how do you decide which path is right for your organization? Here are some questions to ask.

What are the skills and size of your development team?

The shift from Drupal 7 to Drupal 8 and Drupal 9 involved a shift from Drupal-specific paradigms to incorporating more general object-oriented programming concepts. If your team consists of long-time Drupal developers who haven't done a lot of object-oriented programming, this paradigm shift involves a learning curve that does have an associated cost. For some budget-conscious organizations, this may mean it's more economical to remain on Drupal 7 while developers work on skilling up for Drupal 8/Drupal 9 paradigms.

Another consideration is the size of your development team. If your team is small, you may need to engage an agency for help or explore the other alternatives mentioned above.

What are the plans for the site?

How much active development is being done on the site? Are you planning to add new features, or is the site in maintenance mode? What is your budget and plan to maintain the site; do you have developers devoted to ongoing maintenance, or is it one small priority among many competing priorities? 

If you're planning to add new features, the best option is to migrate to Drupal 8 and Drupal 9. Drupal 9 is under active development, and these modern systems may already include the new features you want to add. If not, working in an ecosystem that's under active development generally reduces development overhead. 

What is the life expectancy of the site?

How many years do you expect the current iteration of the site to continue? Are you planning to use the site for three more years before a major redesign and upgrade? Eight more years? Sites with a shorter lifespan may be good candidates for Drupal 7 ES, while sites with longer life expectancies would benefit from upgrading to a modern platform with a longer lifespan.

What code is the site using?

Do an inventory of your site's code. What contributed modules are you using? What do you have that's custom? Drupal 8 upgrade evaluation is a good place to start. 

Some Drupal 7 contributed modules have Drupal 8 and Drupal 9 versions available, while others no longer apply in a world with different programming paradigms. Still, others may now be a part of Drupal 9 core. 

If you're using a lot of custom modules and code, migrating to Drupal 8 and Drupal 9 is a bigger project.  You might be able to mitigate some of that by altering the scope of your new site to take more advantage of the new capabilities of core and the available Drupal 8 contributed modules.

What features do you want?

Make a list of the features that are important to your organization. This should include features that your site currently has that you couldn't live without, and features you'd like to have but currently don't. Then, do a feature comparison between Drupal 8 and Drupal 9, and any other options you're considering. This may drive your decision to migrate, or you may decide that you can live without "must-have" features based on availability.

Where to go from here

Bottom line: with the Drupal 7 end-of-life date coming next year, now is the time to scope your site changes. But where do you go from here? The next few articles in this series explore how and when to upgrade from Drupal 7 to Drupal 9 and alternate solutions if upgrading isn’t a good choice for your organization. Stay tuned!

May 06 2020
May 06

Drupal 8 to 9 Upgrade

With Drupal 9 just around the corner, there's more and more buzz about preparing for the upgrade. From a project planning perspective, what do organizations need to consider when planning for the Drupal 9 upgrade? Developers may be wondering about the technical details; how to upgrade to Drupal 9. We’ve discussed who should upgrade to Drupal 9 and when to upgrade to Drupal 9. Here’s how to do it.

Drupal 9 Upgrade Project Planning

Plan a release window

Drupal 9 is currently slated for release in June 2020. However, Drupal 8.9.x is scheduled to reach end-of-life in November 2021, with older versions, such as 8.7.x slated to stop receiving security support in June 2020. You first need to plan a release window to upgrade to Drupal 8.9.x during this timeframe to make sure your site is upgraded before the older Drupal versions are no longer supported. Once on Drupal 8.9, you can perform and release all the preparatory work described below. After that, you’ll be ready to plan a release window for the final upgrade to Drupal 9. 

For more on planning a release window, check out Drupal 8 Release Planning in the Enterprise. Remember to factor in other development work, updates for the rest of your stack, and other ongoing development projects, and give yourself plenty of time to complete the work.

Scope the upgrade project

To scope the upgrade project, you'll need to consider a handful of factors:

  • Deprecated code that must be updated
  • Other development work that you'll do as part of the upgrade project

We'll dive deeper into how to check for and correct deprecated code and APIs shortly, but first, let's take a look at other development work you might do as part of the upgrade project.

Solicit feature requests from stakeholders

Does your website deliver stakeholder-required features using contributed modules that haven't yet been updated for Drupal 9? Do your key stakeholders want new features to better serve site users, or meet business objectives? 

For many organizations, major Drupal replatforming efforts have provided a cadence for other website development work, including new feature development. If it's been a while since your organization checked in with stakeholders, now might be a good time to do that. 

Regardless of whether or not you plan to deliver new feature development in the Drupal 9 upgrade project, it's a good idea to make sure you won't lose Drupal 8 contributed modules that provide the functionality your stakeholders can't live without - unless you've got a new way to deliver that functionality in Drupal 9.

Architecture, content, accessibility audits and more

For sites that are already on Drupal 8, the Drupal 9 upgrade is different than many previous major version updates; Drupal 8 to Drupal 9 does not require a content migration, so there's no real need to do a major information architecture audit and overhaul. In this new world, organizations should look at shifting the site redesign and content architecture cadence to an ongoing, iterative model.

How to Prepare for Drupal 9

Upgrade to Drupal 8.8 or Drupal 8.9

If you haven't already updated your Drupal 8 site to the most recent version of Drupal 8.8.x or 8.9.x, that's where you must start. Drupal 8.8 is a big milestone for API compatibility; it's the first release with an API that's fully compatible with Drupal 9. Practically speaking, this means that contributed modules released prior to 8.8 may not be compatible with Drupal 9.

Beyond API compatibility, Drupal 8.8 and 8.9 introduce further bugfixes, as well as database optimizations, to prepare for Drupal 9. If you upgrade your website and all contributed modules and themes to versions that are compatible with 8.9, those parts of your site should be ready for Drupal 9. 

Platform requirements 

One change between Drupal 8 and Drupal 9 is that Drupal 9 requires a minimum of PHP 7.3. Drupal 8 recommends but does not require 7.3. There are new minimum requirements for MYSQL and MariaDB and other databases. And your Drush version, if you use Drush, must be Drush 10. If you need to update any of these, you should be able to do it while still on Drupal 8, if you like. There may be other changes to the Drupal 9 requirements in the future, so double-check the environment requirements.

Audit for conflicting dependencies

Composer manages third-party dependency updates and will update Drupal dependencies when you do Composer updates. However, if anything else in your stack, such as contributed modules or custom code, has conflicting dependencies, you could run into issues after you update. For this reason, you should check your code for any third-party dependency that conflicts with the core dependencies. 

For example, Drupal 9 core requires Symfony 4.4, while Drupal 8 worked with Symfony 3.4. If you have contributed modules or custom code that depends on Symfony 3.4, you'll need to resolve those conflicts before you update to Drupal 9. If your code works with either version, you can update yourcomposer.json to indicate that either version works. For instance, the following code in your module’s composer.json indicates that your code will work with either the 3.4 or 4.4 version of Symfony Console. This makes it compatible with both Drupal 8 and Drupal 9 and any other libraries that require either of these Symfony versions.

{
  "require": {
    "symfony/console": "~3.4.0 || ^4.4"
  }
}

If you have code or contributed modules that require incompatible versions of third party libraries and won’t work with the ones used in Drupal 9, you’ll have to find some way to remove those dependencies. That may mean rewriting custom code, helping your contributed modules rewrite their code, or finding alternative solutions that don’t have these problems.

Check for deprecated code

Sites that are already on Drupal 8 can see deprecated code using a few different tools:

  • IDEs or code editors that understand `@deprecated` annotations;
  • Running Drupal Check,  PhpStan Drupal, or Drupal Quality Checker from the command line or as part of a continuous integration system to check for deprecations and bugs;
  • Installing the Drupal 8 branch of the Upgrade Status module to get Drupal Check functionality, plus additional scanning;
  • Configuring your test suite to fail when it tries to execute a method that calls a deprecated code path.

See Hawkeye Tenderwolf’s article How to Enforce Drupal Coding Standards via Git for more ideas. That article explains how Lullabot uses GrumPHP and Drupal Quality Checker to monitor code on some of our client and internal sites.  

Many organizations already have solutions to check for deprecated code built into their workflow. Some organizations do this as part of testing, while others do it as part of a CI workflow. In the modern software development world, these tools are key components of developing and maintaining complex codebases.

While you can do this check in any version of Drupal 8, you’ll need to do a final pass once you upgrade any older Drupal 8 version to Drupal 8.8, because new deprecations have been identified in every release up to Drupal 8.8.

Refactor, update and remove deprecated code

If you find that your site contains deprecated code, there are a few avenues to fix it prior to upgrading to Drupal 9. Some of those tools include:

Flag modules as Drupal 9 compatible

Once you’ve removed deprecated code from your custom modules, flag them as being compatible with both Drupal 8 and Drupal 9, by adding the following line to your module’s info.yml file.

core_version_requirement: ^8 || ^9

What about contributed modules?

If you're using contributed modules that are deprecated, work with module maintainers to offer help when possible to ensure that updates will happen. You can find Drupal 9 compatible modules, check reports in the drupal.org issue queue for Drupal 9 compatibility or by checking Drupal 9 Deprecation Status.

You should update all contributed modules to a Drupal 9-compatible version while your site is still on Drupal 8. Do this before attempting an upgrade to Drupal 9!

Update to Drupal 9

One interesting aspect of the Drupal 9 upgrade is that you should be able to do all the preparatory work while you’re still on Drupal 8.8+. Find and remove deprecated code, update all your contributed modules to D9-compatible versions, etc. Once that is done, updating to Drupal 9 is simple:

  1. Update the core codebase to Drupal 9.
  2. Run update.php.

Drupal 9.x+

The Drupal Association has announced its intent to provide minor release updates every six months. Assuming Drupal 9.0 releases successfully in June 2020, the Drupal 9.1 update is planned for December 2020, with 9.2 to follow in June 2021.

To make Drupal 9.0 as stable as possible, no new features are planned for Drupal 9.0. The minor updates every six months may introduce new features and code deprecations, similar to the Drupal 8 release cycle. With this planned release cycle, there is no benefit to waiting for Drupal 9.x releases to upgrade to Drupal 9; Drupal 9.0 should be as stable and mature as Drupal 8.9.

Other resources

Apr 22 2020
Apr 22

Drupal 8 to 9 Upgrade

As the release of Drupal 9 approaches, organizations are starting to think about when to upgrade to Drupal 9. Quick Drupal adoption isn't automatic. Historically, it's taken years for some significant Drupal versions to gain traction. With a relatively short window between the Drupal 9 release and Drupal 8's end-of-life, however, organizations must move more quickly to adopt Drupal 9 or make other arrangements.

No penalty for early Drupal 9 adopters

A common strategy for many technology release cycles is to avoid the initial version of a major software release. Some organizations wait until one or more point releases after a new version, while others prefer to wait months or even years after a major version release for things like bug fixes, additional features, and helpful resources created by early adopters. In the Drupal world, this delay is often exacerbated by waiting for contributed modules to be compatible with the new version.

The nice thing about Drupal 9 is that there is no penalty for early adopters, so there's no reason to wait for a later version. The initial Drupal 9 version release introduces zero new features. Drupal 9.0 core code matches Drupal 8.9 core. The only differences between Drupal 9.0 and Drupal 8.8 or 8.9 are the removal of deprecated code and required upgrades to third-party dependencies.

The primary consideration is whether or not your favorite contributed modules have declared that they are Drupal 9 compatible. With past upgrades, waiting for contributed modules to be ready for the new Drupal version caused months or even years of delays. But the Drupal 9 upgrade path for contributed modules is relatively easy, so they should be able to adapt quickly. Many modules are already compatible, and others will need minimal changes.

When your code is ready

One of the core components of the Drupal 9 upgrade is the removal of deprecated code in Drupal 8. However, this means that when you're planning your release window, you'll need to schedule some time for the pre-work of auditing and refactoring deprecated code. If you've already been doing this, you may not need to pad your schedule to compensate for this work. We'll dive deeper into how to get your code ready in a future article.

In addition to scheduling time to address code deprecations, you'll also need to give yourself time to address any third-party dependencies that require newer versions in Drupal 9. When you're looking at when to upgrade to Drupal 9, you should do it after you've had a chance to resolve any third-party dependency updates that conflict with other things in your stack. If you've got a contrib module or custom code that requires an older version of a third-party dependency, but Drupal 9 calls for a newer version of that dependency, you'll need to make a plan and address this conflict before you upgrade to Drupal 9.

Consider other website work

Many organizations have traditionally used major Drupal version migrations as a time to plan overall website redesign projects, information architecture work, and other web development projects. Because the upgrade to Drupal 9 is more like a minor release than a major one, there's no need to deep dive into information architecture - there's no migration! That means your organization needs to establish a new strategy for these projects; we're working on an upcoming article to cover web development strategy for Drupal 9 for more insights around this.

If business logic dictates that your organization plan other web development projects for this year, make sure you give yourself time to complete the Drupal 9 upgrade before Drupal 8 reaches end-of-life in November 2021. 

Take the availability of preferred partners and development teams into account

If you're planning to work with vendor partners, make sure you factor their availability into your project plan. With an upgrade window of slightly over a year between the release of Drupal 9 and the end-of-life of Drupal 8, some vendor partners may have limited availability, especially if yours is a larger project. Planning ahead helps to ensure you can work with your preferred partners; otherwise, you might add the stress of working with a new partner into the mix.

At the same time, don't forget about internal initiatives such as serving multiple stakeholders. For example, doing new feature development for content editors while simultaneously maintaining an up-to-date platform consistent with your organization's security policies can mean a dance to prioritize development resources to meet everyone's priorities and deadlines. While this complicates the release planning process, it's essential to consider these factors when determining the timing of upgrading to Drupal 9.

We dipped our toes into these considerations in Drupal 8 Release Planning in the Enterprise, and hope to release an updated version of this article soon for Drupal 9 release planning.

Missing the Drupal 9 upgrade window

To summarize, you should upgrade to Drupal 9 earlier rather than later. But what if your site can't upgrade to Drupal 9 before Drupal 8 reaches end-of-life? Unlike Drupal 7, Drupal 8 does not have an extended support program. The upgrade from Drupal 8 to Drupal 9 is such a minor replatforming effort compared to prior versions that the decision was made not to offer an extended support program for Drupal 8. 

Support will continue through November 2021 for sites upgraded to 8.9.x, but support for that version ends when that Drupal 8 end-of-life date arrives. Older Drupal 8.x versions will cease getting support before that date; 8.7.x stops getting security support as of June 3, 2020, and security support ends for Drupal 8.8.x on December 2, 2020.

Long-term, your organization needs a plan to upgrade to Drupal 9, or to consider other options. A future article in this series offers more information about what that plan might look like.

Thanks to the Lullabot team for contributing to this article and to Dachary Carey for drafting it.

Karen Stevenson

Thumbnail

Karen is one of Drupal's great pioneers, co-creating the Content Construction Kit (CCK) which has become Field UI, part of Drupal core.

Apr 15 2020
Apr 15

Drupal 8 to 9 Upgrade

This article is the first in a series discussing Who, What, Why, and How Drupal 8 sites can upgrade to the upcoming Drupal 9 release. In a future series will discuss upgrading Drupal 7 sites.

With Drupal 9 scheduled for release in summer 2020, and with both Drupal 7 and Drupal 8 scheduled for end-of-life (EOL) in November 2021, it’s time to think about whether to upgrade Drupal sites to the new version. Upgrading to newer versions in the past was a significant replatforming effort that required a substantial investment and a non-trivial release window. The Drupal 8 to Drupal 9 upgrade is different, though; this is the first major version upgrade that’s reputed to be as simple as a minor point release. Can it really be that simple? Who should upgrade to Drupal 9?

The Easy Path: Upgrading from Drupal 8

Organizations that are already on Drupal 8 are several steps ahead in upgrading to Drupal 9. One of the biggest benefits of upgrading to Drupal 8 is that the platform and core code of Drupal 8 form the basis for Drupal 9. 

Drupal 9.0 doesn’t introduce any new features or new code, so sites that are on the final Drupal 8 point release are essentially ready to upgrade to Drupal 9.0. No big lift; no major replatforming effort; no content migration; just a final audit to make sure the site doesn’t rely on any deprecated code or outdated Composer dependencies. 

Sites that have kept up-to-date with Drupal 8’s incremental updates (see Andrew Berry’s article Drupal 8 Release Planning in the Enterprise) should be ready to go when it comes to core code. Many sites are already using automated tools or workflows to keep them up-to-date on code deprecations for contributed and custom modules. Ideally, you have been continuously replacing older versions of contributed modules with versions that have removed deprecated code, removing deprecated code in your custom code, and dealing with any Composer dependency conflicts. If so, the upgrade effort for your site should be relatively simple. The same is true if you rely on widely-used and well-supported contributed modules and have little custom code.

If you have custom code and use less widely-used contributed modules, but you’ve been paying attention to code deprecations in your custom code and the readiness of your contributed modules, you’re probably in a good position to upgrade. If you have strong test coverage and aren’t relying on any deprecated third-party dependencies, you’re in even better shape. You shouldn’t see substantial changes from Drupal 8 to Drupal 9.0, so even custom code is likely to work without issue as long as it doesn’t rely on deprecated functions or methods that are removed. 

The caveat is that if your custom code or contributed modules rely on older versions of Composer dependencies that are deprecated in Drupal 9 in favor of newer versions, you may need to do some refactoring to make sure that code works with the new third-party dependencies.

Can you stay on Drupal 8 past its EOL?

There should be no reason for anyone on Drupal 8 not to upgrade to Drupal 9.  There will be a small window of time until November 2021, during which the last Drupal 8 release will be supported with security updates. That allows time to make the necessary changes to move to Drupal 9. But after that, you’ll need to make the switch.

When Drupal 6 reached its end of life, there was a Long Term Support (LTS) program, which made it possible to stay on Drupal 6 past its EOL. There are plans to provide an LTS program for Drupal 7; however, there will be no Long Term Support program for Drupal 8 because the upgrade path from Drupal 8 to Drupal 9 is much easier.

If you don’t make the move, you’ll be on your own to deal with security updates and other maintenance and bug fixes for your Drupal 8 code. And that would likely be more expensive and time-consuming than just doing the upgrade.

Prepare to upgrade or start considering alternatives.

With the Drupal 9 upgrade being relatively uncomplicated for sites that are already on Drupal 8, it's easy to recommend that those sites should upgrade. The main question is when, and what other options do you have? Later articles in this series will delve into more detail about how to prepare for the upgrade.

Thanks to the Lullabot team for contributing to this article and to Dachary Carey for drafting it.

Karen Stevenson

Thumbnail

Karen is one of Drupal's great pioneers, co-creating the Content Construction Kit (CCK) which has become Field UI, part of Drupal core.

Mar 04 2020
Mar 04

Sending a Drupal Site into Retirement

The previous article in this series explained how to send a Drupal Site into retirement using HTTrack—one of the solutions to maintaining a Drupal site that isn't updated very often. While this solution works pretty well for any version of Drupal, another option is using the Static Generator module to generate a static site instead. However, this module only works for Drupal 7 as it requires the installation of some modules on the site, and it uses Drupal to generate the results. 

The Static Generator module relies on the XML sitemap module to create a manifest. The links in the XML sitemap serve as the list of pages that should be transformed into static pages. After generating the initial static pages, the Cache Expiration module keeps track of changed pages to be regenerated to keep the static site current. This combination of Static Generator, XML sitemap, and Cache Expiration is a good solution when the desire is to regenerate the static site again in the future, after making periodic updates.

There are many module dependencies, so quite a list of modules was downloaded and installed. Once installed, the high-level process is:

  • Create and configure the XML sitemap and confirm it contains the right list of pages.
  • Configure Cache expiration to use the Static Generator and expire the right caches when content changes.
  • Go to  /admin/config/system/static and queue all static items for regeneration.
  • Click a Publish button to generate the static site.

Install Static Generator

The modules are downloaded and enabled using Drush. Enabling additional modules, like xmlsitemap_taxonomy, may be needed depending on the makeup of the site.

drush dl static expire xmlsitemap

drush en static_file, static_views, static_xmlsitemap, static_node, static

drush en expire

drush en xmlsitemap_menu, xmlsitemap_node, xmlsitemap

Configure XMLSitemap

On /admin/config/search/xmlsitemap, make sure the site map is accurately generated and represents all pages that should appear in the static site. Click on the link to the sitemap to see what it contains.

  • Add all content types whose paths should be public.
  • Add menus and navigation needed to allow users to get to the appropriate parts of the site.
  • Make sure Views pages are available in the map.

A lot of custom XML sitemap paths may be required for dynamic views pages. If so, generate XML sitemap links in the code where the database is queried for all values that might exist as a path argument, then create a custom link for each path variation.

Code to add custom XML sitemap links look like this (this is Drupal 7 code):



/**
 * Add a views path to xmlsitemap.
 *
 * @param string $path
 *   The path to add.
 * @param float $priority
 *   The decimal priority of this link, defaults to 0.5.
 */
function MYMODULE_add_xmlsitemap_link($path, $priority = '0.5') {
  drupal_load('module', 'xmlsitemap');

  // Create a unique namespace for these links.
  $namespace = 'MYMODULE';
  $path = drupal_get_normal_path($path, LANGUAGE_NONE);

  // See if link already exists.
  $current = db_query("SELECT id FROM {xmlsitemap} WHERE type = :namespace AND loc = :loc", array(
    ':namespace' => $namespace,
    ':loc' => $path,
  ))->fetchField();
  if ($current) {
    return;
  }

  // Find the highest existing id for this namespace.
  $id = db_query("SELECT max(id) FROM {xmlsitemap} WHERE type = :namespace", array(
    ':namespace' => $namespace,
  ))->fetchField();

  // Create a new xmlsitemap link.
  $link = array(
    'type' => $namespace,
    'id' => (int) $id + 1,
    'loc' => $path,
    'priority' => $priority,
    'changefreq' => '86400', // 1 day = 24 h * 60 m * 60 s
    'language' => LANGUAGE_NONE
  );

  xmlsitemap_link_save($link);
}

Configure Cache Expiration

On admin/config/system/expire, set up cache expiration options. Make sure that all the right caches will expire when content is added, edited, or deleted. For instance, the home page should expire any time nodes are added, changed, or deleted since the changed nodes change the results in the view of the latest content that appears there. 

Generate the Static Site

Once configured, a Publish Site button appears on every page, which is a shortcut. But the first time through, it’s better to visit /admin/config/system/static to configure static site options and generate the static site. Some pages were created automatically, and others not during the initial setup. Once all other modules are configured, and the XML sitemap looks right, clear all the links and regenerate the static site.

The location where the static site is created can be controlled, but the default location is at the path, /static/normal, in the same repository as the original site. That location and other settings are configured on the Settings tab.

Generate the static site and ensure all the pages are accounted for and work correctly. This is an iterative process due to the discovery of missing links from the XML sitemap and elsewhere. Circle through the process of updating the sitemap and then regenerate the static site as many times as necessary.

The process of generating the static site runs in batches. It might also run only on cron depending on what options are chosen in settings. Uncheck the cron option when generating the initial static site and later use cron just to pick up changes. Otherwise, running cron multiple times to generate the initial collection of static pages is required.

For a 3,500 page site, it takes about seven minutes to generate the static pages. Later updates should be faster since only changed pages would have to be regenerated.

When making changes later, they need to be reflected in the XML sitemap before they will be picked up by Static Generator. If XML sitemap updates on cron, run cron first to update the sitemap, then update the static pages.

After generating the static site and sending it to GitHub, it was clear that the Static Generator module transforms a page like /about into the static file /about.html, then depends on an included .htaccess file that uses mod_rewrite to redirect requests to the right place. But, GitHub Pages won’t recognize mod_rewrite. That makes the Static Generator a poor solution for a site to be hosted on Github Pages, although it should work fine on sites where mod_rewrite will work. 

Comparing HTTrack and Static Generator Options

Here’s a comparison of a couple of methods explored when creating a static site: 

  • HTTrack will work on any version of Drupal, Static Generator, only on Drupal 7.
  • HTTrack doesn’t require setup other than the standard preparation of any site, which is required for any static solution. Static Generator took some time to configure, especially since there wasn’t an existing XML sitemap and Cache Expiration installed and configured.
  • HTTrack can take quite a while to run, a half-hour to an hour, possibly longer. Static Generator is much faster—seven minutes for the initial pass over the whole site.
  • The Static Generator solution makes the most sense if there is a need to keep updating the site and regenerating the static pages. That situation justifies the up-front work required to configure it. HTTrack is easier to set up for a one-and-done situation.
  • The file pattern of /about/about.html created by our custom HTTrack arguments works fine for managing internal links on Github Pages. The file pattern of /about.html created by Static Generator will not correctly manage internal links on Github Pages. The second pattern will only work on a host that has mod_rewrite installed and the appropriate rules configured in .htaccess.

Github Pages or the Static Generator module make excellent solutions. To view an example of a site generated with HTTrack, go to https://everbloom.us.

Feb 26 2020
Feb 26

Sending a Drupal Site into Retirement

Maintaining a fully functional Drupal 7 site and keeping it updated with security updates year-round takes a lot of work and time. For example, some sites are only active during certain times of the year, so continuously upgrading to new Drupal versions doesn't always make the most sense. If a site is updated infrequently, it's often an ideal candidate for a static site. 

To serve static pages, GitHub Pages is a good, free option, especially when already using GitHub. GitHub Pages deploys Jekyll sites, but Jekyll is perfectly happy to serve up static HTML, which doesn't require any actions other than creating functional HTML pages to get a solution working. Using this fishing tournament website as the basis for this article, here’s how to retire a Drupal site using HTTrack. 

Inactivate the Site

To get started, create a local copy of the original Drupal site and prepare it to go static using ideas from Sending A Drupal Site into Retirement.

Create GitHub Page

Next, create a project on GitHub for the static site and set it up to use GitHub Pages. Just follow the instructions to create a simple Hello World repository to be sure it’s working. It’s a matter of choosing the option to use GitHub Pages in the settings and identifying the GitHub Pages branch to use. The GitHub pages options are way down at the bottom of the settings page. There's an option to select a GitHub theme, but if there's one provided in the static pages, it will override anything chosen. So, really, any theme will do.

A committed index.html file echoes back "Hello World" and the new page becomes viewable at the GitHub Pages  URL. The URL pattern is http://REPO_OWNER.github.io/REPO_NAME; the GitHub Pages information block in the repository settings will display the actual URL for the project. 

Create Static Pages with HTTrack

Now that there's a place for the static site, it's time to generate the static site pages into the new repository. Wget could spider the site, but a preferred solution is one that uses HTTrack to create static pages. This is a tool that starts on a given page, generally the home page, then follows every link to create a static HTML representation of each page that it finds. This will only be sufficient if every page on the site is accessible from another link on the site and the navigation or other links on the home page. HTTrack won't know anything about unlinked pages, although there are ways to customize the instructions to identify additional URLs to spider. 

Since this solution doesn’t rely on Drupal at all, it's possible to use it for a site built with any version of Drupal, or even sites built with other CMSes. It self-discovers site pages, so there's no need to provide any manifest of pages to create. HTTrack has to touch every page and retrieve all the assets on each page, so it can be slow to run, especially when running it over the Internet. It's best to run it on a local copy of the site.

It's now time to review all the link elements in the head of the pages and make sure they are all intentional. Using the Pathauto module, the head elements added by Drupal 7, such as <link rel="shortlink" href="https://www.lullabot.com/articles/sending-drupal-site-retirement-using-h.../node/9999" />, should be removed. They point to URLs that don't require replication in the static site, and HTTrack will try to create all those additional pages when it encounters those links.

When using the Metatags module, configuring it to remove those tags is possible. Instead, a bit of code like the following is used in a custom module to strip tags out (borrowed  from the Metatags module, code appropriate for a Drupal 7 site):


/**
 * Implements hook_html_head_alter().
 *
 * Hide links added by core that we don't want in the static site.
 */
function MYMODULE_html_head_alter(&$elements) {
  $core_tags = array(
    'generator',
    'shortlink',
    'shortcut icon',
  );
  foreach ($elements as $name => &$element) {
    foreach ($core_tags as $tag) {
      if (!empty($element['#attributes']['rel']) && $element['#attributes']['rel'] == $tag) {
        unset($elements[$name]);
      }
      elseif (!empty($element['#attributes']['name']) && strtolower($element['#attributes']['name']) == $tag) {
        unset($elements[$name]);
      }
    }
  }
}

The easiest way to install HTTrack on a Mac is with Homebrew:

brew install httrack

Based on the documentation and further thought, it became clear that the following command string is the ideal way to use HTTrack. After moving into the local GitHub Pages repo, the following command should be executed where LOCALSITE is the path to the local site copy that's being spidering, and DESTINATION is the path to the directory where the static pages should go:

httrack http://LOCALSITE -O DESTINATION -N "%h%p/%n/index%[page].%t" -WqQ%v --robots=0 --footer ''

The -N flag in the command will rewrite the pages of the site, including pager pages, into the pattern /results/index.html. Without the -N flag, the page at /results would have been transformed into a file called results.html. This will take advantage of the GitHub Pages server configuration, which will automatically redirect internal links that point to /results to the generated file /results/results.html.

The --footer '' option means omit comments that HTTrack automatically adds to each page and looks like the following. This gets rid of the first comment, but nothing appears to get rid of the second one. Getting rid of the first one, which has a date in it, eliminates having a Git repository in which every page appears to change every time HTTrack runs. It also obscures the URL of the original site, which may be confusing since it's a local environment.

<!-- Mirrored from everbloom-7.lndo.site/fisherman/aaron-davitt by HTTrack Website Copier/3.x [XR&CO'2014], Sun, 05 Jan 2020 10:35:55 GMT -->

<!-- Added by HTTrack --><meta http-equiv="content-type" content="text/html;charset=utf-8" /><!-- /Added by HTTrack -->

The pattern also deals with paged views results. It tells HTTrack to find a value in the query string called "page" and inserts that value, if it exists, into the URL pattern in the spot marked by [page]. Paged views create links like /about/index2.html, /about/index3.html for each page of the view. Without specifying this, the pager links would be created as meaningless hash values of the query string. This way, the pager links are user-friendly and similar (but not quite the same) as the original link URLs.

Shortly after the process starts, it will stop and ask a question about how far to go in the following links. '*' is the response to that question:

The progress is viewable as it goes to see which sections of the site it is navigating into. The '%v' flag in the command tells it to use verbose output.

HTTrack runs on a local version of the site to spider and creates about 3,500 files, including pages for every event and result and every page of the paged views. HTTrack is to slow to use across the network on the live site URL, so it makes sense to do this on a local copy. The first attempt took nearly two hours because so many unnecessary files were created, such as an extra /node/9999.html file for every node in addition to the desired file at the aliased path. After a while, it was apparent they came from the shortlink in the header pointing to the system URL. Removing the short links, cut the spidering time by more than half. Invalid links and images in the body of some older content that HTTrack attempted to follow (creating 404 pages at each of those destinations) also contributed to the slowness. Cleaning up all of those invalid links caused the time to spider the site to drop to less than a half-hour.

The files created by HTTrack are then committed to the appropriate branch of the repository, and in a few minutes, the results appear at http://karens.github.io/everbloom.

Although incoming links to /results now work while internal links still look like this in the HTML:

/results/index.html

A quick command line fix to clean that up is to run this, from the top of the directory that contains the static files:

find . -name "*.html" -type f -print0 |   xargs -0 perl -i -pe "s/\/index.html/\//g"

That will change all the internal links in those 3,500 pages from results/index.html to /results/ resulting in a static site that pretty closely mirrors the original file structure and URL pattern of the original site.

One more change is to fix index.html at the root of the site. When HTTrack generates the site, it creates an index.html page that redirects to another page, /index/index.html. To clean things up a bit and remove the redirect, I copy /index/index.html to /index.html. The relative links in that file now need to be fixed to work in the new location, so I do a find and replace on the source of that file to remove ../ in the paths in that page to change URLs like ../sites/default/files/image.jpg to sites/default/files/image.jpg.

Once this is working successfully, the final step was to have the old domain name redirect to the new GitHub Pages site. GitHub provides instructions about how to do that.

Updating the Site

Making future changes requires updating the local site and then regenerating the static pages using the method above. Since Drupal is not publicly available, there's no need to update or maintain it, nor worry about security updates, as long as it works well enough to regenerate the static site when necessary. When making changes locally, regenerate the static pages using HTTrack and push up the changes. 

The next article in this series will investigate whether or not there is a faster way of creating a static site.

Jan 02 2020
Jan 02

Site owners and administrators often want to send emails to users telling them about content creation or changes. This sounds basic, but there are a lot of questions. What exactly needs to be accomplished? Some examples could include:

  • Send notifications to authors when their content is published.
  • Send notifications to authors when users comment on it.
  • Send notifications to administrators when content is created or updated.
  • Send notifications to site users when content is commented on.
  • Mail the content of a node to site users when content is created or updated.
  • And the list goes on…

The first step in assessing solutions is to identify the specific need by asking the following questions:

Who needs to be notified? 

  • So many options! It could be an editor, the author, all site users, all site users of a given role, a specific list of users, anyone who commented on the node, or anyone who subscribed to the node.

When should notifications be created?

  • A little simpler, but it could be when the node is created, when it is published, or when it is commented on. A message might be initiated every time the action happens, or postponed and accumulated into a digest of activity that goes out once a day or once a week or once a month.

When should notifications be sent?

  • This could be immediate, sent to a queue and processed on cron, or scheduled for a specific time.

What should deliver the notification?

  • Is it both possible and feasible for the web server to be responsible for delivering the notification? Does a separate server need to deliver the mail, or perhaps a third party mail delivery service needs to be used? 

How should recipients be notified?

What should recipients receive?

  • Notifications could be just messages saying that the node has been created, changed, or published. It might include a summary of the node content, or the entire content of the node could be sent in the email. This could also be a digest of multiple changes over time.

How much system flexibility is required?

  • This could encompass anything from a very simple system, like a fixed list of users or roles who are always notified, all the way to complicated systems that allow users to select which content types they want to be notified about, maybe even allowing them to subscribe and unsubscribe from specific nodes.

Clarifying the answers to these questions will help define the solution you need and which module(s) might be options for your situation. There are dozens of modules that have some sort of ability to send emails to users. While I did not review all of them, below are reviews of a few of the most widely-used solutions.

Note: There are also several ways to use the Rules module (possibly in conjunction with one of these solutions), but I did not investigate Rules-based solutions in this collection of ideas. 

Admin Content Notification

The Admin Content Notification module is designed to notify administrators when new content is created, but it can also be used to notify any group of users, administrators or not. The configuration allows you to choose either a hard-coded list of email recipients or send notifications to all users that have one or more specified roles. Since you can send notifications to users by role, you could create a new ‘email’ role and assign that role to anyone who should be notified.

Some of the capabilities of this module include:

  • Selecting content types that should generate notifications.
  • Choosing whether to send notifications only on node creation or also on updates, and whether to notify about unpublished nodes or only published nodes.
  • Selecting roles that have permissions to generate notifications.
  • Selecting either a hard-coded list of email addresses OR a list of roles that should receive notifications.
  • Creating a notification message to send when content is created or changed.
  • Adding the node content to the message by using tokens.

The settings for this module are in an odd location, in a tab on the content management page.

This module is extremely easy to set up; just download and enable the module and fill out the configuration settings. Because of its simplicity, it has little flexibility. All content types and situations use the same message template, and there is no way for users to opt in or out of the notifications. There is no integration with changes in workflow states, only with the published or unpublished status of a node. This module provides no capability to send notifications when comments are added or changed. If this capability matches what you need, this is a very simple and easy solution.

Workbench Email

Workbench Email is part of the Workbench collection, but it also works just with core Content Moderation. This module adds the ability to send notifications based on changes in workflow states. 

You begin by creating any number of email templates. In each template, you identify which content types it applies to, who should get emailed, and what the email should contain. The template uses tokens, so you can include tokens to display the body and other custom fields on the node in the message.

Once you’ve created templates, you edit workflow transitions and attach the templates to appropriate transitions. The screenshot below is a bit confusing. The place where it says Use the following email templates is actually below the list of available templates, not above the list. In this example, there is only one template available, called Notification, which has been selected for this transition.

Documentation for the Drupal 8 version does not exist and the screenshots on the project page don’t match the way the site pages look. There is, however, good integration with core Workflow and Content Moderation, and there is a certain amount of flexibility provided in that you can create different messages and send messages to different people for different transitions. There is a documented bug when uninstalling this module, so test it on a site where you can recover your original database until you decide if you want to keep it! This module provides no capability to send notifications when comments are added or changed.

Comment Notify

This module fills a gap in many of the other solutions: a way to get notified about comments on a post. It’s a lightweight solution to allow users to be notified about comments on content they authored or commented on. Configuration is at admin/config/people/comment_notify:

The module has some limitations:

  • Only the author can automatically receive notices about all comments on a thread.
  • You won’t receive notices about other comments unless you add a comment first.
  • You can only unsubscribe to comments by adding another comment and changing your selection.
  • There is only one template and it applies to all content types.
  • You can’t automatically subscribe groups of users to notifications. Each user manages their own subscriptions.

With the above restrictions, the module is an easy way to get the most obvious functionality: be notified about comments on content you created, and be notified about replies to any comments you made. This module could be combined with solutions that only provide notifications about changes to nodes for a more comprehensive solution.

Message Stack

The Message Stack is an API that you must implement using custom code. Unlike the above modules, this is absolutely not an out-of-the-box solution. It’s much more complex but highly flexible. Much of the available documentation is for the Drupal 7 version, so I spent quite some time trying to understand how the Drupal 8 version should work. 

The Message module creates a new “Message” entity. Your custom code then generates a new message for whatever actions you want to track—node creation or updates, comment creation or updates—all using Drupal’s hooks. You can create any number of token-enabled templates for messages, and implement whichever template applies for each situation in the appropriate hook. Using a separate module in the stack, Message Notify, you choose notification methods. It comes with email or SMS plugins, and you can create other notification methods by creating custom plugins. A third, separate module, Message Subscribe, is used to allow users to subscribe to content using the Flag module. You then create custom code that implements Drupal hooks (likehook_node_insert()) to create the appropriate message(s) and send the messages to the subscriber list.

One head-scratcher was how to set up the module to populate the email subject and message. You do it by creating two messages in the template. The first contains text that will go into the email subject, the second contains the text for the email body. You’ll see two new view modes on the message, mail_body and mail_subject. The subject is populated with Partial 0 (the first message value), the body with Partial 1 (the second message value).

Another thing that took me a while to get my head around is that you designate who the message should be sent to by making them the author of the message, which feels odd. "Author" is a single value field, so if you want to send a message to multiple people, you create a basic message, clone it for each recipient, and change the author on the cloned version for each person on the list to send them their own copy of the message. This way each recipient gets their own message, making it possible to dynamically tweak the specific message they receive.

The Message Subscribe module does all that work for you, but it’s useful to know that’s what it’s doing. It creates Flag module flags that allow users to select which content they want to be notified about and hooks that allow you to alter the list of subscribers and the messages each receives in any way you like. That means that your custom code only needs to create the right message in the right hook. If you want the message to just go to the list of people who already subscribed to the content, you do something like this:

/**
 * Implements hook_node_update().
 */
function mymodule_node_update(Node $node) {
  $message = Message::create([
    'template' => 'update_node',
    'uid' => $node->getOwnerId(),
  ]);
  $message->set('field_node_reference', $node);
  $message->set('field_published', $node->isPublished());
  $message->save();
  $subscribers = \Drupal::service('message_subscribe.subscribers');
  $subscribers->sendMessage($node, $message);
}

You can also add to or alter the list of subscribers and/or the specific text you send to each one. To add subscribers, you create a DeliveryCandidate with two arrays. The first is an array of the flags you want them to have, the second is an array of notification methods that should apply. This is very powerful since you don’t have to wait until users go and subscribe to each node. You can “pseudo-subscribe” a group of users this way. This is probably most applicable for admin users since you might want them to automatically be subscribed to all content. Note that this also eliminates any possibility that they can unsubscribe themselves, so you’d want to use this judiciously.

/**
 * Implements hook_message_subscribe_get_subscribers().
 */
function mymodule_message_subscribe_get_subscribers(MessageInterface $message,
array $subscribe_options, array $context) {
  $admin_ids = [/* list of uids goes here */];
  $uids = [];
  foreach ($admin_ids as $uid) {
    $uids[$uid] = new DeliveryCandidate([], ['email'], $uid);
  }
  return $uids;
}

A lot of configuration is created automatically when you enable the Message Stack modules, but some important pieces will be missing. For instance, message templates and fields might be different in different situations, so you’re on your own to create them.

I created a patch in the issue queue with a comprehensive example of Message, Message Notify, and Message Subscribe that contains the code I created to review these modules. It’s a heavily-commented example that creates some initial configuration, combines code from all the Message example modules, is updated with information I derived from various issues and documentation that I found after searching the Internet and a bunch of trial and error in my test site. It’s a comprehensive alternative to the other examples, which are pretty basic, and should answer a lot of the most common questions about how to configure the module. I included lots of ideas of things you can do in the custom code, but you’ll probably want to remove some of them and alter others using what’s there as a starting point for your own code. Read the README file included in that patch, visit the configuration pages, and review the example code for more ideas of how to use the stack. 

Note that I ran into several issues with Message Subscribe and submitted potential patches for them. Most of them related to the Message Subscribe Email module, and I ultimately realized I didn’t even need that module and removed it from my example. I found an easier way to email the subscribers using just one line of code in my example module. The remaining issue I ran into was one that created serious problems when I tried to queue the subscription process. I suggest you review and use that patch if you alter the list of recipients and use the queue to send messages.

Which module to use?

That’s it for the modules I reviewed this time. Which one(s) should you use? This is Drupal! Take your pick! There are many ways to solve any problem.

Seriously, it really comes down to whether you want something out-of-the-box that requires little or no code, or whether you need more customization and flexibility than that. 

In summary:

  • If you want an easy, out-of-the-box, solution to notify users about published content and comments, a combination of Admin Content Notification and Comment Notify could work well.
  • If you need to notify users about changes in workflow states, Workbench Email is an easy solution.
  • If you need a highly customized solution and you are comfortable writing code and using Drupal’s APIs, the Message stack has a lot of potential.

For my project, an intranet, I ultimately selected the Message stack. I created a public repository with a custom module,  Message Integration, that integrates the Message stack with Swiftmailer and Diff to automatically subscribe all users to new nodes and email the subscriber list with the node’s content when nodes are published, a diff of the changes when new revisions are created, and the text of new comments on nodes they subscribe to. The code is too opinionated to be a contrib module, but it could be forked and used on other sites as a quick start to a similar solution.

Any of these modules (and probably lots of others) might work for you, depending on the specific needs of your site and your ability and desire to write custom code. Hopefully, this article will help you assess which modules might fit your communication needs, and provide some comparison for any other solutions you investigate.

Apr 18 2018
Apr 18

This first-ever Decoupled Summit at DrupalCon Nashville was a huge hit. Not only did it sell out but the room was packed to the gills, literally standing room only. Decoupled Drupal is a hot topic these days. The decoupled summit was an opportunity to look at the state of decoupled Drupal, analyze pros and cons of decoupling, and look at decoupling strategies and examples. There is lots of interest in decoupling, but there are still many hard problems to solve, and it isn’t the right solution for every situation. This summit was an opportunity to assess the state of best practices.

The summit was organized by Lullabot's Sally Young and Mediacurrent's Matt Davis, two of the innovators in this space.

What is “decoupled Drupal”? 

First, a quick explanation of what “decoupled Drupal” means, in case you haven’t caught the fever yet. Historically, Drupal is used to deliver all the components of a website, an approach that can be called “traditional,” “monolithic,” or “full stack” Drupal. In this scenario, Drupal provides the mechanism to create and store structured data, includes an editorial interface that allows editors to add and edit content and set configuration, and takes responsibility for creating the front-end markup that users see in their browsers. Drupal does it all.

“Decoupled”, or “headless” Drupal is where a site separates these website functions across multiple web frameworks and environments. That could mean managing data creation and storage in a traditional Drupal installation, but using React and Node.js to create the page markup. It could also mean using a React app as an editorial interface to a traditional Drupal site. 

Drupal tools and activity

Drupal core is enabling this activity through a couple of core initiatives:

Drupal and the Drupal community have numerous tools available to assist in creating a decoupled site:

  • Contenta, a pre-configured decoupled Drupal distribution.
  • Waterwheel, an emerging ecosystem of software development kits (SDKs) built by the Drupal community.
  • JSON API, an API that allows consumers to request exactly the data they need, rather than being limited to pre-configured REST endpoints.
  • GraphQL, another API that allows consumers to request only the data they want while combining multiple round-trip requests into one.

There’s lots of activity in headless CMSes. But the competitors are proprietary. Drupal and WordPress are the only end-to-end open source contenders. The others only open source the SDKs.

Highlights of the summit

The summit included several speakers, a business panel, and some demonstrations of decoupled applications. Participants brought up lots of interesting questions and observations. I jotted down several quotes, but it wasn't always possible to give attribution with such an open discussion, so my apologies in advance. Some general reflections from my notes:

Why decouple?

  • More and more sites are delivering content to multiple consumers, mobile apps, TV, etc. In this situation, the website can become just another consumer of the data.
  • It’s easier to find generalist JavaScript developers than expert Drupal developers. Decoupling is one way to ensure the front-end team doesn't have to know anything about Drupal.
  • If you have large teams, a decoupled site allows you to have a clean separation of duties, so the front and back end can work rapidly in parallel to build the site.
  • A modern JavaScript front-end can be fast—although several participants pointed out that a decoupled site is not automatically faster. You still need to pay attention to performance issues.
  • Content is expensive to create; decoupling is a way to re-use it, not just across platforms, but also from redesign to redesign.
  • You could launch a brand new design without making any changes to the back end, assuming you have a well-designed API (meaning an API that doesn't include any assumptions about what the front end looks like). As one participant said, “One day, React won't be cool anymore, we'll need to be ready for the next big thing.”

What are some of the complications?

  • It often or always costs more to decouple than to build a traditional site. There’s additional infrastructure, the need to create new solutions for things that traditional Drupal already does, and the fact that we’re still as a community figuring out the best practices.
  • If you only need a website, decoupling is a convoluted way to accomplish it. Decoupling makes sense when you are building an API to serve multiple consumers.
  • You don’t have to decouple to support other applications. Drupal can be a full-featured website, and also the source of APIs.
  • Some tasks are particularly tricky in a decoupled environment, like previewing content before publishing it. Although some participants pointed out that in a truly decoupled environment preview makes no sense anyway. “We have a bias that a node is a page, but that’s not true in a decoupled context. There is no concept of a page on a smartphone. Preview is complicated because of that.”
  • Many businesses have page-centric assumptions embedded deep into their content and processes. It might be difficult to shift to a model where editors create content that might be deployed in many different combinations and environments. One participant discussed a client that "used all the decoupled technology at their disposal to build a highly coupled CMS." On the other hand, some clients are pure Drupal top to bottom, but they have a good content model and are effectively already "decoupling" their content from its eventual display.
  • Another quote, “Clients trying to unify multiple properties have a special problem; they have to swallow that there will have to be a unified content model in order to decouple. Otherwise, you're building numerous decoupled systems.”
  • Once you are decoupled, you may not even know who is consuming the APIs or how they're being used. If you make changes, you may break things outside of your website. You need to be aware of the dependency you created by serving an API.

Speakers and Panelists

The following is a list of speakers and panelists. These are people and companies you could talk to if you have more questions about decoupling:

  • Sally Young (Lullabot)
  • Matt Davis (Mediacurrent)
  • Jeff Eaton (Lullabot)
  • Preston So (Acquia)
  • Matt Grill (Acquia)
  • Daniel Wehner (TES)
  • Wes Ruvalcaba (Lullabot)
  • Mateu Aguiló Bosch (Lullabot)
  • Suzi Arnold (Comcast)
  • Jason Oscar (Comcast)
  • Jeremy Dickens (Weather.com)
  • Nichole Davison (Edutopia)
  • Baddy Breidert (1xinternet)
  • Christoph Breidert (1xinternet)
  • Patrick Coffey (Four Kitchens)
  • Greg Amaroso (Softvision)
  • Eric Hestenes(Edutopia)
  • David Hwang (DocuSign)
  • Shellie Hutchens (Mediacurrent)
  • Karen Stevenson (Lullabot)

Summary

It was a worthwhile summit, I learned a lot, and I imagine others did as well. Several people mentioned that Decoupled Drupal Days will be taking place August 17-19, 2018 in New York City (there is a link to last year's event). The organizers say it will be “brutally honest, not a cheerleading session.” And they’re also looking for sponsors. I’d highly recommend marking those days on your calendar if you’re interested in this topic!

Nov 08 2017
Nov 08

Drupal 8 ships with a built-in WYSIWG editor called CKEditor. It’s great to have it included in core, but I had some questions about how to control the styling. In particular, I wanted the styling in the editor to look like my front-end theme, even though I use an administration theme for the node form. I spent many hours trying to find the answer, but it turned out to be simple if a little confusing.

In my example, I have a front-end theme called “Custom Theme” that extends the Bootstrap theme. I use core’s “Seven” theme as an administration theme, and I checked the box to use the administration theme for my node forms. 

My front end theme adds custom fonts to Bootstrap and uses a larger than normal font, so it’s distinctively different than the standard styling that comes with the WYSIWYG editor. 

Front End Styling

Front end styling

WYSIWYG Styling

Out of the box, the styling in the editor looks very different than my front-end theme. The font family and line height are wrong, and the font size is too small.

Back end, before the changes

It turns out there are two ways to alter the styling in the WYSIWYG editor, adding some information to the default theme’s info.yml file, or implementing HOOK_ckeditor_css_alter() in either a module or in the theme. The kicker is that the info changes go in the FRONT END theme, even though I’m using an admin theme on the node form.

I added the following information to my default theme info file, custom_theme.info.yml. The font-family.css and style.css files are the front-end theme CSS files that I want to pass into the WYSIWYG editor. Even if I select the option to use the front-end theme for the node form, the CSS from that theme will not make it into the WYSIWYG editor without making this change, so this is necessary whether or not you use an admin theme on the node form!  

name: "Custom Theme"
description: A subtheme of Bootstrap theme for Drupal 8.
type: theme
core: 8.x
base theme: bootstrap
ckeditor_stylesheets:
  - https://fonts.googleapis.com/css?family=Open+Sans
  - css/font-family.css
  - css/style.css
libraries:
  ...

WYSIWYG Styling

After this change, the font styles in the WYSIWYG editor match the text in the primary theme.

Back end, after the change

When CKEditor builds the editor iframe, it checks to see which theme is the default theme, then looks to see if that theme has values in the info.yml file for ckeditor_stylesheets. If it finds anything, it adds those CSS files to the iframe. Relative CSS file URLs are assumed to be files in the front-end theme’s directory, or you can use absolute URLs to other files.

The contributed Bootstrap module does not implement ckeditor_stylesheets, so I had to create a sub-theme to take advantage of this. I always create a sub-theme anyway, to add in the little tweaks I want to make. In this case, my sub-theme also uses a Google font instead of the default font, and I can also pass that font into the WYSIWYG editor.

TaDa!

That was easy to do, but it took me quite a while to understand how it worked. So I decided to post it here in case anyone else is as confused as I was.

More Information

To debug this further and understand how to impact the styling inside the WYSIWYG editor, you can refer to the relevant code from two files in core, ckeditor.module:  

/**
 * Retrieves the default theme's CKEditor stylesheets.
 *
 * Themes may specify iframe-specific CSS files for use with CKEditor by
 * including a "ckeditor_stylesheets" key in their .info.yml file.
 *
 * @code
 * ckeditor_stylesheets:
 *   - css/ckeditor-iframe.css
 * @endcode
 */
function _ckeditor_theme_css($theme = NULL) {
  $css = [];
  if (!isset($theme)) {
    $theme = \Drupal::config('system.theme')->get('default');
  }
  if (isset($theme) && $theme_path = drupal_get_path('theme', $theme)) {
    $info = system_get_info('theme', $theme);
    if (isset($info['ckeditor_stylesheets'])) {
      $css = $info['ckeditor_stylesheets'];
      foreach ($css as $key => $url) {
        if (UrlHelper::isExternal($url)) {
          $css[$key] = $url;
        }
        else {
          $css[$key] = $theme_path . '/' . $url;
        }
      }
    }
    if (isset($info['base theme'])) {
      $css = array_merge(_ckeditor_theme_css($info['base theme']), $css);
    }
  }
  return $css;
}

and Plugin/Editor/CKEditor.php:  

 /**
   * Builds the "contentsCss" configuration part of the CKEditor JS settings.
   *
   * @see getJSSettings()
   *
   * @param \Drupal\editor\Entity\Editor $editor
   *   A configured text editor object.
   * @return array
   *   An array containing the "contentsCss" configuration.
   */
  public function buildContentsCssJSSetting(Editor $editor) {
    $css = [
      drupal_get_path('module', 'ckeditor') . '/css/ckeditor-iframe.css',
      drupal_get_path('module', 'system') . '/css/components/align.module.css',
    ];
    $this->moduleHandler->alter('ckeditor_css', $css, $editor);
    // Get a list of all enabled plugins' iframe instance CSS files.
    $plugins_css = array_reduce($this->ckeditorPluginManager->getCssFiles($editor), function($result, $item) {
      return array_merge($result, array_values($item));
    }, []);
    $css = array_merge($css, $plugins_css);
    $css = array_merge($css, _ckeditor_theme_css());
    $css = array_map('file_create_url', $css);
    $css = array_map('file_url_transform_relative', $css);
    return array_values($css);
  }
Aug 01 2017
Aug 01

TL;DR:

  • Structured data has become an important component of search engine optimization (SEO).
  • Schema.org has become the standard vocabulary for providing machines with an understanding of digital data.
  • Google prefers Schema.org data as JSON LD over the older methods using RDFa and microdata. Also, JSON LD might be a better solution for decoupled sites.
  • Google provides tools to validate structured data to ensure you’re creating the right results.
  • You can use the Schema.org Metatag module to add Schema.org structured data as JSON LD in Drupal and validate it using Google’s tools.

Why does structured data matter to SEO?

Humans can read a web page and understand who the author and publisher are, when it was posted, and what it is about. But machines, like search engine robots, can’t tell any of that automatically or easily. Structured data is a way to provide a summary, or TL;DR (Too long; didn't read), for machines, to ensure they accurately categorize the data that is being represented. Because structured data helps robots do their job, it should be a huge factor in improving SEO.

Google has a Structured Data Testing Tool that can provide a preview of what a page marked up with structured data will look like in search results. These enhanced results can make your page stand out, or at least ensure that the search results accurately represent the page. Pages that have AMP alternatives, as this example does, get extra benefits, but even non-AMP pages with structured data receive enhanced treatment in search results.

Structured data code example

Who is Schema.org and why should we care?

Schema.org has become the de-facto standard vocabulary for tagging digital data for machines. It’s used and recognized by Google and most or all of the other search engines.

If you go to the main Schema.org listing page, you’ll see a comprehensive list of all the types of objects that can be described, including articles, videos, recipes, events, people, organizations, and much much more. Schema.org uses an inheritance system for these object types. The basic type of object is a Thing, which is then subdivided into several top-level types of objects:

  • Thing
    • Action
    • CreativeWork
    • Event
    • Intangible
    • Organization
    • Person
    • Place
    • Product

These top-level Things are then further broken down. For example, a CreativeWork can be an Article, Book, Recipe, Review, WebPage, to name just a few options, and an Article can further be identified as a NewsArticle, TechArticle, or SocialMediaPosting.

Each of these object types has its properties, like ‘name,' ‘description,' and ‘image,' and each inherits the properties of its parents, and adds their own additional properties. For instance, a NewsArticle inherits properties from its parents, which are Thing, CreativeWork, and Article. Finally, NewsArticle has some additional properties of its own. So it inherits ‘author’ and ‘description’ from its parents and adds a ‘dateline’ property that its parents don’t have.

NewsArticle Schema.org specification

Some properties are simple key/value pairs, like description. Other properties are more complex, such as references to other objects. So a CreativeWork object may have a publisher property, which is a reference to a Person or Organization object.

Further complicating matters, an individual web page might be home multiple, related or unrelated, Schema.org objects. A page might have an article and also a video. There could be other elements on the page that are not part of the article itself, like a breadcrumb, or event information. Structured data can include as many objects as necessary to describe the page.

Because there’s no limit to the number of objects that might be described, there's also a property mainEntityOfPage, which can be used to indicate which of these objects is the primary object on the page.

What are JSON LD, RDFa, and Microdata, where do they go, and which is better?

Once you decide what Schema.org objects and properties you want to use, you have choices about how to represent them on a web page. There are three primary methods: JSON LD, RDFa, and Microdata.

RDFa and Microdata use slightly different methods of accomplishing the same end. They wrap individual items in the page markup with identifying information.

JSON LD takes a different approach. It creates a JSON array with all the Schema.org information and places that in the head of the page. The markup around the actual content of the page is left alone.

Schema.org includes examples of each method. For instance, here’s how the author of an article would be represented in each circumstance:

RDFa

<div vocab="http://schema.org/" typeof="Article">
 <h2 property="name">How to Tie a Reef Knot</h2>
 by <span property="author">John Doe</span>
 The article text.
</div>

Microdata

<div itemscope itemtype="http://schema.org/Article">
 <h2 itemprop="name">How to Tie a Reef Knot</h2>
 by <span itemprop="author">John Doe</span>
 The article text.
</div>

JSON LD

<script type="application/ld+json">
{
 "@context": "http://schema.org",
 "@type": "Article",
 "author": "John Doe",
 "name": "How to Tie a Reef Knot".

 “description”: “The article text”.
}
</script>

Which is better?

There are advantages and disadvantages to each of these. RDFa and Microdata add some complexity to the page markup and are a little less human-readable, but they avoid data duplication and keep the item's properties close to the item.

JSON LD is much more human-readable, but results in data duplication, since values already displayed in the page are repeated in the JSON LD array.

All of these are valid, and none is really “better” than the other. That said, there is some indication that Google may prefer JSON LD. JSON LD is the only method that validates for AMP pages, and Google indicates a preference for it in its guide to structured data.

From the standpoint of Drupal’s theme engine, the JSON LD method would be the easiest to implement, since there’s no need to inject changes into all the individual markup elements of the page. It also might be a better solution for decoupled sites, since you could theoretically use Drupal to create a JSON LD array that is not directly tied to Drupal’s theme engine, then add it to the page using a front-end framework.

What about properties that reference other objects?

As noted above, many properties in structured data are references to other objects. A WebPage has a publisher, which is either an Organization or a Person.

There are several ways to configure those references. You can indicate the author of a CreativeWork either by using a shortcut, the string name or URL of the author, or by embedding a Person or Organization object. That embedded object could include more information about the author than just the name, such as a URL to an image of the person or a web page about them. In the following example, you can see several embedded references: image, author, and publisher.

<script type="application/ld+json">{
    "@context": "http://schema.org",
    "@graph": [
         {
            "@type": "Article",
            "description": "Example description.",
            "image": {
                "@type": "ImageObject",
                "url": "https://www.example.com/582753085.jpg",
                "width": "2408",
                "height": "1600"
            },
            "headline": "Example Title",
            "author": {
                "@type": "Person",
                "name": "Example Person",
                "sameAs": [
                    "https://www.example-person.com"
                ]
            },
            "dateModified": "2017-06-03T21:38:02-0500",
            "datePublished": "2017-03-03T19:14:50-0600",
            "publisher": {
                "@type": "Organization",
                "name": "Example.com",
                "url": "https://www.example.com//",
                "logo": {
                    "@type": "ImageObject",
                    "url": "https://www.example.com/logo.png",
                    "width": "600",
                    "height": "60"
                }
            }
        }
    ]
}</script>

JSON LD provides a third way to reference other objects, called Node Identifiers. An identifier is a globally unique identifier, usually an authoritative or canonical URL. In JSON LD, these identifiers are represented using @id. In the case of the publisher of a web site, you would provide structured data about the publisher that includes the @id property for that Organization. Then instead of repeating the publisher data over and over when referencing that publisher elsewhere, you could just provide the @id property that points back to the publisher record. Using @id, the above JSON LD might look like this instead:

<script type="application/ld+json">{
    "@context": "http://schema.org",
    "@graph": [
         {
            "@type": "Article",
            "description": "Example description.",
            "image": {
                "@type": "ImageObject",
                "@id": "https://www.example.com/582753085.jpg"
            },
            "headline": "Example Title",
            "author": {
                "@type": "Person",
                "@id": "https://www.example-person.com"
            },
            "dateModified": "2017-06-03T21:38:02-0500",
            "datePublished": "2017-03-03T19:14:50-0600",
            "publisher": {
                "@type": "Organization",
                "@id": "https://www.example.com//"
             }
        }
    ]
}</script>

How can we be sure that Google understands our structured data?

Once you’ve gone to the work of marking up your pages with structured data, you’ll want to be sure that Google and other search engines understand it the way you intended. Google has created a handy tool to validate structured markup. You can either paste the URL of a web page or the markup you want to evaluate into the tool. The second option is handy if you’re working on changes that aren't yet public.

Once you paste your code into the tool, Google provides its interpretation of your structured data. You can see each object, what type of object it is, and all its properties.

If you’re linking to a live page rather than just providing a snippet of code, you will also see a ‘Preview’ button you can click to see what your page will look like in search results. The image at the top of this article is an example of that preview.

Schema.org doesn’t require specific properties to be provided for structured data, but Google has some properties that it considers to be “required” or “recommended.” If those are missing, validation will fail.

You can see what Google expects on different types of objects. Click into the links for each type of content to see what properties Google is looking for.

Structured data testing tool

How and where can we add structured data to Drupal?

The next logical question is what modules are available to accomplish the task of rendering structured data on the page in Drupal 8. Especially tricky is doing it in a way that is extensible enough to support that gigantic list of possible objects and properties instead of being limited to a simple subset of common properties.

Because of the complexity of the standards and the flexibility of Drupal’s entity type and field system, there is no one-size-fits-all solution for Drupal that will automatically map Schema.org properties to every kind of Drupal data.

The RDFa module is included in core and seems like a logical first step. Unfortunately, the core solution doesn’t provide everything needed to create content that fully validates. It marks up some common properties on the page but has no way to indicate what type of object a page represents. Is it an Article? Person? Organization? Event? There is no way to flag that. And there is no way to support anything other than a few simple properties without writing code.

There is a Drupal Summer of Code project called RDF UI. It adds a way to link a content type to a Schema.org object type and to link fields to Schema.org properties. Though the module pulls the whole list of possible values from Schema.org, some linkages aren’t possible, for instance, a way to identify the title or creation date as anything other than standard values. I tried it out, but content created using this module didn’t validate for me on Google’s tool. The module is very interesting, and it is a great starting point, but it still creates RDFa rather than JSON LD.

The architecture of the Schema.org Metatag module.

After looking for an existing solution for Drupal 8, I concluded there wasn’t a simple, valid, extensible solution available to create JSON LD, so I created a module to do it, Schema.org Metatag.

Most of the heavy lifting of Schema.org Metatag comes from the Metatag module. The Metatag module manages the mapping and storing of data is managed, allowing you to either input hard-coded values or use tokens to define patterns that describe where the data originates. It also has a robust system of overrides so that you can define global patterns, then override some of them at the entity type level, or at the individual content type level, and or even per individual item, if necessary. There is no reason not to build on that framework, and any sites that care about SEO are probably already using the Metatag module already. I considered it an ideal starting point for the Schema Metatag module.

The Schema.org Metatag module creates Metatag groups for each Schema.org object type and Metatag tags for the Schema.org properties that belong to that object.

The base classes created by the Schema.org Metatag module add a flag to groups and tags that can be used to identify those that belong to Schema.org, so they can be pulled out of the array that would otherwise be rendered as metatags, to be displayed as JSON LD instead.

Some Schema.org properties need more than the simple key/value pairs that Metatag provides, and this module creates a framework for creating complex arrays of values for properties like the Person/Organization relationship. These complex arrays are serialized down into the simple strings that Metatag expects and are unserialized when necessary to render the form elements or create the JSON LD array.

The primary goal was to make it easily and endlessly extensible. The initial module code focuses on the properties that Google notes as “Required” or “Recommended” for some basic object types. Other object types may be added in the future, but could also be added by other modules or in custom code. The module includes an example module as a model of how to add more properties to an existing type, and the existing modules provide examples of how to add other object types.

Also, there is a patch for the Metatag module to refactor it a bit to make it possible for a decoupled Drupal back end to share metatags with a front-end framework. Since this module is built on the Metatag model, hopefully, that change could be exploited to provide JSON LD to a decoupled front end as well.

This approach worked well enough in Drupal 8 that I am in the process of backporting it to Drupal 7 as well.

Enough talk, how do I get JSON LD on the page?

It’s helpful to understand how Schema.org objects and properties are intended to work, which is the reason for going into some detail about that here. It helps to figure out ahead of time what values you expect to see when you get done.

Start by scanning the Schema.org lists and Google’s requirements and recommendations to identify which objects and properties you want to define for the content on your site. If you’re doing this for SEO, spend some time reviewing Google's guide to structured data to see what interests Google. Not all content types are of interest to Google, and Google considers some properties to be essential while ignoring others.

Some likely scenarios are that you will have one or more types of Articles, each with images and relationships to the People that author them or the Organization that publishes them. You might have entity types that represent Events, or Organizations, or People or Places, or Products. Events might have connections to Organizations that sponsor them or People that perform in them. You should be able to create a map of the type of content you have and what kind of Schema.org object each represents.

Then install the Schema.org Metatag module and enable the sub-modules you need for the specific content types on your site. Use this module the same way you would use the Metatag module. If you understand how that works, you should find this relatively easy to do. See the detailed instructions for Metatag 8.x or Metatag 7.x. You can set up global default values using tokens, or override individual values on the node edit form.

In Conclusion

Providing JSON LD structured data on your website pages is bound to be good for SEO. But it takes a while to get comfortable with how structured data works and the somewhat confusing Schema.org standards, let alone Google’s unique set of requirements and recommendations.

No solution will automatically configure everything correctly out of the box, and you can’t avoid the need to know a little about structured data. Nevertheless, this article and the Schema.org Metadata module should enable you to generate valid JSON LD data on a Drupal site.

Jul 28 2017
Jul 28

Pantheon is rolling out free HTTPS to all their websites—great news for the Drupal community since HTTPS is tremendously important, and Pantheon now provides an easy and free alternative to buttoning up your Drupal site with HTTPS. 

HTTPS is critical these days. I wrote a series of articles about HTTPS earlier this year, HTTPS Everywhere: Security is Not Just for Banks, HTTPS Everywhere: Quick Start With Cloudflare, and HTTPS Everywhere: Deep Dive Into Making the Switch.

As I noted in those articles, HTTPS is important for the privacy of your site users as well as the security of the site itself. And it's increasingly important in SEO.

Pantheon partnered with Fastly to deliver traffic across their edge cloud platform, and they integrated Let's Encrypt to provide HTTPS free to all sites on their platform. As a result, sites will run even faster, and content will be delivered even closer to users. As they say,"HTTPS on Pantheon is now automatic and free—forever."

All Pantheon sites are now automatically:

  • Distributed across 36 global points of presence (POPs)
  • Issued HTTPS certificates for free 
  • Getting an instant 2x boost in performance, at minimum

Pantheon provides details about how to take advantage of the HTTPS change. New sites will be set up on HTTPS automatically. Existing sites may need to make a small change to your DNS configuration and add some configuration to settings.php. Note that existing sites will be rolled out gradually. If you don't see a single "Domains/HTTPS" tab in your dashboard, your site hasn't been updated yet. Contact Pantheon, and they'll take care of it.

In my HTTPS series, I talked about the process of using a free Cloudflare account to add HTTPS to a Drupal site hosted on Pantheon. Any site set up as described in that article is ready for the Pantheon change, and will not need to make changes to DNS and settings.php file. They should have been configured correctly as a part of setting the site up to use Cloudflare. The only change needed to switch from Cloudflare to using Pantheon's CDN is to go to the "DNS" page on Cloudflare and toggle the orange cloud icon, so it is gray instead of orange. That will indicate that Cloudflare is no longer providing the proxy service, only DNS.

I tried this out on my own Pantheon site. I had set it up using Cloudflare's free SSL option. Since I already made the necessary DNS changes as a part of that change, the switch to Pantheon's new CDN was seamless. I only had to contact them to tell them I was ready to switch, wait for the changes to propagate, go to Cloudflare and toggle their CDN off, and a few minutes later I could see that my site was serving HTTPS using the Pantheon certificate. After this change, Cloudflare is still providing my DNS services, but not my SSL certificate.

I commend Pantheon for adding free HTTPS to their platform!

Mar 06 2017
Mar 06

HTTPS Everywhere: Deep Dive Into Making the Switch

In the previous articles, HTTPS Everywhere: Security is Not Just for Banks and HTTPS Everywhere: Quick Start With CloudFlare, I talked about why it’s important to serve even small websites using the secure HTTPS protocol, and provided a quick and easy how-to for sites where you don’t control the server. This article is going to provide a deep dive into SSL terminology and options. Even if you are offloading the work to a service like Cloudflare, it’s good to understand what’s going on behind the scenes. And if you have more control over the server you’ll need a basic understanding of what you need to accomplish and how to go about it.

At a high level, there are a few steps required to set up a website to be served securely over HTTPS:

  1. Decide what type of certificate to use.
  2. Install a signed certificate on the server.
  3. Configure the server to use SSL.
  4. Review your site for mixed content and other validation issues.
  5. Redirect all traffic to HTTPS.
  6. Monitor the certificate expiration date and renew it when it expires.

Your options are dependent on the type of certificate you want and your level of control over the website. If you self-host, you have unlimited choices, but you’ll have to do the work yourself. If you are using a shared host service, you’ll have to see what SSL options your host offers and how they recommend setting it up. Another option is to set up SSL on a proxy service like the Cloudflare CDN, which stands between your website and the rest of the web.

I’m going to go through these steps in detail.

Decide Which Certificate to Use

Every distinct domain needs certificates, so if you are serving content at www.example.com and blog.example.com, both domains need to be certified. Certificates are provided by a Certificate Authority (CA). There are numerous CAs that will sell you a certificate, including DigiCert, VeriSign, GlobalSign, and Comodo. There are also CAs that provide free SSL certificates, like LetsEncrypt.

Validation Levels There are several certificate validation levels available.

Domain Validation (DV) degree certificate indicates that the applicant has control over the specified DNS domain. DV certificates do not assure that any particular legal entity is connected to the certificate, even if the domain name may imply that. The name of the organization will not appear next to the lock in the browser since the controlling organization is not validated. DV certificates are relatively inexpensive, or even free. It’s a low level of authentication but provides assurance that the user is not on a spoofed copy of a legitimate site.

Organization Validation (OV) OV certificates verify that the applicant is a legitimate business. Before issuing the SSL certificate, the CA performs a rigorous validation procedure, including checking the applicant's business credentials (such as the Articles of Incorporation) and verifying the accuracy of its physical and Web addresses.

Extended Validation (EV) Extended Validation certificates are the newest type of certificate. They provide more validation than the OV validation level and adhere to industry-wide certification guidelines established by leading Web browser vendors and Certificate Authorities. To clarify the degree of validation, the name of the verified legal identity is displayed in the browser, in green, next to the lock. EV certificates are more expensive than DV or OV certificates because of the extra work they require from the CA. EV certificates convey more trust than the other alternatives, so are appropriate for financial and commerce sites, but they are useful on any site where trust is important.

Certificate Types

In addition to the validation levels, there are several types of certificates available.

Single Domain Certificate An individual certificate is issued for a single domain. It can be either DV, OV or EV.

Wildcard Certificate A wildcard certificate will automatically secure any sub-domains that a business adds in the future. They also reduce the number of certificates that need to be tracked. A wildcard domain would be something like *.example.com, which would include www.example.com, blog.example.com, help.example.com, etc. Wildcards work only with DV and OV certificates. EV certificates cannot be provided as wildcard certificates, since every domain must be specifically identified in an EV certificate.

Multi-Domain Subject Alternative Name (SAN) A multi-domain SAN certificate secures multiple domain names on a single certificate. Unlike a wildcard certificate, the domain names can be totally unrelated. It can be used by services like Cloudflare that combine a number of domains into a single certificate. All domains are covered by the same certificate, so they have the same level of credentials. A SAN certificate is often used to provide multiple domains with DV level certification, but EV SAN certificates are also available.

Install a Signed Certificate

The process of installing a SSL certificate is initiated on the server where the website is hosted by creating a 2048-bit RSA public/private key pair, then generating a Certificate Signing Request (CSR). The CSR is a block of encoded text that contains information that will be included in the certificate, like the organization name and location, along with the server’s public key. The CA then uses the CSR and the public key to create a signed SSL certificate, or a Certificate Chain. A certificate chain consists of multiple certificates where each certificate vouches for the next. This signed certificate or certificate chain is then installed on the original server. The public key is used to encrypt messages, and they can only be decrypted with the corresponding private key, making it possible for the user and the website to communicate privately with each other.

Obviously, this process is something that only works if you have shell access or a control panel UI to the server. If your site is hosted by a third party, it will be up to the host to determine, how, if at all, they will allow their hosted sites to be served over HTTPS. Most major hosts offer HTTPS, but specific instructions and procedures vary from host to host.

As an alternative, there are services, like Cloudflare, that provide HTTPS for any site, no matter where it is hosted. I discussed this in more detail in my previous article, HTTPS Everywhere: Quick Start With CloudFlare.

Configure the Server to Use SSL

The next step is to make sure the website server is configured to use SSL. If a third party manages your servers, like a shared host or CDN, this is handled by the third party and you don’t need to do anything other than determine that it is being handled correctly. If you are managing your own server, you might find Mozilla's handy configuration generator and documentation about Server Side TLS useful.

One important consideration is that the server and its keys should be configured for PFS, an abbreviation for either Perfect Forward Security or Perfect Forward Secrecy. Prior to the implementation of PFS, an attacker could record encrypted traffic over time and store it. If they got access to the private key later, they could then decrypt all that historic data with the private key. Security around the private key might be relaxed once the certificate expires, so this is a genuine issue. PFS ensures that even if the private key gets disclosed later, it can’t be used to decrypt prior encrypted traffic. An example of why this is important is the Heartbleed bug, where PFS would have prevented some of the damage caused by Heartbleed. If you’re using a third-party service for SSL, be sure it uses PFS. Cloudflare does, for instance.

Normally SSL certificates have a one-to-one relationship to the IP address of their domains. Server Name Indication (SNI) is an extension of TLS that provides a way to manage multiple certificates on the same IP address. SNI-compatible browsers (most modern browsers are SNI-compatible) can communicate with the server to retrieve the correct certificate for the domain they are trying to reach, which allows multiple HTTPS sites to be served from a single IP address.

Test the server’s configuration with Qualys' handy SSL Server Test. You can use this test even on servers you don’t control! It will run a battery of tests and give the server a security score for any HTTPS domain.

Review Your Site for HTTPS Problems

Once a certificate has been installed, it’s time to scrutinize the site to be sure it is totally valid using HTTPS. This is one of the most important, and potentially time-consuming, steps in switching a site to HTTPS.

To review your site for HTTPS validation, visit it by switching the HTTP in the address to HTTPS and scan the page source. Do this after a certificate has been installed, otherwise, the validation error from the lack of a certificate may prevent other validation errors from even appearing.

A common problem that prevents validation is the problem of mixed content, or content that mixes HTTP and HTTPS resources on the page. A valid HTTPS page should not include any HTTP resources. For instance, all JavaScript files and images should be pulled from HTTPS sources. Watch canonical URLs and link meta tags, as they should use the same HTTPS protocol. This is something that can be fixed even before switching the site to HTTPS, since HTTP pages can use HTTPS resources without any problem, just not the reverse.

There used to be a recommendation to use protocol-relative links, such as //example.com instead of http://example.com, but now the recommendation is to just always use HTTPS, if available since a HTTPS resource works fine under either protocol.

Absolute internal links should not conflate HTTP and HTTPS references. Ideally, all internal links should be relative links anyway, so they will work correctly under either HTTP or HTTPS. There are lots of other benefits of relative links, and few reasons not to use them.

For the most part, stock Drupal websites already use relative links wherever possible. In Drupal, some common sources of mixed content problems include:

  • Hard-coded HTTP links in custom block content.
  • Hard-coded HTTP links added by content authors in body, text, and link fields.
  • Hard-coded HTTP links in custom menu links.
  • Hard-coded HTTP links in templates and template functions.
  • Contributed modules that hard-code HTTP links in templates or theme functions.

Most browsers will display HTTPS errors in the JavaScript console. That’s the first place to look if the page isn’t validating as HTTPS. Google has an example page with mixed content errors where you can see how this looks.

Mixed content illustration

Redirect all Traffic to HTTPS

Once you’ve assured yourself that your website passes SSL validation, it’s time to be sure that all traffic goes over HTTPS instead of HTTP. You need 301 redirects from your HTTP pages to HTTPS, especially when switching from HTTP to HTTPS. If a website was already in production on HTTP, search engines have already indexed your pages. The 301 redirect ensures that search engines understand the new pages are a replacement for the old pages.

If you haven’t already, you need to determine whether you prefer the bare domain or the www version, example.com vs www.example.com. You should already be redirecting traffic away from one to the other for good SEO. When you include the HTTP and HTTPS protocols, at a minimum you will have four potential addresses to consider: http://example.com, https://example.com, https://example.com, and https://www.example.com. One of those should survive as your preferred address. You’ll need to set up redirects to reroute traffic away from all the others to that preferred location.

Specific details about how to handle redirects on the website server will vary depending on the operating system and configuration on the server. Shared hosts like Acquia Cloud and Pantheon provide detailed HTTPS redirection instructions that work on their specific configurations. Those instructions could provide useful clues to someone configuring a self-hosted website server as well.

HTTP Strict Transport Security (HSTS)

The final level of assurance that all traffic uses HTTPS is to implement the HTTP Strict Transport Security (HSTS) header on the secured site. The HSTS header creates a browser policy to always use HTTPS for the specified domain. Redirects are good, but there is still the potential for a Man-in-the-Middle to intercept the HTTP communication before it gets redirected to HTTPS. With HSTS, after the first communication with a domain, that browser will always initiate communication with HTTPS. The HSTS header contains a max-age when the policy expires, but the max-age is reset every time the user visits the domain. The policy will never expire if the user visits the site regularly, only if they fail to visit within the max-age period.

If you’re using Cloudflare’s SSL, as in my previous article, you can set the HSTS header in Cloudflare’s dashboard. It’s a configuration setting under the “Crypto” tab.

Local, Dev, and Stage Environments

A final consideration is whether or not to use HTTPS on all environments, including local, dev, and stage environments. That is truly HTTPS everywhere! If the live site uses HTTPS, it makes sense to use HTTPS in all environments for consistency.

HTTPS Is Important

Hopefully, this series of articles provides convincing evidence that it's important for sites of all sizes to start using the HTTPS protocol, and some ideas of how to make that happen. HTTPS Everywhere is a worthy initiative!

Jan 03 2017
Jan 03

A Github Pages Site

I started with the simplest possible example. I have a website hosted by a free, shared hosting service, Github Pages, that doesn’t directly provide SSL for custom domains, I have no shell access to the server, and I just wanted to get my site switched to HTTPS as easily and inexpensively as possible. I used an example from the Cloudflare blog about how to use Cloudflare SSL for a Github Pages site.

Services like Cloudflare can provide HTTPS for any site, no matter where it is hosted. Cloudflare is a Content Delivery Network (CDN) that stands in front of your web site to catch traffic before it gets to your origin website server. A CDN provides caching and efficient delivery of resources, but Cloudflare also provides SSL certificates, and they have a free account option to add any domain to a existing SSL certificate for no charge. With this alternative there is no need to purchase an individual certificate, nor figure out how to get it uploaded and signed. Everything is managed by Cloudflare. The downside of this option is that the certificate will be shared with numerous other unrelated domains. Cloudflare has higher tier accounts that have more options for the SSL certificates, if that’s important. But the free option is an easy and inexpensive way to get basic HTTPS on any site.

It’s important to note that adding another server to your architecture means that content makes another hop between servers. Now, instead of content going directly from your origin website server to the user, it goes from the the origin website server to Cloudflare to the user. The default Cloudflare SSL configuration will encrypt traffic between end users and the Cloudflare server (front-end traffic), but not between Cloudflare and your origin website server (back-end traffic). They point out in their documentation that back-end traffic is much harder to intercept, so that might be an acceptable risk for some sites. But for true security you want back-end traffic encrypted as well. If your origin website server has any kind of SSL certificate on it, even a self-signed certificate, and is configured to manage HTTPS traffic, Cloudflare can encrypt the back-end traffic as well with a “Full SSL” option. If the web server has an SSL certificate that is valid for your specific domain, Cloudflare can provide even better security with the “Full SSL (strict)” option. Cloudflare also can provide you with a SSL certificate that you can manually add to your origin server to support Full SSL, if you need that.

The following screenshot illustrates the Cloudflare security options.

SSL types

Step 1. Add a new site to Cloudflare

I went to Cloudflare, clicked the button to add a site, typed in the domain name, and waited for Cloudflare to scan for the DNS information (that took a few minutes). Eventually a green button appeared that said ‘Continue Setup’.

Cloudflare add screen

Step 2. Review DNS records

Next, Cloudflare displayed all the existing DNS records for my domain.

Network Solutions is my registrar (the place where I bought and manage my domain). Network Solutions was also my DNS provider (nameserver) where I set up the DNS records that indicate which IP addresses and aliases to use for my domain. Network Solutions will continue to be my registrar, but this switch will make Cloudflare my DNS provider, and I’ll manage my DNS records on Cloudflare after this change.

I opened up the domain management screen on Network Solutions and confirmed that the DNS information Cloudflare had discovered was a match for the information in my original DNS management screen. I will be able to add and delete DNS records in Cloudflare from this point forward, but for purposes of making the switch to Cloudflare I initially left everything alone.

DNS configuration screen

Step 3. Move the DNS to Cloudflare

Next, Cloudflare prompted me to choose a plan for this site. I chose the free plan option. I can change that later if I need to. Then I got a screen telling me to switch nameservers in my original DNS provider.

Nameserver screen

On my registrar, Network Solutions, I had to go through a couple of screens, opting to Change where domain points, then Domain Name Server, point domain to another hosting provider. That finally got me to a screen where I could input the new nameservers for my domain name.

Network Solutions screen

Back on Cloudflare, I saw a screen like the following, telling me that the change was in progress. There was nothing to do for a while, I just needed to allow the change to propagate across the internet. The Cloudflare documentation assured me that the change should be seamless to end users, and that seemed logical since nothing had really changed so far except the switch in nameservers.

Cloudflare confirmation screen

Several hours later, once the status changed from Pending to Active, I was able to continue the setup. I was ready to configure the SSL security level. There were three possible levels. The Flexible level was the default. That encrypts traffic between my users and Cloudflare, but not between Cloudflare and my site’s server. Further security is only possible if there is an SSL certificate on the origin web site server as well as on Cloudflare. Github pages has a SSL certificate on the server, since they provide HTTPS for non-custom domains. I selected the Crypto tab in Cloudflare to choose the SSL security level I wanted and changed the security level to Full.

Cloudflare crypto screen

Step 4. Confirm that HTTPS is Working Correctly

What I had accomplished at this point was to make it possible to access my site using HTTPS with the original HTTP addresses still working as before.

Next, it was time to check that HTTPS was working correctly. I visited the production site, and manually changed the address in my browser from HTTP://www.example.com to HTTPS://www.example.com. I checked the following things:

  • I confirmed there was a green lock displayed by the browser.
  • I clicked the green lock to view the security certificate details (see my previous article for a screenshot of what the certificate looks like), and confirmed it was displaying a security certificate from Cloudflare, and that it included my site’s name in its list of domains.
  • I checked the JavaScript console to be sure no mixed content errors were showing up. Mixed content occurs when you are still linking to HTTP resources on an HTTPS page, since that invalidates your security. I’ll discuss in more detail how to review a site for mixed content and other validation errors in the next article in this series.

Step 5. Set up Automatic Redirection to HTTPS

Once I was sure the HTTPS version of my site was working correctly, I could set up Cloudflare to handle automatic redirection to HTTPS, so my end users would automatically go to HTTPS instead of HTTP.

Cloudflare controls this with something it calls “Page Rules,” which are basically the kinds of logic you might ordinarily add to an .htaccess file. I selected the “Page Rules” tab and created a page rule that any HTTP address for this domain should always be switched to HTTPS.

Page rule screen

Since I also want to standardize on www.example.com instead of example.com, I added another page rule to redirect traffic from HTTPS://example.com to HTTPS://www.example.com using a 301 redirect.

WWW page rules screen

Finally, I tested the site again to be sure that any attempt to access HTTP redirected to HTTPS, and that attempts to access the bare domain redirected to the www sub-domain.

A Drupal Site Hosted on Pantheon

I also have several Drupal sites that are hosted on Pantheon and wanted to switch them to HTTPS, as well. Pantheon has instructions for installing individual SSL certificates for Professional accounts and above, but they also suggest an option of using the free Cloudflare account for any Pantheon account, including Personal accounts. Since most of my Pantheon accounts are small Personal accounts, I decided to set them up on Cloudflare as well.

The setup on Cloudflare for my Pantheon sites was basically the same as the setup for my Github Pages site. The only real difference was that the Pantheon documentation noted that I could make changes to settings.php that would do the same things that were addressed by Cloudflare’s page rules. Changes made in the Drupal settings.php file would work not just for traffic that hits Cloudflare, but also for traffic that happens to hit the origin server directly. Pantheon’s documentation notes that you don’t need to provide both Cloudflare page rules and Drupal settings.php configuration for redirects. You probably want to settle on one or the other to reduce future confusion. However, either, or both, will work.

These settings.php changes might also be adapted for Drupal sites not hosted on Pantheon, so I am copying them below.

// From https://pantheon.io/docs/guides/cloudflare-enable-https/#drupal
// Set the $base_url parameter to HTTPS:
if (defined('PANTHEON_ENVIRONMENT')) {
  if (PANTHEON_ENVIRONMENT == 'live') {
    $domain = 'www.example.com';
  }
  else {
    // Fallback value for development environments.
    $domain = $_SERVER['HTTP_HOST'];
  }
  # This global variable determines the base for all URLs in Drupal.
  $base_url = 'https://'. $domain;
}

// From https://pantheon.io/docs/redirects/#require-https-and-standardize-domain
//Redirect all traffic to HTTPS and WWW on live site:
if (isset($_SERVER['PANTHEON_ENVIRONMENT']) &&
  ($_SERVER['PANTHEON_ENVIRONMENT'] === 'live') &&
  (php_sapi_name() != "cli")) {
  if ($_SERVER['HTTP_HOST'] != 'www.example.com' ||
      !isset($_SERVER['HTTP_X_SSL']) ||
      $_SERVER['HTTP_X_SSL'] != 'ON' ) {
    header('HTTP/1.0 301 Moved Permanently');
    header('Location: https://www.example.com'. $_SERVER['REQUEST_URI']);
    exit();
  }
}

There was one final change I needed to make to my Pantheon sites that may or may not be necessary for other situations. My existing sites were configured with A records for the bare domain. That configuration uses Pantheon’s internal system for redirecting traffic from the bare domain to the www domain. But that redirection won’t work under SSL. Ordinarily you can’t use a CNAME record for the bare domain, but Cloudflare uses CNAME flattening to support a CNAME record for the bare domain. So once I switched DNS management to Cloudflare’s DNS service, I went to the DNS tab, deleted the original A record for the bare domain and replaced it with a CNAME record, then confirmed that the HTTPS bare domain properly redirected to the HTTPS www sub-domain.

Configuring a CNAME screen

Next, A Deep Dive

Now that I have basic SSL working on a few sites, it’s time to dig in and try to get a better understanding about HTTPS/SSL terminology and options and see what else I can do to secure my web sites. I’ll address that in my next article, HTTPS Everywhere: Deep Dive Into Making the Switch.

Dec 14 2016
Dec 14

HTTPS protects end users from eavesdroppers and other threats. Because of all the security ramifications of plain HTTP, Google is putting its considerable weight behind efforts to encourage websites to become more secure with an “HTTPS Everywhere” initiative:

HTTPS is also a requirement for some new interactive functionality, like taking pictures, recording audio, enabling offline app experiences, or geolocation, all of which require explicit user permissions. So, there are many reasons for website owners and users to pay attention to it.

What Does Insecurity Look Like?

As an experiment, to see exactly what level of security HTTPS gives the user, I visited two sites, one HTTP, and one HTTPS. Our Senior Systems Administrator, Ben Chavet, acted like an eavesdropper. He wasn’t even sitting next to me. He was 800 miles away watching my traffic over the VPN I was using. It took him just a few minutes to pick up what I was doing. What he did could have been done by someone in a coffee shop on a shared network, or by a “Man-in-the-Middle” somewhere between me and the sites I was accessing.

When I logged into the plain HTTP site, my “eavesdropper” could see everything I did, in plain text, including the full path I was visiting, along with my login name and password. He could even get my session cookie, which would allow him to impersonate me. Here’s a screen shot of some of the information he was able to view.

HTTP eavesdropping

But when I logged into a site protected by HTTPS, the only thing that was legible to my “eavesdropper” was the domain name of the site, and a couple of other bits of information from the security certificate as it was being processed. Everything else was encrypted. 

HTTPS eavesdropping

There are other problems with plain HTTP. An eavesdropper could steal session cookies to emulate a legitimate user to gain access to information they shouldn’t be able to see. If an attacker has access to a plain HTTP page, they could change links on the page, perhaps to redirect a user to another site. Or by encrypting form submissions (but not the form itself) an attacker can modify a form to post to a different URL. A valid HTTPS page is not vulnerable to these kinds of changes.

Clearly, HTTPS offers a huge security benefit!

What Does HTTPS Provide?

Let’s back up a bit. What exactly does HTTPS give us? It’s two things, really. First, it’s a way to ensure data integrity and make sure that traffic sent over the internet is encrypted. Secondly, it’s a system that provides authentication, meaning an assurance that the site a user is looking at is the site they think they are looking at.

In addition to obfuscating the user’s activity and data, HTTPS means the identity of the site is authenticated using a certificate which has been verified by a trusted third party.

If you get to a site using HTTPS instead of HTTP, you are accessing a site that purports to be secure. On an HTTPS connection, the browser you use (i.e. Internet Explorer, Safari, Chrome, or Firefox) and the site’s server will communicate with each other. The browser expects the server to provide a certificate of authenticity and a key the browser can use to encode and decode messages between the browser and the server. If the browser gets the information it requires from a secure site, it will display a safety lock in the address bar. If anything seems amiss, the browser will warn the user. Problems on an HTTPS page could be a missing, invalid, or expired certificate or key, or “mixed content” (HTTP content or resources that should never be included on an HTTPS page).

Identity, data integrity, and encryption are all important. A bogus site could still be encrypting its traffic, and a site that is totally legitimate might not be encrypting its traffic. A really secure site will both encrypt its traffic and also provide evidence that it is the site it purports to be.

How Do Users Know a Site is Secure?

Browsers provide messages for insecure sites. The specific messages vary from browser to browser, and depend on the situation, but might include text like “This page may not be secure.” or “The certificate is not trusted because it is self signed.” Most browsers display some color-coding that is expected to help convey the security status.  

If a site is rendered only over HTTP, browsers usually don’t indicate anything at all about the security of the site, they just provide a plain URL without a lock. This provides no information, but also no assurance of any kind. And as noted above, unencrypted internet traffic over HTTP is still a potential security risk.

The following chart illustrates a range of possibilities for browser security status indicators (note that EV is a special type of HTTPS certificate that provides extra assurance, like for bank and financial sites, more about that later):

Security indicator comparisons

For more information about the HTTPS security, users can click on the lock icon. The specific details they see will vary from browser to browser, but generally, there is a link with text like “More details” or “View certificate” that will allow the user to see who owns the certificate and other details about it.

HTTPS lock details

Research about how well end users understand HTTPS security status and messages found that most users don’t understand and ultimately ignore security warnings. Users often miss the lock, or lack of a lock, and find the highly technical browser messages to be confusing. The focus on color to indicate security status is a problem for those that are color blind. Also, so many sites still use HTTP or are otherwise potentially insecure that it is easy for users to discount the risk and proceed regardless. The conclusion of all this research is that better systems need to be put in place to make it clear to users which sites are secure and which aren’t, and to encourage more sites to adhere to recommended security best practices.

A while ago, Chrome started to let users understand how secure a site is. These examples use a combination of color and shape to convey what’s secure and what isn’t. Currently, the plain HTTP site is more noticeably a security threat.

Current Chrome security warnings

Starting in January of 2017, they plan to add text saying ‘Secure’ or ‘Not secure’ for even more emphasis:

Future Chrome security warnings

Other browsers may follow suit to make plain HTTP look more noticeably insecure. Between the user safety, the SEO hit, and the security warnings that may scare people away from sites using plain HTTP, no legitimate site can really afford to ignore the implications of not serving content over HTTPS.

What Do All the Terms Mean? 

HTTPS terminology is confusing. There is a lot of jargon and countless acronyms. If you read anything about HTTPS, you can quickly get lost in a sea of unfamiliar terminology. Here is a list of definitions to help make things more clear.

Secure Socket Layer (SSL)

SSL is the original standard used for encrypted traffic sent over HTTP. It has actually been superseded by TLS, but the term is still used in a generic way to refer to either SSL or TLS.

Transport Layer Security (TLS)

TLS is the new variation of SSL, but it’s a newer, more stringent, protocol. TLS is not just for web browsers and HTTP, it can also be used with non-HTTP applications. For instance, it can be used to provide secure email delivery. TLS is the layer where encryption takes place.

HTTPS

HTTPS is just a protocol that indicates that HTTP includes the extra layer of security provided by TLS.

Certificate Authority (CA)

A CA is an organization that provides and verifies HTTPS certificates. “Self-signed” certificates don’t have any indication about who they belong to. Certificates should be signed by a known third party.

Certificate Chain of Trust

There can be one or more intermediate certificates, creating a chain. This chain should take you from the current certificate all the way back to a trusted CA.

Domain Validation (DV)

A DV certificate indicates that the applicant has control over the specified DNS domain. DV certificates do not assure that any particular legal entity is connected to the certificate, even if the domain name may imply that. The name of the organization will not appear next to the lock since the controlling organization is not validated. DV certificates are relatively inexpensive, or even free. It’s a low level of authentication, but provides assurance that the user is not on a spoofed copy of a legitimate site.

Nov 29 2016
Nov 29

I ended up using the JSON API module, along with the REST modules in Drupal Core on the source site. On the target site, I used Migrate from  Drupal Core 8.2.3 along with Migrate Plus and Migrate Tools.

Why JSON API?

Drupal 8 Core ships with two ways to export JSON data. You can access data from any entity by appending ?_format=json to its path, but that means you have to know the path ahead of time, and you’d be pulling in one entity at a time, which is not efficient.

You could also use Views to create a JSON endpoint, but it might be difficult to configure it to include all the required data, especially all the data from related content, like images, authors, and related nodes. And you’d have to create a View for every possible collection of data that you want to make available. To further complicate things, there's an outstanding bug using GET with Views REST endpoints.

JSON API provides another solution. It puts the power in the hands of the data consumer. You don’t need to know the path of every individual entity, just the general path for a entity type, and bundle. For example: /api/node/article. From that one path, the consumer can select exactly what they want to retrieve just by altering the URL. For example, you can sort and filter the articles, limit the fields that are returned to a subset, and bring along any or all related entities in the same query. Because of all that flexibility, that is the solution I decided to use for my example. (The Drupal community plans to add JSON API to Core in the future.)

There’s a series of short videos on YouTube that demonstrate many of the configuration options and parameters that are available in Drupal’s JSON API.

Prepare the Source Site

There is not much preparation needed for the source because of JSON API’s flexibility. My example is a simple Drupal 8 site with an article content type that has a body and field_image image field, the kind of thing core provides out of the box.

First, download and install the JSON API module. Then, create YAML configuration to “turn on” the JSON API. This could be done by creating a simple module that has YAML file(s) in /MODULE/config/optional. For instance, if you created a module called custom_jsonapi, a file that would expose node data might look like:

filename: /MODULE/config/optional/rest.resource.entity.node.yml:
id: entity.node
plugin_id: 'entity:node'
granularity: method
configuration:
  GET:
    supported_formats:
      - json
    supported_auth:
      - basic_auth
      - cookie
dependency:
  enforced:
    module:
      - custom_jsonapi

To expose users or taxonomy terms or comments, copy the above file, and change the name and id as necessary, like this:

filename: /MODULE/config/optional/rest.resource.entity.taxonomy_term.yml:
id: entity.taxonomy_term
plugin_id: 'entity:taxonomy_term'
granularity: method
configuration:
  GET:
    supported_formats:
      - json
    supported_auth:
      - basic_auth
      - cookie
dependency:
  enforced:
    module:
      - custom_jsonapi

That will support GET, or read-only access. If you wanted to update or post content you’d add POST or PATCH information. You could also switch out the authentication to something like OAuth, but for this article we’ll stick with the built-in basic and cookie authentication methods. If using basic authentication and the Basic Auth module isn’t already enabled, enable it.

Navigate to a URL like http://sourcesite.com/api/node/article?_format=api_json and confirm that JSON is being output at that URL.

That's it for the source.

Prepare the Target Site

The target site should be running Drupal 8.2.3 or higher. There are changes to the way file imports work that won't work in earlier versions. It should already have a matching article content type and field_image field ready to accept the articles from the other site.

Enable the core Migrate module. Download and enable the Migrate Plus and Migrate Tools modules. Make sure to get the versions that are appropriate for the current version of core. Migrate Plus had 8.0 and 8.1 branches that only work with outdated versions of core, so currently you need version 8.2 of Migrate Plus.

To make it easier, and so I don’t forget how I got this working, I created a migration example as the Import Drupal module on Github. Download this module into your module repository. Edit the YAML files in the /config/optional  directory of that module to alter the JSON source URL so it points to the domain for the source site created in the earlier step.

It is important to note that if you alter the YAML files after you first install the module, you'll have to uninstall and then reinstall the module to get Migrate to see the YAML changes.

Tweaking the Feed Using JSON API

The primary path used for our migration is (where sourcesite.com is a valid site):

http(s)://sourcesite.com/api/node/article?_format=api_json

This will display a JSON feed of all articles. The articles have related entities. The field_image field points to related images, and the uid/author field points to related users. To view the related images, we can alter the path as follows:

http(s)://sourcesite.com/api/node/article?_format=api_json&include=field_image

That will add an included array to the feed that contains all the details about each of the related images. This way we won’t have to query again to get that information, it will all be available in the original feed. I created a gist with an example of what the JSON API output at this path would look like.

To include authors as well, the path would look like the following. In JSON API you can follow the related information down through as many levels as necessary:

http(s)://sourcesite.com/api/node/article?_format=api_json&include=field_image,uid/author

Swapping out the domain in the example module may be the only change needed to the example module, and it's a good place to start. Read the JSON API module documentation to explore other changes you might want to make to that configuration to limit the fields that are returned, or sort or filter the list.

Manually test the path you end up with in your browser or with a tool like Postman to make sure you get valid JSON at that path.

Migrating From JSON

I had a lot of trouble finding any documentation about how to migrate into Drupal 8 from a JSON source. I finally found some in the Migrate Plus module. The rest I figured out from my earlier work on the original JSON Source module (now deprecated) and by trial and error. Here’s the source section of the YAML I ended up with, when migrating from another Drupal 8 site that was using JSON API.

source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: json
  urls: http://sourcesite.com/api/node/article?_format=api_json
  ids:
    nid:
      type: integer
  item_selector: data/
  fields:
    -
      name: nid
      label: 'Nid'
      selector: /attributes/nid
    -
      name: vid
      label: 'Vid'
      selector: /attributes/vid
    -
      name: uuid
      label: 'Uuid'
      selector: /attributes/uuid
    -
      name: title
      label: 'Title'
      selector: /attributes/title
    -
      name: created
      label: 'Created'
      selector: /attributes/created
    -
      name: changed
      label: 'Changed'
      selector: /attributes/changed
    -
      name: status
      label: 'Status'
      selector: /attributes/status
    -
      name: sticky
      label: 'Sticky'
      selector: /attributes/sticky
    -
      name: promote
      label: 'Promote'
      selector: /attributes/promote
    -
      name: default_langcode
      label: 'Default Langcode'
      selector: /attributes/default_langcode
    -
      name: path
      label: 'Path'
      selector: /attributes/path
    -
      name: body
      label: 'Body'
      selector: /attributes/body
    -
      name: uid
      label: 'Uid'
      selector: /relationships/uid
    -
      name: field_image
      label: 'Field image'
      selector: /relationships/field_image


One by one, I’ll clarify some of the critical elements in the source configuration.

File-based imports, like JSON and XML use the same pattern now. The main variation is the parser, and for JSON and XML, the parser is in the Migrate Plus module:

source:
  plugin: url
  data_fetcher_plugin: http
  data_parser_plugin: json

The url is the place where the JSON is being served. There could be more than one URL, but in this case there is only one. Reading through multiple URLs is still pretty much untested, but I didn’t need that:

  urls: http://sourcesite.com/api/node/article?_format=api_json

We need to identify the unique id in the feed. When pulling nodes from Drupal, it’s the nid:

  ids:
    nid:
      type: integer

We have to tell Migrate where in the feed to look to find the data we want to read. A tool like Postman (mentioned above) helps figure out how the data is configured. When the source is using JSON API, it’s an array with a key of data:

  item_selector: data/

We also need to tell Migrate what the fields are. In the JSON API, they are nested below the main item selector, so they are prefixed using an xpath pattern to find them. The following configuration lets us refer to them later by a simple name instead of the full path to the field. I think the label would only come into play if you were using a UI:

  fields:
    -
      name: nid
      label: 'Nid'
      selector: /attributes/nid

Setting up the Image Migration Process

For the simple example in the Github module we’ll just try to import nodes with their images. We’ll set the author to an existing author and ignore taxonomy. We’ll do this by creating two migrations against the JSON API endpoint, first one to pick up the related images, and then a second one to pick up the nodes.

Most fields in the image migration just need the same values they’re pulling in from the remote file, since they already have valid Drupal 8 values, but the uri value has a local URL that needs to be adjusted to point to the full path to the file source so the file can be downloaded or copied into the new Drupal site.

Recommendations for how best to migrate images have changed over time as Drupal 8 has matured. As of Drupal 8.2.3 there are two basic ways to process images, one for local images and a different one for remote images.  The process steps are different than in earlier examples I found. There is not a lot of documentation about this. I finally found a Drupal.org thread where the file import changes were added to Drupal core and did some trial and error on my migration to get it working.  

For remote images:

source:
  ...
  constants:
    source_base_path: 'http://sourcesite.com/'
process:
  filename: filename
  filemime: filemime
  status: status
  created: timestamp
  changed: timestamp
  uid: uid
  uuid: id
  source_full_path:
    plugin: concat
    delimiter: /
    source:
      - 'constants/source_base_path'
      - url
  uri:
    plugin: download
    source:
      - '@source_full_path'
      - uri
    guzzle_options:
      base_uri: 'constants/source_base_path'

For local images change it slightly:

source:
  ...
  constants:
    source_base_path: 'http://sourcesite.com/'
process:
  filename: filename
  filemime: filemime
  status: status
  created: timestamp
  changed: timestamp
  uid: uid
  uuid: id
  source_full_path:
    plugin: concat
    delimiter: /
    source:
      - 'constants/source_base_path'
      - url
  uri:
    plugin: file_copy
    source:
      - '@source_full_path'
      - uri

The above configuration works because the Drupal 8 source uri value is already in the Drupal 8 format, http://public:image.jpg. If migrating from a pre-Drupal 7 or non-Drupal source, that uri won’t exist in the source. In that case you would need to adjust the process for the uri value to something more like this:

source:
  constants:
    is_public: true
  ...
process:
  ...
  source_full_path:
    -
      plugin: concat
      delimiter: /
      source:
        - 'constants/source_base_path'
        - url
    -
      plugin: urlencode
  destination_full_path:
    plugin: file_uri
      source:
        - url
        - file_directory_path
        - temp_directory_path
        - 'constants/is_public'
  uri:
    plugin: file_copy
    source:
      - '@source_full_path'
      - '@destination_full_path'

Run the Migration

Once you have the right information in the YAML files, enable the module. On the command line, type this:

drush migrate-status

You should see two migrations available to run.  The YAML files include migration dependencies and that will force them to run in the right order. To run them, type:

drush mi --all

The first migration is import_drupal_images. This has to be run before import_drupal_articles, because field_image on each article is a reference to an image file. This image migration uses the path that includes the related image details, and just ignores the primary feed information.

The second migration is import_drupal_articles. This pulls in the article information using the same url, this time without the included images. When each article is pulled in, it is matched to the image that was pulled in previously.

You can run one migration at a time, or even just one item at a time, while testing this out:

drush migrate-import import_drupal_images --limit=1

You can rollback and try again.

drush migrate-rollback import_drupal_images

If all goes as it should, you should be able to navigate to the content list on your new site and see the content that Migrate pulled in, complete with image fields. There is more information about the Migrate API on Drupal.org.

What Next?

There are lots of other things you could do to build on this. A Drupal 8 to Drupal 8 migration is easier than many other things, since the source data is generally already in the right format for the target. If you want to migrate in users or taxonomy terms along with the nodes, you would create separate migrations for each of them that would run before the node migration. In each of them, you’d adjust the include value in the JSON API path to pull the relevant information into the feed, then update the YAML file with the necessary steps to process the related entities.

You could also try pulling content from older versions of Drupal into a Drupal 8 site. If you want to pull everything from one Drupal 6 site into a new Drupal 8 site you would just use the built in Drupal to Drupal migration capabilities, but if you want to selectively pull some items from an earlier version of Drupal into a new Drupal 8 site this technique might be useful. The JSON API module won’t work on older Drupal versions, so the source data would have to be processed differently, depending on what you use to set up the older site to serve JSON. You might need to dig into the migration code built into Drupal core for Drupal to Drupal migrations to see how Drupal 6 or Drupal 7 data had to be massaged to get it into the right format for Drupal 8.

Jun 24 2016
Jun 24

The oddity of this field can create problems. The summary has no format of its own, it shares a format with the body. So you can't have a simple format for the summary and a more complex one for the body. The link to expose and hide the summary on the edit form is a little non-intuitive, especially since no other field behaves this way, so it's easy to miss the fact that there is a summary field there at all. If you are relying on the truncated text for the summary, there's no easy way to see in the node form what the summary will end up looking like. You have to preview the node to tell.

I wanted to move away from using the legacy body field in favor of separate body and summary fields that behave in a more normal way, where each is a distinct field, with its own format and no unexpected behavior. I like the benefits of having two fields, with the additional granularity that provides. This article describes how I made this switch on one of my own sites.

Making the Switch

The first step was to add the new fields to the content types where they will be used. I just did this in the UI by going to admin > structure > types. I created two fields, one called field_description for the full body text and one called field_summary for the summary. My plan was for the summary field to be a truncated, plain text excerpt of the body that I could use in metatags and in AMP metadata, as well as on teasers. I updated the Manage Display and Manage Form Display data on each content type to display my new fields instead of the old body field on the node form and in all my view modes.

Once the new fields were created I wanted to get my old body/summary data copied over to my new fields. To do this I needed an update hook. I used Drupal.org as a guide for creating an update hook in Drupal 8.

The instructions for update hooks recommend not using normal hooks, like $node->save(), inside update hooks, and instead updating the database directly with a SQL query. But that would require understanding all the tables that need to be updated. This is much more complicated in Drupal 8 than it was in Drupal 7. In Drupal 7 each field has exactly two tables, one for the active values of the field and one with revision values. In Drupal 8 there are numerous tables that might be used, depending on whether you are using revisions and/or translations. There could be up to four tables that need to be updated for each individual field that is altered. On top of that, if I had two fields in Drupal 7 that had the same name, they were always stored in the same tables, but in Drupal 8 if I have two fields with the same name they might be in different tables, with each field stored in up to four tables for each type of entity the field exists on.

To avoid any chance of missing or misunderstanding which tables to update, I went ahead and used the $node->save() method in the update hook to ensure every table gets the right changes. That method is time-consuming and could easily time out for mass updates, so it was critical to run the updates in small batches. I then tested it to be sure the batches were small enough not to create a problem when the update ran.

The update hook ended up looking like this:


<?php
/**
 * Update new summary and description fields from body values.
 */
function custom_update_8001(&$sandbox) {

  // The content types to update.
  $bundles = ['article', 'news', 'book'];
  // The new field for the summary. Must already exist on these content types.
  $summary_field = 'field_summary';
  // The new field for the body. Must already exist on these content types.
  $body_field = 'field_description';
  // The number of nodes to update at once.
  $range = 5;

  if (!isset($sandbox['progress'])) {
    // This must be the first run. Initialize the sandbox.
    $sandbox['progress'] = 0;
    $sandbox['current_pk'] = 0;
    $sandbox['max'] = Database::getConnection()->query("SELECT COUNT(nid) FROM {node} WHERE type IN (:bundles[])", array(':bundles[]' => $bundles))->fetchField();
  }

  // Update in chunks of $range.
  $storage = Drupal::entityManager()->getStorage('node');
  $records = Database::getConnection()->select('node', 'n')
    ->fields('n', array('nid'))
    ->condition('type', $bundles, 'IN')
    ->condition('nid', $sandbox['current_pk'], '>')
    ->range(0, $range)
    ->orderBy('nid', 'ASC')
    ->execute();
  foreach ($records as $record) {
    $node = $storage->load($record->nid);

    // Get the body values if there is now a body field.
    if (isset($node->body)) {
      $body = $node->get('body')->value;
      $summary = $node->get('body')->summary;
      $format = $node->get('body')->format;

      // Copy the values to the new fields, being careful not to wipe out other values that might be there.
      if (empty($node->{$summary_field}->getValue()) && !empty($summary)) {
        $node->{$summary_field}->setValue(['value' => $summary, 'format' => $format]);
      }
      if (empty($node->{$body_field}->getValue()) && !empty($body)) {
        $node->{$body_field}->setValue(['value' => $body, 'format' => $format]);
      }

      if ($updated) {
        // Clear the body values.
        $node->body->setValue([]);
      }
    }

    // Force a node save even if there are no changes to force the pre_save hook to be executed.
    $node->save();

    $sandbox['progress']++;
    $sandbox['current_pk'] = $record->nid;
  }

  $sandbox['#finished'] = empty($sandbox['max']) ? 1 : ($sandbox['progress'] / $sandbox['max']);

  return t('All content of the types: @bundles were updated with the new description and summary fields.', array('@bundles' => implode(', ', $bundles)));
}
?>

Creating the Summary

That update would copy the existing body data to the new fields, but many of the new summary fields would be empty. As distinct fields, they won't automatically pick up content from the body field, and will just not display at all. The update needs something more to get the summary fields populated. What I wanted was to end up with something that would work similarly to the old body field. If the summary is empty I want to populate it with a value derived from the body field. But when doing that I also want to truncate it to a reasonable length for a summary, and in my case I also wanted to be sure that I ended up with plain text, not markup, in that field.

I created a helper function in a custom module that would take text, like that which might be in the body field, and alter it appropriately to create the summaries I want. I have a lot of nodes with html data tables, and I needed to remove those tables before truncating the content to create a summary. My body fields also have a number of filters that need to do their replacements before I try creating a summary. I ended up with the following processing, which I put in a custom.module file:


<?php
use Drupal\Component\Render\PlainTextOutput;

/**
 * Clean up and trim text or markup to create a plain text summary of $limit size.
 *
 * @param string $value
 *   The text to use to create the summary.
 * @param string $limit
 *   The maximum characters for the summary, zero means unlimited.
 * @param string $input_format
 *   The format to use on filtered text to restore filter values before creating a summary.
 * @param string $output_format
 *   The format to use for the resulting summary.
 * @param boolean $add_ellipsis
 *   Whether or not to add an ellipsis to the summary.
 */
function custom_parse_summary($value, $limit = 150, $input_format = 'plain_text', $output_format = 'plain_text', $add_ellipsis = TRUE) {

  // Remove previous ellipsis, if any.
  if (substr($value, -3) == '...') {
    $value = substr_replace($value, '', -3);
  }

  // Allow filters to replace values so we have all the original markup.
  $value = check_markup($value, $input_format);

  // Completely strip tables out of summaries, they won't truncate well.
  // Stripping markup, done next, would leave the table contents, which may create odd results, so remove the tables entirely.
  $value = preg_replace('/(.*?)<\/table>/si', '', $value);

  // Strip out all markup.
  $value = PlainTextOutput::renderFromHtml(htmlspecialchars_decode($value));

  // Strip out carriage returns and extra spaces to pack as much info as possible into the allotted space.
  $value = str_replace("\n", "", $value);
  $value = preg_replace('/\s+/', ' ', $value);
  $value = trim($value);

  // Trim the text to the $limit length.
  if (!empty($limit)) {
    $value = text_summary($value, $output_format, $limit);
  }

  // Add ellipsis.
  if ($add_ellipsis && !empty($value)) {
    $value .= '...';
  }

  return $value;
}
?>

Adding a Presave Hook

I could have used this helper function in my update hook to populate my summary fields, but I realized that I actually want automatic population of the summaries to be the default behavior. I don't want to have to copy, paste, and truncate content from the body to populate the summary field every time I edit a node, I'd like to just leave the summary field blank if I want a truncated version of the body in that field, and have it updated automatically when I save it.

To do that I used the pre_save hook. The pre_save hook will update the summary field whenever I save the node, and it will also update the summary field when the above update hook does $node->save(), making sure that my legacy summaries also get this treatment.

My pre_save hook, in the same custom.module file used above, ended up looking like the following:


<?php

use Drupal\Core\Entity\EntityInterface;

/**
 * Implements hook_entity_presave().
 *
 * Make sure summary and image are populated.
 */
function custom_entity_presave(EntityInterface $entity) {
  
  $entity_type = 'node';
  $bundles = ['article', 'news', 'book'];
  // The new field for the summary. Must already exist on these content types.
  $summary_field = 'field_summary';
  // The new field for the body. Must already exist on these content types.
  $body_field = 'field_description';
  // The maximum length of any summary, set to zero for no limit.
  $summary_length = 300;

  // Everything is an entity in Drupal 8, and this hook is executed on all of them!
  // Make sure this only operates on nodes of a particular type.
  if ($entity->getEntityTypeId() != $entity_type || !in_array($entity->bundle(), $bundles)) {
    return;
  }

  // If we have a summary, run it through custom_parse_summary() to clean it up.
  $format = $entity->get($summary_field)->format;
  $summary = $entity->get($summary_field)->value;
  if (!empty($summary)) {
    $summary = custom_parse_summary($summary, $summary_length, $format, 'plain_text');
    $entity->{$summary_field}->setValue(['value' => $summary, 'format' => 'plain_text']);
  }

  // The summary might be empty or could have been emptied by the cleanup in the previous step. If so, we need to pull it from description.
  $format = $entity->get($body_field)->format;
  $description = $entity->get($body_field)->value;
  if (empty($summary) && !empty($description)) {
    $summary = custom_parse_summary($description, $summary_length, $format, 'plain_text');
    $entity->{$summary_field}->setValue(['value' => $summary, 'format' => 'plain_text']);
  }
}  
?>

With this final bit of code I’m ready to actually run my update. Now whenever a node is saved, including when I run the update to move all my legacy body data to the new fields, empty summary fields will automatically be populated with a plain text, trimmed, excerpt from the full text.

Going forward, when I edit a node, I can either type in a custom summary, or leave the summary field empty if I want to automatically extract its value from the body. The next time I edit the node the summary will already be populated from the previous save. I can leave that value, or alter it manually, and it won't be overridden by the pre_save process on the next save. Or I can wipe the field out if I want it populated automatically again when the node is re-saved.

Javascript or Presave?

Instead of a pre_save hook I could have used javascript to automatically update the summary field in the node form as the node is being edited. I would only want that behavior if I'm not adding a custom summary, so the javascript would have to be smart enough to leave the summary field alone if I already have text in it or if I start typing in it, while still picking up every change I make in the description field if I don’t. And it would be difficult to use javascript to do filter replacements on the description text or have it strip html as I'm updating the body. Thinking through all the implications of trying to make a javascript solution work, I preferred the idea of doing this as a pre_save hook.

If I was using javascript to update my summaries, the javascript changes wouldn't be triggered by my update hook, and the update hook code above would have to be altered to do the summary clean up as well.

Ta-dah

And that's it. I ran the update hook and then the final step was to remove my now-empty body field from the content types that I switched, which I did using the UI on the Content Types management page.

My site now has all its nodes updated to use my new fields, and summaries are getting updated automatically when I save nodes. And as a bonus this was a good exercise in seeing how to manipulate nodes and how to write update and pre_save hooks in Drupal 8.

Oct 26 2015
Oct 26

I’ve built and rebuilt many demo Drupal 8 sites while trying out new D8 modules and themes and experimenting with new functionality like migrations. After installing D8 manually from scratch so many times, I decided to sit down and figure out how to build a Drupal site using Composer to make it easier. The process is actually very handy, sort of the way we’ve used Drush Make in the past, where you don’t actually store all the core and contributed module code in your repository, you just record which modules and versions you’re using and pull them in dynamically.

I was a little worried about changing the process I’ve used for a long time, but my worries were for nothing. Anyone who’s used to Drush would probably find it pretty easy to get this up and running. 

TLDR: How to go from an empty directory to a fully functional Drupal site in two command lines:

sudo composer create-project drupal-composer/drupal-project:~8.0 drupal --stability dev --no-interaction

cd drupal/web
../vendor/bin/drush site-install --db-url=mysql://{username}:{password}@localhost/{database}

Install Composer

Let's talk through the whole process, step by step. The first step is to install Composer on your local system. See https://getcomposer.org/download/ for more information about installing Composer.

Set Up A Project With Composer

To create a new Drupal project using Composer, type the following on the command line, where /var/drupal is the desired code location:

cd /var
sudo composer create-project drupal-composer/drupal-project:~8.0 drupal --stability dev --no-interaction

The packaging process downloads all the core modules, Devel, Drush and Drush Console, and then moves all the Drupal code into a ‘web’ subdirectory. It also moves the vendor directory outside of the web root. The new file structure will look like this:

File directory structure

You will end up with a composer.json file at the base of the project that might look like the following. You can see the beginning of the module list in the ‘require’ section, and that Drush and Drupal Console are included by default. You can also see rules that move contributed modules into ‘/contrib’ subfolders as they’re downloaded.

{
    "name": "drupal-composer/drupal-project",
    "description": "Project template for Drupal 8 projects with composer",
    "type": "project",
    "license": "GPL-2.0+",
    "authors": [
        {
            "name": "",
            "role": ""
        }
    ],
    "repositories": [
        {
            "type": "composer",
            "url": "https://packagist.drupal-composer.org"
        }
    ],
    "require": {
        "composer/installers": "^1.0.20",
        "drupal/core": "8.0.*",
        "drush/drush": "8.*",
        "drupal/console": "~0.8",
    },
    "minimum-stability": "dev",
    "prefer-stable": true,
    "scripts": {
        "post-install-cmd": "scripts/composer/post-install.sh"
    },
    "extra": {
        "installer-paths": {
            "web/core": ["type:drupal-core"],
            "web/modules/contrib/{$name}": ["type:drupal-module"],
            "web/profiles/contrib/{$name}": ["type:drupal-profile"],
            "web/themes/contrib/{$name}": ["type:drupal-theme"],
            "web/drush/commands/{$name}": ["type:drupal-drush"]
        }
    }
}

That site organization comes from https://github.com/drupal-composer/drupal-project/tree/8.x. A README.md file there describes the process for doing things like updating core. The contributed modules are coming from Packagist rather than directly from Drupal.org. That’s because the current Drupal versioning system doesn’t qualify as the semantic versioning the system needs. There is an ongoing discussion https://www.drupal.org/node/1612910 about how to fix that.

Install Drupal

The right version of Drush for Drupal 8 comes built into this package. If you have an empty database you can then install Drupal using the Drush version in the package:

cd drupal/web
../vendor/bin/drush site-install --db-url=mysql://{username}:{password}@localhost/{database}

If you don’t do the installation with Drush you can do it manually, but the Drush installation handles all this for you. The manual process for installing Drupal 8 is:

  • Copy default.settings.php to settings.php and unprotect it
  • Copy default.license.yml to license.yml and unprotect it
  • Create sites/files and unprotect it
  • Navigate to EXAMPLE.COM/install to provide the database credentials and follow the instructions.

Add Contributed Modules From Packagist

Adding contributed modules is done a little differently. Instead of adding modules using drush dl, add additional modules by running composer commands from the Drupal root:

composer require drupal/migrate_upgrade 8.1.*@dev
composer require drupal/migrate_plus 8.1.*@dev

As you go, each module will be downloaded from Packagist and composer.json will be updated to add this module to the module list. You can peek into the composer.json file at the root of the project and see the ‘require’ list evolving.

Repeat until all desired contributed modules have been added. The composer.json file will then become the equivalent of a Drush make file, with documentation of all your modules.

For even more parity with Drush Make, you can add external libraries to your composer.json as well, and, with a plugin, you can also add patches to it. See more details about all these options at https://www.drupal.org/node/2471553.

Commit Files to the Repo

Commit the composer.json changes to the repo. The files downloaded by Composer do not need to be added to the repo. You’ll see a .gitignore file that keeps them out (this was added as a part of the composer packaging). Only composer.json, .gitignore and the /sites subdirectory (except /sites/default/files) will be stored in the git repository.

.gitignore

# Ignore directories generated by Composer
vendor
web/core
web/modules/contrib
web/themes/contrib
web/profiles/contrib

# Ignore Drupal's file directory
web/sites/default/files

Update Files

To update the files any time they might have changed, navigate to the Drupal root on the command line and run:

composer update

Add additional Drupal contributed modules, libraries, and themes at any time from the Drupal root with the same command used earlier:

composer require drupal/module_name 8.1.*@dev

That will add another line to the composer.json file for the new module. Then the change to composer.json needs to be committed and pushed to the repository. Other installations will pick this change up the next time they do git pull, and they will get the new module when they run composer update.

The composer update command should be run after any git pull or git fetch. So the standard routine for updating a repository might be:

git pull
composer update
drush updb
...

New Checkout

The process for a new checkout of this repository on another machine would simply be to clone the repository, then cd into it and run the following, which will then download all the required modules, files, and libraries:

composer install

That’s It

So that’s it. I was a little daunted at first but it turns out to be pretty easy to manage.  You can use the same process on a Drupal 7 site, with a few slight modifications.

Obviously the above process describes using the development versions of everything, since Drupal 8 is still in flux. As it stabilizes you’ll want to switch from using 8.1.*@dev to identifying specific stable releases for core and contributed modules.

See the links below for more information:

Sep 24 2015
Sep 24

But things have changed In Drupal 8. Hook_menu is gone and now all these tasks are managed separately using a system of YAML files that provide metadata about each item and corresponding PHP classes that provide the underlying logic.

The new system makes lots of sense, but figuring out how to make the switch can be confusing. To make things worse, the API has changed a few times over the long cycle of Drupal 8 development, so there is documentation out in the wild that is now incorrect. This article explains how things work now, and it shouldn't change any more.

I’m going to list some of the situations I ran into while porting a custom module to Drupal 8 and show before and after code examples of what happened to my old hook_menu items.

Custom Pages

One of the simplest uses of hook_menu is to set up a custom page at a given path. You'd use this for a classic "Hello World" module. In Drupal 8, paths are managed using a MODULE.routing.yml file to describe each path (or ‘route’) and a corresponding controller class that extends a base controller, which contains the logic of what happens on that path. Each controller class lives in its own file, where the file is named to match the class name. This controller logic might have lived in a separate MODULE.pages.inc file in Drupal 7.

In Drupal 7 the code might look like this:


function example_menu() {
  $items = array();
  $items[‘main’] = array(
    'title' => 'Main Page',
    'page callback' => example_main_page',
    'access arguments' => array('access content'),
    'type' => MENU_NORMAL_ITEM,
    'file' => 'MODULE.pages.inc'
  );
  return $items;
}

function example_main_page() {
  return t(‘Something goes here’);
}

In Drupal 8 we put the route information into a file called MODULE.routing.yml. Routes have names that don’t necessary have anything to do with their paths. They are just unique identifiers. They should be prefixed with your module name to avoid name clashes. You may see documentation that talks about using _content or _form instead of _controller in this YAML file, but that was later changed. You should now always use _controller to identify the related controller.


example.main_page_controller:
  path: '/main’
  defaults:
    _controller: '\Drupal\example\Controller\MainPageController::mainPage’
    _title: ‘Main Page’
  requirements:
    _permission: 'access content'


Note that we now use a preceding slash on paths! In Drupal 7 the path would have been main, and in Drupal 8 it is /main! I keep forgetting that and it is a common source of problems as I make the transition. It’s the first thing to check if your new code isn’t working!

The page callback goes into a controller class. In this example the controller class is named MainPageController.php, and is located at MODULE/src/Controller/MainPageController.php. The file name should match the class name of the controller, and all your module’s controllers should be in that /src/Controller directory. That location is dictated by the PSR-4 standard that Drupal has adopted. Basically, anything that is located in the expected place in the ‘/src’ directory will be autoloaded when needed without using module_load_include() or listing file locations in the .info file, as we had to do in Drupal 7.

The method used inside the controller to manage this route can have any name, mainPage is an arbitrary choice for the method in this example. The method used in the controller file should match the YAML file, where it is described as CLASS_NAME::METHOD. Note that the Contains line in the class @file documentation matches the _controller entry in the YAML file above.

A controller can manage one or more routes, as long as each has a method for its callback and its own entry in the YAML file. For instance, the core nodeController manages four of the routes listed in node.routing.yml.

The controller should always return a render array, not text or HTML, another change from Drupal 7.

Translation is available within the controller as $this->t() instead of t(). This works because ControllerBase has added the StringTranslationTrait. There's a good article about how PHP Traits like translation work in Drupal 8 on Drupalize.Me.


/**
 * @file
 * Contains \Drupal\example\Controller\MainPageController.
 */
namespace Drupal\example\Controller;

use Drupal\Core\Controller\ControllerBase;

class MainPageController extends ControllerBase {
  public function mainPage() {
    return [
        '#markup => $this->t('Something goes here!'),
    ];
  }


Paths With Arguments

Some paths need additional arguments or parameters. If my page had a couple extra parameters it would look like this in Drupal 7:


function example_menu() {
  $items = array();
  $items[‘main/first/second’] = array(
    'title' => 'Main Page',
    'page callback' => example_main_page',
    ‘page arguments’ => array(1, 2),
    'access arguments' => array('access content'),
    'type' => MENU_NORMAL_ITEM,
  );
  return $items;
}

function example_main_page($first, $second) {
  return t(‘Something goes here’);
}

In Drupal 8 the YAML file would be adjusted to look like this (adding the parameters to the path):


example.main_page_controller:
  path: '/main/{first}/{second}’
  defaults:
    _controller: '\Drupal\example\Controller\MainPageController::mainPage’
    _title: ‘Main Page’
  requirements:
    _permission: 'access content'

The controller then looks like this (showing the parameters in the function signature)::


/**
 * @file
 * Contains \Drupal\example\Controller\MainPageController.
 */
namespace Drupal\example\Controller;

use Drupal\Core\Controller\ControllerBase;

class MainPageController extends ControllerBase {
  public function mainPage($first, $second) {
    // Do something with $first and $second.
    return [
        '#markup => $this->t('Something goes here!'),
    ];
  }
}

Obviously anything in the path could be altered by a user so you’ll want to test for valid values and otherwise ensure that these values are safe to use. I can’t tell if the system does any sanitization of these values or if this is a straight pass-through of whatever is in the url, so I’d probably assume that I need to type hint and sanitize these values as necessary for my code to work.

Paths With Optional Arguments

The above code will work correctly only for that specific path, with both parameters. Neither the path /main, nor /main/first will work, only /main/first/second. If you want the parameters to be optional, so /main, /main/first, and /main/first/second are all valid paths, you need to make some changes to the YAML file.

By adding the arguments to the defaults section you are telling the controller to treat the base path as the main route and the two additional parameters as path alternatives. You are also setting the default value for the parameters. The empty value says they are optional, or you could give them a fixed default value to be used if they are not present in the url.


example.main_page_controller:
  path: '/main/{first}/{second}’
  defaults:
    _controller: '\Drupal\example\Controller\MainPageController::mainPage’
    _title: ‘Main Page’
    first: ''
    second: ''
  requirements:
    _permission: 'access content'

Restricting Parameters

Once you set up parameters you probably should also provide information about what values will be allowed for them. You can do this by adding some more information to the YAML file. The example below indicates that $first can only contain the values ‘Y’ or ‘N’, and $second must be a number. Any parameters that don’t match these rules will return a 404. Basically the code is expecting to evaluate a regular expression to determine if the path is valid.

See Symfony documentation for lots more information about configuring routes and route requirements.


example.main_page_controller:
  path: '/main/{first}/{second}’
  defaults:
    _controller: '\Drupal\example\Controller\MainPageController::mainPage’
    _title: ‘Main Page’
    first: ''
    second: ''
  requirements:
    _permission: 'access content'
    first: Y|N
    second: \d+

Entity Parameters

As in Drupal 7, when creating a route that has an entity id you can set it up so the system will automatically pass the entity object to the callback instead of just the id. This is called ‘upcasting’. In Drupal 7 we did this by using %node instead of %. In Drupal 8 you just need to use the name of the entity type as the parameter name, for instance {node} or {user}.


example.main_page_controller:
  path: '/node/{node}’
  defaults:
    _controller: '\Drupal\example\Controller\MainPageController::mainPage’
    _title: ‘Node Page’
  requirements:
    _permission: 'access content'

This obviously means you should be careful how you name your custom parameters to avoid accidentally getting an object when you didn’t expect it. Treat entity type names as reserved words that should not be used for other parameters. Or maybe even add a prefix to custom parameters to ensure they won’t collide with current or future entity types or other automatic mapping.

JSON Callbacks

All the above code will create HTML at the specified path. Your render array will be converted to HTML automatically by the system. But what if you wanted that path to display JSON instead? I had trouble finding any documentation about how to do that. There is some old documentation that indicates you need to add _format: json to the YAML file in the requirements section, but that is not required unless you want to provide alternate formats at the same path.

Create the array of values you want to return and then return it as a JsonResponse object. Be sure to add ”use Symfony\Component\HttpFoundation\JsonResponse” at the top of your class so it will be available.


/**
 * @file
 * Contains \Drupal\example\Controller\MainPageController.
 */
namespace Drupal\example\Controller;

use Drupal\Core\Controller\ControllerBase;
use Symfony\Component\HttpFoundation\JsonResponse;

class MainPageController extends ControllerBase {
  public function mainPage() {
    $return = array();
    // Create key/value array.
    return new JsonResponse($return);
  }
}

Access Control

Hook_menu() also manages access control. Access control is now handled by the MODULE.routing.yml file. There are various ways to control access:

Allow access by anyone to this path:


example.main_page_controller:
  path: '/main’
  requirements:
    _access: ‘TRUE’

Limit access to users with ‘access content’ permission:


example.main_page_controller:
  path: '/main’
  requirements:
    _permission: 'access content'

Limit access to users with the ‘admin’ role:


example.main_page_controller:
  path: '/main’
  requirements:
    _role: 'admin'

Limit access to users who have ‘edit’ permission on an entity (when the entity is provided in the path):


example.main_page_controller:
  path: '/node/{node}’
  requirements:
    _entity_access: ‘node.edit’

See Drupal.org documentation for more details about setting up access control in your MODULE.routing.yml file.

Hook_Menu_Alter

So what if a route already exists (created by core or some other module) and you want to alter something about it? In Drupal 7 that is done with hook_menu_alter, but that hook is also removed in Drupal 8. It’s a little more complicated now. The simplest example in core I could find was in the Node module, which is altering a route created by the System module.

A class file at MODULE/src/Routing/CLASSNAME.php extends RouteSubscriberBase and looks like the following. It finds the route it wants to alter using the alterRoutes() method and changes it as necessary. You can see that the values that are being altered map to lines in the original MODULE.routing.yml file for this entry.


/**
 * @file
 * Contains \Drupal\node\Routing\RouteSubscriber.
 */

namespace Drupal\node\Routing;

use Drupal\Core\Routing\RouteSubscriberBase;
use Symfony\Component\Routing\RouteCollection;

/**
 * Listens to the dynamic route events.
 */
class RouteSubscriber extends RouteSubscriberBase {

  /**
   * {@inheritdoc}
   */
  protected function alterRoutes(RouteCollection $collection) {
    // As nodes are the primary type of content, the node listing should be
    // easily available. In order to do that, override admin/content to show
    // a node listing instead of the path's child links.
    $route = $collection->get('system.admin_content');
    if ($route) {
      $route->setDefaults(array(
        '_title' => 'Content',
        '_entity_list' => 'node',
      ));
      $route->setRequirements(array(
        '_permission' => 'access content overview',
      ));
    }
  }

}

To wire up the menu_alter there is also a MODULE.services.yml file with an entry that points to the class that does the work:


services:
  node.route_subscriber:
    class: Drupal\node\Routing\RouteSubscriber
    tags:
      - { name: event_subscriber }

Many core modules put their RouteSubscriber class in a different location: MODULE/src/EventSubscriber/CLASSNAME.php instead of MODULE/src/Routing/CLASSNAME.php. I haven’t been able to figure out why you would use one location over the other.

Altering routes and creating dynamic routes are complicated topics that are really beyond the scope of this article. There are more complex examples in the Field UI and Views modules in core.

And More!

And these are still only some of the things that are done in hook_menu in Drupal 7 that need to be transformed to Drupal 8. Hook_menu is also used for creating menu items, local tasks (tabs), contextual links, and form callbacks. I’ll dive into the Drupal 8 versions of some of those in a later article.

More information about this topic:

Sep 02 2015
Sep 02

Watching the Drupal release cycle ebb and flow reminds me of sitting on the beach as the waves roll in. There is Drupal 5! It’s getting closer and closer! Finally it crashes on the beach in a splash of glory. But immediately, and initially imperceptibly, it starts to recede, making way for Drupal 6. And so the cycle goes, Drupal 5 recedes and Drupal 6 rushes in. Drupal 6 is overcome by Drupal 7. And now, as I write this, we’re watching as Drupal 7 washes back and Drupal 8 towers over the beach.

Each new version of Drupal is a huge improvement on the one before. But each version also introduces uncertainties. Is all that new functionality necessary? Has Drupal core become ‘bloated’? Does it do too much (or too little)? Will it be performant? How much work will it take to implement? Is it still buggy? And, arguably, the most important question of all, when will the contributed modules we need catch up?

So when is Drupal “ready” for our clients? If clients want a new site in this between-releases period, do we build it on the solid, safe, predictable older release? Or jump in with the shiny, new, improved release that is just over the horizon, or just released? Or do we wait for the new version to mature further and delay building a new site until it’s ready?

We’ve dealt with these questions over and over through the years. Knowing when to embrace and build on a new major release requires careful consideration along several axis, and making the right decision can be the difference between success and failure.Here are the guidelines I use.

How Complex is the Site?

If the site is simple and can be built primarily with Drupal Core, than the shiny new version is likely a safe bet. Contributed modules may add nice features, but creating the site without many (or any) of them will mitigate your risk.

Each new Drupal release pulls into core some functionality that was previously only possible using contributed modules. Drupal 5 allowed you to create custom content types in the UI. Drupal 7 added custom fields to core. Drupal 8 brings Views into core. And every Drupal release makes some contributed modules obsolete. If the new core functionality is a good match for what the site needs, we’ll be able to build a new site without using (and waiting for) those contributed modules, which would be a good reason to build out on the frontier.

Correspondingly, if the site requires many contributed modules that are not included in core, we’ll have to wait for, and perhaps help port, those modules before we can use the new version. If we can’t wait or can’t help we may have no choice but to use the older version, or wait until contributed modules catch up.

How Tight is the Deadline?

It will probably take longer to build a site on a new version of Drupal that everyone is still getting familiar with than an older version that is well understood. It always takes a little longer to do things when using new processes as when repeating patterns you’ve used many times before.

Delays will also be introduced while waiting for related functionality to be ready. Perhaps there is a contributed module that solves a problem, but it hasn’t been ported yet, so we have to stop and help port it. Or there may not be any contributed module that does anything close to what we need, requiring us to plan and write custom code to solve the problem. Latent bugs in the code may emerge only when real world sites start to use the platform, and we might have to take time to help fix them.

In contrast, if we’re using the mature version of Drupal, odds are good that the bugs have been uncovered and there is code somewhere to do pretty much anything that needs to be done. It might be a contributed module, or a post with examples of how others solved the problem, or a gist or sandbox somewhere. Whatever the problem, someone somewhere probably has already run into it. And that code will either solve the problem, or at least provide a foundation for a custom solution, meaning less custom code.

Basically, if the deadline is a key consideration, stick with the tried and true, mature version of Drupal. There just may not be enough time to fix bugs and create custom code or port contributed modules.

How Flexible is the Budget?

This is a corollary to the previous question. For all the same reasons that a deadline might be missed, the budget may be affected. It takes more time (and money) to write custom code (or stop and port related contributed modules). So again, if budget is tight and inflexible, it might be a bad decision to roll out a site on a shiny new version of Drupal.

How Flexible is the Scope?

If we use the latest, greatest, version of Drupal, is the scope flexible enough to allow us to leverage the way the new code works out of the box? If not, if the requirements of the new site force us to bend Drupal to our will, no matter what, it will require custom code. If we build on a more mature version of Drupal we may have more existing modules and code examples to rely on for that custom functionality. If we build on the bleeding edge, we’ll be much more on our own.

Where is the Data Coming From?

If this is a new, from-scratch site, and there’s no need to migrate old data in, that would be a good use case for building this shiny new site with the latest, greatest version of Drupal.

But if there is an existing site, and we need to not only create a new site, but also migrate data from the old site to the new, the question of which version to use gets more complicated. If the source is another, older Drupal site, there will (eventually) be a supported method to get data from the old site to the new site. Even so, that may not be fully ready when the new version of Drupal is released. Drupal 8 uses Migrate module for data migration, but only the Drupal 6 to Drupal 8 migration path is complete, and that migration process will likely improve in future point releases. The upgrade path in previous versions of Drupal was often fraught with problems early on. It's something that never gets fully baked until the new version is in use and the upgrade process is tested over and over with complex, real-world sites. So the need to migrate data is another reason to use the older, more mature version of Drupal (or to wait until the new release is more mature).

How Important Is the New Hotness?

Every version of Drupal has a few things that just weren’t possible in previous versions. CMI (Configuration Management) in Drupal 8 provides a much more rational process for deploying code and configuration changes than Drupal 7 does. Drupal 7 requires banging your head against the limitations of the Features module, which in turn is hampered by the fact that Drupal 7 core just isn’t architected in a way that makes this easy. And Drupal 8 core has built-in support for functionality previously only possible by using one or more additional Services modules in Drupal 7.

If these new features are critical features, and if struggling to solve them in older versions has been a time sink or requires complex contributed modules, it makes sense to dive into the latest greatest version that has this new functionality built in.

How Long Should It Last?

A final question is how often the site gets re-built. If it is likely to be redesigned and re-architected every two or three years to keep it fresh, there should be little concern about rolling out on the older, mature version of Drupal. Drupal 7 will be supported until Drupal 9 is released, and that is likely to be a long time in the future. If it will be many years before there will be budget to re-build this site that might be a reason to build it on the latest version, delaying the project if necessary until the latest version is fully supported by contributed modules and potential problems have been worked out.

It’s Complicated!

The ideas above are just part of the thought process we go through in evaluating when to use which version of Drupal. It’s often a complex question with no black and white answers. But I take pride in our ability to use our long experience with Drupal to help clients determine the best path forward in these between-release periods.

Jul 15 2015
Jul 15

Drush is great! I can’t manage Drupal without it. But now that Drupal 8 is nearing release I’ve run into a big problem. Drupal 8 requires the bleeding edge version of Drush, but that version of Drush won’t work with older Drupal sites. In particular, I was running into problems trying to switch between a Drupal 6 site and the Drupal 8 site I’m trying to migrate it into. Drupal 6 works with nothing later than Drush version 5, but Drupal 8 requires a minimum of Drush version 8! And in the meantime I’m still working on several Drupal 7 sites which have Drush scripts that only work with Drush version 6 or 7. What I needed was an easy way to switch versions of Drush for the task at hand.

I combed the web for instructions on how to switch Drush versions on a Mac and didn’t find what I needed. But I did find several articles that had parts of the answer. So I stitched things together and came up with the following system based on Composer.

1) Install Composer

Composer is the recommended method of installing Drush these days, certainly for the bleeding edge version. I’ll need composer to work with Drupal 8, so this makes sense anyway. It’s pretty easy to install following the instructions at https://getcomposer.org/doc/00-intro.md#globally.

I previously had Drush installed with homebrew and wanted to get rid of that installation, so I had to do this:

brew remove --force drush

Then I installed a default version of Drush, Drush version 7, globally:

composer global require drush/drush:7.*

2) Pick a Location

I decided to go whole hog and create a way to switch between every version I might need, Drush 5, 6, 7, or 8, by creating directories for each of these. I could do this anywhere, but the most logical place seemed to be in my user directory.

3) Install Drush 8

mkdir ~/drush8
cd ~/drush8 composer
require drush/drush:dev-master

4) Install Drush 7

mkdir ~/drush7
cd ~/drush7
composer require drush/drush:7.*

5) Install Drush 6

mkdir ~/drush6
cd ~/drush6
composer require "drush/drush:6.*"

6) Install Drush 5

cd ~
wget "https:// github. com/drush-ops/drush/archive/5.10.0.zip"
unzip 5.10.0.zip
sudo mv drush-5.10.0 drush5

7) Alias The Directories

To make them switchable I created an alias for each. In bash.profile I added the following:

alias drush5='~/drush5/drush'
alias drush6='~/drush6/vendor/bin/drush'
alias drush7='~/drush7/vendor/bin/drush'
alias drush8='~/drush8/vendor/bin/drush'

Since I installed Drush version 7 globally, anytime I type “drush” without a version modifier, it will default to using Drush 7. Because of that I could have skipped the installation of the Drush7 version above, but I decided I liked the idea of having both a global default (that I might change later) and a definite way to invoke version 7 that will work without knowing or caring what the global default is.

8) Test The Aliases

To test the finished system, I made sure the aliases work as designed. I opened a new terminal window (so it picks up the changes in the bash profile) and typed:

drush5 --version
drush6 --version
drush7 --version
drush8 --version

From this point on any time I need to run a drush script that uses a particular version of drush I just need to use my new aliases to do so:

drush5 status
drush6 cc all
drush7 sql-sync

9) Profit

That’s it. Now I know I can control the drush version by adjusting my commands to invoke the right version.

The following articles provided fodder for this solution:

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web