Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Jul 08 2020
Jul 08

Front-end development workflows have seen considerable innovation in recent years, with technologies like React disseminating revolutionary concepts like declarative components in JSX and more efficient document object model (DOM) diffing through Virtual DOMs. Nonetheless, while this front-end development revolution has led to significant change in the developer experiences we see in the JavaScript landscape and to even more momentum in favor of decoupled Drupal architectures in the Drupal community, it seems that many traditional CMSs have remained behind the curve when it comes to enabling true shared component ecosystems through developer experiences that focus on facilitating shared development practices across back and front end.

At DrupalCon Amsterdam 2019, Fabian Franz (Senior Technical Architect and Performance Lead at Tag1) delivered a session entitled "Components everywhere: Bridging the gap between back end and front end" that delved into his ideal vision for enabling such shared components in Drupal's own native rendering layer. Fabian joined Michael Meyers (Managing Director at Tag1), and me (Preston So, Editor in Chief at Tag1; Senior Director, Product Strategy at Oracle; and author of Decoupled Drupal in Practice) for a Tag1 Team Talks episode highlighting the progress other ecosystems have made in the face of this problem space and how a hypothetical future Drupal could permit rich front-end developer experiences seldom seen in the CMS world. In this two-part blog series, a sequel to Fabian's DrupalCon session, we dive into some of his new conclusions and their potential impact on Drupal's future.

Components everywhere in Drupal

At the onset of our conversation, Fabian offered a quick summary of his idea behind components everywhere—i.e. shared across both client and server—within the Drupal context. The main thrust of Fabian's vision is that developers in Drupal ought to be able to implement a back-end application in a manner indistinguishable from how they would implement a front-end application. In other words, developers should not necessarily need to understand Drupal's application programming interfaces (APIs) or decoupled Drupal approaches. By decoupling Drupal within its own architecture (as I proposed with Lauri Eskola and with Sally Young and Matt Grill before that), we can enable the implementation of purely data-driven Drupal applications.

But what does this truly mean from the standpoint of Drupal developers? Fabian identifies the moment where components everywhere will truly reach success as the conditions in which the same component can be leveraged on the front end and back end without any distinction in how data is handled. One of the key means of doing this in a way that can be shared across client and server is through slots, which can contain additional data and provide the concept of component "children."

Because of how Drupal's front-end architecture was originally architected, there are significant gaps between how Drupal handles its "components" and how other technologies juggle theirs. For instance, while theme functions comprise an important foundation for how Drupal developers interact with the Drupal front end, there is no way to provide a slot for interior data or nested components. There is an analogous concept in terms of children in the render tree, but this requires considerable knowledge of PHP to traverse. According to Fabian, though we have all of the elements needed for a component-based system available in Drupal, one of the primary challenges is that there are so many elements within Drupal that can lend themselves to such a component-based system.

Looking to Laravel for inspiration

Adhering to the open-source philosophy of "proudly found elsewhere," Fabian turned to other projects for inspiration as he began to articulate what it would take to implement the vision he presented in Amsterdam. After all, reinventing the wheel is usually an ill-advised approach when open-source solutions are available to be leveraged. For instance, Laravel contains templates but needed to introduce component tags to their templating system in order to capture generic slots. In Drupal, on the other hand, both theme functions and Twig templates can morphologically be considered components, but they lack certain key attributes most components today contain. Slots are implementable in Twig, but that is solely because all data is already available to Twig templates in Drupal.

Laravel 7 introduced BladeX to the Laravel ecosystem. BladeX provides a highly enjoyable developer experience by serving as a component handler for Laravel components. As long as developers prefix all components with x- in their custom element names (i.e. <x-component>, they no longer need to use a regular expression to find all possible component names in the component system, instead simply searching for all components whose names are prefixed with x-. And if the React developer experience is any indication, many modern front-end developers strongly prefer declarative HTML like the following:

    <x-alert prop="value"></x-alert>

BladeX first began as a contributed plugin to Laravel. Later, it was added to Laravel core due to its usefulness in enabling not only a graceful component system but also pleasant-to-use syntax to work with those components. Livewire also includes graceful capabilities enabling interactivity, which in Drupal is currently represented by the Drupal Ajax framework (difficult to use due to its tight coupling to Drupal's Form API).

More recently, Laravel introduced a tool known as Livewire, which makes it possible to implement server-side document object models (DOM) but lacks the data input/output (I/O) necessary to enable state management and interactivity. As such, Fabian extended the concept of a store from his DrupalCon session to include a provider that allows data retrieval and use in components. Fortunately, Livewire has a partial implementation of this, and it is possible to implement a server-side message that increments a counter and then to retrieve that counter value gracefully from the client side. Livewire automatically understands that it needs to update the server-side render of that counter and serve that updated value to the client.

What about Web Components?

Fabian's thinking is by no means alone when it comes to enabling components everywhere in Drupal. Many other initiatives, including one that aimed to introduce Web Components into Drupal, have been down this road. But why are Web Components so compelling for this in the first place? By going a step further and introducing the Shadow DOM, Web Components can provide full encapsulation automatically, off the shelf.

And the Shadow DOM itself is a game changer because of the benefits provided by syntactic features like CSS scoping, in which styles contained in a Shadow DOM are unaffected by those that came previously. Another way to accomplish such CSS scoping is through stringent class-based selector nomenclature or utilities like TailwindCSS that dispense with the traditional CSS cascade altogether. Many in the JavaScript world are increasingly moving in this direction, according to Fabian, of considering the cascade in CSS a suboptimal feature.

In other user interface (UI) systems, particularly in the mobile application development landscape, there are two emerging approaches to styling mobile applications seen in ecosystems like React Native and Flutter. These allow you to assemble compelling layouts without any cascade presented in CSS, and all are React-driven components that leverage CSS-in-JavaScript solutions to perform styling. Increasingly, these developments point to a landscape where developers eschew the cascade, long essential to writing CSS, in favor of a more atomic approach to styling components.

Conclusion

Components are difficult even in the best of times, not solely because of the relative conceptual complexity and differences in understandings when it comes to how components are defined from system to system. In the case of JavaScript technologies, approaches like React's declarative component syntax and Virtual DOM portend a world in which components are increasingly shared between client and server and data in components is decoupled from the component during all stages of component life cycles, irrespective of whether it is rendered on the back end or front end. Complicating matters further is the fact that traditional content management systems like Drupal and WordPress have largely not kept pace with the dizzying innovation in the front-end development universe.

In this blog post, we examined some of the new conclusions Fabian has come to well after his DrupalCon presentation when it comes to enabling components everywhere in Drupal, particularly taking inspiration from other ecosystems like Laravel, React, and Web Components. In the second installment of this two-part blog series, we'll dive into how to define components in Drupal, offer a more declarative component experience when working with them, and some of the other ways in which we can enable shared components across client and server and rich immutable data-driven state in a setting where these novelties have long seemed to be anathema or worlds removed: the Drupal front-end ecosystem.

Special thanks to Fabian Franz and Michael Meyers for their feedback during the writing process.

Photo by Tim Johnson on Unsplash

Jul 01 2020
Jul 01

Many front-end technologies, especially React, now consider the notion of declarative components to be table stakes. Why haven't they arrived in environments like the Drupal CMS's own front end? Many native CMS presentation layers tend to obsolesce quickly and present a scattered or suboptimal developer experience, particularly against the backdrop of today's rapidly evolving front-end development workflows. But according to Fabian Franz, there is a solution that allows for that pleasant front-end developer experience within Drupal itself without jettisoning Drupal as a rendering layer.

The solution is a combination of Web Components support within Drupal and intelligent handling of immutable state in data that allows for Drupal to become a more JavaScript-like rendering layer. Rather than working with endless render trees and an antiquated Ajax framework, and instead of reinventing Drupal's front-end wheel from scratch, Fabian recommends adopting the best of both worlds by incorporating key aspects of Web Components, the Shadow DOM, and particularly syntactic sugar for declarative components that competes readily not only with wildly popular JavaScript technologies like React and Vue but also matches up to the emerging approaches seen in ecosystems like Laravel.

In this Tag1 Team Talks episode, join Fabian Franz (Senior Technical Architect and Performance Lead at Tag1), Michael Meyers (Managing Director at Tag1), and your host and moderator Preston So (Editor in Chief at Tag1; Senior Director, Product Strategy at Oracle; and author of Decoupled Drupal in Practice) for a wide-ranging technical discussion about how to enable declarative components everywhere for Drupal's front end out of the box. If you were interested in Fabian's "Components Everywhere" talk at DrupalCon Amsterdam last year, this is a Tag1 Team Talks episode you won't want to miss!

[embedded content]

Related Links

DrupalCon Amsterdam 2019:Components everywhere! - Bridging the gap between backend and frontend

Insider insights on rendering and security featuresWhat the future holds for decoupled Drupal - part 2

Livewire

Laravel Blade Templates

Inertia.js

Mortenson's WebComponents server-side shim

AJAX API Guide on Drupal.org

Chat with the Drupal Community on Slack: https://www.drupal.org/slack

Vue.js

Lit-HTML

Other mentions:

Preston’s newsletter: Preston.so

Preact - Fast 3kB alternative to React with the same modern API https://preactjs.com/

Descript.com - Uses AI to transcribe Audio (PodCasts) and Video into text, providing you with a transcript & closed captioning; edit the audio/video by editing the text!

Photo by Ren Ran on Unsplash.

Jun 24 2020
Jun 24

After four-and-a-half years of development, Drupal 9 was just released, a milestone in the evolution of the Drupal content management system. The Drupal Association has long played a critical role not only in supporting the advancement and releases of one of the world's largest and most active open-source software projects; it also contributes to the Drupal roadmap and drives its forward momentum in other important ways. In addition to maintenance releases for Drupal 7 and Drupal 8, the Drupal 9 release not only promises an easy upgrade for Drupal 8 users but also ushers in a new period of innovation for Drupal.

But that's not all. Drupal 9's release also means long-awaited upgrades to Drupal.org as well as some of the most essential infrastructure and services that underpin Drupal.org and its associated properties, like localize.drupal.org, groups.drupal.org, and api.drupal.org. Releases in Drupal have also garnered greater scrutiny from nefarious actors who target launch dates to seek security vulnerabilities. The Drupal Association works tirelessly to buttress all of these initiatives and responsibilities, with the support of Tag1 and other organizations.

In this Tag1 Team Talks episode, part of a special series with the engineering team at the Drupal Association, we speak discuss Drupal 9 and what it portends for Drupal's future with Tim Lehnen (Chief Technology Officer, Drupal Association), Neil Drumm (Senior Technologist, Drupal Association), Narayan Newton (Chief Technology Officer, Tag1 Consulting), Michael Meyers (Managing Director, Tag1 Consulting), and Preston So (Editor in Chief at Tag1 Consulting and author of Decoupled Drupal in Practice). We dove into some of the nitty-gritty and day-in-the-life of Drupal core committers and how Drupal is taking a uniquely new approach to tackle technical debt.

[embedded content]

---

Links

Photo by asoggetti on Unsplash

Jun 22 2020
Jun 22

Maintaining Drupal projects and managing Drupal modules can be challenging for even contributors who have unlimited time. For decades now, Drupal's ecosystem has cultivated a wide array of tools for contributors to create patches, report issues, collaborate on code, and perform continuous integration. But as many source control providers begin to release shiny new features like web IDEs and issue workspaces that aim to make open-source contributors' lives even easier, many are doubtlessly wondering how Drupal's own developer workflows figure in an emerging world of innovation in the space.

DrupalSpoons, created by Moshe Weitzman and recently released, is a special configuration of groups and projects in GitLab that provides a bevy of useful features and tools for Drupal contributors who are maintaining Drupal projects. A play on the word "fork," which refers to a separately maintained clone of a codebase that still retains a link to the prior repository, DrupalSpoons offers support for GitLab issues, merge requests (GitLab's analogue for GitHub's pull requests), and continuous integration on contributed Drupal projects in the ecosystem. It leverages zero custom code, apart from the issue migration process to aid DrupalSpoons newcomers, and outlines potential trajectories for Drupal contribution in the long term as well.

In this exciting episode of Tag1 Team Talks, Moshe Weitzman (Subject Matter Expert, Senior Architect, and Project Lead at Tag1) hopped on with Michael Meyers (Managing Director at Tag1) and your host Preston So (Editor in Chief at Tag1 and author of Decoupled Drupal in Practice) for a deep dive into what makes DrupalSpoons so compelling for Drupal contributors and the origin story that inspired Moshe to build it. Join us to learn how you can replace your existing Drupal contribution workflows with DrupalSpoons to get the most out of Drupal's recent migration to GitLab and the most modern capabilities in Drupal code management today.

[embedded content]

---

Links

Photo by Richard Iwaki on Unsplash

Jun 15 2020
Jun 15

Part 1 | Part 2 | Part 3

For several years now, decoupled Drupal has been among the topics that has fixated members of the Drupal community. At present, there is no shortage of blog posts and tutorials about the subject, including my own articles, as well as a comprehensive book covering decoupled Drupal and an annual conference in New York City to boot. Now that JSON:API has been part of Drupal core for quite some time now, some of the former obstacles to implementations of decoupled Drupal architectures have been lowered.

However, though we have seen a large upswing in the number of decoupled Drupal projects now in the wild, some areas of the decoupled Drupal ecosystem have not yet seen the spotlight afforded projects like JSON:API and GraphQL. Nonetheless, many of these contributed projects are critical to adding to the possibilities of decoupled Drupal and can abbreviate the often lengthy period of time it takes to architect a decoupled Drupal build properly.

In April of last year, this author (Preston So, Editor in Chief at Tag1 Consulting and author of Decoupled Drupal in Practice) spoke to a packed auditorium at DrupalCon Amsterdam about some of the lesser-known portions of the decoupled Drupal landscape. In this multi-part blog series, we survey just a few of these intriguing projects that can serve to accelerate your decoupled Drupal implementations with little overhead but with outsized results. In this third and final installment, we cover several projects that encompass some of the most overlooked requirements in decoupled Drupal, namely JSON-RPC, Schemata, OpenAPI, and Contenta.js.

Running Drupal remotely with JSON-RPC

Just before we continue, it's important that you have exposure to the other information provided in this series for the most complete possible perspective on these projects that make decoupled Drupal even more compelling. This third and final installment in the blog series presumes knowledge already presented in the first and second installments, in particular the summary of motivations behind JSON-RPC provided in the installment immediately preceding this post.

Maintained by Mateu Aguiló Bosch (e0ipso) and Gabe Sullice (gabesullice), the mission of JSON-RPC is to serve as a canonical foundation for Drupal administrative actions that go well beyond the limitations and possibilities of RESTful API modules like core REST, Hypertext Application Language (HAL), and JSON:API. The JSON-RPC module also exposes certain internals of Drupal, including permissions and the list of enabled modules on a site.

To install JSON-RPC, use the following commands, which also enable JSON-RPC submodules.

    $ composer require drupal/jsonrpc
    $ drush en -y jsonrpc jsonrpc_core jsonrpc_discovery

Executing Drupal actions

To rebuild the cache registry, you can issue a POST request to the /jsonrpc endpoint with the following request body, and JSON-RPC will respond with the following response body and a 204 No Content response code.

    {
      "jsonrpc": "2.0",
      "method": "cache.rebuild",
    }

To retrieve a user's permissions, you can similarly issue a POST request to the same /jsonrpc endpoint, which will respond to your request with a 200 OK response code and a list of the user's permissions.

    {
  "jsonrpc": "2.0",
      "method": "user_permissions.list",
      "params": {
    "page": {
      "limit": 5,
          "offset": 0
        }
  },
      "id": 2
    }

All JSON-RPC methods

The table below shows some of the other common methods that you can execute by issuing requests to JSON-RPC. For a deeper explanation of JSON-RPC as well as a full account of what features JSON-RPC makes available to decoupled Drupal architectures, consult Chapter 23 of my book Decoupled Drupal in Practice.

drawing

Derived schemas and documentation with Schemata and OpenAPI

In API-first approaches, schemas are declarative descriptions that outline the shape of a JSON document, such as a typical entity response from a Drupal web service. In Drupal 8, the Schemata module, maintained by Adam Ross (grayside), is responsible for providing schemas that facilitate features that were previously impossible in Drupal such as generated API documentation and generated code, both of which we will examine shortly. To install the Schemata module, execute the following commands:

    $ composer require drupal/schemata
    $ drush en -y schemata schemata_json_schema

Navigating schemas

With Schemata, you can navigate a schema to learn more about how the API issues and handles data either by using the browser or by issuing GETrequests against endpoints that are prefixed with /schemata. Consider, for instance, the following format for Schemata requests:

    /schemata/{entity_type}/{bundle}?_format={output_format}&_describes={described_format}

Here are two examples of schema navigation with regard to the possible URLs against which you can issue requests. Note that in the first example, we are requesting a description of the resource according to the JSON:API module, whereas in the second we are requesting it in the HAL format found in Drupal 8's core REST module.

    /schemata/node/article?_format=schema_json&describes=api_json

    /schemata/user?_format=schema_json&describes=hal_json

In the image below, you can see the result of a sample response from Schemata for the schema describing article data.

Schemata sample response

OpenAPI

OpenAPI is a separate project, formerly known as the Swagger specification, which describes RESTful web services based on a schema. The OpenAPI module, maintained by Rich Gerdes (richgerdes) and Ted Bowman (tedbow), integrates with both core REST and JSON:API to document available entity routes in both web services modules.

The unique value proposition for OpenAPI for decoupled Drupal practitioners is that it offers a full explorer to traverse an API schema to understand what requests are possible and what responses are output when the API issues a response. To install OpenAPI, execute the following commands, depending on whether you prefer to use ReDoc or Swagger UI, both of which are libraries that integrate with OpenAPI to provide styles for API documentation.

    # Use ReDoc.
    $ composer require drupal/openapi
    $ composer require drupal/openapi_ui_redoc
    $ drush en -y openapi openapi_ui_redoc

    # Use Swagger UI.
    $ composer require drupal/openapi
    $ composer require drupal/openapi_ui_swagger
    $ drush en -y openapi openapi_ui_swagger

One of the more exciting use cases that OpenAPI makes possible is the idea of generated code, which is dependent on the notion that generated API documentation based on derived schemas means that APIs are predictable. This opens the door to possibilities such as generated CMS forms with built-in validation based on what these schemas provide. For more information about generated code based on the advantages of derived schemas and generated API documentation, consult Chapter 24 of my book Decoupled Drupal in Practice.

Revving up with proxies:

One final project that we would be remiss not to cover as part of this survey of hidden treasures of decoupled Drupal is Contenta.js, authored by Mateu Aguiló Bosch (e0ipso), which addresses the pressing need for a Node.js proxy that acts as middleware between a Drupal content API layer with web services and a JavaScript application. As many decoupled Drupal practitioners have seen in the wild, a Node.js proxy is often useful for decoupling Drupal due to its value in offloading responsibilities normally assigned to Drupal.

Contenta.js integrates seamlessly with any Contenta CMS installation that exposes APIs, as long as the URI of the site is provided in that site's configuration. Many developers working with decoupled Drupal are knowledgeable about Contenta CMS, an API-first distribution for Drupal that provides a content repository optimized for decoupled Drupal while still retaining many of the elements that make Drupal great such as the many contributed modules that add to Drupal's base functionality. (Another similar project, Reservoir, has since been deprecated.)

One of the compelling selling points of Contenta.js is that for Contenta installations that already have modules like JSON:API, JSON-RPC, Subrequests (covered in Chapter 23 of Decoupled Drupal in Practice), and OpenAPI need no further configuration in order for Contenta.js to work with little customization out of the box. Contenta.js contains a multithreaded Node.js server, a Subrequests server facilitating request aggregation, a Redis integration, and a more user-friendly approach to cross-origin resource sharing (CORS). For more information about Contenta.js, consult Chapter 16 of my book Decoupled Drupal in Practice.

Conclusion

Decoupled Drupal is no longer theoretical or experimental. For many developers the world over, it is now not only a reality but also a bare minimum requirement for many client projects. But fortunately for decoupled Drupal practitioners who may be skittish about the fast-changing world of API-first Drupal approaches, there is a rapidly expanding and maturing ecosystem for decoupled Drupal that furnishes solutions for a variety of use cases. Most of these are described at length in my book Decoupled Drupal in Practice.

In this final installment, we covered some of the major modules—and one Node.js project—that you should take into consideration when architecting and building your next decoupled Drupal project, including JSON-RPC, Schemata, OpenAPI, and Contenta.js. And in this multi-part blog series, we summarized some of the most important projects in the contributed Drupal landscape that can help elevate your decoupled Drupal implementations to a new level of success, thanks to the accelerating innovation occurring in the Drupal community.

Special thanks to Michael Meyers for his feedback during the writing process.

Part 1 | Part 2 | Part 3

Photo by ASA Arts & Photography on Unsplash

Jun 08 2020
Jun 08

Though the biggest news this month is the release of Drupal 9, that doesn't mean big releases aren't happening on other versions of Drupal too. The milestone represented by Drupal 9 also welcomes new versions of both Drupal 7 and Drupal 8 to the Drupal ecosystem. It's been four-and-a-half years since Drupal 8 was released, and 54 months of development from scores of contributors around the world went into Drupal 9. And thanks to the indefatigable efforts of open-source contributors in the module ecosystem, there are already over 2,000 contributed modules ready to go, compatible with Drupal 9 out of the box.

Drupal 9 is a massive step for innovation in the Drupal community, thanks to the careful thought that went into how Drupal can continue to stay ahead of the curve. During the Drupal 9 development cycle, which was largely about deprecating and removing old code, the Drupal core committers laid the groundwork for the future and facilitated a more pleasant upgrade experience from Drupal 8 to Drupal 9 that should smooth over many of the hindrances that characterized the transition from Drupal 7 to Drupal 8. And there's already exciting new plans ahead for Drupal 9, with coming releases consisting of even more refactoring and deprecations. With Drupal 9.1 in December, the focus will shift to new features and improvements, including user experience, accessibility, performance, security, privacy, and integrations.

In the second episode of our new monthly show Core Confidential, Fabian Franz (VP Software Engineering at Tag1) sat down with Michael Meyers (Managing Director at Tag1) and your host Preston So (Editor in Chief at Tag1 and author of Decoupled Drupal in Practice) for a quick but comprehensive survey of how Drupal 9 will change Drupal for the better. Beyond discussing the technical improvements and ecosystem advancements, this Core Confidential episode also dives into the anxieties, challenges, and concerns that core committers have about Drupal 9 moving forward.

[embedded content]

Links

Two moderately critical advisoriess that you need to be aware of and address:

Photo by Jingda Chen on Unsplash

Jun 01 2020
Jun 01

Part 1 | Part 2 | Part 3

Decoupled Drupal has been a hot topic in the Drupal community for several years now, and there are now many projects implementing decoupled Drupal architectures, as well as a bevy of content (including my own articles on the subject). Nowadays, decoupled Drupal practitioners can now benefit from the first-ever comprehensive book about decoupled Drupal as well as a yearly decoupled Drupal conference. Presently, especially with the JSON:API module now available as part of Drupal core, getting started with decoupled Drupal has never been more accessible to more developers.

Nevertheless, there are still hidden areas of decoupled Drupal that have seldom seen much attention in the Drupal community for a variety of reasons. Some of these contributed Drupal modules have been around for quite some time and can help to shorten the amount of time you spend implementing a decoupled Drupal architecture, whether it comes down to a differing API specification or extending existing functionality.

Recently, your correspondent (Preston So, Editor in Chief at Tag1 Consulting and author of Decoupled Drupal in Practice) delivered a DrupalCon Seattle session about some of these lesser-known parts of the decoupled Drupal ecosystem. In this multi-part blog series, we embark on a tour through some of these exciting areas and dive into how these projects can accelerate your builds. In this second installment, we cover how you can leverage the RELAXed Web Services module for own purposes and how you can extend existing features in the JSON:API module now incorporated into core.

Working with RELAXed Web Services

Before we proceed, be sure to read the first installment in this series for a quick introduction to decoupled Drupal and a taxonomy of the architecture involved. The first installment in this blog series also introduces RELAXed Web Services and how to install and configure the module. From this point forward, it is presupposed that you have a working Drupal 8 site with RELAXed Web Services installed and configured.

To verify that RELAXed Web Services is working properly, we can issue the following GET request against the /relaxed endpoint (or whatever we have configured the URL in the previous installment of this blog series). The Drupal server should respond with a 200 response code and the following response body:

    {
      "couchdb": "Welcome",
      "uuid": "02286a1b231b68d89624d281cdfc0404",
      "vendor": {
        "name": "Drupal",
        "version": "8.5.6",
      },
      "version": "8.5.6"
    }

Retrieving data with RELAXed Web Services

The following table describes all of the GET requests that you can issue against a variety of resources provided by RELAXed Web Services.

RELAXed Web Services response

The screenshot below demonstrates a sample RELAXed Web Services response to a GET request targeted to retrieve a single Drupal entity.

RELAXed Web Services response

Creating entities with RELAXed Web Services

To create documents, which in RELAXed Web Services parlance are equivalent to Drupal entities, you can issue a POST request to the /relaxed/live endpoint (or prefixed with the custom API root you have configured) with the following request body. The server will respond with a 201 Created response code.


    {
      "@context": {
        "_id": "@id",
        "@language": "en"
      },
      "@type": "node",
        "_id": "b6cea743-ba86-49b0-81ac-03ec728f91c4",
        "en": {
          "@context": {
            "@language": "en"
          },
        "langcode": [{ "value": "en" }],
        "type": [{ "target_id": "article" }],
        "title": [{ "value": "REST and RELAXation" }],
        "body": [
          {
            "value": "This article brought to you by a request to RELAXed Web
    Services!"
          }
        ]
      }
    }


Because a full description of RELAXed Web Services is well beyond the scope of this survey blog series, this section provided just a taste of some of the ways in which RELAXed Web Services differs from some of the other API approaches available in the decoupled Drupal ecosystem, including Drupal 8's native core REST and HAL (Hypertext Application Language).

Nonetheless, for developers looking for RESTful solutions that are more flexible than core REST and better-suited to offline solutions than JSON:API in many cases, RELAXed Web Services provides a powerful RESTful alternative. For more information about RELAXed Web Services and information about modifying and deleting individual documents remotely in Drupal, please consult Chapters 8 and 13 of my book Decoupled Drupal in Practice.

Extending JSON:API with Extras and Defaults

Oftentimes, when using modules like JSON:API, which is now available as part of Drupal 8 core for developers to leverage, we need to override the preconfigured defaults that accompany the module upon installation. Luckily, there are two modules available in Drupal's contributed ecosystem that can make this process much easier, especially given the fact that JSON:API aims to work out of the box as a zero-configuration module.

The JSON:API Extras module provides interfaces to override default settings and configure new ones that the resulting API should follow in lieu of what comes off the shelf in the JSON:API module. Some of the features contained in the module include capabilities such as enabling and disabling individual resources from the API altogether, aliasing resource names and paths, disabling individual fields within entity responses, aliasing constituent field names, and modifying field output through field enhancers in Drupal.

You can install both modules easily with Composer. JSON:API Defaults, which we cover later in this section, is available as a submodule of JSON:API Extras.

    # Install JSON:API Extras.
    $ composer require drupal/jsonapi_extras
    $ drush en -y jsonapi_extras

    # Install JSON:API Defaults.
    $ drush en -y jsonapi_extras jsonapi_defaults

In the following image, you can see how we can override certain preconfigured settings in JSON:API such as disabling the resource altogether, changing the name of the resource type, and overriding the resource path that follows the /jsonapi prefix.

Add JSON:API resource

In the image below, field customization is displayed in JSON:API Extras, a feature that allows you to alias fields and perform other actions that permit you to customize the response output in a more granular way. As you can see, one of the most compelling motivations for JSON:API Extras on your own implementation is the notion of full customization of JSON:API's output not only at the resource level but at the field level as well.

Field customization

JSON:API Defaults

Formerly an entirely separate module maintained by Martin Kolar (mkolar), JSON:API Defaults allows you to set default includes and filters for resources in JSON:API. JSON:API Defaults is particularly useful when consumers prefer issuing slimmer requests without the parameters required to yield a response that includes relationships in the payload. In other words, you can issue a request without parameters and receive a response having predetermined defaults such as includes.

Though a full discussion of JSON:API Defaults is outside the scope of this rapid-fire survey of the lesser-known parts of the decoupled Drupal ecosystem, I highly encourage you to check out Chapter 23 in my book Decoupled Drupal in Practice, which engages in an in-depth discussion of JSON:API Extras and JSON:API Defaults.

Running Drupal remotely

Sometimes, merely interacting with Drupal content through APIs in decoupled Drupal is insufficient for the use cases and requirements that our customers demand. Deeper functionality in Drupal is often required remotely for consumer applications to access, particularly actions such as performing a cache registry rebuild or running a cron job. But these do not necessarily fit neatly into the normal API-driven approaches for Drupal entities, because they are not part of the RESTful paradigms in which Drupal generally operates out of the box.

In decoupled Drupal and other software ecosystems, remote procedure calls (RPCs) are calls that execute a procedure on another system, written as if they were local actions, without direct code written against that other system. In short, in the decoupled Drupal context, they are a convenient way for consumer applications to perform tasks remotely without their developers needing to understand the nuts and bolts of the upstream system. In the next installment of this blog series, we'll cover Drupal's RPC approach for decoupled Drupal and how you can leverage it for a variety of tasks you need in your client.

Conclusion

In this blog post, we surveyed several of the major API-first solutions available for decoupled Drupal aficionados that have not received as much attention as of late, including RELAXed Web Services and supplementary modules that provide additional features like JSON:API Extras and JSON:API Defaults. Over the course of this post, we covered how to retrieve entities using RELAXed Web Services and how you can customize your JSON:API resources and fields to your heart's content.

In the following installment of this multi-part blog series, we dive into JSON-RPC, the RPC provider for decoupled Drupal and discuss how to perform certain tasks using the JSON-RPC module. In addition, we'll cover derived schemas and API documentation, two of the most important concepts in the emerging API-first landscape that is beginning to gain significant attention in the headless CMS community.

Special thanks to Michael Meyers for his feedback during the writing process.

Part 1 | Part 2 | Part 3

Photo by Stefan Steinbauer on Unsplash

May 26 2020
May 26

Now that decoupled Drupal has permeated the Drupal community, even to the furthest extremes, articles (including my own) introducing concepts and how-to tutorials describing how you can build your first decoupled Drupal architecture are ubiquitous. As a matter of fact, decoupled Drupal now also has a book on the subject as well as an annual conference dedicated to the topic. Particularly with the JSON:API module in Drupal core as of 8.7.0, decoupled Drupal out of the box has never been easier.

But despite the brilliant spotlight shining on decoupled Drupal from all corners of the CMS industry, there are lesser-known secrets and hidden treasures that reflect not only the innovative character of the Drupal community but also some true gems that can accelerate your decoupled Drupal implementation. Whether in the area of web services or the category of Drupal modules that extend those same web services, there are myriad components of the decoupled Drupal experience that you may not have heard of before.

In this multi-part blog series, we’ll delve into a few of these concepts in this companion piece to the recent session I (Preston So, Editor in Chief at Tag1 Consulting and author of Decoupled Drupal in Practice) gave entitled “Secrets of the decoupled Drupal practitioner” at DrupalCon Seattle in April. We’ll first venture through a rapid reintroduction to decoupled Drupal before moving progressively up the stack, starting with web services and ending with some of the consumer tooling available to help you maintain high velocity.

A quick introduction to decoupled Drupal

In short, monolithic Drupal consists of a contiguous Drupal architecture that cannot be separated into distinct services. In other words, the default Drupal front end is inextricable from the larger Drupal monolith because of all the linkages that require the front end to remain coupled to the back end, including data references in the theme layer and other tools like the Form API, which allows for the rendering of forms in the Drupal presentation layer according to certain back-end logic.

Defining decoupled Drupal

The simplest definition of decoupled Drupal is also one that adheres to the larger definition of decoupled CMS (and an exhaustive definition is also available in my book Decoupled Drupal in Practice). In short, a decoupled CMS is a content or data service that exposes data for consumption by other applications, whatever these applications are built in, including native mobile, native desktop, and single-page applications. Whereas a single Drupal site could serve a single consumer, in today’s landscape, many practitioners are implementing a single Drupal site that simultaneously acts as a repository for a wide variety of consumers.

The original iteration of decoupled Drupal came in the mid-2010s with the advent of progressively decoupled Drupal. In this paradigm, rather than separating out the front end into a separate implementation, a JavaScript framework could be interpolated into the existing Drupal front end and have access not only to ES2015 capabilities but also certain data Drupal makes available to its presentation layer.

Universal JavaScript

With the proliferation of Node.js and the enablement of server-side JavaScript, Drupal began to be relegated more to concerns surrounding API provisioning and structured content management, while JavaScript application libraries and frameworks like React and Vue.js could take over for all rendering concerns, not only on the client side but also on the server (for both progressive enhancement and search engine optimization purposes).

A typical architecture that implements decoupled Drupal in conjunction with a universal JavaScript application (shared JavaScript code for rendering across both client and server) would facilitate the following interactions: During the initial server-side render executed by Node.js, the application fetches all data synchronously from Drupal to flesh out the render that will be flushed to the browser. Then, when the client-side bundle of the application initializes, the initial render is rehydrated with further asynchronous client-side renders that retrieve updated data from Drupal as needed.

There are many risks and rewards involved in implementing a decoupled architecture of this nature, especially in terms of architecture, developer experience, security and performance, and project management. For more information about these advantages and disadvantages as well as more detailed background on decoupled Drupal, consult my new book Decoupled Drupal in Practice (Apress, 2018).

An alternative API: RELAXed Web Services

While JSON:API and GraphQL have seemingly received all the airtime when it comes to web services available in Drupal, there is another web service implementation that not only adheres to a commonly understood specification, like JSON:API and GraphQL, but also enables a variety of new functionality related to content staging and offline-enabled website features. RELAXed Web Services, a module maintained by Tim Millwood and Andrei Jechiu, implements the Apache CouchDB specification and is part of the Drupal Deploy ecosystem, which provides modules that allow for rich content staging.

An implementation of CouchDB stores data within JSON documents (or resources) exposed through a RESTful API. And unlike Drupal’s own core REST API, now mostly superseded by the availability of JSON:API in core as of Drupal 8.7, CouchDB implementations accept not only the typical HTTP methods of GET, POST, and DELETE, but also PUT and COPY.

drawing

RELAXed Web Services occupies a relatively unique place in the Drupal web services ecosystem. The diagram above, which is not exhaustive, illustrates some of the ways in which Drupal’s major web services modules interact. Some depend on only the Serialization module, such as Drupal’s JSON:API implementation (prior to its entry into Drupal core), while others such as GraphQL rely on nothing at all. RELAXed Web Services relies on both REST and Serialization in order to provide its responses.

Thus, we can consider RELAXed Web Services part of the RESTful API segment of Drupal’s web services. The above Euler diagram illustrates how GraphQL, because it does not adhere to REST principles, remains uniquely distinct from other modules such as core REST, JSON:API, and RELAXed Web Services. While all RESTful APIs are web services, not all web services are RESTful APIs.

Installing and configuring RELAXed Web Services

To install RELAXed Web Services, you’ll need to use Composer to install both the relaxedws/replicator dependency and the module itself:

    $ composer require relaxedws/replicator:dev-master
    $ composer require drupal/relaxed
    $ drush en -y relaxed

Fortunately, RELAXed Web Services does not require you to use its content staging capabilities if you do not wish to, but you will need to configure the Replicator user and install the separate Workspaces module if you wish to do so. Without the Workspaces module enabled, the default workspace that is available by default in RELAXed Web Services is live, and we will see in the next installment of this blog series why that name is so important.

The screenshot below displays the RELAXed Web Services settings page, where you can configure information such as the Replicator user and customize an API root if you wish to prefix references to your resources with something different.

drawing

While covering the full range of RELAXed Web Services' capabilities is beyond the scope of this first installment, I strongly encourage you to take a look at what is available with the help of the Apache CouchDB specification, as some of the use cases that this approach can enable are unprecedented when it comes to the future of user experiences leveraging decoupled Drupal.

Conclusion

In this blog post, we embarked on a rapid-fire reintroduction to decoupled Drupal for those unfamiliar with the topic as well as a deep dive into one of the most fascinating and seldom discovered modules utilized in the decoupled Drupal space, RELAXed Web Services, which implements the Apache CouchDB specification. In the process, we covered how to install the module before turning to how to use RELAXed Web Services to implement a variety of data requirements in decoupled Drupal architectures in the next installment.

In the next installment in this multi-part blog series, we'll cover how to employ RELAXed Web Services for common needs in decoupled Drupal architectures and some of the intriguing ways in which the module and its surrounding ecosystem enable not only content staging use cases but also offline-enabled features that satisfy the widening demands that many clients working with decoupled Drupal today have on a regular basis.

Special thanks to Michael Meyers for his feedback during the writing process.

Photo by Michael Dziedzic on Unsplash

May 18 2020
May 18

Of all the discussions in the Drupal community, few have generated such a proliferation of blog posts and conference sessions as decoupled Drupal, which is also the subject of a 2019 book and an annual New York conference—and has its share of risks and rewards. But one of the most pressing concerns for Drupal is how we can ensure a future for our open-source content management system (CMS) that doesn't relegate it to the status of a replaceable content repository. In short, we have to reinvent Drupal to provide not only the optimal back-end experience for developers, but also a front end that ensures Drupal's continued longevity for years to come.

A few months ago, Fabian Franz (Senior Technical Architect and Performance Lead at Tag1 Consulting) offered up an inspirational session that presents a potential vision for Drupal's front-end future that includes Web Components and reactivity in the mix. In Fabian's perspective, by adopting some of the key ideas that have made popular JavaScript frameworks famous among front-end developers, we can ensure Drupal's survival for years to come.

In this multi-part blog series that covers Fabian's session in detail from start to finish, we summarize some of the key ideas that could promise an exciting vision not only for the front-end developer experience of Drupal but also for the user experience all Drupal developers have to offer their customers. In this fifth installment in the series, we continue our analysis of some of the previous solutions we examined and consider some of the newfangled approaches made possible by this evolution in Drupal.

The "unicorn dream"

Before we get started, I strongly recommend referring back to the first, second, third, and fourth installments of this blog series if you have not already. They cover essential background information and insight into all of the key components that constitute the vision that Fabian describes. Key concepts to understand include Drupal's render pipeline, virtual DOMs in React and Vue, the growing Twig ecosystem, universal data stores, and how reactivity can be enabled in Drupal.

One of the final questions Fabian asks in his presentation is about the promise unleashed by the completion of work to enable shared rendering in Drupal, as well as reactivity and offline-enabled functionality. During his talk, Fabian recalls a discussion he had at DrupalCon Los Angeles with community members about what he calls the unicorn dream: an as-yet unfulfilled vision to enable the implementation of a Drupal site front end with nothing more than a single index.html file.

Slots

Fabian argues that the component-driven approach that we have outlined in this blog series makes this unicorn dream possible thanks to slots in Web Components. Because React, Vue, and Twig all have slots as part of their approaches to componentization, this possibility becomes more possible than ever before. Front-end developers can insert repeatable blocks with little overhead while still benefiting from configuration set by editors who don't touch a single line of code but that affects rendered output. Developers can extend said block rather than overriding the block.

Consider, for instance, the following example that illustrates leveraging an attribute to override the title of a block:

    <sidebar type="left">
      <block slot="header" id="views:recent_content">
        <h2 slot="title">I override the title</h2>
      </block>
    </sidebar>

When Fabian attempted to do this with pure Twig, he acknowledges that the complexity quickly became prohibitive to proceed, and the prototype never reached core-readiness. However, thanks to this approach using Web Components slots, one could create plugins for modern editors that would simply use and configure custom elements. Among editors that would support this hypothetical scenario are heavyweights like CKEditor 5, ProseMirror (which Tag1 recently selected as part of a recent evaluation of rich-text editors), and Quip.

Developer experience improvements

This means that we as developers no longer have the need to convert the display of tokens through a variety of complex approaches. Instead, we can simply render HTML and directly output the configured component; Drupal will handle the rest:

    <drupal-image id="123" />

Moreover, leveraging BigPipe placeholders with default content finally becomes simple thanks to this approach, argues Fabian. We can simply place default content within the component, and once the content arrives, it becomes available for use:

    <block id="views:recent_content" drupal-placeholder="bigpipe">
      I am some default content!
    </block>

In this way, we can take advantage of our existing work implementing BigPipe in Drupal 8 rather than resorting to other JavaScript to resolve this problem for us.

Performance improvements

Finally, some of the most important advancements we can make come in the area of performance. For front-end developers who need to serve the ever-heightening demands of customers needing the most interactive and reactive user experience possible, performance is perennially a paramount consideration. When using a universal data store, performance can be improved drastically, particularly when the store is utilized for as many data requirements as possible.

We can simply update the real-time key-value store, even if this happens to solely be located on Drupal. As Fabian argues, a data-driven mindset makes the problem of shared rendering and componentization in Drupal's front end much simpler to confront. Developers, contends Fabian, can export both the data and template to a service such as Amazon S3 and proceed to load the component on an entirely different microsite, thus yielding benefits not only for a single site but for a collection of sites all relying on the same unified component, such as &lt;my-company-nav />.

Such an approach would mean that this company-wide navigation component would always be active on all sites requiring that component, simplifying the codebase across a variety of disparate technologies.

Editorial experience improvements

Nonetheless, perhaps some of the most intriguing benefits come from improvements to the editorial experience and advancements in what becomes possible despite the separation of concerns. One of the chief complaints about decoupled Drupal architectures, and indeed one of its most formidable disadvantages, is the loss of crucial in-context functionality that editors frequently rely on on a daily basis such as contextual links and in-place editing.

With Fabian's approach, the dream that formerly seemed utterly impossible of achieving contextual administrative interfaces within a decoupled Drupal front end become not only possible but realistic. We can keep key components of Drupal's contextual user interface such as contextual links as part of the data tree rather than admitting to our customers that such functionality would need to vanish in a scenario enabling greater reactivity and interactivity for users.

After all, one of the key critiques of decoupled Drupal and JavaScript approaches paired with Drupal, as I cover in my book Decoupled Drupal in Practice, is the lack of support for contextual interfaces and live preview, though I've presented on how Gatsby can mitigate some of these issues. Not only does this solution allow for contextual interfaces like contextual links to remain intact; it also means that solutions like progressive decoupling also become much more feasible.

Moreover, one of the key benefits of Fabian's approach is Drupal's capacity to remain agnostic to front-end technologies, which guarantees that Drupal is never coupled to a framework that could become obsolete in only a few years, without having to reinvent the wheel or create a Drupal-native JavaScript framework. And one of the key defenses of Fabian's vision is this rousing notion: We can potentially enable single-page applications with Drupal without having to write a single line of JavaScript.

Outstanding questions

Despite the rousing finish to Fabian's session, pertinent questions and concerns remain about the viability of his approach that were borne out during the Q&A session following the session. One member of the audience cited the large number of examples written in Vue and asked whether other front-end technologies could truly be used successfully to implement the pattern that Fabian prescribes. Fabian responded by stating that some work will be necessary to implement this in the framework's own virtual DOM, but in general the approach is possible, as long as a customizable render() function is available.

Another member of the audience asked how Drupal core needs to evolve in order to enable the sort of future Fabian describes. Fabian answered by recommending that more areas in Drupal responsible for rendering should be converted to lazy builders. This is because once no dependencies in the render tree are present, conversion to a component tree would be much simpler. Fabian also cited the need for a hook that would traverse the DOM to discover custom components after each rendering of the Twig template. Thus, the main difference would be writing HTML in lieu of a declaration in Twig such as {% include menu-item %}.

Conclusion

In this fifth and final installment of our multi-part blog series about a visionary future for Drupal's front end, we examined Fabian's rousing DrupalCon Amsterdam session to discuss some of the benefits that reactivity and offline-first approaches could have in Drupal, as well as a framework-agnostic front-end vision for components that potentially extends Drupal's longevity for many years to come. For more information about these concepts, please watch Fabian's talk and follow our in-house Tag1 Team Talks for discussion about this fascinating subject.

Special thanks to Fabian Franz and Michael Meyers for their feedback during the writing process.

Photo by Stephen Leonardi on Unsplash

May 13 2020
May 13

What is the day-to-day life of a Drupal core committer like? Besides squashing bugs and shepherding the Drupal project, the maintainers responsible for Drupal core are also constantly thinking of ways to improve the developer experience and upgrade process for novice and veteran Drupal users alike. With Drupal 9 coming just around the corner, and with no extended support planned for Drupal 8 thanks to a more seamless transition to the next major release, Drupal's core developers are hard at work building tools, approving patches, and readying Drupal 9 for its day in the spotlight. But Drupal 9 isn't the only version that requires upkeep and support—other members of the Drupal core team also ensure the continued longevity of earlier versions of Drupal like Drupal 7 as well.

The impending release of Drupal 9 has many developers scrambling to prepare their Drupal implementations and many module maintainers working hard to ensure their contributed plugins are Drupal 9-ready. Thanks to Gábor Hojtsy's offer of #DrupalCares contributions in return for Drupal 9-ready modules, there has been a dizzying acceleration in the growth of modules available as soon as Drupal 9 lands. In addition, the new Rector module allows for Drupal contributors to have access to a low-level assessment of what needs to change in their code to be fully equipped for the Drupal 9 launch.

In this inaugural episode of Core Confidential, the insider guide to Drupal core development and Tag1's new series, we dive into the day-to-day life of a core committer and what you need to know about Drupal 9 readiness with the help of Fabian Franz (VP of Software Engineering at Tag1), Michael Meyers (Managing Director at Tag1), and your host Preston So (Editor in Chief at Tag1 and author of Decoupled Drupal in Practice). Learn more about how Drupal's core team continues to support the Drupal project as it gets ready for the latest and greatest in Drupal 9, due to be released this summer for eager CMS practitioners worldwide.

[embedded content]

Links:

May 12 2020
May 12

Load testing is one of the tools we leverage regularly at Tag1. It can help prevent website outages, stress test code changes, and identify bottlenecks. The ability to run the same test repeatedly gives critical insight into the impact of changes to the code and/or systems. Often -- as part of our engagements with clients -- we will write a load test that can be leveraged and re-used by the client into the future.

In some cases, our clients have extensive infrastructures and multi-layered caches, including CDNs, that also need to be load tested. In these instances, it can take a considerable amount of computing power to generate sufficient load to apply stress and identify bottlenecks. This ultimately led us to write and open source Goose, a new and powerful load testing tool.

Discovering Locust was a breath of fresh air, solving so many of the frustrations we used to have when load testing with jMeter. Instead of working with a clunky UI to build sprawling, bloated JMX configuration files, Locust allows the writing of truly flexible test plans in pure Python. This allows code to easily be re-used between projects, and swarms of distributed Locusts can easily be spun-up to apply distributed load during testing. Locust added considerable power and flexibility to our load testing capabilities, and made the entire process more enjoyable.

Though Python is a great language that allows for quickly writing code, it's not without flaws. Locust uses resources more efficiently than jMeter, but the Python GIL, or Global Interpreter Lock, locks Python to a single CPU core. Fortunately, you can work around this limitation by starting a "slave process" for each core, and then performing the load test with a "master process" all running on the same server. Locust is therefore able to work around some of Python's limitations thanks to its excellent support for distributed load testing.

Recently we've been hearing a lot about the Rust language, and were curious to see if it could improve some of our standard toolset. The language has a steep learning curve primarily due to its concept of ownership, an ingenious solution to memory management that avoids the need for garbage collection. The language focuses on correctness, trading slower compilation times for extremely performant and reliable binaries. And there's a lot of (well earned) hype about how easy it is to write safe multithreaded code in Rust. It seemed like an excellent way to increase our ability to load test large client websites with fewer load testing server resources.

The Rust ecosystem is still fairly young and evolving, but there are already fantastic libraries providing much flexibility when load testing. The compiler ensures that you're writing safe code, and the resulting binaries tend to be really fast without extra programming effort. For these reasons, it was looking like Rust would be an excellent language to use for load testing.

Indeed, once we had an early prototype of Goose, we were able to run some comparisons, and have seen amazing performance improvements compared to similar load tests run with Locust. With the same test plan, Goose is consistently able to generate over five times as much traffic as Locust using the same CPU resources on the test server. As you add more CPU cores to the testing infrastructure, Goose's multithreaded Rust implementation seamlessly takes advantage of the added resources without additional configuration.

When writing Goose, we were primarily interested in preserving specific functionality from Locust that we use regularly. We first identified the run-time options we depend on the most, and then used the Rust structopt library to add them as command line options to the as-of-yet then non-functional Goose. We then worked option by option, studying how they are implemented in Locust and reimplementing them in Rust. The end result can be seen by passing the -h flag to one of the included examples.

CLI Options

The easiest way to develop Rust libraries and applications is with Cargo, the Rust package manager. Goose includes some example load tests to demonstrate how to write them, each of which can be run with Cargo. To compile and run the included simple example and pass the resulting application the -h flag, you can type:

$ cargo run --example simple --release -- -h
    Finished release [optimized] target(s) in 0.06s
     Running `target/release/examples/simple -h`
client 0.5.8
CLI options available when launching a Goose loadtest, provided by StructOpt

USAGE:
    simple [FLAGS] [OPTIONS]

FLAGS:
    -h, --help            Prints help information
    -l, --list            Shows list of all possible Goose tasks and exits
    -g, --log-level       Log level (-g, -gg, -ggg, etc.)
        --only-summary    Only prints summary stats
        --print-stats     Prints stats in the console
        --reset-stats     Resets statistics once hatching has been completed
        --status-codes    Includes status code counts in console stats
    -V, --version         Prints version information
    -v, --verbose         Debug level (-v, -vv, -vvv, etc.)

OPTIONS:
    -c, --clients           Number of concurrent Goose users (defaults to available CPUs)
    -r, --hatch-rate     How many users to spawn per second [default: 1]
    -H, --host                 Host to load test in the following format: http://10.21.32.33 [default: ]
        --log-file          [default: goose.log]
    -t, --run-time         Stop after the specified amount of time, e.g. (300s, 20m, 3h, 1h30m, etc.)
                                     [default: ]

Statistics

Goose displays the same statistics as Locust, though we chose to split the data into multiple tables in order to make the tool more useful from the command line. The following statistics were displayed after running a one-hour load test using the included drupal_loadtest example with the following options (which should look familiar to anyone that has experience running Locust from the command line):


    cargo run --release --example drupal_loadtest --   --host=http://apache.fosciana -c 100 -r 10 -t 15m --print-stats --only-summary -v

The load test ran for fifteen minutes, then automatically exited after displaying the following statistics:


------------------------------------------------------------------------------ 
 Name                    | # reqs         | # fails        | req/s  | fail/s
 ----------------------------------------------------------------------------- 
 GET (Auth) comment form | 13,192         | 0 (0%)         | 14     | 0    
 GET (Auth) node page    | 43,948         | 0 (0%)         | 48     | 0    
 GET (Auth) login        | 20             | 0 (0%)         | 0      | 0    
 GET (Anon) user page    | 268,256        | 0 (0%)         | 298    | 0    
 GET static asset        | 8,443,480      | 0 (0%)         | 9,381  | 0    
 GET (Auth) user page    | 13,185         | 0 (0%)         | 14     | 0    
 GET (Anon) node page    | 894,176        | 0 (0%)         | 993    | 0    
 POST (Auth) login       | 20             | 0 (0%)         | 0      | 0    
 GET (Auth) front page   | 65,936         | 1 (0.0%)       | 73     | 0    
 POST (Auth) comment f.. | 13,192         | 0 (0%)         | 14     | 0    
 GET (Anon) front page   | 1,341,311      | 0 (0%)         | 1,490  | 0    
 ------------------------+----------------+----------------+--------+--------- 
 Aggregated              | 11,096,716     | 1 (0.0%)       | 12,329 | 0    
-------------------------------------------------------------------------------
 Name                    | Avg (ms)   | Min        | Max        | Median    
 ----------------------------------------------------------------------------- 
 GET (Auth) comment form | 108        | 16         | 6271       | 100       
 GET (Auth) node page    | 109        | 14         | 6339       | 100       
 GET (Auth) login        | 23147      | 18388      | 27907      | 23000     
 GET (Anon) user page    | 13         | 1          | 6220       | 4         
 GET static asset        | 4          | 1          | 6127       | 3         
 GET (Auth) user page    | 57         | 8          | 6205       | 50        
 GET (Anon) node page    | 13         | 1          | 26478      | 4         
 POST (Auth) login       | 181        | 98         | 234        | 200       
 GET (Auth) front page   | 83         | 16         | 6262       | 70        
 POST (Auth) comment f.. | 144        | 25         | 6294       | 100       
 GET (Anon) front page   | 5          | 1          | 10031      | 3         
 ------------------------+------------+------------+------------+------------- 
 Aggregated              | 6          | 1          | 27907      | 3         
-------------------------------------------------------------------------------
 Slowest page load within specified percentile of requests (in ms):
 ------------------------------------------------------------------------------
 Name                    | 50%    | 75%    | 98%    | 99%    | 99.9%  | 99.99%
 ----------------------------------------------------------------------------- 
 GET (Auth) comment form | 100    | 100    | 200    | 300    | 1000   |   1000
 GET (Auth) node page    | 100    | 100    | 200    | 300    | 1000   |   1000
 GET (Auth) login        | 23000  | 25000  | 27907  | 27907  | 27907  |  27907
 GET (Anon) user page    | 4      | 8      | 90     | 100    | 200    |    200
 GET static asset        | 3      | 6      | 10     | 10     | 30     |     30
 GET (Auth) user page    | 50     | 60     | 100    | 100    | 2000   |   2000
 GET (Anon) node page    | 4      | 7      | 200    | 200    | 300    |    300
 POST (Auth) login       | 200    | 200    | 200    | 200    | 200    |    200
 GET (Auth) front page   | 70     | 100    | 200    | 200    | 1000   |   1000
 POST (Auth) comment f.. | 100    | 200    | 300    | 300    | 400    |    400
 GET (Anon) front page   | 3      | 6      | 10     | 10     | 30     |     30
 ------------------------+--------+--------+--------+--------+--------+------- 
 Aggregated              | 3      | 6      | 40     | 90     | 200    |   4000

Reviewing the above statistics, you can see there was a single error during the load test. Looking in the apache access_log, we find that it was a 500 error returned by the server when loading the front page as a logged in user:

127.0.0.1 - - [07/May/2020:01:26:34 -0400] "GET / HTTP/1.1" 500 4329 "-" "goose/0.5.8"

Goose introduces counts per-status-code, something not available in Locust. This can be enabled by specifying the --status-codes flag when running a load test, which provides more insight into what sorts of errors or other response codes the web server returned during the load test. During one round of testing, Goose generated the following warning:


    06:41:12 [ WARN] "/node/1687": error sending request for url (http://apache.fosciana/node/1687): error trying to connect: dns error: failed to lookup address information: Name or service not known

In this particular case, no request was made as a DNS lookup failed, and so there was no status code returned by the server. Goose assigns client failures such as the above a status code of 0, which shows up in the status code table as follows:


-------------------------------------------------------------------------------
 Name                    | Status codes            
 -----------------------------------------------------------------------------
 GET static asset        | 125,282 [200]          
 GET (Auth) comment form | 1,369 [200]            
 GET (Anon) user page    | 11,139 [200]           
 GET (Anon) front page   | 55,787 [200]           
 POST (Auth) login       | 48 [200]               
 GET (Auth) node page    | 4,563 [200]            
 GET (Auth) front page   | 6,854 [200]            
 GET (Anon) node page    | 37,091 [200], 1 [0]    
 GET (Auth) login        | 48 [200]               
 POST (Auth) comment f.. | 1,369 [200]            
 GET (Auth) user page    | 1,364 [200]            
-------------------------------------------------------------------------------
 Aggregated              | 244,914 [200], 1 [0]    

As with all other statistics tables, Goose breaks things out per request, as well as giving an aggregated summary of all requests added together.

Weights

Load tests are collections of one or more task sets, each containing one or more tasks. Each "client" runs in its own thread and is assigned a task set, repeatedly running all contained tasks. You can better simulate real users or your desired load patterns through weighting, causing individual tasks to run more or less frequently, and individual task sets to be assigned to more or fewer client threads.

When using Locust we’ve frequently found its heuristic style of assigning weights frustrating, as large weights mixed with small weights within a task set can lead to individual tasks never running. Goose is intentionally very precise when applying weights. If a task set has two tasks -- for example, task "a" with a weight of 1 and task "b" with a weight of 99 -- it will consistently run task "a" one time, and task "b" ninety nine times each and every time it loops through the task set. The order of tasks, however, are randomly shuffled each time the client thread loops through the task set.

Sequences

A client is assigned one task set, and by default will run all contained tasks in a random order, shuffling the order each time it completes the running of all tasks. In some cases, it can be desirable to better control the order client threads run tasks. Goose allows you to optionally assign a sequence (any integer value) to one or more tasks in a task set, controlling the order in which client threads run the tasks. Tasks can be both weighted and sequenced at the same time, and any tasks with the same sequence value will be run in a random order, before any tasks with a higher sequence value. If a task set mixes sequenced tasks and unsequenced tasks, the sequenced tasks will always all run before the unsequenced tasks.

On Start

Tasks can also be flagged to only run when a client thread first starts. For example, if a task set is intended to simulate a logged in user, you likely want the user to log in only one time when the client thread first starts. For maximum flexibility, these tasks can also be sequenced and weighted if you want the tasks to run more than once, or multiple tasks to run in a specific order only when the client first starts.

On Stop

Similarly, Goose also allows tasks to be flagged to only run when a client thread is stopping. For example, you can have a client thread simulate logging out at the end of the load test. Goose client threads will only execute these tasks when a load test reaches the configured run time, or is canceled with control-c. As expected, these tasks can also be sequenced and weighted. You can also flag any task to run both at start time and at stop time.

Wait Time

If no wait time is assigned to a task set, any client threads running that set will execute tasks one after the other as rapidly as they can. This can generate large amounts of load, but it can also result in generating unrealistic loads, or it can bottleneck the load testing server itself. Typically you'd specify a wait time, which tells Goose client threads how long to pause after executing each task. Wait time is declared with a low-high integer tuple, and the actual time paused after each task is a randomly selected value from this range.

Clients

Rust has no global lock and thus is able to make far better use of available CPU cores than Python. By default Goose will spin up 1 client per core, each running in its own thread. You can use the --clients option to control how many total clients are launched, and the --hatch-rate option to control how quickly they are launched by specifying how many to launch per second. When you build more complex test plans and start launching thousands of concurrent clients, you’ll likely need to increase kernel level limits on the maximum number of open files. You'll also need to add some delays to the task set, by specifying a wait time as described above.

Run Time

If you don't specify a run time, Goose will generate load until you manually stop it. If you've enabled the display of statistics, they will be displayed as soon as you cancel the load test with control-c.

Naming Tasks

When using Goose's built in statistics, by default each request is recorded and identified by the URL requested. As load tests get more complex, this can result in less useful statistics. For example, when load testing the Drupal Memcache module, one of our tasks loads a random node, and this can generate up to 10,000 unique URLs. In this case, the Drupal-powered website follows the same code path to serve up any node, so we prefer that the statistics for loading nodes are all grouped together, instead of being broken out per node id.

This can be achieved by applying custom names at the task level, which causes all requests made within that task to be grouped together when displaying statistics. Names can also be specified at the request level, giving total flexibility over how statistics are grouped and identified. Naming tasks and requests is only relevant when displaying statistics.

Requests

The primary use-case of Goose is generating HTTP(S) requests. Each client thread initializes a Reqwest blocking client when it starts, and then this client is used for all subsequent requests made by that individual thread. And no, that's not a typo, the Rust library we're using is spelled "Reqwest". The Reqwest client automatically stores cookies, handles headers, and much more, simplifying the task of writing load test tasks. All available Reqwest functions can be called directly, but it's important to use the provided Goose helpers if you want accurate statistics, and if you want to be able to easily change the host the load test is applied against with a run-time flag.

Goose provides very simplistic GET, POST, HEAD and DELETE wrappers, simplifying the most common request types. There are also two-part helpers allowing raw access to the underlying Reqwest objects allowing more complex GET, POST, HEAD, DELETE, PUT and PATCH requests.

By default, Goose will check the status code returned by the server, identifying 2xx codes as successes, and non-2xx codes as failures. It allows you to override this within your task if necessary, for example if you want to write a task that tests 404 pages and therefore considers a 404 status code as a success, and anything else including 2xx status codes as a failure. It can also be useful to review response bodies or headers and verify expected text or tags are where you expect them, flagging the response as a failure if not.

Our first proof of concept for Goose was to load test a new version of the Drupal Memcache module. Years ago we started load testing each release with jMeter, an effective way to validate changes in the low-level code that's trusted to help the performance of tens of thousands of Drupal websites. A few years ago these tests were rewritten in Python, as Locust had become our favored load testing tool at Tag1. Thus, rewriting the tests again in Rust for Goose seemed like an excellent place to start testing Goose, and offered a chance to make some early comparisons between the tools.

8-core Test System Running Against a 16-core Web Server

All of our current testing is being done on a single system with a 32-core AMD Threadripper, managed with Proxmox. We set up two VMs running Debian 10 during initial development, with a 16-core VM running Apache, PHP, and MySQL, and an 8-core VM running Goose. All server processes are restarted between tests, and the database is reloaded from a backup.

Once Goose supported all functionality required by the Drupal Memcache loadtest, it was a good time to run some comparisons to better understand if we are indeed benefitting by using Rust. To begin, we simply used the existing load testing VMs already set up for development. Of course, you generally wouldn't have (or want to need) so many cores dedicated to the load testing tool compared to the web server.

Goose

Our old test plans "simulated" 100 users pounding the web pages as fast as possible (without any wait time), so we started with the same configuration for Goose. This is not a very realistic test, as real users would generally pause on each page, but we wanted to change as few variables as necessary when getting started. And the primary intent of this load test is to put some stress on the Drupal memcache module's code.

We launched the first Goose load test as follows:


    cargo run --release --example drupal_loadtest -- --host=http://apache.fosciana -c 100 -r 10 -t 1h --print-stats --only-summary -v

It was initially surprising that this didn't put much strain on the load testing VM, taking only about 40% of the available CPU resources. This was surprising as Goose creates a new thread for each client, and Rust has no global lock, so it should have been using all 8 cores fully available to it, yet clearly wasn't using them:

8 core Goose CPU

During this test, it generated nearly 60 Mbit/second of traffic for the duration of the test:

8 core Goose traffic

Further analysis revealed that the shared web and database server was the bottleneck. Specifically, several of the Goose task sets were logging in and posting comments so quickly that Drupal's caches were flushing 10 times a second, causing the MySQL database to bottleneck and slow everything down. This resulted in all the clients being blocked, waiting for the web page to return results.

The following are the Goose statistics output after one such load test run:


------------------------------------------------------------------------------ 
 Name                    | # reqs         | # fails        | req/s  | fail/s
 ----------------------------------------------------------------------------- 
 GET (Anon) user page    | 475,275        | 0 (0%)         | 132    | 0    
 POST (Auth) login       | 20             | 0 (0%)         | 0      | 0    
 GET (Auth) comment form | 37,295         | 0 (0%)         | 10     | 0    
 GET (Anon) front page   | 2,376,330      | 1 (0.0%)       | 660    | 0    
 GET (Auth) node page    | 124,327        | 0 (0%)         | 34     | 0    
 GET (Auth) login        | 20             | 0 (0%)         | 0      | 0    
 GET static asset        | 5,125,594      | 0 (0%)         | 1,423  | 0    
 GET (Auth) user page    | 37,293         | 0 (0%)         | 10     | 0    
 GET (Auth) front page   | 186,468        | 0 (0%)         | 51     | 0    
 POST (Auth) comment f.. | 37,295         | 0 (0%)         | 10     | 0    
 GET (Anon) node page    | 1,584,250      | 0 (0%)         | 440    | 0    
 ------------------------+----------------+----------------+--------+--------- 
 Aggregated              | 9,984,167      | 1 (0.0%)       | 2,773  | 0    
-------------------------------------------------------------------------------
 Name                    | Avg (ms)   | Min        | Max        | Median    
 ----------------------------------------------------------------------------- 
 GET (Anon) user page    | 75         | 1          | 6129       | 80        
 POST (Auth) login       | 149        | 68         | 309        | 100       
 GET (Auth) comment form | 189        | 23         | 6204       | 200       
 GET (Anon) front page   | 11         | 1          | 6043       | 7         
 GET (Auth) node page    | 190        | 17         | 6267       | 200       
 GET (Auth) login        | 49         | 3          | 140        | 40        
 GET static asset        | 2          | 1          | 100        | 1         
 GET (Auth) user page    | 93         | 7          | 6082       | 80        
 GET (Auth) front page   | 121        | 14         | 6115       | 100       
 POST (Auth) comment f.. | 281        | 40         | 1987       | 300       
 GET (Anon) node page    | 127        | 1          | 6280       | 100       
 ------------------------+------------+------------+------------+------------- 
 Aggregated              | 34         | 1          | 6280       | 4         
-------------------------------------------------------------------------------
 Slowest page load within specified percentile of requests (in ms):
 ------------------------------------------------------------------------------
 Name                    | 50%    | 75%    | 98%    | 99%    | 99.9%  | 99.99%
 ----------------------------------------------------------------------------- 
 GET (Anon) user page    | 80     | 100    | 200    | 200    | 300    |    300
 POST (Auth) login       | 100    | 200    | 300    | 300    | 300    |    300
 GET (Auth) comment form | 200    | 200    | 300    | 400    | 900    |    900
 GET (Anon) front page   | 7      | 10     | 30     | 40     | 100    |    100
 GET (Auth) node page    | 200    | 200    | 300    | 400    | 6000   |   6000
 GET (Auth) login        | 40     | 80     | 100    | 100    | 100    |    100
 GET static asset        | 1      | 2      | 10     | 20     | 30     |     30
 GET (Auth) user page    | 80     | 100    | 200    | 200    | 5000   |   5000
 GET (Auth) front page   | 100    | 100    | 200    | 200    | 6000   |   6000
 POST (Auth) comment f.. | 300    | 300    | 500    | 500    | 700    |    700
 GET (Anon) node page    | 100    | 200    | 400    | 400    | 500    |    500
 ------------------------+--------+--------+--------+--------+--------+------- 
 Aggregated              | 4      | 10     | 300    | 300    | 400    |   6000

The single failure above was a time out, for which Goose displayed the following easy to understand error:

failed to parse front page: error decoding response body: operation timed out

Locust

We then configured Locust to generate the same load from the same 8-core VM. Python's Global Interpreter Lock quickly made an appearance, limiting how much traffic a single instance of Locust can generate.

The load test was launched with the following options:


    locust -f locust_testplan.py --host=http://apache.fosciana --no-web -c 100 -r 10 -t 1h --only-summary

Locust saturated a single core of the 8-core VM:

8 core Locust CPU

It also generated considerably less traffic, around 2.3 Mbit/second compared to the 58 Mbit/second generated by Goose:

8 core Locust network

The following are the complete Locust statistics output after one such load test run:


 Name                                                          # reqs      # fails     Avg     Min     Max  |  Median   req/s failures/s
--------------------------------------------------------------------------------------------------------------------------------------------
 GET (Anonymous) /node/[nid]                                    74860     0(0.00%)     296      21   11697  |     270   20.74    0.00
 GET (Anonymous) /user/[uid]                                    22739     5(0.02%)     297      14    8352  |     270    6.30    0.00
 GET (Anonymous) Front page                                    112904     0(0.00%)     312       4   12564  |     290   31.28    0.00
 GET (Auth) /node/[nid]                                         17768     0(0.00%)     296      24   10540  |     270    4.92    0.00
 GET (Auth) /user/[uid]                                          5200     1(0.02%)     293      15    6120  |     270    1.44    0.00
 GET (Auth) Comment form                                         5306     0(0.00%)     293      18    2330  |     270    1.47    0.00
 GET (Auth) Front page                                          26405     0(0.00%)     289      20   10137  |     260    7.32    0.00
 POST (Auth) Logging in: /user                                     20     0(0.00%)     370     105     706  |     350    0.01    0.00
 GET (Auth) Login                                                  20     0(0.00%)    2600     909    5889  |    2200    0.01    0.00
 POST (Auth) Posting comment                                     5306     0(0.00%)     448      34    5147  |     440    1.47    0.00
 GET (Static File)                                             835603     0(0.00%)     293       4   11965  |     270  231.51    0.00
--------------------------------------------------------------------------------------------------------------------------------------------
 Aggregated                                                   1106131     6(0.00%)     296       4   12564  |     270  306.46    0.00

Percentage of the requests completed within given times
 Type                 Name                                                           # reqs    50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100%
------------------------------------------------------------------------------------------------------------------------------------------------------
 GET                  (Anonymous) /node/[nid]                                         74860    270    330    370    390    440    480    530    570    810   5100  12000
 GET                  (Anonymous) /user/[uid]                                         22739    270    330    370    390    440    480    530    570    800   4800   8400
 GET                  (Anonymous) Front page                                         112904    290    350    390    410    450    490    530    560    790   5800  13000
 GET                  (Auth) /node/[nid]                                              17768    270    330    370    390    440    480    530    580    770   6600  11000
 GET                  (Auth) /user/[uid]                                               5200    270    330    370    390    440    480    540    580    840   6100   6100
 GET                  (Auth) Comment form                                              5306    270    330    370    390    440    480    530    580    750   2300   2300
 GET                  (Auth) Front page                                               26405    260    320    370    390    430    470    520    560    800   3600  10000
 POST                 (Auth) Logging in: /user                                           20    360    410    460    530    610    710    710    710    710    710    710
 GET                  (Auth) Login                                                       20   2400   3200   4000   4000   4400   5900   5900   5900   5900   5900   5900
 POST                 (Auth) Posting comment                                           5306    440    550    620    650    710    760    840    890   1200   5100   5100
 GET                  (Static File)                                                  835603    270    330    370    390    440    480    520    560    780   5200  12000
------------------------------------------------------------------------------------------------------------------------------------------------------
 None                 Aggregated                                                    1106131    270    330    380    400    440    480    530    570    800   5400  13000

Error report
 # occurrences      Error                                                                                               
--------------------------------------------------------------------------------------------------------------------------------------------
 5                  GET (Anonymous) /user/[uid]: "HTTPError('404 Client Error: Not Found for url: (Anonymous) /user/[uid]')"
 1                  GET (Auth) /user/[uid]: "HTTPError('404 Client Error: Not Found for url: (Auth) /user/[uid]')"      
--------------------------------------------------------------------------------------------------------------------------------------------

Distributed Locust

Fortunately, Locust has fantastic support for running distributed tests, and this functionality can also be utilized to generate more load from a multi-core server.

We first started the master Locust process as follows:


    locust -f locust_testplan.py --host=http://apache.fosciana --no-web -t 1h -c100 -r10 --only-summary --master --expect-slaves=8

We then launched eight more instances of Locust running in slave-mode, starting each one as follows:


    locust -f locust_testplan.py --host=http://apache.fosciana --no-web --only-summary --slave

The end result was eight individual Python instances all working in a coordinated fashion to generate load using all available CPU cores. The increased load is visible in the following CPU graph, where the load from a single Locust instance can be seen on the left, and the load from one master and eight slaves can be seen on the right:

8 core distributed Locust CPU

Perhaps more importantly, this resulted in considerably more network traffic, as desired:

8 core distributed Locust network

With both distributed Locust and standard Goose, we are using all available CPU cores, but our requests are being throttled by bottlenecks on the combined web and database server. In this distributed configuration, Locust was able to sustain a little over half the network load as Goose.

1-core Testing System Running Against a 16-core Web Server with Varnish

From here we made a number of configuration changes, running new load tests after each change to profile the impact. Ultimately we ended up on a single-core VM for running the load tests, against a 16-core VM for running the LAMP stack. We also added Varnish to cache anonymous pages and static assets in memory, offloading these requests from the database.

We tuned the database for the most obvious gains, giving InnoDB more memory, disabling full ACID support to minimize flushing to disk, and turning off the query cache to avoid the global lock. We also conifgured Drupal to cache anyomous pages for a minimum of 1 minute. Our goal was to remove the server-side bottlenecks to better understand our load testing potential.


    [mysqld]
    innodb_buffer_pool_size = 1G
    innodb_flush_log_at_trx_commit = 0
    query_cache_size = 0

These combined changes removed the most extreme server-side bottlenecks.

Measuring Goose Performance on a Single Core VM

With the web server able to sustain more simulated traffic, we launched another Goose load test in order to see how much traffic Goose could generate from a single-CPU system. With a little trial and error, we determined that 12 clients loading pages as fast as they could produced the optimal load from a 1-core VM, initiating the test with the following options:


    cargo run --release --example drupal_loadtest --  --host=http://apache.fosciana -c 12 -r 2 -t1h --print-stats --only-summary -v

Goose was now bottlenecked only by running from a single CPU core, fairly consistently consuming 100% of its CPU resources:

1 core Goose CPU

And perhaps more importantly, Goose was able to generate 35Mbit/second of network traffic, all from a single process running on a single-core VM:

1 core Goose network

Using top to look at the server load, you can see that MySQL, Varnish, Memcached and Apache are all getting a healthy workout:

apache server top

And with varnishstat we can get some insight into where Varnish is spending its time. It's successfully serving most requests out of memory:

apache server varnishstat

Measuring Locust Performance on a Single Core VM

From the same single-core VM, we also ran the equivalent load test with Locust. We started it with similar command line options:


    locust -f locust_testplan.py --host=http://apache.fosciana --no-web -t 1h -c12 -r2 --only-summary

As seen below, Locust again pegs the single CPU at 100%. In fact, it's much more consistent about doing this than Goose is -- an apparent bug in Goose (see the dips and valleys on the left side of the chart below) -- something that still needs to be profiled and fixed:

1 core Locust CPU

However, while Locust produces steady load, it's only generating about 3Mbit/second of traffic versus Goose's 35Mbit/second. Now that there's no server bottleneck, Goose's true potential and advantages are far more visible. The following graph shows network traffic generated by Goose on the left side of the graph, and Locust on the right side. In both instances they are utilizing 100% CPU on the load test VM:

1 core Locust network

Speeding Up Locust through Optimization

We've used Locust enough to know it can generate significantly more load than this. Through profiling, we identified that the botteleneck was due to using Beautiful Soup to extract links from the pages. Parsing the HTML is really expensive! To solve, we replaced the Beautiful Soup logic with a simple regular expression.

The load testing client continued to use 100% of the available CPU, but network traffic grew nearly four times, to 8 Mbit/second:

1 core Locust CPU

This was definitely a big step in the right direction! But the question remained, could we generate even more load from a single core?

Speeding Up Locust with FastHttpLocust

Locust includes an alternative HTTP client, called FastHttp, which the documentation suggests can increase Locust's performance. We updated our test plan, switching from HttpLocust to FastHttpLocust. The defined tasks are simple enough no other changes were necessary.

We then launched the load test again with the same parameters, and saw another dramatic improvement. Locust was now able to generate nearly 20M of sustained traffic.

1 core Locust CPU

Further optimizations, such as completely replacing Beautiful Soup with regular expressions didn't produce any additional measurable gains.

On the web server, we see that Locust is doing a decent job of simulating load, putting some visible stress on server processes:

FastHttp Locust network

However, reviewing our earlier notes, by comparison Goose was able to generate over 35 Mbit/second. And what's even more interesting is that it's doing this while leveraging heavy libraries to parse the HTML and extract links and post comments. These libraries make our job writing load tests easier, but it leads to an obvious question, can we speed up Goose through the same optimizations we made to Locust?

Speeding Up Goose through Optimization

We did two rounds of optimizations on Goose. First, we replaced the Select library with regular expressions optimizing how we extract static assets from the page. Next, we also replaced the Scraper library with regular expressions optimizing how we log in and post comments.

As with Locust, we saw a considerable improvement. Goose was now able to generate 110 Mbit/second of useful network traffic, all from a single VM core!

1 core Optimized Goose Network

On the web server, Goose is giving all server processes a truly impressive workout:

1 core Optimized Goose Top

This additional load is consistent:

1 core Optimized Goose Network

And Varnish continues to serve most requests out of RAM:

1 core Optimized Goose Varnishstat

After an hour, Goose displayed the following statistics:


------------------------------------------------------------------------------ 
 Name                    | # reqs         | # fails        | req/s  | fail/s
 ----------------------------------------------------------------------------- 
 GET (Auth) node page    | 112,787        | 0 (0%)         | 31     | 0    
 GET (Anon) user page    | 416,767        | 0 (0%)         | 115    | 0    
 GET (Auth) login        | 3              | 0 (0%)         | 0      | 0    
 POST (Auth) login       | 3              | 0 (0%)         | 0      | 0    
 GET (Auth) front page   | 169,178        | 0 (0%)         | 46     | 0    
 GET static asset        | 13,518,078     | 0 (0%)         | 3,755  | 0    
 GET (Auth) comment form | 33,836         | 0 (0%)         | 9      | 0    
 GET (Anon) node page    | 1,389,225      | 0 (0%)         | 385    | 0    
 GET (Auth) user page    | 33,834         | 0 (0%)         | 9      | 0    
 GET (Anon) front page   | 2,083,835      | 0 (0%)         | 578    | 0    
 POST (Auth) comment f.. | 33,836         | 0 (0%)         | 9      | 0    
 ------------------------+----------------+----------------+--------+--------- 
 Aggregated              | 17,791,382     | 0 (0%)         | 4,942  | 0    
-------------------------------------------------------------------------------
 Name                    | Avg (ms)   | Min        | Max        | Median    
 ----------------------------------------------------------------------------- 
 GET (Auth) node page    | 27         | 10         | 5973       | 30        
 GET (Anon) user page    | 5          | 1          | 12196      | 1         
 GET (Auth) login        | 8899       | 6398       | 11400      | 9000      
 POST (Auth) login       | 64         | 57         | 74         | 60        
 GET (Auth) front page   | 22         | 14         | 6029       | 20        
 GET static asset        | 0          | 1          | 6030       | 1         
 GET (Auth) comment form | 27         | 10         | 5973       | 30        
 GET (Anon) node page    | 7          | 1          | 6038       | 1         
 GET (Auth) user page    | 13         | 6          | 6014       | 10        
 GET (Anon) front page   | 0          | 1          | 6017       | 1         
 POST (Auth) comment f.. | 38         | 20         | 265        | 40        
 ------------------------+------------+------------+------------+------------- 
 Aggregated              | 1          | 1          | 12196      | 1         
-------------------------------------------------------------------------------
 Slowest page load within specified percentile of requests (in ms):
 ------------------------------------------------------------------------------
 Name                    | 50%    | 75%    | 98%    | 99%    | 99.9%  | 99.99%
 ----------------------------------------------------------------------------- 
 GET (Auth) node page    | 30     | 30     | 50     | 50     | 70     |     70
 GET (Anon) user page    | 1      | 10     | 20     | 30     | 40     |     40
 GET (Auth) login        | 9000   | 9000   | 11000  | 11000  | 11000  |  11000
 POST (Auth) login       | 60     | 60     | 70     | 70     | 70     |     70
 GET (Auth) front page   | 20     | 20     | 40     | 40     | 50     |     50
 GET static asset        | 1      | 1      | 4      | 6      | 10     |     10
 GET (Auth) comment form | 30     | 30     | 50     | 50     | 70     |     70
 GET (Anon) node page    | 1      | 3      | 50     | 50     | 70     |     70
 GET (Auth) user page    | 10     | 10     | 20     | 30     | 40     |     40
 GET (Anon) front page   | 1      | 1      | 5      | 7      | 20     |     20
 POST (Auth) comment f.. | 40     | 40     | 60     | 70     | 100    |    100
 ------------------------+--------+--------+--------+--------+--------+------- 
 Aggregated              | 1      | 1      | 30     | 30     | 50     |     70

More optimizations are certainly possible. For example, just how Locust offers a FastHttpClient, the Rust ecosystem also has clients faster than Reqwest. And as Goose is written in Rust, adding more cores to the load testing server gives it more power without any additional configuration.

Profiling

This Is A Goose

While Goose has proven quite capable at generating a lot of load, it's hard to miss the periodic dips visible in the Goose network traffic graphs. Some effort is required to profile the load testing tool under load, to understand what bottlenecks are causing this, and determine if it's fixable. Best case, the generated load should be steady, as is generally seen when load testing with Locust. Hopefully this issue can be fully understood and resolved in a future release.

Beyond that, this is a very early version of Goose, and as such is totally unoptimized. We are confident that with a little time and effort Goose's ability to generate load can be greatly improved.

Automated Testing

Cargo has built-in support for running tests, and Goose would benefit from considerably better test coverage. While there's already quite a few tests written, over time we aim to have nearly complete coverage.

More Examples

As of the 0.5.8 release which was used to write this blog, Goose comes with two example load tests. The first, simple.rs, is a clone of the example currently found on the Locust.io website. It doesn't do much more than demonstrating how to set up a load test, including a simple POST task, and some GET tasks. It is primarily useful to someone familiar with Locust, looking to understand the differences in building a load test in Rust with Goose.

The second example, drupal_loadtest.rs, was previously discussed and is a clone of the load test Tag1 has been using to validate new releases of the Drupal Memcache module. It leverages much more Goose functionality, including weighting task sets and tasks, as well as parsing the pages that are loaded to confirm expected elements exist. Prior to our regular expressin optimization it leveraged the scraper library to extract form elements required to log into a Drupal website and post comments. It also used the select library to extract links from returned HTML in order to load static elements embedded in image tags.

The plan is to add several other useful examples, providing additional recipes on how you might leverage Goose to load test your websites, API endpoints, and more. Contributed examples leveraging different libraries from the Rust ecosystem are very welcome!

API

Currently Goose is controlled entirely through run-time options specified on the command line. The development plan is to expose an API allowing the same functionality to be controlled and monitored in other ways.

Gaggles

The first intended use-case of the Goose API will be to add support for distributed load testing. Two or more Goose instances working together will be referred to as a Gaggle. A Goose Manager instance will be able to control one or more Goose Worker instances. If enabled, Goose Workers will also regularly send statistics data to the Goose Manger instance. We are also exploring the possibility of multi-tiered Gaggles, allowing a single Goose instance to be both a Worker and a Manager, making it possible to group together multiple Gaggles.

Web User Interface

The second intended use-case of the Goose API will be to add a simple UI for controlling and monitoring load tests from a web browser. As with everything else in Goose, the initial goal of this UI will be to clone the functionality currently provided in the Locust UI. Once that is working, we will consider additional functionality.

The web user interface will live in its own Cargo library for a couple of reasons. First, if you don't need the UI, you won't have to install it and its dependencies. Second, we hope eventually alternative UIs will be contributed by the open source community!

Async

Currently Goose uses Reqwest's blocking HTTP Client to load web pages. The Reqwest documentation explains:

"The blocking Client will block the current thread to execute, instead of returning futures that need to be executed on a runtime."

With each Goose client running in its own thread, blocking is likely the best simulation of a real user when building load tests. That said, as of Rust 1.39 which was released in November of 2018, Rust gained async-await syntax. We intend to explore adding support for Reqwest's default async-based Client as an optional alternative, as well as adding support for defining tasks themselves to be async. This should allow individual Goose client threads to generate much more network traffic.

Multiple HTTP Clients

Related, we will also explore supporting completely different HTTP clients. There's nothing in Goose's design that requires it to work only with Reqwest. Different clients will have different performance characteristics, and may provide functionality required to load test your project.

The current intent is to keep Reqwest's blocking HTTP client as the default, and to make other clients available as compile-time Cargo features. If another client library proves to be more flexible or performant, it may ultimately become the default.

Macros

One of our favorite features of Locust is how easy it is to write load plans, partially thanks to their use of Python decorators. We hope to similarly simplify the creation of Goose load plans by adding macros, simplifying everything between initializing and executing the GooseState when writing a load plan. Our goal is that writing a load plan for Goose essentially be as simple as defining the individual tasks in pure Rust, and tagging them with one or more macros.

Though Goose is still in an early stage of development, it is already proving to be very powerful and useful. We're actively using it to prepare the next release of the Drupal Memcache module, ensuring there won't be unexpected performance regressions as mission critical websites upgrade to the latest release. We're also excited to leverage the correctness, performance, and flexibility provided by Rust and its ecosystem with future client load tests.

To get started using Goose in your own load testing, check out the comprehensive documentation. The tool is released as open source, contributions are welcome!

May 11 2020
May 11

Drupal is one of the largest and most active open-source software projects in the world. Behind the scenes is the Drupal Association, the non-profit organization responsible for enabling it to thrive by architecting and introducing new tooling and infrastructure to support the needs of the community and ecosystem. Many of us know the Drupal Association as the primary organizer of the global DrupalCon conference twice a year. But it's less common knowledge that the Drupal Association is actively engaged in Drupal development and maintains some of the most important elements of the Drupal project. This runs across the spectrum of software localizations, version updates, security advisories, dependency metadata, and other "cloud services" like the Drupal CI system that empower developers to keep building on Drupal.

With the ongoing coronavirus pandemic, the Drupal Association is in dire financial straits due to losses sustained from DrupalCon North America (one of the largest sources of funding) having to be held as a virtual event this year. As part of the #DrupalCares campaign, we at Tag1 Consulting implore organizations that use Drupal, companies that provide Drupal services, and even individuals who make their living off Drupal development to contribute in some shape or form to the Drupal Association in this time of need.

We are putting our money where our mouth is. For years we have donated at least eighty hours a month to support the DA and Drupal.org infrastructure and tooling. I’m proud to announce that we are expanding this commitment by 50% to 120 hours a month of pro-bono work, from our most senior resources, to help the DA offset some of its operating expenses. Furthermore, we contributed to help #DrupalCares reach its $100,000 goal and so that any donation you make is doubled in value.

To gain insights into building software communities at scale in open source, Michael Meyers (Managing Director at Tag1) and I (Preston So, Editor in Chief at Tag1 and author of Decoupled Drupal in Practice) recently kicked off a Tag1 Team Talks miniseries with the Drupal Association's engineering team, represented by Tim Lehnen (Chief Technology Officer at the Drupal Association) and Narayan Newton (Chief Technology Officer at Tag1), to examine all the ways in which the DA keeps the Drupal community ticking.

Why Tag1 supports the Drupal Association

Here at Tag1, we work with a diverse range of technologies, but Drupal has been our passion for many years. It's been a critical part of our business since Tag1's inception, and we're grateful to the Drupal Association for sustaining such an essential part of our work today. By no means is it an understatement to characterize the Drupal Association as the lifeblood of the Drupal ecosystem. Because of our appreciation for what Drupal has given us, we're committed to doing our part to giving back to Drupal, not only over the course of our many years working in concert with the Drupal Association but also right now during the #DrupalCares campaign.

How we contribute to Drupal

Though Tag1 is well-known for being the all-time number-two contributor to the Drupal project, with the largest concentration of core committers, branch managers, release managers, and core maintainers of any organization in the community, we're much less known for how we support the underlying foundations of the ecosystem. Beyond the more visible contributions of staff members like Moshe Weitzman, Nathaniel Catchpole (catch), Francesco Placella (plach), and Fabian Franz (fabianx), we also do much more than add our support to Drupal core development. After all, supporting Drupal requires more than just code; it also requires the tooling and infrastructure that keep the project's blood flowing.

During our Tag1 Team Talks episode with the Drupal Association, Tim Lehnen eloquently made the case for the non-profit that has driven Drupal's success for so many years: While the software makes up the bulk of open-source contributions, offering surrounding services that buttress the software's core is another key function that the Drupal Association performs. To that end, for many years, Tag1 has donated 80 hours of pro-bono work a month to ensure that Drupal.org and all the tooling the community relies on stays up and running. Tag1 is honored to increase our monthly contribution of pro-bono hours to the Drupal Association by 50% from 80 to 120 hours of expert work from our most senior resources. And now with our increased work hours and financial contributions, critical projects like the migration to GitLab can continue to move forward, even during a situation like the current pandemic.

Supporting Drupal's test infrastructure

In Drupal, a key aspect of code contribution is running tests that verify a patch will work against a massive variety of environments, be compatible with a spectrum of versions of Drupal, and not introduce any functional regressions in the code. One of the key questions many community members ask is why Drupal maintains its own testing infrastructure in lieu of a service such as TravisCI.

Unfortunately, whenever existing continuous integration solutions were tasked with running a Drupal core test for every Drupal patch, they would consistently time out, maxing out available resources. To solve the challenges associated with developing and testing at scale, the DA partnered with Tag1. We deployed our expertise in infrastructure, mission-critical application development, and performance and scalability to help run and maintain Drupal.org's servers and the DrupalCI test runner system. The CI system ensures that contributors have a reliable center for collaboration and a dependable test infrastructure for all of their patches and modules. Tag1's deep expertise has been critical to the success of the DrupalCI system, which we scaled dynamically to the extent that it is now concurrently running more than an entire decade's worth of testing in a single year.

The new testing infrastructure was an enormous undertaking for the Drupal Association due to its complexity. Narayan Newton opted from early days to leverage standard Unix tools to build out the environments for testing targets. And rather than use Kubernetes for the orchestration of tests, the Drupal Association opted to use Jenkins and the EC2 Fleet plugin for DrupalCI. Jenkins manages the orchestration of virtual machines (VMs) and initializes them as test targets before actually running the tests themselves in a clean room environment. As Narayan notes during our conversation, one of the most fascinating quirks of Drupal's infrastructure is that many of its core elements were installed before standardized tooling emerged to handle those use cases in a regimented way.

Supporting Drupal's migration to GitLab

In addition to our contributions to Drupal's underlying infrastructure, Tag1 also assists with key initiatives run by the Drupal Association such as the ongoing migration from Drupal's homegrown Git system to GitLab, a source control provider. According to Narayan, the migration to GitLab has been much more straightforward than previous historical migrations in Drupal's past, more specifically the original migration from Drupal's previous CVS source control system to Git, which it has used ever since. Code management in Drupal has long employed a bespoke Git approach with a homegrown Git daemon written by the community and cgit as the web-based front end for Git repositories.

One of the key benefits GitLab provides to the Drupal Association is the fact that the DA is no longer responsible for building and supporting a source control system for Drupal at the scale at which it operates. After all, GitLab has a dedicated site reliability engineering (SRE) team focused on ensuring source availability even at high loads. And as Narayan notes, GitLab has been responsive to security issues, in addition to facilitating "one of the smoothest migrations I've been a part of." But this doesn't mean there weren't complications.

Because GitLab has a superset of features that include some existing Drupal.org functionality, the Drupal Association, supported by Tag1, worked closely with the GitLab team to ensure that certain features could be disabled for use with the Drupal project, avoiding many of the issues that have plagued the GitHub mirror of Drupal since its conception. Narayan contributed key features to ensure that GitLab's integration points could be toggled on and off in order to enable the unique needs and requirements of the Drupal community and ecosystem.

Tim adds that in terms of lack of downtime, disruption, forklifting the entire Git code management infrastructure without disrupting the development community was a rousing success, especially given that there was no impact on a minor version release. In the process, the Drupal community has gained a number of key features that will enable accelerated development and conversation between contributors in ever-richer ways. In coming months, the Drupal Association will also facilitate the addition of GitLab's merge requests feature, which will introduce yet more efficiencies for those making code contributions.

Why #DrupalCares is so important

For us, Drupal is a key reason we exist, and the Drupal Association has done wonders to ensure the longevity of an open-source software project we hold dear. This is why in these troubling times for the Drupal Association, it could not be more important to uphold the ideals of open source and ensure the survival of our beloved community and ecosystem. Over the course of the past month, we've witnessed an incredible outpouring of support from all corners of the community, buttressed by the various matches provided by community members like none other than project lead Dries Buytaert. We at Tag1 Consulting have contributed toward #DrupalCares' $100,000 goal in order to multiply the impact of community donations and buttress our existing support.

Without your support, whether as a company or an individual, we may never see another DrupalCon grace our stages or celebrate yet another major version release that introduces innovative features to the Drupal milieu. And it's not just about the more visible elements of the Drupal experience like DrupalCon. It's also about the invisible yet essential work the Drupal Association does to keep the Drupal project rolling along. Thanks to the innumerable contributions the Drupal Association has made to maintain DrupalCI, the GitLab migration, Composer Façade, and a host of other improvements to Drupal's infrastructure and tooling, with the support of Tag1, the Drupal project remains one of the most impressive open-source projects in our industry.

Conclusion

Here at Tag1, we believe in the enduring value of open source and its ability to enrich our day-to-day lives in addition to the way we do business. We're dedicated to deepening our already extensive support for the Drupal Association in ways both financial and technological. And now it's your turn to return the favor. If you're an individual community member, we strongly encourage you to start or renew a membership. If you're an organization or company in the Drupal space, we encourage you to contribute what you can to ensure the continued success of Drupal. Together, we can keep Drupal alive for a new era of contribution and community.

Special thanks to Jeremy Andrews and Michael Meyers for their feedback during the writing process.

Photo by Jon Tyson on Unsplash

May 06 2020
May 06

In recent years, it seems as if open source has taken the software world by storm. Nonetheless, many enterprise organizations remain hesitant to adopt open-source technologies, whether due to vendor lock-in or a preference for proprietary solutions. But open source can in fact yield substantial fruit when it comes to advancing your business in today’s highly competitive landscape. By leveraging and contributing back to open source, you can distinguish your business with open source as a competitive advantage.

A few years back, Michael Meyers (Managing Director at Tag1 Consulting) presented a keynote at Texas Camp 2016 about the individual and business benefits of open source. As part of that talk, he highlighted some of the best motivations for open-source adoption and the outsized benefits that open source delivers to not only individual developers but also businesses that are seeking to get ahead in recruiting, sales, and other areas. In this two-part blog series (read the first part), we analyze the positive effects of open source on everyone from individual developers to the biggest enterprises in the world, all of whom are benefitting from their adoption of open-source software.

In this second installment, we dive into some of the ways in which open-source technologies like Drupal can improve your bottom line, with the help of a hypothetical tale of two companies and real-world case studies that demonstrate that open source presents much more rewards than risks in the context of enterprise.

A tale of two enterprises

As I wrote in the previous installment in this two-part series, individuals who participate in and contribute to Drupal garner immense benefits from open-source communities. And organizations can leverage these benefits as well for themselves by encouraging their employees to attend open-source conferences and grow their expertise and knowledge.

Let’s consider a hypothetical scenario in which two enterprise organizations are attempting to outcompete others in their space. The two protagonists of our vignette are DIY Corporation (whose slogan is “reinventing the wheel since forever”), and their corporate headquarters is located in the silos next to a nearby waterfall. Collab Incorporated is the other main character in this story, and they focus on working with others.

Writing custom code vs. leveraging open source

In this hypothetical scenario, DIY Corporation downloads Drupal, one of the most commonly used open-source content management systems (CMS) in the world. However, it soon discovers that it needs to extend existing functionality to solve problems unique to its business requirements. DIY Corporation chooses to write code to solve the problem rather than leveraging others’ code, something that is a common event among organizations that are unaccustomed to open-source software. Writing code makes perfect sense, as the business needs are resolved, but the challenge is when developers leave and additional support is required. When DIY Corporation gets stuck, they have no one to turn to, because their code is located in a private repository.

Meanwhile, Collab Inc. first checks to see if there is a solution available that has already been committed to the open-source ecosystem in the form of a Drupal module or experimental sandbox project. The key distinction here is that if there is no solution already available, Collab Inc. can decide only then to write a solution—and they choose to do so in public rather than in a silo. Too often, Drupal companies download software and write code in isolation rather than contributing that code back. If every organization opts to do this, then we negate the value of the open-source community in the first place.

A real-world example: Fivestar module

The key lesson from this hypothetical scenario is that sharing code from the beginning translates into better results for everyone across the board. By being open to contributions and ideas from others, we can resolve shared problems when we hit a wall. After all, other organizations will have a vested interest in your contributed code, because they are dependent on it and appreciative of the outcomes they have been able to achieve as a result.

A real-world example of this situation is Drupal’s Fivestar module, which ironically does exactly what it says it does. Originally developed by Lullabot for Sony BMG, which needed to provide ratings on pages associated with the label’s musicians, it has quickly found ubiquity across a variety of businesses leveraging Drupal. After the Fivestar module was released, Warner Music Group also contributed to the module’s codebase by upgrading it to Drupal 6 from Drupal 5. This illustrates an increasingly rare scenario in the hyper-competitive music landscape: two competitors helping each other for better results all around.

Thanks to Warner Music Group’s contributions, when Sony BMG finally needed to update all of their artist websites to Drupal 6, they simply employed the existing Drupal 6 module. Because of this strategic alliance, even as direct competitors, Sony BMG and Warner Music Group recognized that they were not competitors in the technology space—only in the music space—and worked together to realize mutual benefits. In the end, technology is a commodity, and every dime spent on additional code is not in each organization’s best interest. Their five-star rating systems are not a differentiator; instead of building separate code in that arena, they can focus on creating good music.

Recruiting talent in open source

Consider another scenario. Organizations are always looking to recruit the best talent, particularly in the Drupal ecosystem. Our hypothetical DIY Corporation posts to a job board; they release information about their opening and wonder why their recruiting pipeline is running dry, especially if they are not a well-known household name. However, because DIY Corporation has not focused on recruiting open-source developers in an open-source community, they have not attracted the interest they desire.

This brings us to a crucial point: Locking up your code guarantees a disincentive for people to work with you as an employee. Developers who wish to grow their careers are willing to work with employers who have a vested interest in their growth as well. If an employer does not grant the necessary opportunities to engage with open-source communities, this results in a lack of opportunities. Thanks to open source, organizations can develop a bench of people who may be interested in the future, thus expanding their recruiting pipelines.

The competitive advantage of open source

The benefits and advantages conferred by Drupal cannot be overstated. While the most overt benefit is Drupal’s cost-effectiveness, the more subtle—and perhaps realer—benefit is that you can participate in a global community with common methodologies and best practices that expand your sphere of knowledge and influence. Open source has been proven time and time again to be better, faster, and cheaper.

For agencies interested in getting involved in open source, there are huge opportunities. For instance, if a customer is looking to hire a consultancy to solve a particular problem, they have a clear choice between an agency that simply uses Drupal as opposed to one that actively contributes meaningfully to the Drupal community.

Agencies can gain a significant advantage by contributing to open source. Granted, contributing to open source as a small agency can be difficult, and bench time can often be limited for developers not actively working on projects. However, organizations that do get involved and publicize their open-source contributions tend to get meaningfully more business as a result. For instance, prominent companies in the Drupal landscape such as Amazee Labs, Phase2, Chapter Three, and others with full-time Drupal contributors often have customers reaching out directly precisely because of their commitment to open source.

Conclusion

Getting involved in open source can yield substantial dividends for those who engage in it. Though there are thousands upon thousands of open-source projects in the wild that you can get involved in, Drupal in particular has a highly well-developed ecosystem for organizations to get involved in open-source contribution, including user groups and Drupal conferences around the world that are looking for sponsors interested in supporting open source. As a case in point, I organize a non-profit open-source conference in New York City called Decoupled Days, about decoupled Drupal (also the subject of my book), and we’re currently looking for more sponsors!

For businesses interested in contributing to open source, there are also business summits and events, such as Drupal Business Days, that can help you connect with other organizations exploring open-source software like Drupal. And there’s no need to be a developer to contribute to open source. In fact, among the most critical needs the Drupal community perpetually has are marketing and event support. That brings us to perhaps the most important message of open-source contributions: You, too, can contribute.

Special thanks to Michael Meyers for his feedback during the writing process.

Photo by Vincent van Zalinge on Unsplash

May 05 2020
May 05

Drupal is one of the largest and most active open-source projects in the world, and the Drupal Association is responsible for enabling it to thrive by creating and maintaining tooling and other projects that keep Drupal humming. Though many in the Drupal community and outside it see the Drupal Association only as the organizer of the global DrupalCon conferences each year, the Drupal Association is, in fact, responsible for some of the most critical elements that sustain Drupal as a software product, including localizations, updates, security advisories, metadata, and infrastructure. All of the "cloud services" that we work with on a daily basis in the Drupal ecosystem represent fundamental functions of the Drupal Association.

In recent years, the Drupal Association has launched several features that reinvent the way developers interact with Drupal as a software system, including DrupalCI (Drupal's test infrastructure), Composer Façade (in order to support Drupal's adoption of Composer), and Drupal's ongoing migration to GitLab for enhanced source control. For many years, Tag1 Consulting has supported and contributed to the Drupal Association not only as a key partner in visible initiatives but also in the lesser-known aspects of the Drupal Association's work that keep Drupal.org and the ecosystem running. Though we've long provided 80 free hours of work a month to the Drupal Association, we're proud to announce we are expanding this commitment by 50% to 120 pro-bono hours per month (75% of an FTE). In addition we have also made a donation toward #DrupalCares' $100,000 goal.

In this special edition of the _Tag1 Team Talks _show, we introduce a miniseries with the engineering team at the Drupal Association, including Tim Lehnen (Chief Technology Officer, Drupal Association) and Narayan Newton (Chief Technology Officer, Tag1), along with Michael Meyers (Managing Editor, Tag1) and Preston So (Editor in Chief at Tag1 and Senior Director, Product Strategy at Oracle). In this first installment, we dive into some of the mission-critical work the Drupal Association performs for the Drupal community with the support of Tag1 and other organizations and how they represent the lifeblood of the Drupal project as well as its continued longevity.

[embedded content]

Michael Meyers Joins Tag1 As Managing Director

Mar 26 2018
Mar 26

Background Image - A New Drupal 8 Module

Nov 29 2017
Nov 29
Apr 13 2017
Apr 13

Apache JMeter and I have a long and complicated relationship. It is definitely a trusted and valuable tool, but I am also quite confident that certain parts of it will make an appearance in my particular circle of hell. Due to this somewhat uncomfortable partnership, I am always interested in new tools for applying load to an infrastructure and monitoring the results. Locust.io is not exactly a new tool, but I have only recently begun to use it for testing.

What Is Locust?

Locust is a load-testing framework which allows you to write your load plan in regular Python. This is a welcome experience if you have ever had to manually edit a JMeter JMX file. Not only is it a more pleasant experience, but writing executors in Python makes it easy to create a very flexible load plan.

     Idea For A Circle Of Hell: Given a slightly corrupted JMX file that must be loadable and cannot easily be replaced, attempt to look through it to find the error preventing loading. Every time you save the file, some other tag corrupts slightly. Who needs eternal damnation, give me a large JMX file and some failing drives…

The other advantage of Locust is that it has a quite nice flask-based UI (that you can extend fairly easily) and it is quite easy to distribute load generation among multiple locust instances or servers.

Simple Load Plan

In the grand tradition of blog entries like this, let's build a completely impractical, simplistic example.

from locust import HttpLocust, TaskSet, task

class test_user(TaskSet):
    @task
    def front_page(self):
        self.client.get("/")
        
    @task
    def about_page(self):
        self.client.get("/about/")

class TestUserRunner(HttpLocust):
    task_set = test_user

The above code imports the required pieces to build a Locust test plan, these being the TaskSet, HttpLocust instance, and the task decorator. The class you create by inheriting the TaskSet class represents a type of user for the most part (Anonymous, Authenticated, Editorial, etc). In reality it is just a set of individual tasks, supporting elements and supporting methods. However, that matches the reality of a user rather well, so in general I define separate task sets for different user types.

The majority of this code is fairly self-explanatory. You can make requests via the client.get call and individual tasks are marked with the ‘@task’ decorator. These tasks will be what the main testing routine executes and you can weight each task differently, if you choose to. For example, in the above code we might want to weight the front_page higher than the about_page, since the front page will likely see more traffic. You do this by passing the task decorator a weight (where a higher weight equals increased likelihood of running), like so:

@task(10)
 
def front_page(self):

    self.client.get("/")

Running Our Simple Load Plan

Executing our load plan is not difficult. We save the code to plan.py (or any other name that is NOT locust.py) and run:

locust -f plan.py --host=<our test target>

We then open a browser and go to localhost:8089. You will be prompted for the number of users to spawn and how many users to spawn per second. Once you fill this out you can start the test. You will see something like the following:

Locust dashboard

This screen will allow you to monitor your load test, download csv files containing results, stop the load test, and see failures. A quick note on ramp-up: You may notice that you get results up until the point where all of your requested users are launched, then the results are cleared. This is so your final results only include numbers from when all users were launched, but it can take you by surprise if you are not expecting it.

Fetching Static Assets

While the above example can test Drupal's ability to build and deliver a page, it doesn't do much to actually test the site or infrastructure. This is partly because we aren't fetching any static assets. This is where things get a bit interesting. In JMeter, you would check a box at this point. Locust on the other hand trusts you to handle this yourself. It has a very high opinion of you.

Fortunately, it isn’t that hard. There are a few different tools you can use to parse a returned page and pull out static resources. I am going to use BeautifulSoup because I find the name amusing.

     NOTE: It is decisions like this that make me think I need to start planning for which circle of hell I end up in.

For my load tests I wrote a helper function called “fetch_static_assets”. The function is below:

def fetch_static_assets(session, response):
    resource_urls = set()
    soup = BeautifulSoup(response.text, "html.parser")
 
    for res in soup.find_all(src=True):
        url = res['src']
        if is_static_file(url):
            resource_urls.add(url)
        else:
            print "Skipping: " + url

The function is_static_file is quite important. The BeautifulSoup is going to return all URLs to you. Some of these may be broken, some may be off-site, etc. I recommend defining the is_static_file function and have it return false. Then look at what URLs are being skipped and slowly add patterns that match your static files and/or the URLs you want to fetch as sub-requests. In particular for a staging site, you don’t necessarily want to apply load to everything linked from your page. Here is an example of a very simplistic is_static_file function:

def is_static_file(f):
    if "/files" in f:
        return True
    else:
        return False

The rest of the fetch_static_assets function is below:

    for url in set(resource_urls):
        if "amazonaws.com" in url:
            session.client.get(url, name="S3 Static File")
            print "S3: " + url
        else:
            session.client.get(url, name="Static File")
            print "Regular: " + url

You can pass these static files into client.get for monitoring, but I would recommend setting the name to something consistent or else it is going to make your results quite messy. As you can see, I am tagging S3 URLs separately from regular static files in this example. Since you're defining all of this yourself, you have the flexibility to do basically whatever you want when you are parsing the page response and requesting sub-resources.

Below is an example of using this static asset function:

@task(10)
def frontpage(self):
    response = self.client.get("/")
    fetch_static_assets(self, response)

Logging Into Drupal

So, our load test can now fetch static assets. It can even fetch static assets of our choice and tag them as we would like. However, we are basically just testing the Drupal page cache at this point or perhaps Varnish or NGINX or even a CDN. Could be useful...probably isn’t though. To really be useful, we are going to have to login to the site. Fortunately this isn’t that difficult with Locust and we can use BeautifulSoup again. We are going to use the on_start method now. This is a special method of a Locust TaskSet that gets called on the start of the task set. It is not creatively named. Our example on_start is below:

def on_start(l):
    response = l.client.get("/user/login", name="Login")
    soup = BeautifulSoup(response.text, "html.parser")
    drupal_form_id = soup.select('input[name="form_build_id"]')[0]["value"]
    r = l.client.post("/user/login", {"name":"nnewton", "pass":"hunter2", "form_id":"user_login_form", "op":"Log+in", "form_build_id":drupal_form_id})

And there it is. Once this TaskSet logs in, Locust will keep the session cookie for the duration of that run. All requests from this TaskSet will be considered a logged in user. It is not uncommon for a test plan to have two TaskSets at the outside, one to cover the anonymous use-case and one for logged in users.

Locust is definitely a bit more difficult to approach than the JMeter GUI, but I have found it to be much easier to deal with when you are attempting to represent a somewhat complicated user pattern. In our next blog on this topic, we will be discussing how to use SaltStack and EC2 to automate Locust testing from multiple endpoints, i.e. spinning up a group of test VMs to run the Locust test plan and report back to a reporting node.

Download the demo load test described in this post.

Take the headache out of Drupal security with Tag1 Quo - Drupal Monitoring Made Easy.

Mar 15 2017
Mar 15

Though it came and went largely unnoticed, February 24th, 2017 marked an important anniversary to tens of thousands of Drupal website owners. February 24th 2017 was the 1-year anniversary of the End-of-Life (EOL) announcement for Drupal 6 as no longer supported by the Drupal community.

It is widely known that major Drupal version upgrades require non-trivial resources. Not only do they require significant planning, technical expertise, and budget, but the path is often determined by funding and availability of maintainers of popular contributed functionality (modules). Add the complexity of upgrading custom development, and the conditions create significant challenges for small to medium websites without large operating budgets. As evidence of this, our research indicates there are at least 150,000 publicly accessible sites still running Drupal 6.

Tag1 Quo is the only Drupal monitoring solution that supports Drupal 6 LTS, Drupal 7, and Drupal 8 under one dashboard.

For most D6 site managers, the most critical (and stressful) impact of EOL is the discontinuation of Drupal 6 security patches by the Drupal security team. When a major version reaches EOL, the Drupal security team ceases to release patches, or serve public Security Advisories for that version. Unless those sites are maintained by a skilled developer with the time to monitor upstream projects, review patches, and backport them by hand, Drupal 6 site managers find themselves in a vulnerable spot: ongoing, publicly announced vulnerabilities may be directly exposed on their site.

To its credit, the Drupal security team developed a plan so as not to abandon D6 site owners to the wilderness. Under the Long Term Support (LTS) initiative, they selected Tag1 and other qualified vendors to provide Drupal 6 patches as a paid service to site owners, under the condition that those patches also be made available to the public.

Tag1 Quo: A Year of LTS

With the EOL deadline rapidly approaching, Tag1—like many Drupal consulting firms—had clients still on Drupal 6. We were happy to sign on as an LTS provider to support our clients formally under the LTS initiative. It didn’t take us long to decide on automating patch delivery and empowering customers with some useful management tools. A few months into EOL, Tag1 Quo was launched with automated detection and notification of outstanding patches, and a unified view of security updates across all of their Drupal websites.

The vision was simple:Tag1 Quo Security Patch Notification

  • Provide D6 sites with a dashboard to quickly assess the status of their modules and themes, providing automated patches and pre-packaged releases delivered to their inbox, tested by our team of Drupal experts.
  • Make it platform and hosting-agnostic to provide maximum flexibility to the varied workflows and infrastructure of our customers.
  • Make it simple to setup and run from any site, in any environment, returning clear status reports and ongoing updates with the install of one module and a few clicks.
  • Price it aggressively: for less than the cost of 1 hour of senior developer time per month, a D6 customer could lease Quo as their security concierge, monitoring for patches around the clock.

Because of customers of Tag1 Quo and the LTS program, we’ve delivered on that vision. Paying customers of LTS have financed 25 Drupal 6 patches, written and contributed back to the community. While Drupal 8 continues to mature and add contributed functionality, the D6 LTS initiative is still going strong, giving site managers breathing room to fundraise, budget, and plan for their next big upgrade.

Enterprise Security with Tag1 Quo

Like many power users of Drupal, at Tag1 we maintain internal Drupal 6, 7, and 8 sites, as well as client sites on all of those versions. As we began designing and building Tag1 Quo, we quickly realized that the tools we wanted and needed for managing updates across sites were tools that would come in handy for other enterprise users:

  • agencies
  • universities
  • large corporations, and
  • infrastructure providers

In January 2017, we launched support for Drupal versions 7 and 8 on Tag1 Quo. With discounted rates for multiple sites, Tag1 Quo customers can now manage multiple sites via a centralized security dashboard with email notifications, across Drupal versions 6 through 8.

Tag1 Quo dashboard

Quo also provides powerful filtering tools across all sites. Filter by site, project, module version to see all instance of a particular module, across all sites. At-a-glance status color-coding tells you if your module has available updates, security-related or otherwise.

Filtering on modules in Tag1 Quo

Click in on a module to access a direct link to the latest release and access project metadata such as package, info, schema, and dependencies.Tag1 Quo module details

In managing our own sites, we’ve found that combining these tools in one central system has rapidly increased our turnaround on identifying and patching vulnerabilities, while lowering our management overhead. Eating our own dogfood has been satisfying: Tag1 Quo has freed up valuable developer time and budget to focus on feature development. If you are an agency maintaining client sites, or an IT department managing multiple corporate properties, you must have a security updates monitoring strategy and we’re confident that enterprise Tag1 Quo provides a solution.

Making Drupal maintenance easy forever

For years, the community has wrestled with the problem of expensive upgrades referenced in the beginning of this blog. How can Drupal continue to be a leader in innovation without becoming cost prohibitive to non-enterprise users? Last week, Dries published an important blog Making Drupal upgrades easy forever that announces an exciting, new approach for Drupal upgrades, based on a policy change initiated by Tag1’s Nat Catchpole.

Writes Dries:

we will continue to introduce new features and backwards-compatible changes in Drupal 8 releases. In the process, we sometimes have to deprecate old systems. Instead of removing old systems, we will keep them in place and encourage module maintainers to update to the new systems. This means that modules and custom code will continue to work. The more we innovate, the more deprecated code there will be in Drupal 8...Eventually, we will reach a point where we simply have too much deprecated code in Drupal 8. At that point, we will choose to remove the deprecated systems and release that as Drupal 9. This means that Drupal 9.0 should be almost identical to the last Drupal 8 release, minus the deprecated code.

For site owners and decision makers, this change is potentially earth-shattering. It replaces the monumental major version upgrade with incremental minor-version updates. Drupal 6 sites planning a Drupal 7 upgrade might want to revisit that plan. Drupal 7 sites waiting to upgrade directly to Drupal 9 may also want to reconsider. Site managers will need to invest more time on planning around minor releases: contributed code they rely on will be ported more frequently (though less dramatically). These changes are good for the Drupal ecosystem but issues of backward compatibility, legacy APIs, and deprecated code will likely require additional diligence.

We’ve built Tag1 Quo with an eye to this new future, with current and upcoming features to help site owners manage this complexity. If you are still on Drupal 6, Tag1 Quo has your back. If you are still on Drupal 7 when it goes EOL, Tag1 Quo will be there. And if you are somewhere in-between D8.7 and D8.11, Tag1 Quo will also be there for you, too.

In March 2017, get $50 credit towards your subscription with

Oct 25 2016
sam
Oct 25

When we left off last time, we’d assembled a definition of what versions are. Now, we’re going to dive into how we use them in Tag1 Quo: comparing them to one another!

The general goal is straightforward enough: we want to know if, say, 6.x-1.0 is less than 6.x-1.1. (Yup!) Or if 6.x-1.0-alpha1 is less than 6.x-1.0. (Also yup!) Let’s rewrite these two examples as tuple comparisons:

{6,1,0,4,0,0} < {6,1,1,4,0,0} = TRUE
{6,1,0,0,0,0} < {6,1,1,0,0,0} = TRUE

To determine if one tuple is less than the other, we proceed pairwise through the tuple’s values, comparing the integers at the same position from each, until we find different values. Whichever tuple’s value at that position is less is considered to be the lesser version. (Uniformity in this comparison operation is why the mapping for prerelease types assigns unstable to 0, rather than 4.)

However, this simple comparison operation doesn’t actually meet Quo’s requirements. Remember, Quo’s crucial question is not whether there are any newer versions, but whether there are newer security releases that are likely to apply to the version we’re investigating.

So, say we’re looking at 6.x-1.1 for a given extension, and there exists a 7.x-2.2 that’s a security release. While the latter is obviously less than the former:

{6,1,1,4,0,0} < {7,2,2,4,0,0} = TRUE

We don’t care, because these releases are on totally different lines of development.

...right? I mean, it’s probably true that whatever security hole existed in 7.x-2.1 doesn’t exist in 6.x-1.1. Maybe? Sort of. Certainly, you can't upgrade to 7.x-2.1 directly from 6.x-1.1, as that's changing major versions. But Quo came to be as part of the D6LTS promise - that IF there are security holes in later versions, we'll backport them to 6.x - so it's certainly possible that the problem might still exist. It all depends on what you take these version numbers to mean.

Yeah, we need to take a detour.

Versions are meaningless

As you become accustomed to a version numbering scheme - Drupal, semver, rpm, whatever - the meanings of the version components gradually work their way to the back of your mind. You don’t really “read” versions, so much as “scan and decode” them, according to these osmosed semantics. This peculiar infection of our subconscious makes it far too easy to forget a simple fact:

Version numbers have absolutely no intrinsic meaning. They have no necessary relationship to the code they describe.

Maybe this is obvious. Maybe it isn’t. If not, consider: what would prevent you from writing a module for Drupal 7 APIs, but then tagging and releasing it as 8.x-1.0? Or, for that matter, writing a module with no functions, but prints “spork” on inclusion of its .module file? (Answer: nothing.) Also, Donald Knuth uses a π-based numbering system for TeX’s versions, adding one more digit with each successive release. The version looks like a number, but the only property that matters is its length. Versions are weird.

This nebulous relationship is both the blessing and curse of versions. The curse is obvious: we can’t actually know anything with certainty about code just by looking at, or comparing, version numbers. But the blessing is more subtle: a well-designed version numbering system provides a framework for consistently encoding all of our intended semantics, together. Both of those words have specific meaning here:

  • “Together,” as in, it combines all the different aspects of changes to code that are important for Quo’s purposes: independent lines of development, Drupal core version compatibility, D6LTS’ own patch addenda, etc.

  • “Consistent,” as in, a numerical coördinate system - rather than an ad-hoc collection of flags, strings, and numbers - is a formal mathematical system without weird, nasty combinations of states.

The blessing outweighs the curse because, even if versions may lie to us about what the code actually is, they provide a formal structure in which it’s easy to understand what it should be. And, in the wild west of organic open source software growth, knowing with certainty about what things should be is a pretty good goal. It makes tasks concrete enough that you can actually build a business and product - like Tag1 Quo! Which takes us back to the main road after our detour - what’s the answer to this question?

{6,1,1,4,0,0} < {7,1,2,4,0,0}

The strictly mathematical answer is “yes.” But, for the question we’re actually interested in. we generally assume that security releases are only necessary when they’re on both the same core version, in the same line of development (major version). So, we say “no” here. And we’d also say “no” if the core version were the same:

{6,1,1,4,0,0} < {6,1,2,4,0,0}

This one is a little iffier, though. While porting from one Drupal core version to the next almost always involves a significant rewrite, that’s not necessarily the case for major versions. The security release may actually apply. It’s the kind of thing we need to investigate when deciding whether or not to release our D6LTS patch versions.

Today, Quo assumes security releases on different lines of development aren’t applicable to one another, but what’s important is that we know that’s an assumption. By representing the versions in a fully abstracted coördinate system as we have, rather than (for example) formally encoding assumptions about e.g. “lines of development” into the data itself, we allow ourselves the flexibility of changing those assumptions if they turn out to be wrong. Being that Quo’s entire business proposition turns on answering questions about versions correctly, it pays to build on a clear, flexible foundation.

This post rounded out the broader theory and big-picture considerations for versions in Quo. In the next post, I’ll get more into the nitty gritty - how Quo implements these ideas in a way that is practical, fast, and scalable.

 

Oct 25 2016
sam
Oct 25

When we left off last time, we’d assembled a definition of what versions are. Now, we’re going to dive into how we use them in Tag1 Quo: comparing them to one another!

The general goal is straightforward enough: we want to know if, say, 6.x-1.0 is less than 6.x-1.1. (Yup!) Or if 6.x-1.0-alpha1 is less than 6.x-1.0. (Also yup!) Let’s rewrite these two examples as tuple comparisons:

{6,1,0,4,0,0} < {6,1,1,4,0,0} = TRUE
{6,1,0,0,0,0} < {6,1,1,0,0,0} = TRUE

To determine if one tuple is less than the other, we proceed pairwise through the tuple’s values, comparing the integers at the same position from each, until we find different values. Whichever tuple’s value at that position is less is considered to be the lesser version. (Uniformity in this comparison operation is why the mapping for prerelease types assigns unstable to 0, rather than 4.)

However, this simple comparison operation doesn’t actually meet Quo’s requirements. Remember, Quo’s crucial question is not whether there are any newer versions, but whether there are newer security releases that are likely to apply to the version we’re investigating.

So, say we’re looking at 6.x-1.1 for a given extension, and there exists a 7.x-2.2 that’s a security release. While the latter is obviously less than the former:

{6,1,1,4,0,0} < {7,2,2,4,0,0} = TRUE

We don’t care, because these releases are on totally different lines of development.

...right? I mean, it’s probably true that whatever security hole existed in 7.x-2.1 doesn’t exist in 6.x-1.1. Maybe? Sort of. Certainly, you can't upgrade to 7.x-2.1 directly from 6.x-1.1, as that's changing major versions. But Quo came to be as part of the D6LTS promise - that IF there are security holes in later versions, we'll backport them to 6.x - so it's certainly possible that the problem might still exist. It all depends on what you take these version numbers to mean.

Yeah, we need to take a detour.

Versions are meaningless

As you become accustomed to a version numbering scheme - Drupal, semver, rpm, whatever - the meanings of the version components gradually work their way to the back of your mind. You don’t really “read” versions, so much as “scan and decode” them, according to these osmosed semantics. This peculiar infection of our subconscious makes it far too easy to forget a simple fact:

Version numbers have absolutely no intrinsic meaning. They have no necessary relationship to the code they describe.

Maybe this is obvious. Maybe it isn’t. If not, consider: what would prevent you from writing a module for Drupal 7 APIs, but then tagging and releasing it as 8.x-1.0? Or, for that matter, writing a module with no functions, but prints “spork” on inclusion of its .module file? (Answer: nothing.) Also, Donald Knuth uses a π-based numbering system for TeX’s versions, adding one more digit with each successive release. The version looks like a number, but the only property that matters is its length. Versions are weird.

This nebulous relationship is both the blessing and curse of versions. The curse is obvious: we can’t actually know anything with certainty about code just by looking at, or comparing, version numbers. But the blessing is more subtle: a well-designed version numbering system provides a framework for consistently encoding all of our intended semantics, together. Both of those words have specific meaning here:

  • “Together,” as in, it combines all the different aspects of changes to code that are important for Quo’s purposes: independent lines of development, Drupal core version compatibility, D6LTS’ own patch addenda, etc.

  • “Consistent,” as in, a numerical coördinate system - rather than an ad-hoc collection of flags, strings, and numbers - is a formal mathematical system without weird, nasty combinations of states.

The blessing outweighs the curse because, even if versions may lie to us about what the code actually is, they provide a formal structure in which it’s easy to understand what it should be. And, in the wild west of organic open source software growth, knowing with certainty about what things should be is a pretty good goal. It makes tasks concrete enough that you can actually build a business and product - like Tag1 Quo! Which takes us back to the main road after our detour - what’s the answer to this question?

{6,1,1,4,0,0} < {7,1,2,4,0,0}

The strictly mathematical answer is “yes.” But, for the question we’re actually interested in. we generally assume that security releases are only necessary when they’re on both the same core version, in the same line of development (major version). So, we say “no” here. And we’d also say “no” if the core version were the same:

{6,1,1,4,0,0} < {6,1,2,4,0,0}

This one is a little iffier, though. While porting from one Drupal core version to the next almost always involves a significant rewrite, that’s not necessarily the case for major versions. The security release may actually apply. It’s the kind of thing we need to investigate when deciding whether or not to release our D6LTS patch versions.

Today, Quo assumes security releases on different lines of development aren’t applicable to one another, but what’s important is that we know that’s an assumption. By representing the versions in a fully abstracted coördinate system as we have, rather than (for example) formally encoding assumptions about e.g. “lines of development” into the data itself, we allow ourselves the flexibility of changing those assumptions if they turn out to be wrong. Being that Quo’s entire business proposition turns on answering questions about versions correctly, it pays to build on a clear, flexible foundation.

This post rounded out the broader theory and big-picture considerations for versions in Quo. In the next post, I’ll get more into the nitty gritty - how Quo implements these ideas in a way that is practical, fast, and scalable.

 

Drupal 6 reached end of life, but Quo has security patches your site needs to remain in the game.

Oct 20 2016
sam
Oct 20

When Tag1 decided to build Tag1 Quo, we knew there was one question we’d have to answer over, and over, and over again: is there a security update available for this extension? Answering that question - at scale, for many websites, across many extensions, through all the possible versions they might have - is the heart of what Quo does.

The problem seems simple enough, but doing it at such scale, for “all” versions, and getting it right, has some deceptive difficulties. Given a site with an extension at a particular version, we need to know where it sits on the continuum of all versions that exist for that extension (we often refer to that as the “version universe,”), and whether any of the newer versions contain security fixes.

There are a few different approaches we could’ve taken to this problem. The one we ultimately settled on was a bit more abstracted than what might initially seem necessary. It was also not a “typical” Drupal solution. In this blog series, I’ll cover both the theoretical foundation of the problem, and the approach we took to implementation.

What’s a version?

Let's start at the beginning.

Quo works by having existing Drupal 6 (for now!) sites install an agent module, which periodically sends a JSON message back to the Quo servers indicating what extensions are present on that site, and at what versions. Those “versions” are derived from .info files, using functions fashioned after the ones used in Drupal core. Once that JSON arrives at Quo’s servers, we have to decide how to interpret the version information for each extension.

All versions arrive as strings, and the first decision Quo has to make is whether that string is “valid” or not. Validation, for the most part, means applying the same rules as applied by the Drupal.org release system. Let’s demonstrate that through some examples:

6.x-1.0

Drupal extension versions are a bit different than most software versions in the wild, because the first component, 6.x, explicitly carries information about compatibility, not with itself, but with its ecosystem: it tells us the version of Drupal core that the extension is supposedly compatible with.

The next part, 1.0, holds two bits of information: the major (1) and minor (0) versions. It’s wholly up to the author to decide what numbers to use there. The only additional meaning layered on is that releases with a 0 major version, like 6.x-0.1, are considered to be prerelease, and thus don’t receive coverage from the Drupal.org security team. Of course, Quo’s raison d’etre is that Drupal 6 is no longer supported by the security team, so that distinction doesn’t really matter anymore.

Now, 6.x-1.0 is an easy example, but it doesn’t cover the range of what’s possible. For example:

6.x-1.0-alpha1

This adds the prerelease field - alpha1. This field is optional, but if it does appear, it indicates the version to be some form of prerelease - and thus, as above, not covered by the security team. Drupal.org’s release system also places strong restrictions on the words that can appear there: alpha, beta, rc, and unstable. Additionally, the release system requires that the word must be accompanied by a number - 1, in this case.

There are a couple of notably different forms that valid Drupal versions can come in. There can be dev releases:

6.x-1.x-dev

Dev releases can only have the compatibility version and a major version, and cannot have a prerelease field. They’re supposed to represent a “line” of development, where that line is then dotted by any individual releases with the same major version.

And of course, Drupal core itself has versions:

6.43

Core versions have the same basic structure as the major version/minor version structure in an extension. Here, the 43 is a minor version - a dot along the development line of the 6 core version.

These examples illustrate what’s allowed by the Drupal.org release system, and thus, the shape that all individual extensions’ version universe will have to take. All together, we can say there are five discrete components of a version:

  • Core version
  • Major version
  • Minor version
  • Prerelease type
  • Prerelease number

Viewed in this way, it’s a small step to abstracting the notion of version away from a big stringy blob, and towards those discrete components. Specifically, we want to translate these versions into a 5-dimensional coördinate system, or 5-tuple: {core, major, minor, prerelease_type, prerelease_num}, where each of these dimensions has an integer value. Four of the components are already numbers, so that’s easy, but prerelease type is a string. However, because there’s a finite set of values that can appear for prerelease type, it’s easy to map those strings to integers:

  • Unstable = 0
  • Alpha = 1
  • Beta = 2
  • Rc = 3
  • (N/A - not a prerelease) = 4

With this mapping for prerelease types, we can now represent 6.x-1.0-alpha1 as {6,1,0,1,1}, or 6.x-2.3 as {6,2,3,4,0}.

However, that’s not quite the end of the story. The Quo service is all about delivering on the Drupal 6 Long Term Support (D6LTS) promise: providing and backporting security fixes for Drupal 6 extensions, now that they’re no longer supported by the security team.

Because such fixes are no longer official, we can’t necessarily expect there to be proper releases for them. At the same time, it’s still possible for maintainers to release new versions of 6.x modules, so we can’t just reuse the existing numbering scheme - the maintainer might later release a conflicting version.

For example, if D6LTS providers need to patch version 6.x-1.2 of some module, then we can’t release the patched version as 6.x-1.3, because we don’t have the authority to [get maintainers to] roll official releases (we’re not the security team, even though several members work for Tag1), and the maintainer could release 6.x-1.3 at any time, with or without our fix. Instead, we have to come up with some new notation that works alongside the existing version notation, without interfering with it.

Converting to the coördinate system gives us a nice tip in the right direction, though - we need a sixth dimension to represent the LTS patch version. And “dimension” isn’t metaphorical: for any given version, say {6, 1, 0, 1, 1} (that is, 6.x-1.0-alpha1), we may need to create an LTS patch to it, making it {6, 1, 0, 1, 1, 1}. And then later, maybe we have to create yet another: {6, 1, 0, 1, 1, 2}.

Now, we also have to extend the string syntax to support this sixth dimension - remember, strings are how the agent reports a site’s extension versions! It’s easy enough to say “let’s just add a bit to the end,” like we did with the coördinates, but we’re trying to design a reliable system here - we have to understand the implications of such changes.

Fortunately, this turns out to be quite easy: {6, 1, 0, 1, 1, 1} becomes 6.x-1.0-alpha1-p1; {6, 2, 3, 4, 0, 1} becomes 6.x-2.3-p1. This works well specifically because the strings in the prerelease type field are constrained to unstable, alpha, beta, and rc - unlike in semver, for example:

     A pre-release version MAY be denoted by appending a hyphen and a series
     of dot separated identifiers immediately following the patch version.
     Identifiers MUST comprise only ASCII alphanumerics and hyphen
     [0-9A-Za-z-].

     ...identifiers consisting of only digits are compared numerically and
     identifiers with letters or hyphens are compared lexically in ASCII
     sort order.

In semver, prerelease information can be any alphanumeric string, can be repeated, and are compared lexicographically (that is, alpha < beta < rc < unstable). If Drupal versions were unbounded in this way, then a -p1 suffix would be indistinguishable from prerelease information, creating ambiguity and making conflicts possible. But they’re not! So, this suffix works just fine.

Now, a coordinate system is fine and dandy, but at the end of the day, it’s just an abstracted system for representing the information in an individual version. That’s important, but the next step is figuring out where a particular version sits in the universe of versions for a given extension. Specifically, the question Quo needs to ask is if there’s a “newer” version of the component available (and if so, whether that version includes security fixes). Basically, we need to know if one version is “less” than another.

And that’s where we’ll pick up in the next post!

Oct 20 2016
sam
Oct 20

When Tag1 decided to build Tag1 Quo, we knew there was one question we’d have to answer over, and over, and over again: is there a security update available for this extension? Answering that question - at scale, for many websites, across many extensions, through all the possible versions they might have - is the heart of what Quo does.

The problem seems simple enough, but doing it at such scale, for “all” versions, and getting it right, has some deceptive difficulties. Given a site with an extension at a particular version, we need to know where it sits on the continuum of all versions that exist for that extension (we often refer to that as the “version universe,”), and whether any of the newer versions contain security fixes.

There are a few different approaches we could’ve taken to this problem. The one we ultimately settled on was a bit more abstracted than what might initially seem necessary. It was also not a “typical” Drupal solution. In this blog series, I’ll cover both the theoretical foundation of the problem, and the approach we took to implementation.

What’s a version?

Let's start at the beginning.

Quo works by having existing Drupal 6 (for now!) sites install an agent module, which periodically sends a JSON message back to the Quo servers indicating what extensions are present on that site, and at what versions. Those “versions” are derived from .info files, using functions fashioned after the ones used in Drupal core. Once that JSON arrives at Quo’s servers, we have to decide how to interpret the version information for each extension.

All versions arrive as strings, and the first decision Quo has to make is whether that string is “valid” or not. Validation, for the most part, means applying the same rules as applied by the Drupal.org release system. Let’s demonstrate that through some examples:

6.x-1.0

Drupal extension versions are a bit different than most software versions in the wild, because the first component, 6.x, explicitly carries information about compatibility, not with itself, but with its ecosystem: it tells us the version of Drupal core that the extension is supposedly compatible with.

The next part, 1.0, holds two bits of information: the major (1) and minor (0) versions. It’s wholly up to the author to decide what numbers to use there. The only additional meaning layered on is that releases with a 0 major version, like 6.x-0.1, are considered to be prerelease, and thus don’t receive coverage from the Drupal.org security team. Of course, Quo’s raison d’etre is that Drupal 6 is no longer supported by the security team, so that distinction doesn’t really matter anymore.

Now, 6.x-1.0 is an easy example, but it doesn’t cover the range of what’s possible. For example:

6.x-1.0-alpha1

This adds the prerelease field - alpha1. This field is optional, but if it does appear, it indicates the version to be some form of prerelease - and thus, as above, not covered by the security team. Drupal.org’s release system also places strong restrictions on the words that can appear there: alpha, beta, rc, and unstable. Additionally, the release system requires that the word must be accompanied by a number - 1, in this case.

There are a couple of notably different forms that valid Drupal versions can come in. There can be dev releases:

6.x-1.x-dev

Dev releases can only have the compatibility version and a major version, and cannot have a prerelease field. They’re supposed to represent a “line” of development, where that line is then dotted by any individual releases with the same major version.

And of course, Drupal core itself has versions:

6.43

Core versions have the same basic structure as the major version/minor version structure in an extension. Here, the 43 is a minor version - a dot along the development line of the 6 core version.

These examples illustrate what’s allowed by the Drupal.org release system, and thus, the shape that all individual extensions’ version universe will have to take. All together, we can say there are five discrete components of a version:

  • Core version

  • Major version

  • Minor version

  • Prerelease type

  • Prerelease number

Viewed in this way, it’s a small step to abstracting the notion of version away from a big stringy blob, and towards those discrete components. Specifically, we want to translate these versions into a 5-dimensional coördinate system, or 5-tuple: {core, major, minor, prerelease_type, prerelease_num}, where each of these dimensions has an integer value. Four of the components are already numbers, so that’s easy, but prerelease type is a string. However, because there’s a finite set of values that can appear for prerelease type, it’s easy to map those strings to integers:

With this mapping for prerelease types, we can now represent 6.x-1.0-alpha1 as {6,1,0,1,1}, or 6.x-2.3 as {6,2,3,4,0}.

However, that’s not quite the end of the story. The Quo service is all about delivering on the Drupal 6 Long Term Support (D6LTS) promise: providing and backporting security fixes for Drupal 6 extensions, now that they’re no longer supported by the security team.

Because such fixes are no longer official, we can’t necessarily expect there to be proper releases for them. At the same time, it’s still possible for maintainers to release new versions of 6.x modules, so we can’t just reuse the existing numbering scheme - the maintainer might later release a conflicting version.

For example, if D6LTS providers need to patch version 6.x-1.2 of some module, then we can’t release the patched version as 6.x-1.3, because we don’t have the authority to [get maintainers to] roll official releases (we’re not the security team, even though several members work for Tag1), and the maintainer could release 6.x-1.3 at any time, with or without our fix. Instead, we have to come up with some new notation that works alongside the existing version notation, without interfering with it.

Converting to the coördinate system gives us a nice tip in the right direction, though - we need a sixth dimension to represent the LTS patch version. And “dimension” isn’t metaphorical: for any given version, say {6, 1, 0, 1, 1} (that is, 6.x-1.0-alpha1), we may need to create an LTS patch to it, making it {6, 1, 0, 1, 1, 1}. And then later, maybe we have to create yet another: {6, 1, 0, 1, 1, 2}.

Now, we also have to extend the string syntax to support this sixth dimension - remember, strings are how the agent reports a site’s extension versions! It’s easy enough to say “let’s just add a bit to the end,” like we did with the coördinates, but we’re trying to design a reliable system here - we have to understand the implications of such changes.

Fortunately, this turns out to be quite easy: {6, 1, 0, 1, 1, 1} becomes 6.x-1.0-alpha1-p1; {6, 2, 3, 4, 0, 1} becomes 6.x-2.3-p1. This works well specifically because the strings in the prerelease type field are constrained to unstable, alpha, beta, and rc - unlike in semver, for example:

     A pre-release version MAY be denoted by appending a hyphen and a series
     of dot separated identifiers immediately following the patch version.
     Identifiers MUST comprise only ASCII alphanumerics and hyphen
     [0-9A-Za-z-].

     ...identifiers consisting of only digits are compared numerically and
     identifiers with letters or hyphens are compared lexically in ASCII
     sort order.

In semver, prerelease information can be any alphanumeric string, can be repeated, and are compared lexicographically (that is, alpha < beta < rc < unstable). If Drupal versions were unbounded in this way, then a -p1 suffix would be indistinguishable from prerelease information, creating ambiguity and making conflicts possible. But they’re not! So, this suffix works just fine.

Now, a coordinate system is fine and dandy, but at the end of the day, it’s just an abstracted system for representing the information in an individual version. That’s important, but the next step is figuring out where a particular version sits in the universe of versions for a given extension. Specifically, the question Quo needs to ask is if there’s a “newer” version of the component available (and if so, whether that version includes security fixes). Basically, we need to know if one version is “less” than another.

And that’s where we’ll pick up in the next post!

 

Tag1 Quo is the only Drupal monitoring solution that supports Drupal 6 LTS, Drupal 7, and Drupal 8 under one dashboard.

Aug 30 2016
Aug 30

Long Term Support for Drupal 6 might be my favorite new feature included in Drupal 8. (I know, that might be stretching things for the fundamentally awesome step forward that Drupal 8 is, but bear with me.)

Long Term Support for Drupal 6 Long Term Support for Drupal 6

If you're like me, you have loved the power of building websites for people that expose their ideas or services to the world. If you're like me, you've ended up "owning" a number of these websites that you somehow ended up supporting along the way too. And if you're like me, you've ended up with lots of Drupal 6 websites to support, even though D6 hit End-of-Life on February 24th, 2016.

There are a lot of D6 sites out there with no money for an upgrade, but which still have a niche to fill or useful information for the world. Those can be an albatross around our necks and a time sink. We don't have the resources to update (and their owners don't either) but we can't set the site owners adrift.

When previous versions of Drupal hit end-of-life, it was always a catastrophe for those of us with sites out there. Upgrade or else. Very costly in time and effort or money. But this time, with the release of Drupal 8, the Drupal Security Team and a number of vendors teamed up to actually offer real commercial support for Drupal 6. Yay! Yay!

One of those vendors is Tag1 Consulting with our slick new Tag1 Quo service, which monitors both Drupal.org security releases and your D6 site to make sure you know what the security issues are. We even work with the community and the Security Team to backport discovered D7 security fixes. (Full disclosure: I got to work on development of the very nice D8 Quo site.)

The Tag1 Quo Dashboard The Tag1 Quo Dashboard

I got to beta test Tag1 Quo with several of my old sites this year, and it was surprisingly easy. I just installed the D6 Tag1 Quo module on each D6 site, and immediately got a summary (on the Tag1 Quo site) of the security status of each site. Then, when Drupal 7 Security Advisories have gone out in recent months, I get an email with a patch or even a tarball for the affected module.

Drupal 6 Long Term Support is a huge step forward for our community, and great kudos are in order for the Security Team for arranging and supporting it and the vendors for providing it.

Tag1 has you covered with our Drupal security monitoring solution, Tag1 Quo.

Aug 22 2016
Aug 22

Tag1 Quo: Drupal 6 Long Term SupportIt’s been an exciting summer, building our first product with Drupal 8. When we originally made the decision to offer Long Term Support for Drupal 6, we were thinking about a few of our clients that were a little behind on their upgrade plans, and had envisioned a mostly manual process. However, once we took the plunge and signed up new clients, we had more modules and themes to track than could easily be done manually, and it remained critically important we never miss an upstream release.

Tag1 Quo Dashboard, multi-site up-to-date status The Tag1 Quo Dashboard, managing multiple websites.

This ultimately led to building Tag1 Quo, a product built on top of Drupal 8 to automatically track upstream releases and security advisories, comparing them against subscriber code bases to determine which need to be backported. This automation was combined with an administrative dashboard and email notification system resulting in a fancy system that quickly delivers all applicable patches to new and existing customers, ensuring everyone stays up to date while also making it easy for us to track ongoing security issues.

Why Drupal 8

Drupal 8 LogoThe first step was architecting the central service where all this information was going to be collected, stored, parsed and shared. After a little debate, we ultimately decided on Drupal 8 for a variety of reasons: there are huge improvements in hosting web services, we have an unparalleled team of Drupal 8 experts, and we generally wanted more real world Drupal 8 experience.

There have been days (and occasionally weeks) I’ve regretted the decision. For developers intimately familiar with earlier versions of Drupal, taking the plunge into 8 can feel intimidating and overwhelming. Fortunately it quickly becomes familiar and you realize it’s still very much Drupal, and you’re just working with a more powerful set of the same essential building blocks. Quite frequently the same problems are solved differently which doesn’t always equate to better, but nor does it always mean it’s worse; I found I had to remind myself occasionally to maintain a good attitude, and ultimately learned a lot in the progress and more often than not found myself preferring the Drupal 8 way.

Ultimately, it’s been a fantastic experience. I mean, I wouldn’t have wanted to do it with any other team. There were weeks we were tracking down and fixing core bugs, many that were both non-trivial and yet basic/common functionality. We carefully maintain an ever-growing directory of core patches waiting to get committed upstream. We also found that a number of key contrib modules weren’t quite stable, leading us to help fix bugs and add the features we need, always sharing them upstream. Through our development cycle, many of our patches have already been merged benefiting us and anyone else using Drupal 8.

Once I got used to the Drupal 8 file structure, and began to wrap my head around the object-oriented paradigm, there’s a number of wonderful improvements that make developing with Drupal 8 a joy. I’ve personally loved managing the site with the new configuration management system -- all configuration changes are made locally in a private development environment, reviewed, merged, and flow upstream in a controlled and auditable fashion to the shared development server, staging server, and finally deployed to the production server.

There is a certain irony in building this Drupal 8 website, which is parsing data from the Drupal 7 powered Drupal.org, and ingesting JSON data sent from and ultimately for supporting Drupal 6 websites. Nat Catchpole, one of the Drupal 8 branch maintainers involved in building Tag1 Quo, stated this eloquently in a tweet:

First proper 8.x project involves parsing HTML from 7.x d.o for 6.x LTS support. I might be stuck in a loop.

— catch (@catch56) May 20, 2016

Drupal 6 Long Term Support

Getting Started With Tag1 Quo Getting started with Tag1 Quo

At this point, it’s time to explain just exactly what Drupal 6 Long Term Support is. The idea is simple: Drupal is an open source project and as the project moves forward community volunteers simply can’t support all old versions. Shortly after the release of Drupal 8, Drupal 6 was “end of lifed”, which means Drupal’s volunteer security team is no longer actively reviewing, patching, or maintaining it in any way. That’s where we come in at Tag1: we monitor Drupal 7 and Drupal 8 security releases for core and contrib modules, and backport them to Drupal 6 if they affect any of our clients.

Simple, right? Except not really as it can quickly become complicated figuring out who’s installed what version of each module; is the 6.x-1.x branch affected or the 6.x-3.x branch? was the module renamed when it was removed from core in Drupal 7? or when it was merged into core in Drupal 8? and so on.

So, we automated it. We wrote and maintain a simple Drupal tag1quo module with a trivially simple configuration of a single token which then securely pushes information about your website to our central server. At the same time, we track upstream security releases and security advisories from Drupal.org, both through parsing RSS feeds and scraping web pages.

And at the heart of all of this we created a special field type that strictly parses version strings in a way that allows reliable and quick comparisons. This both allows us to flag upstream releases needing review, as well as when we need to notify users of patches that affect them.

Example Tag1 Quo security advisory Example Tag1 Quo security advisory, received as email.

As upstream security releases are made, we review them to determine if they also apply to the Drupal 6 version of the code. When they do and we have subscribers using the module, we carefully backport patches, test them, share them with other D6 LTS providers for additional testing, and ultimately manage a coordinated release.

For the end user, all of this effort is hidden. Notification emails show up in your inbox with simple and clear instructions on how to apply an update.

Tag1 Quo: D6 LTS And Then Some

While we automated our D6 LTS support offering, we realized we had something that was useful for far more than just Drupal 6 Long Term Support. While the core functionality remains about keeping your website secure, we also highlight modules needing non-security updates, or those missing version information (such as those installed directly from source). We take the guesswork out of which updates affect you and simplify it with a pretty, graphical dashboard allowing you to quickly monitor all your website from on a single overview page. More complex searching and filtering across sites and projects is also provided.

At this time, we offer three levels of service:

Pro is our recommended option, as it includes our pro-active Drupal 6 Long Term Support. We monitor all your modules and themes for upstream Drupal 7 and Drupal 8 security releases as described earlier. Our expert engineers carefully review each upstream security release to determine whether or not your Drupal 6 code is also vulnerable. If it is, we backport, test, and deliver patches to fix all identified security issues. You’re covered.

Tag1 Quo viewing single site details. Viewing one website's details with Tag1 Quo.

Users desiring more direct support from Tag1 and an adaptable pricing structure for larger numbers of websites will be interested in our Enterprise level offering.

Finally, we also developed an option for those on a tighter budget that can’t afford to subscribe to Drupal 6 Long Term Support but still want to keep as up-to-date and secure as possible. Our Basic offering delivers all patches affecting your website that were paid for by our Pro and Enterprise subscribers. We don’t monitor all of your modules and themes, but your more popular modules can still be kept up to date and secure.

What’s next?

We have big plans for Tag1 Quo. We’ll continue to revisit our roadmap in future blogs, but briefly now this includes adding support for monitoring Drupal 7 and Drupal 8 websites. And when Drupal 9 is released and Drupal 7 is end of lifed, we’ll be there to support it. But before that, Tag1 Quo is still a hugely useful tool for proactively keeping your site up to date. Coming soon is a feature to help with planning upgrades, tracking your modules against what’s been ported to the version of Drupal you’re looking to migrate. We’re also working to support other open source CMS’s, starting with WordPress 4.5 and 4.6.

Tag1 Consulting

Tag1 Consulting LogoWhile we’re very proud of Tag1 Quo, we remain a consulting company at heart, and we’d love to hear about how we can help you with your project. Whether you’re building your own Drupal 8 website (or product), upgrading your Drupal 6 website, or making improvements to your existing website, we’d love to be involved! Our specialities include Security, Performance and Scalability, and Architecture. We’ve always preferred to do it right the first time, but we can also help get you out of a jam.

It’s exciting to have put so much effort into a product over the summer, and to finally have something we’re proud of that we can share. If you have a Drupal 6 website you should sign up today, affordably keeping your website up to date and secure!

May 21 2015
ChX
May 21

Drupal 7

In Drupal 7, a hook_node_access implementation could return NODE_ACCESS_IGNORE, NODE_ACCESS_ALLOW and NODE_ACCESS_DENY. If any of them returned NODE_ACCESS_DENY then access was denied. If neither did but one returned NODE_ACCESS_ALLOW then access was allowed. If neither of these values were returned by any implementation then the decision was made based on other rules but at the end of the day some code needed to grant access explicitly or access was denied. Other entities didn’t have access control.

Also, blocks had some sort of access control in a very strange way: hook_block_list_alter is used -- even by core -- to remove the non-visible blocks.

Drupal 8

Drupal 8 brings a unified entity API -- even blocks become entities. It also uses many of the same objects and concepts for routing access. Instead of constants, we now use objects implementing the AccessResultInterface. You can get an instance by calling the rather self descriptive AccessResult::allowed(), AccessResult::forbidden(), AccessResult::neutral() methods. If you are handed an AccessResultInterface object you can figure out which one it is by calling the isAllowed, isForbidden, isNeutral methods on it. Only one of them can return TRUE. Access results can be cached and so have relevant cache contexts and tags -- this is why we bother with Neutral. This caching metadata is properly merged when doing various operations on access results.

These objects are returned by hook_entity_access implementations for entity access check and by the check method of services tagged with access_check used for routing access

Let’s first consider a node access case. The entity access checker fires hook_entity_access and hook_ENTITY_TYPE_access (in this case hook_node_access) and tries to put together the results in a sane way. If we do not want to change the behavior from D7, then any D7-deny (now called Forbidden) should result in a Forbidden end result. If there are no Forbiddens then any Allowed should result in a final Allowed return value. Finally, if there were neither Forbiddens nor Allows then only Neutral were present (if anything at all) so return a Neutral. This is called the orIf operation, if you have two result objects then run $result1->orIf($result2) to get the end result. The other doesn’t matter, $result2->orIf($result1) is the same.

Let’s take a look at the AccessResult::allowedIfHasPermission method. What results do we want? Obviously, if the permission is present we want an Allowed to be returned. But if the permission is not there? It can not result in Forbidden because then any hook_node_access simply returning the result of this method would immediately deny access to the node. It really can only return Neutral: the lack of a permission means we this method can’t form an opinion about access. (Using this method is strongly preferred to if ($user->hasPermission()) { return AccessResult::allowed();} because it adds the right caching metadata).

Let’s say we want to determine whether a field is editable! This requires the entity to be editable and the field to be editable. For the sake of simplicity, let’s presume both are controlled by a permission. So let’s say the user has “edit entity” permission but doesn’t have the “edit field” permission. So according to the previous paragraph AccessResult::allowedIfHasPermission($account, ‘edit entity’) is Allowed while AccessResult::allowedIfHasPermission($account, ‘edit field) is Neutral. We can not return Allowed in this case! So we can’t use orIf -- we need another operation: this is called andIf. Much like orIf we want any Forbidden input to result in a Forbidden output and again the same as orIf we want two Allowed to result in Allowed. The only difference is in the case detailed above: when one is Allowed, the other Neutral, here orIf results in Allowed but andIf results in Neutral.

If you want to use your knowledge and familiarity with AND/OR then consider first the iron rule of “any Forbidden input results in Forbidden” and only if that rule didn’t apply then consider Allowed as TRUE and Neutral as FALSE and apply the normal AND/OR to these values. This logic is called Kleene's weak three-valued logic where the “third value” is Forbidden. Most misunderstanding and confusion results from trying to treat Forbidden as FALSE instead of being the contagious “third value” it is. The name Neutral might make you think “oh, three values are not a problem, I will just erase any N I see and the resulting two values can be evaluated like normal AND/OR” this is absolutely incorrect! In fact, if you have two variables and both are Allowed / Forbidden then the results for $x->orIf($y) will be the exact same as $x->andIf($y)! The outcome will only differ if either $x or $y is Neutral (and the other is Allowed).

Routing vs Entity Access

We have clarified the necessity for two operators and we have two subsystems, each using one of them: routing uses andIf, entity uses orIf. The difference is subtle -- as we have seen the only difference is the end of result of two access checks where one is Neutral, the other is Allowed. This becomes a Neutral result in routing and an Allowed result in entity access.

Making a Decision

All this three value logic is nice but at the end of the day, this is all internal. The user wants to see a page, can we show it to them or do we show a 403 page? Can we show this entity? The answer cannot be “I do not know”, it must be yes or no. So the system looks at the result and says yes if it is explicitly allowed, otherwise says no. For routing (remember, andIf) this is very simple: if every access checker answered Allowed then and only then we can answer yes. For entity access (remember, orIf) there should be no Forbidden answers and at least one Allowed to say yes.

Apr 22 2015
Apr 22

Nedjo Rogers is a Senior Performance Engineer with Tag1 based out of Victoria, Canada. He’s been an active Drupal contributor since 2003, has served as an advisory board member of the Drupal Association, and has led Drupal development projects for clients including Sony Music, the Smithsonian Institute, the Linux Foundation, and a number of nonprofit organizations. He’s also the co-founder of Chocolate Lily, where he builds web tools for nonprofits, including the Drupal distribution Open Outreach.

Nedjo and I chatted shortly after he returned from a bike trip, touching on a range of topics including his entry into Drupal, working at Tag1, his nonprofit distribution, Drupal 8, corporate influence, CMI, Backdrop, and mining activism.

Dylan: How did you get started in web development and Drupal?

Nedjo: In 1996 I was working at an environmental group focused on the impacts of mining. My first work was mapping--providing communities with information about proposed mining projects in their area and the impacts of existing mines. We also had an international project, working in solidarity with organizations and communities impacted by the activities of Canadian mining companies in Peru and other Latin American countries.

In 1997 I started building an organizational website using Microsoft Front Page. Then, working with an international network of mining activist organizations, I produced a complex relational database application to track the worldwide activities of mining companies. Our ambitious aim was to enable communities to pool their information, supporting each other. I built the database using the software I had installed on my work computer: Microsoft Access.

In short: I was trying to build liberatory, activist internet resources, but using proprietary tools. I didn't know better. I'd never heard of free software.

By 2003, when I was working at a small technical support NGO, I'd had my introduction to the free software community, thanks initially to Paul Ramsay of PostGIS renown. At the time, Drupal was a small and relatively little known project. The available contributed modules numbered in the low dozens. Compared to other options I evaluated, the core software itself did very little. But what it did do, it did well and cleanly. I set about writing and contributing modules and dabbling in Drupal core.

Dylan: Your Drupal bio says you have a masters degree in geography. When you started working in technology was it an explicit part of your job responsibilities or was it more of something you fell into because someone needed to do it?

Nedjo: Decidedly the latter. In my MA, I took exactly one course in anything related to technology: geographic information systems (GIS). I was teased initially when I started hacking code. "Nerdjo" was my nickname of that time.

I've always had a love-hate relationship with the idea of working in IT. In many ways, I love the work. At its best, it's a constant source of intellectual challenge. But technology per se I don't care about at all. Outside of the Drupal community, few of my friends know in any detailed sense what I do.

Dylan: Yet based on what I’ve read about you, I'd venture a guess that your education, your intellectual interests, your views have had a big influence on the work you’ve done in the Drupal community. Is that true?

Nedjo: Very much so. Software, code, documentation--my sense is that all of it begins with the basic questions we ask ourselves.

Like many others, I came to Drupal with a purpose: to build tools that empower local, change-making organizations and movements. To put emerging communication technologies into the hands of activists and community builders. To do so, we need to work with those communities to shape technology to the needs and abilities of those who will use it.

I believe of Drupal that it retains, at least latently, a lot of the original principles of its genesis as a student project: a core focus on communities and collaboration. So, for me, working to bring a grounded and critical perspective to topics like structure and architecture has been a core part of my service to the project.

Working at Tag1

Dylan: Can you tell me about how you started working with Tag1 and what your experience working here has been like?

Tag1 partner Jeremy Andrews and I were longtime colleagues at CivicSpace, and I worked closely with Tag1 manager Peta Hoyes when we were both at CivicActions, so when Peta reached out to me with the opportunity to work with Tag1 I already knew it was going to be a great place to work.

My first assignment was on archetypes.com, a complex custom build that included some intriguing challenges. The project came to Tag1 through a route that’s now become familiar. We were brought on to work on performance issues on the site, which was built originally by a different team. We recommended architectural improvements that would address performance bottlenecks and then were engaged to guide that process, working with the in house Archetypes dev team.

It was my first chance to work with Tag1 engineer Fabian Franz, and I was immediately struck by his creative genius. I also got to collaborate again with Károly Négyesi (ChX), who years earlier was an important mentor for me, encouraging me to participate in the planning sprint for fields in Drupal core. Károly was also on hand to witness my near arrest by a half dozen of Chicago’s finest on Larry Garfield’s mother’s back porch while attending a Star Trek meetup, but that’s a longer story ;)

Recent highlights at Tag1 include collaborating with staff at the Linux Foundation to upgrade and relaunch their identity management system and working with Mark Carver and others to complete a redesign for Tag1’s own Drupal Watchdog site.

Tag1 is a good fit with my interests and commitments and I love the chance to work as part of a creative team where I’m always learning.

Nonprofit Distribution

Dylan: You’ve developed a Drupal distribution for nonprofits, Open Outreach and a set of features called Debut. Tell me about those projects.

Nedjo: In 2010 or so, while working at CivicActions, I completed six months of intensive work as tech lead of a project for the Smithsonian, a site on human origins for the US National Museum of Natural History.

We'd repeatedly worked 12 to 16 hours days to complete a grueling series of sprints. The site was a resounding success, but for me it sparked a crisis of faith. If we were here for free software, what were we doing pouring our efforts into what was essentially closed, proprietary development?

More troubling for me personally was the realization that I'd steadily lost touch with the reasons that brought me to Drupal in the first place: the small, activist groups I care about and am part of. If I wasn't going to focus solidly on building technology specifically for them, exactly who was?

I left CivicActions and soon afterwards, my partner, Rosemary Mann, also left her job of ten years as the executive director of a small group working with young, often marginalized parents. With our two kids, we pulled up roots and spent five months in Nicaragua. While there, I volunteered with an activist popular health organization.

When we returned, both at a crossroads, we decided to start up a Drupal distribution for the activist and community groups that we cared about. Groups that, we felt, were increasingly being left behind as Drupal shops pursued high-paying client contracts.

When we started into the work, via a small partnership, Chocolate Lily, several people said they didn’t understand our business model. By focusing on small, low resourced organizations, we'd never generate the lucrative contracts that Drupal shops depend on.

And we said, yeah, that's kind of the point.

Our first Open Outreach site was for the organization that Rosemary worked for for ten years, the Young Parents Support Network.

When building Open Outreach, we put a lot of energy into interoperability. At that time, the Kit specification, aimed at enabling the features of different distributions to work together, was a recent initiative. We very intentionally divided our work into two parts. Everything that was general beyond our specific use case (small NGOs and activist groups) we put into a generic set of features, Debut. The idea was that these features could be used independently of our distribution.

We also put energy into various other efforts that seemed to be oriented towards interoperability. This included the Apps module, which, at least in theory, included interoperability in its mandate. Why put energy in this direction?

Features itself - focused on capturing configuration in a form that can be reused on multiple sites - can be seen as extending the "free software" ethos from generic platform tools (modules) to applied use cases. But so long as features were limited to a particular distribution, they were in essence crippled. They were proprietary, in the sense that they were tied to a given solution set. Only by explicitly tackling the problem of interoperability could we truly bring the powerful tools being developed in Drupal to organizations other than those that could pay tens or hundreds of thousands of dollars on custom development.

The original authors of Features at Development Seed recognized this problem and took steps to address it. But when they largely left Drupal development, their insights and efforts languished.

I won’t make any great claims for what we achieved with Debut. To say that it was not broadly adopted is definitely an understatement ;) But in my more optimistic moments I like to think of it as a step in a broader process that will take time to bear fruit.

As Victor Kane recently put it, "We truly need open source propagation of reusable sets of Drupal configuration, and to reclaim the power of generically abstracting solutions from single site projects with the aim of re-use, with the creation of install profiles and distributions."

Drupal 8

Dylan: You are getting at some issues that I want to circle back to shortly. I’d like to shift to Drupal 8. What do you find most exciting or interesting about Drupal 8?

Nedjo: I've written or drafted several contributed modules now with Drupal 8. Coming from previous Drupal versions, I found it challenging at first to wrap my head around the new approaches. But as I got comfortable with them, I gained a lot of appreciation for the basic architecture and the flexibility and power that it enables.

Only a couple of months in, I already feel like I'm writing my cleanest Drupal code ever, thanks to the solid framework. While the previous procedural code left a lot up to the individual developer in terms of how best to structure a project, Drupal 8's well developed design patterns provide a lot of guidance.

One key feature of Drupal 8's architecture is "dependency injection", which roughly means that the elements (classes) that a particular solution needs are passed in dynamically as customizable “services” rather than being hard-coded. This design choice means a considerable leap in abstraction. At the same time, it opens the way for very specific and efficient customizations and overrides.

A concrete example, returning to the question of interoperability. A key barrier to features being interoperable is conflicting dependencies. This issue arises from the fact that some types of components need to be supplied once and used in many different features. Examples are user roles and "field bases" in Drupal 7. (In Drupal 8, what was a field base is now a "field storage".) In practice, most distributions put these shared components into a common feature that becomes a dependency for most or all other features in the distribution. For example, two different distributions might both include a core feature that provides a “body” field and an “administrator” role and that is a dependency of other features in the respective distributions.

This means you can’t easily mix features from one distribution with those of another, because their dependencies conflict.

In Drupal 7, I wrote the Apps Compatible module to try to address this problem. Apps Compatible makes it possible for multiple features and distros to share a certain limited set of shared configuration without conflicting dependencies. However, due to architectural shortcomings in Drupal 7, it does so via a bunch of awkward workarounds, and only for a very limited set of types of components (entities).

In Drupal 8, I've drafted an equivalent, Configuration Share. Using the dependency injection system, I was able very lightly to override a single method of the 'config.installer' service. In doing so, I achieved the same aim I’d attempted in Apps Compatible--except that this time it was done cleanly, in a few short lines of code, and for all types of configuration types at once.

Wow.

I also wrote the initial architecture of the Drupal 8 Features module, which - thanks to some fresh approaches building on Drupal 8 architectural improvements - promises to be a vast improvement over previous versions. Mike Potter at Phase2 is taking my start and adding on a lot of great ideas and functionality, while respecting and extending the basic architecture.

So, yeah, so far I’ve loved working in Drupal 8. I honour the creative energy that's gone into it. At the same time, I have a critical perspective on some directions taken in the version and the project as a whole. To me, those perspectives are not in any way contradictory. In fact, they stem from the same set of values and analysis.

Corporate Influence

Dylan: That’s a good segue back to an earlier point you made that is worth revisiting. A not-insignificant portion of early adoption and funding of Drupal work was driven by small, scrappy nonprofits, working with dev shops with social change in their missions. Many of those early adopting orgs were attracted to Drupal not only for the economic accessibility of an open source CMS but for the philosophical values of community and collaboration and mutual empowerment through shared interest. You’ve been vocal about your concerns regarding enterprise influence. Assuming that Drupal does not want to lose these users, what can or should be done?

Nedjo: You're correct to highlight the important and ongoing contributions of contributors and groups motivated by the ideals and principles of free software and more broadly of progressive social activism. We can think here of CivicSpace, a social change project that was a key incubator for a lot of what shaped Drupal core.

Back in 2009, working at CivicActions, I had the rare opportunity of doing six months of solid work on Drupal core and contrib for a corporate sponsor, Sony Music.

As technical lead of a fantastic team including Nathaniel Catchpole and Stella Powers, and in close collaboration with great folks at Sony Music and Lullabot, I got to dig into practically every area of Drupal code related to language support and internationalization. At the end of the project, on my own time, I wrote a detailed case study crediting Sony Music with their amazing contribution.

So, when it comes to corporate sponsorship, you could say I'm an early adopter and promoter.

Just not an uncritical one.

In a thoughtful 2008 post, Acquia co-founder Jay Batson wondered whether the balance between volunteer and commercially sponsored development “will change for Drupal as Acquia appears.” His starting point was a blog post on “The effects of commercialization on open source communities”.

Jay bet that sponsored development would not become “the most”, while, commenting on Jay’s post, Drupal founder and Acquia co-founder Dries Buytaert was “100% confident that the Drupal community will continue to do the bulk of the work.”

Dries noted though that “What is important to company A, is not necessarily important to company B. … At Acquia, we have our own itch to scratch, and we'll contribute a lot of code to scratch it properly ;-)”

I recently read a relevant article from 2011, “Control in Open Source Software Development”, in an industry journal. The authors outlined a series of strategies that companies could follow to ensure they achieved significant control over open source projects and thus "help companies achieve their for-profit objectives using open source software."

Looking at the strategies examined, we can see many of them at work in the Drupal project, including sponsoring the work of lead developers.

The question is not whether the “itch scratching” Dries described affects the functionality and focus of the project. Of course it does. That’s the whole point.

It’s how.

Besides sponsorship, there are also indirect but potentially equally powerful influences. Through our work, we're exposed to certain types of problems. If, as our skill level and reputation rise, we end up working on larger budget projects for large corporations, governments, and enterprise-level NGOs, we develop expertise and solutions that address those clients' priorities. This focus can carry over in both explicit and more subtle ways to our work in other areas, even if we're doing that work as volunteers.

I see this influence on a daily basis in my own work. I couldn’t count the number of customizations I’ve done for paid client projects and then carried forward as a volunteer, putting in the unpaid time to turn them into a contributed patch or module.

So, seven years on, it’s worth returning to the questions Jay raised. Have we as contributors in effect been relegated to a role of outsourced and unpaid bugfixers, or do we retain a central role in shaping our collective project? What is our project? What do we want to achieve with it--ultimately, who it is for?

CMI

Dylan: You’ve raised some concerns about the direction that the Configuration Management Initiative (CMI) in D8 is going...

Nedjo: Yes, I think we can see a lot of these questions - of influence, target users, and enterprise interests - at play in Drupal 8 configuration management approaches.

When he introduced Drupal 8 initiatives in his 2011 keynote address at DrupalCon Chicago, Dries explained how he’d consulted twenty of the largest companies and organizations using Drupal about their pain points, and configuration management was one of two things that stood out. So from the start, the focus of the CMI was, as Dries put it, to fix “the pains of large organizations adopting Drupal”.

The CMI began with a particular limited use case: the staging or deployment problem. Initiative lead Greg Dunlop was explicit about this focus. Greg hoped that further use cases - like those of distributions - could and would be added as we moved forward.

Recently, I began tackling in a serious way the possibilities and limitations of Drupal 8 configuration management when it comes to distributions. My starting place is a very practical one: I co-maintain a distribution, Open Outreach, that's aimed at small, low-resourced organizations. I need to know: will Drupal 8 be a sound basis for a new version of our distribution?

As I dug into the code, two things happened. First, as I mentioned before, I was impressed with the architecture of Drupal 8. Second, I was concerned about implications for small sites generally and for distributions in particular.

Briefly (and please bear with me, this gets a bit technical): In Drupal 7, the system of record for exportable configuration like views or image styles provided by a module is, by default, the module itself. If a site builder has explicitly overridden an item, the item’s system of record becomes the site (in practice, the site database).

An item that has not been overridden is on an update path and automatically receives improvements as they are added to the original module. An overridden item is in effect forked--it no longer receives updates and is maintained on a custom basis. This model prioritizes the free software sharing of updatable configuration items among multiple sites, while enabling customization via explicit overriding (forking).

In Drupal 8 as currently built, the system of record for all configuration is the site. There is no support for receiving configuration updates from modules. In effect, the free software use case of sharing configuration among many sites has, for the moment, lost out to the use case that's built into Drupal core: staging configuration between different versions of the same site.

While staging can, in theory, benefit anyone, in practice, there are pretty steep technical hurdles. To fully use the staging system as currently built into Drupal 8 core, you need to have access to a server that supports multiple databases and multiple install directories; know how to set up and manage multiple versions of a single site; move configuration and modules and files between different versions of the site; and make sense of a UI that presents machine names of configuration items and, even more dauntingly, line-by-line diffs of raw data representations of arbitrary types of configuration.

In other words, the configuration management system - which includes a lot of great ideas and improvements - serves a staging use case that has a lot of built-in assumptions about the technical capacity of the target users, and so far lacks support for the more broadly accessible use case of receiving updates to configuration supplied by extensions (modules and themes).

It looks to me like a case where focusing too closely on a particular use case (staging) has raised some unanticipated issues for other use cases, including distributions.

None of this is in any way an irreversible problem. The use case problem is one that Greg Dunlap pretty much foresaw when he commented on his initial limited focus. He noted it was only a starting place, with lots of followup to do. The contributed Configuration Update Manager module provides a good start. I've sketched in potential next steps in Configuration Synchronizer.

But my view is all of this belongs in core. Only there can we ensure that the free software use case of sharing configuration among multiple sites is on at least an equal footing with that of staging config on a single site.

Backdrop

Dylan: Is Backdrop part of the answer?

Nedjo: It's a welcome addition and a healthy sign. For Open Outreach, we're evaluating Backdrop alongside Drupal. We're even considering building versions for both.

Backdrop inherits some of the same issues as Drupal 8 when it comes to configuration management. But there are encouraging signs that the lead Backdrop developers are both aware of these issues and open to fixing them in core.

Mining Activism

Dylan: You mentioned that mining activism was your entry point into web technologies. Are you still involved in activist work?

Nedjo: I was out the loop for several years in terms of mining activism, but recently I've gotten involved again. Three years ago I co-founded a small organization in my hometown of Victoria, the Mining Justice Action Committee (MJAC). Last month I coordinated a visit by a Guatemalan indigenous leader, Angelica Choc, whose husband was killed in connection with his resistance to a Canadian mining company's operations. Angelica is part of a groundbreaking legal case to hold the company HudBay Minerals accountable for impacts related to its presence in Guatemala. She was a focus of the film Defensora.

At the event, I provided interpretation from Spanish to English as Angelica shared her story with the more than 100 people who filled a church hall. As I spoke her words, I was in tears. From the pain that she and family and community have suffered, but just as much from her courage, vision, and resilience.

My partner, Rosemary, is also an active member of MJAC, and last year she built its organizational website using our distribution, Open Outreach. In small ways, circles are completed.

Dylan: Last question. How was your bike ride?

Nedjo: My younger child is eighteen now and plans to leave home this summer, so these last months with them at home have a special intensity. Ardeo and I biked along a former rail line dotted with trestles that carried us high above the Cowichan River, swollen with spring runoff. In several places, winter storms had brought down trees that blocked the trail and we had to unhitch our panniers and work together to hoist bikes over the sprawling trunks. Dusk was falling as we reached our campground beside the river.

These times of connection - personal and ecological - are so essential. They remind us that technology, or our work in the world, is at most a detail, a small means to help us realize what really matters. What truly has value.

You seem to have caught me in a philosophical mood. We'd better stop here before I sound more sappy than I already do!

Dylan: Thanks for your time, Nedjo!

Apr 14 2015
Apr 14

I want to share two stories with you.

I started with Drupal in 2005. I started my first Drupal job in 2006 at $40/hr which was a pay cut. I quickly got a raise to $50/hr. I released Coder module in late 2006 and talked at OSCMS (the predecessor to DrupalCon) in 2007 and I began to be known in the Drupal community. Sometime in 2008 I started working on search. And because of my contributed work and reputation I was offered a 2 week job. I was asked how much I wanted. I remember I was making $50/hr. I said I'd like $100/hr. The client was the New York Observer and the person hiring me was Jeff Robbins of Lullabot. He knew my ask was low. He said, how about $110? Wow!

The second story is from DrupalCon San Francisco. In Dries' keynote he said, "if Drupal has changed your life, please stand up." I stood up and so did hundreds of others. I felt chills. Drupal changed my life because of this great career I now have. I work from home. I work with teams of smart, kind, diverse people. I work for interesting and great clients. And I make a nice paycheck.

So here I am to say something about Drupal 8 and make a challenge to you. Through the Drupal Association the community is trying to raise $250,000 to use as grants and bounties to help finish Drupal 8. I've worked many years full time with Drupal 7 and made a nice paycheck during that time. I also contributed heavily to Drupal 6 and Drupal 7.

But here we are in 2015 and I have contributed very little to Drupal 8. I want D8 released for selfish reasons. I want to start using it in my day job and secure the next several years of work.

So I contributed $1000 and have challenged others that are like me, that work in Drupal full time, who make a good paycheck from Drupal, and who have contributed less to Drupal core than I have to contribute one day's pay to #d8accelerate. Will you stand up with me and take my challenge?

Donate here.

Feb 26 2015
Feb 26

This is the third in a series of blog posts about the relationship between Drupal and Backdrop CMS, a recently-released fork of Drupal. The goal of the series is to explain how a module (or theme) developer can take a Drupal project they currently maintain and support it for Backdrop as well, while keeping duplicate work to a minimum.

  • In part 1, I introduced the series and showed how for some modules, the exact same code can be used with both Drupal and Backdrop.
  • In part 2, I showed what to do when you want to port a Drupal module to a separate Backdrop version and get it up and running on GitHub.
  • In part 3 (this post), I'll wrap up the series by explaining how to link the Backdrop module to the Drupal.org version and maintain them simultaneously.

Linking the Backdrop Module to the Drupal.org Version and Maintaining Them Simultaneously

In part 2 I took a small Drupal module that I maintain (User Cancel Password Confirm) and ported it to Backdrop. In the end, I wound up with two codebases for the same module, one on Drupal.org for Drupal 7, and one on GitHub for Backdrop.

However, the two codebases are extremely similar. When I fix a bug or add a feature to the Drupal 7 version, it's very likely that I'll want to make the exact same change (or at least an extremely similar one) to the Backdrop version. Wouldn't it be nice if there were a way to pull in changes automatically without having to do everything twice manually?

If you're a fairly experienced Git user, you might already know that the answer is "yes". But if you're not, the process isn't necessarily straightforward, so I'm going to document it step by step here.

Overall, what we're doing is simply taking advantage of the fact that when we imported the Drupal.org repository into GitHub in part 2, we pulled in the entire history of the repository, including all of the Drupal commits. Because our Backdrop repository knows about these existing commits, it can also figure out what's different and pull in the new ones when we ask it to.

In what follows, I'm assuming a workflow where changes are made to the Drupal.org version of the module and pulled into Backdrop later. However, it should be relatively straightforward to reverse these instructions to do it the other way around (or even possible, but perhaps less straightforward, to have a setup where you can do it in either direction).

  1. To start off, we need to make our local clone of the Backdrop repository know about the Drupal.org repository. (A local clone is obtained simply by getting the "clone URL" from the GitHub project page and copying it locally, for example with the command shown below.)
    git clone [email protected]:backdrop-contrib/user_cancel_password_confirm.git
    

    First let's check what remote repositories it knows about already:

    $ git remote -v
    origin    [email protected]:backdrop-contrib/user_cancel_password_confirm.git (fetch)
    origin    [email protected]:backdrop-contrib/user_cancel_password_confirm.git (push)
    

    No surprise there; it knows about the GitHub version of the repository (the "origin" repository that it was cloned from).

    Let's add the Drupal.org repository to this list and check again:

    $ git remote add drupal http://git.drupal.org/project/user_cancel_password_confirm.git
    $ git remote -v
    drupal    http://git.drupal.org/project/user_cancel_password_confirm.git (fetch)
    drupal    http://git.drupal.org/project/user_cancel_password_confirm.git (push)
    origin    [email protected]:backdrop-contrib/user_cancel_password_confirm.git (fetch)
    origin    [email protected]:backdrop-contrib/user_cancel_password_confirm.git (push)
    

    The URL I used here is the same one I used in part 2 to import the repository to GitHub (that is, it's the public-facing Git URL of my project on Drupal.org, available from the "Version control" tab of the drupal.org project page, after unchecking the "Maintainer" checkbox - if it’s present - so that the public URL is displayed). I've also chosen to give this repository the name "drupal". (Usually the convention is to use "upstream" for something like this, but in GitHub-land "upstream" is often used in a slightly different context involving development forks of one GitHub repository to another. So for clarity, I'm using "drupal" here. You can use anything you want to.)

  2. Next let's pull in everything from the remote Drupal repository to our local machine:
    $ git fetch drupal
    remote: Counting objects: 4, done.
    remote: Compressing objects: 100% (2/2), done.
    remote: Total 3 (delta 0), reused 0 (delta 0)
    Unpacking objects: 100% (3/3), done.
    From http://git.drupal.org/project/user_cancel_password_confirm
    * [new branch]          7.x-1.x -> drupal/7.x-1.x
    * [new branch]          master  -> drupal/master
    * [new tag]             7.x-1.0-rc1 -> 7.x-1.0-rc1
    

    You can see it has all the branches and tags that were discussed in part 2 of this series. However, although I pulled the changes in, they are completely separate from my Backdrop code (the Backdrop code lives in "origin" and the Drupal code lives in "drupal").

    If you want to see a record of all changes that were made to port the module to Backdrop at this point, you could run git diff drupal/7.x-1.x..origin/1.x-1.x to examine them.

  3. Now let's fix a bug on the Drupal.org version of the module. I decided to do a simple documentation fix: Fix documentation of form API functions to match coding standards

    I made the code changes on my local checkout of the Drupal version of the module (which I keep in a separate location on my local machine, specifically inside the sites/all/modules directory of a copy of Drupal so I can test any changes there), then committed and pushed them to Drupal.org as normal.

  4. Back in my Backdrop environment, I can pull those changes in to the "drupal" remote and examine them using git log:
    $ git fetch drupal
    remote: Counting objects: 5, done.
    remote: Compressing objects: 100% (3/3), done.
    remote: Total 3 (delta 2), reused 0 (delta 0)
    Unpacking objects: 100% (3/3), done.
    From http://git.drupal.org/project/user_cancel_password_confirm
      7a70138..997d82d  7.x-1.x     -> drupal/7.x-1.x
    
    $ git log origin/1.x-1.x..drupal/7.x-1.x
    commit 997d82dce1a4269a9cee32d3f6b2ec2b90a80b33
    Author: David Rothstein 
    Date:   Tue Jan 27 13:30:00 2015 -0500
    
            Issue #2415223: Fix documentation of form API functions to match coding standards.
    

    Sure enough, this is telling me that there is one commit on the Drupal 7.x-1.x version of the module that is not yet on the Backdrop 1.x-1.x version.

  5. Now it's time to merge those changes to Backdrop. We could just merge the changes directly and push them to GitHub and be completely done, but I'll follow best practice here and do it on a dedicated branch with a pull request. (In reality, I might be doing this for a more complicated change than a simple documentation fix, or perhaps with a series of Drupal changes all at once rather than a single one. So I might want to formally review the Drupal changes before accepting them into Backdrop.)

    By convention I'm going to use a branch name ("drupal-2415223") based on the Drupal.org issue number:

    $ git checkout 1.x-1.x
    Switched to branch '1.x-1.x'
    
    $ git checkout -b drupal-2415223
    Switched to a new branch 'drupal-2415223'
    
    $ git push -u origin drupal-2415223
    Total 0 (delta 0), reused 0 (delta 0)
    To [email protected]:backdrop-contrib/user_cancel_password_confirm.git
    * [new branch]          drupal-2415223 -> drupal-2415223
    Branch drupal-2415223 set up to track remote branch drupal-2415223 from origin.
    
    $ git merge drupal/7.x-1.x
    Auto-merging user_cancel_password_confirm.module
    Merge made by the 'recursive' strategy.
    user_cancel_password_confirm.module |   10 ++++++++--
    1 file changed, 8 insertions(+), 2 deletions(-)
    

    In this case, the merge was simple and worked cleanly. Of course, there might be merge conflicts here or other changes that need to be made. You can do those at this time, and then git push to push the changes up to GitHub.

  6. Once the changes are pushed, I went ahead and created a pull request via the GitHub user interface, with a link to the Drupal.org issue for future reference (I could have created a corresponding issue in the project's GitHub issue tracker also, but didn't bother):
    • Fix documentation of form API functions to match coding standards (pull request) (diff)

    Merging this pull request via the GitHub user interface gets it onto the official 1.x-1.x Backdrop branch, and into the Backdrop version of the module.

    Here's the commit for Drupal, and the same one for Backdrop:

    http://cgit.drupalcode.org/user_cancel_password_confirm/commit/?id=997d8...
    https://github.com/backdrop-contrib/user_cancel_password_confirm/commit/...

Using the above technique, it's possible to have one main issue (in this case on Drupal.org) for any change you want to make to the module, do essentially all the work there, and then easily and quickly merge that change into the Backdrop version without the hassle of repeating lots of manual, error-prone steps.

Hopefully this technique will be useful to developers who want to contribute their work to Backdrop while also continuing their contributions to Drupal, and will help the two communities continue to work together. Thanks for reading!

Further Backdrop Resources

Do you have any thoughts or questions, or experiences of your own trying to port a module to Backdrop? Leave them in the comments.

Feb 17 2015
Feb 17

This is the second in a series of blog posts about the relationship between Drupal and Backdrop CMS, a recently-released fork of Drupal. The goal of the series is to explain how a module (or theme) developer can take a Drupal project they currently maintain and support it for Backdrop as well, while keeping duplicate work to a minimum.

  • In part 1, I introduced the series and showed how for some modules, the exact same code can be used with both Drupal and Backdrop.
  • In part 2 (this post), I'll explain what to do when you want to port a Drupal module to a separate Backdrop version and get it up and running on GitHub.
  • In part 3, I'll explain how to link the Backdrop module to the Drupal.org version and maintain them simultaneously.

Porting a Drupal Module to Backdrop and Getting it Up and Running on GitHub

For this post I’ll be looking at User Cancel Password Confirm, a very small Drupal 7 module I wrote for a client a couple years back to allow users who are canceling their accounts to confirm the cancellation by typing in their password rather than having to go to their email and click on a confirmation link there.

We learned in part 1 that adding a backdrop = 1.x line to a module’s .info file is the first (and sometimes only) step required to get it working with Backdrop. In this case, however, adding this line to the .info file was not enough. When I tried to use the module with Backdrop I got a fatal error about a failure to open the required includes/password.inc file. What's happening here is simply that Backdrop (borrowing a change that's also in Drupal 8) reorganized the core directory structure compared to Drupal 7 to put most core files in a directory called "core". When my module tries to load the includes/password.inc file, it needs to load it from core/includes/password.inc in Backdrop instead.

This is a simple enough change that I could just put a conditional statement into the Drupal code so that it loads the correct file in either case. However, over the long run this would get unwieldy. Furthermore, if I had chosen a more complicated module to port, one which used Drupal 7's variable or block systems (superseded by the configuration management and layout systems in Backdrop) it is likely I'd have more significant changes to make.

So, this seemed like a good opportunity to go through the official process for porting my module to Backdrop.

Backdrop contrib modules, like Backdrop core, are currently hosted on GitHub. Regardless of whether you're already familiar with GitHub from other projects, there are some steps you should follow that might not be familiar, to make sure your Backdrop module's repository is set up properly and ultimately to get it included on the official list of Backdrop contributed projects.

Importing to GitHub

The best way to get a Drupal module into GitHub is to import it; this preserves the pre-Backdrop commit history which becomes important later on.

Before you do this step, if you're planning to port a Drupal module that you don't maintain, it's considered best practice to notify the current maintainer and see if they'd like to participate or lead the Backdrop development themselves (see the "Communicating" section of the Drupal 7 to Backdrop conversion documentation for more information). In my case I'm already the module maintainer, so I went ahead and started the import:

  1. Go to the GitHub import page and provide the public URL of the Drupal project's Git repository (which I got from going to the project page on Drupal.org, clicking the "Version control" tab, and then - assuming you are importing a module that you maintain - making sure to uncheck the "Maintainer" checkbox so that the public URL is displayed). Drupal.org gives me this example code:


    git clone --branch 7.x-1.x http://git.drupal.org/project/user_cancel_password_confirm.git

    So I just grabbed the URL from that.

  2. Where GitHub asks for the project name, use the same short name (in this case "user_cancel_password_confirm") that the Drupal project uses.
  3. Import the project into your own GitHub account for starters (unless you're already a member of the Backdrop Contrib team - more on that later).

Here's what it looks like:
GitHub import

Submitting this form resulted in a new GitHub repository for my project at https://github.com/DavidRothstein/user_cancel_password_confirm.

As a final step, I edited the description of the GitHub project to match the description from the module's .info file ("Allows users to cancel their accounts with password confirmation rather than e-mail confirmation").

Cleaning Up Branches and Tags

Next up is some housekeeping. First, I cloned a copy of the new repository to my local machine and then used git branch -r to take a look around:


$ git clone gi[email protected]:DavidRothstein/user_cancel_password_confirm.git
$ git branch -r
origin/7.x-1.x
origin/HEAD -> origin/master
origin/master

Like many Drupal 7 contrib projects, this has a 7.x-1.x branch where all the work is done and a master branch that isn't used. When I imported the repository to GitHub it inherited those branches. However, for Backdrop I want to do all work on a 1.x-1.x branch (where the first "1.x" refers to compatibility with Backdrop core 1.x).

  1. So let's rename the 7.x-1.x branch:


    $ git checkout 7.x-1.x
    Branch 7.x-1.x set up to track remote branch 7.x-1.x from origin.
    Switched to a new branch '7.x-1.x'
    $ git branch -m 7.x-1.x 1.x-1.x
    $ git push --set-upstream origin 1.x-1.x
    Total 0 (delta 0), reused 0 (delta 0)
    To [email protected]:DavidRothstein/user_cancel_password_confirm.git
    * [new branch] 1.x-1.x -> 1.x-1.x
    Branch 1.x-1.x set up to track remote branch 1.x-1.x from origin.

  2. And delete the old one from GitHub:


    $ git push origin :7.x-1.x
    To [email protected]:DavidRothstein/user_cancel_password_confirm.git
    - [deleted] 7.x-1.x

  3. We want to delete the master branch also, but can't do it right away since GitHub treats that as the default and doesn't let you delete the default branch.

    So I went to the module's GitHub project page, where (as the repository owner) I have a "Settings" link in the right column; via that link it's possible to change the default branch to 1.x-1.x through the user interface.

    Now back on my own computer I can delete the master branch:


    $ git push origin :master
    To [email protected]:DavidRothstein/user_cancel_password_confirm.git
    - [deleted] master

  4. On Drupal.org, this module has a 7.x-1.0-rc1 release, which was automatically imported to GitHub. This won't be useful to Backdrop users, so I followed the GitHub instructions for deleting it.
  5. Finally, let's get our local working copy somewhat in sync with the changes on the server. The cleanest way to do this is probably just to re-clone the repository, but you could also run git remote set-head origin 1.x-1.x to make sure your local copy is working off the same default branch.

The end result is:


$ git branch -r
origin/1.x-1.x
origin/HEAD -> origin/1.x-1.x

Just what we wanted, a single 1.x-1.x branch which is the default (and which was copied from the 7.x-1.x branch on Drupal.org and therefore contains all its history).

Updating the Code for Backdrop

Now that the code is on GitHub, it's time to make it Backdrop-compatible.

To do this quickly, you can just make commits to your local 1.x-1.x branch and push them straight up to the server. In what follows, though, I'll follow best practices and create a dedicated branch for each change (so I can create a corresponding issue and pull request on GitHub). For example:


$ git checkout -b backdrop-compatibility
$ git push -u origin backdrop-compatibility

Then make commits to that branch, push them to GitHub, and create a pull request to merge it into 1.x-1.x.

  1. To get the module basically working, I'll make the simple changes discussed earlier:
    • Add basic Backdrop compatibility (issue) (diff)

    If you look at the diff, you can see that instead of simply adding the backdrop = 1.x line to the .info file, I replaced the core = 7.x line with it (since the latter is Drupal-specific and does not need to be in the Backdrop version).

    With that change, the module works! Here it is in action on my Backdrop site:

    Cancel account using password

    (Also visible in this screenshot is a nice effect of Backdrop's layout system: Editing pages like this one, even though they are using the default front-end Bartik theme, have a more streamlined, focused layout than normal front-end pages of the site, without the masthead and other standard page elements.)

  2. Other code changes for this small module weren't strictly necessary, but I made them anyway to have a fully-compatible Backdrop codebase:
    • Replace usage of "drupal" with "backdrop" in the code (issue) (diff)
    • Use method on the user account object to determine its ID (issue) (diff)
  3. Next up, I want to get my module listed on the official list of Backdrop contributed projects (currently this list is on GitHub, although it may eventually move to the main Backdrop CMS website).

    I read through the instructions for applying to the Backdrop contributed project group. They're relatively simple, and I've already done almost everything I need above. The one thing I'm missing is that Backdrop requires a README.md file in the project root with some standard information in it (I like that they're enforcing this; it should help developers browsing the module list a lot), and it also requires a LICENSE.txt file. These were both easy to create following the provided templates and copying some information from the module's Drupal.org project page:

    Once that's done, and after reading through the rest of the instructions and making sure I agreed with them, I proceeded to create an issue:

    Application to join contrib team

    In my case it was reviewed and approved within a few hours (perhaps helped by the fact that I was porting a small module), and I was given access to the Backdrop contributed project team on GitHub.

  4. To get the module transferred from my personal GitHub account to the official Backdrop contrib list, I followed GitHub's instructions for transferring a repository.

    They are mostly straightforward. Just make sure to use "backdrop-contrib" as the name of the new owner (who you are transferring the repository to):

    Transfer repository to backdrop-contrib

    And make sure to check the box that gives push access to your repository to the "Authors" team within the Backdrop Contrib group (if you leave it as "Owners", you yourself wouldn't be able to push to it anymore):

    Grant access to the Authors team

    That's all it took, and my module now appears on the official list.

    You'll notice after you do this that all the URLs of your project have changed, although the old ones redirect to the new ones. That's why if you follow many of the links in this post, which point to URLs like https://github.com/DavidRothstein/user_cancel_password_confirm, you'll see that they actually redirect you to https://github.com/backdrop-contrib/user_cancel_password_confirm.

    For the same reason, you can keep your local checkout of the repository pointed to the old URL and it will still work just fine, although to avoid any confusion you might want to either do a fresh clone at this point, or run a command like the following to update the URL:

    git remote set-url origin [email protected]:backdrop-contrib/user_cancel_password_confirm.git
    

With the above steps, we’re all set; the module is on GitHub and can be developed further for Backdrop there.

But what happens later on when I make a change to the Drupal version of the module and want to make the same change to the Backdrop version (certainly a common occurrence)? Do I have to repeat the same changes manually in both places? Luckily the answer is no. In part 3 of this series, I’ll explain how to link the Backdrop module to the Drupal.org version and maintain them simultaneously. Stay tuned!

Further Backdrop Resources

Do you have any thoughts or questions, or experiences of your own trying to port a module to Backdrop? Leave them in the comments.

Feb 09 2015
ChX
Feb 09

While coding the MongoDB integration for Drupal 8 I hit a wall first with the InstallerKernel which was easy to remedy with a simple core patch but then a similar problem occurred with the TestRunnerKernel and that one is not so simple to fix: these things were not made with extensibility in mind. You might hit some other walls -- the code below is not MongoDB specific. But note how unusual this is: you won’t hit similar problems often. Drupal 8 very extensible but it has its limits. But when you need to bypass those limits, PHP has a solution for you to extend classes (to some extent) that were not meant to be extended. Yes, it’s a hack. But when all else fails...

Be very careful because the next version of Drupal core or other software you are commando-extending might not work the same internally and so your hack will break.

With that caveat, let me introduce you to reflection. While some might think it was only meant to investigate objects, already in PHP 5 there was a ReflectionProperty::setValue method. In PHP 5.3 ReflectionProperty::setAccessible was added.

$r = new \ReflectionObject($kernel);
$services = $r->getProperty('serviceYamls');
$services->setAccessible(TRUE);
$value = $services->getValue($kernel);
$value['app'][] = 'modules/mongodb/mongodb.services.yml';
$services->setValue($kernel, $value);

Let’s investigate carefully what happens here because it is not evident and the documentation is lacking. $services is a ReflectionProperty object and while it was created by a getProperty call on a ReflectionObject encapsulating the $kernel object, the $services object is completely ignorant that it belongs to the $kernel object. We could’ve created it from the InstallerKernel class for the same result, it doesn’t matter.

Once this is understood, there are few surprises in this block of code: the property is protected so we need to make it accessible. The setAccessible() call will not make $kernel->serviceYamls public because, again, the $services ReflectionProperty object is ignorant of the $kernel object it was created from. It will, however, enable us to call getValue and setValue on it. But again, we need to pass $kernel to both the getValue and the setValue call.

So this way and only this way we can change a protected object property.

Even better, this works for private properties as well. So if you need to extend a class which has private properties (many Symfony classes do this) then instead of extending, wrap the object (this is called the delegate pattern) and use this method to access its private properties. As an example, let’s extend Symfony\Component\DependencyInjection\Scope (this is the shortest class I could find with a private property and an interface, the example is totally nonsensical and has nothing to do with the Scope class, it’s just an example):

class MyScope implements ScopeInterface {
  public function __construct($name, $parentName = ContainerInterface::SCOPE_CONTAINER) {
    $this->scope = new Scope($name, $parentName);
  }
  public function getName() {
    // Awful hack breaking encapsulation.
    $reflection_scope = new \ReflectionObject($this->scope);
    $name = $reflection_scope->getProperty('name');
    $name->setAccessible(TRUE);
    $value = ‘my’ . $name->getValue($this->scope);
    $name->setValue($this->scope, $value);
    return $value;
  }
  public function getParentName() {
    // This is proper: delegate the call.
    return $this->scope->getParentName();
  }
}
Feb 03 2015
Feb 03

Part 1 - Reuse the Same Code

In mid-January, the first version of Backdrop CMS was released. Backdrop is a fork of Drupal that adds some highly-anticipated features and API improvements to the core Drupal platform while focusing on performance, usability, and developer experience.

When an open-source fork makes the news, it's often because it was born from a fierce, acrimonious battle (example: Joomla forking from Mambo); the resulting projects compete with each other on the exact same turf and developers are forced to choose sides. Backdrop's goal, however, is not to destroy or replace the original Drupal project. Rather, it aims to be a "friendly fork" that focuses on Drupal's traditional audience of site builders and developers, an audience which the Backdrop founders believe are being slowly left behind by the Drupal project itself.

Because of this, I expect that many existing Drupal developers will not want to choose between the platforms, but instead will continue working with Drupal while also beginning to use Backdrop. In this series of blog posts, I will explain how a module (or theme) developer can take a Drupal project they currently maintain and support it for Backdrop as well, while keeping duplicate work to a minimum.

  • In part 1 (this post), I'll show how for some modules, the exact same code can be used with both Drupal and Backdrop.
  • In part 2, I'll explain what to do when you want to port a Drupal module to a separate Backdrop version and get it up and running on GitHub.
  • In part 3, I'll explain how to link the Backdrop module to the Drupal.org version and maintain them simultaneously.

Sometimes the Same Exact Code can be Used With Both Drupal and Backdrop

To start things off let's look at Field Reference Delete, a Drupal 7 module I maintain which does some behind-the-scenes cleanup in your database when entities such as nodes or taxonomy terms are deleted on the site. It's a moderately-complex module which makes heavy use of Drupal's field and entity systems.

To make this or any Drupal module work with Backdrop, there is one step that is always required: Adding a backdrop = 1.x line to the .info file to inform Backdrop core that the code is Backdrop-compatible.

Easy enough. The big question is what changes are required beyond that?

Checking for Changes

Although Backdrop is not 100% compatible with Drupal 7 (due to the features and API improvements it adds) it aims to be backwards-compatible as much as possible, for example by including a compatibility layer to allow Drupal functionality that is deprecated in Backdrop to still work correctly.

The Backdrop change records and module conversion guide provide technical advice for developers on how and when to upgrade their code. The biggest changes are probably the configuration management system (Backdrop’s replacement for Drupal 7’s variable API and other configuration that was previously stored in the database) and layout system (which removes much of the functionality of the Drupal 7 Block module in favor of a newer, more powerful Layout module).

If your Drupal 7 module makes heavy use of these systems, it’s likely you’ll want to make some changes in order to work with Backdrop. However, the compatibility layer means that you might not actually need to. For example, Backdrop retains Drupal 7’s variable API (although it is marked as deprecated and is not as powerful as the configuration system which replaces it). So your code might still work even if it uses this system. It really depends on the details of how your module works, so the best advice is to test it out and see what (if anything) is broken.

It’s also worth noting that because Backdrop was forked from an early version of Drupal 8 (not from Drupal 7) it inherited a smattering of changes that were added to Drupal 8 early in the release cycle. Not all of these have made it into the list of Backdrop change records yet, although work is ongoing and people are adding them as they are noticed.

In the case of Field Reference Delete, I tested the module on Backdrop and it worked fine. I also skimmed the change records and module conversion guide mentioned above and didn't see anything that obviously needed to change. Even though entities in Backdrop have been converted to classed objects and the field API has been converted to use the configuration management system, Field Reference Delete’s heavy use of the entity and field systems still didn’t require that any changes be made. All I had to do to get the module working was add that one backdrop = 1.x line to the .info file.

Adding the One Line of Code on Drupal.org

Interestingly enough, since Drupal will happily ignore a backdrop = 1.x line in a module's .info file, it's possible to simply add that code to the Drupal.org version of the module and use the same version of the module for either Drupal or Backdrop. I did that in this issue; the resulting diff is simply this:


diff --git a/field_reference_delete.info b/field_reference_delete.info
...
name = Field reference delete
description = Immediately removes references to a deleted entity from fields stored in an SQL database.
core = 7.x
+backdrop = 1.x

Drupal uses the core = 7.x line to determine Drupal compatibility, and Backdrop uses the backdrop = 1.x line to determine Backdrop compatibility. The 7.x-1.0-beta1 release of Field Reference Delete contains the above change and can be used equally well on a Drupal or Backdrop site. Simple!

There are some downsides to doing this, however:

  • Although no changes to the module code may be strictly required, there are usually optional (and non-backwards-compatible) changes that can be made to better integrate with new Backdrop features.
  • It is hard for Backdrop users and developers to find the module and know that it's compatible with Backdrop. I tried to improve its discoverability by adding a "Backdrop compatibility" tag to the above-mentioned issue, and I also put a note on the project page explaining that Backdrop is supported. These aren't ideal, but should help; perhaps a standard will eventually catch on but there isn't a formal one yet.

Despite these disadvantages, for the time being I'd like to just have one copy of the code for this particular module (hosted in one place), and it's nice to know that's possible.

In part 2 of this series, I’ll take a look at a different module I maintain where it isn’t possible to use the same exact code for Drupal and Backdrop, and I’ll show how I went through the official process for porting my module to Backdrop and getting it hosted on GitHub. Stay tuned!

Further Backdrop Resources

Do you have any thoughts or questions, or experiences of your own trying to port a module to Backdrop? Leave them in the comments.

Dec 15 2014
Dec 15

Earlier this year we undertook a project to upgrade a client's infrastructure to all new servers including a migration from old Puppet scripts which were starting to show their age after many years of server and service changes. During this process, we created a new set of Puppet scripts using Hiera to separate configuration data from modules. The servers in question were all deployed with CentOS, and it soon became obvious that we needed a modular way in Puppet to install and configure yum package repositories from within our various Puppet modules.

Searching through the Puppet Forge uncovered a handful of modules created to deal with yum repos, however most were designed to implement a single repository, or were not easily configurable via Hiera, which was one of our main goals for the new Puppet scripts. So, we decided to create our own module, and the yumrepos Puppet module was born.

The idea behind the module is to provide a clean and easy way to pull in common CentOS/RHEL yum repos from within Puppet. By wrapping each repo with its own class (e.g. yumrepos::epel and yumrepos::ius), we gain the ability to override default class parameters with Hiera configuration, making the module easy to use in most any environment. For example, if you have your own local mirror of a repo, you can override the default URL parameter either in Hiera, or from the class declaration without having to directly edit any files within the yumrepos module. This is as easy as:

  1. In your calling class, declare the yumrepo class you need. In this example, we'll use EPEL: class { 'yumrepos::epel': }
  2. In your Hiera configuration, you can configure the repo URL with: yumrepos::epel::epel_url: http://your.local.mirror/path/to/epel/

Currently the yumrepos module provides classes for the following yum repos:

  • Drupal Drush 5
  • Drupal Drush 6
  • EPEL
  • IUS Community (Optionally: IUS Community Archive)
  • Jenkins
  • Jpackage
  • Percona
  • PuppetLabs
  • RepoForge
  • Varnish 3
  • Zabbix 2.4

Each repo contains the GPG key for the repo (where available) and defaults to enabling GPG checks. Have another repo you'd like to see enabled? Feel free to file an issue or pull request at https://github.com/tag1consulting/puppet-yumrepos

Additionally, yumrepos classes accept parameters for package includes or excludes so that you can limit packages on a per-repo basis. These translate to the includepkgs and exclude options within a yum repo configuration. Similar to overriding the default repo URL, these options can be overridden by passing parameters within the class declaration or by setting the appropriate variables within Hiera.

Once we had the yumrepos module in place, we had an easy way to configure yum repos from within our other modules. Stay tuned to the blog; we'll have more information about the overall Puppet scripts coming soon.

Dec 02 2014
Dec 02

I was drawn to Behavior Driven Development the moment I was pointed toward Behat not just for the automation but because it systematized and gave me a vocabulary for some things I already did pretty well. It let me teach some of those skills instead of just using them. At DrupalCon Amsterdam, Behat and Mink architect Konstantin Kudryashov gave a whole new dimension to that. His presentation, Doing Behaviour-Driven Development with Behat, was a straightforward, well-sequenced, and insightful explanation about why automation is so not the point, even though he began by writing automation tools.

Many Drupal shops have found themselves battered in the bizarre project management triangle, where clients push for more and more features within fixed budget often within wildly unrealistic timelines. Based on research conducted by Inviqa, where he works as the BDD Practice Manager, as well as a now-classic study by the Standish group showing that for all that pressure for more and more features, nearly 50% are never used, he makes the following claim:

Your clients don’t care about your scope. Your clients don’t care about their budget. They care about their business. And the reason they try to get so many features on a finite budget is that they don’t believe that you care about their business.

They don’t know how what you do will help them, yet they take a chance. But because they don’t know and they don’t believe, when they come to you they’re really saying, “I don’t know if you can help me or not, but I’m willing to spend this amount of money to see if you can. And I don’t know which of the things you do will really make a difference, so I want everything I can get in hopes that something will help."

The talk focuses on quality vs. quantity in software development and how BDD methodology and tools are intended to drive a quality-first development process.

[embedded content]

The research was interesting, for sure, but for me, not entirely necessary. I already had a big case of confirmation bias.

I entered web development in the mid-90s, the flat-file days before Content Management Systems were popular. I had pretty okay Unix system administration experience and was developing standards-oriented HTML 2.0 skills, but at heart I was we now call a “content strategist.” My formal, pre-technical background is in literature and education, so it makes a certain amount of sense that I entered the tech world content-first.

From the very beginning of the Web, it seemed clear to me that the point of sites was to communicate. There are significant technical considerations about how to support the communication, but ultimately the goals and missions should guide what is built, when it’s built and how it’s constructed.

I emphasized content above all back then, but after my first update of a large site from Drupal 4.7 to 5, I started to explain to new clients that Drupal is free as in puppies, and you’d better plan a maintenance budget. They would need to either trust or understand the kind of ongoing consequences of technical choices. You don’t just take home a new web site, you have to care for it, feed it, make arrangements for its care when you plan to go out of town. Despite good intentions, I had implemented my share of those unused features for a variety of reasons (all of which made sense at the time).

Still, it wasn’t until the late 00s, when I started to interact with Drupal agencies, that I realized achieving business goals and improving organizational communication weren’t even goals for some projects. As Drupal rescue projects began to come in, I realized that if Drupal was free as in puppies, then clients needed to steer clear of the puppy mills. I was feeling that pressure, especially when bidding on a project, to promise more and more features for a site without regard to the consequences - for delivery now and for maintenance later.

At one point, I toyed with addressing that pressure by joining a larger team. I remember interviewing for a job with a Drupal agency that involved immersive on-site consulting with clients. I asked during that interview how much they interfaced with their clients’ business process. My interviewer responded, “We don’t have the luxury of understanding our clients’ business.” I appreciated the honesty and withdrew my application.

I knew the value I offered my client was less about technical prowess and more about listening well to their needs and desires, then translating those into deliverables. That’s what the Behavior Driven Development methodology helps me with - ways to keep the whole team focused so they work on what really matters.

This image is from a great blog write-up by agile coach Dave Moran about where unused features come from, features which lead to considerable baggage as people implement, maintain, and sometimes even port features which are never used.

Sep 24 2012
Sep 24

Drupal’s highly dynamic and modular nature means that many of the central core and contrib subsystems and modules need to maintain a large amount of meta-data.

Rebuilding the data every request would be very expensive, and usually when one part of the data is needed during part of the request, another part will be needed later during the same request. Since just about every request needs to check variable_get(), load entities with fields attached etc., the meta-data needs to be loaded too.

The pattern followed by most subsystems is to put the data into a single large cache item and statically cache it. The more modules you have, the larger these cache items become — since more modules mean more variables, hook_schema() and hook_theme() implementations, etc. And the same happens via configuration with field instances, content types and default views.

This affects many of the central core subsystems — without which it’s impossible to run a Drupal site — as well as some of the most popular contrib modules. The theme, schema, path alias, variables, field API and modules system all have similar issues in core. Views and CCK have similar issues in contrib.

With just a stock Drupal core install, none of this is too noticeable, but once you hit 100 or 200 installed modules, suddenly every request needs to fetch and unserialize() potentially dozens of megabytes of data. Some of the largest cache items like the theme registry can grow too large for MAX_ALLOWED_PACKET or the memcache default slab size. Since the items are statically cached, these caches can easily add 30MB or 40MB to PHP memory usage combined.

The full extent of this problem became apparent when I profiled WebWise Symantec Connect site (Drupal.org case study). Symantec Connect currently runs on Drupal 6, and as a complex site with a lot of social functionality has a reasonably large number of installed modules.

To find the size of these cache items, the simplest method is to profile the request with XHProf. Here’s a detail of the _theme_load_registry() function from Symantec Connect before this issue was dealt with. As you can see from this screenshot, the theme registry is taking up over 3mb of memory usage, and it’s taking over 6ms to pull it from cache.

XHProf screenshot

When reviewing Drupal 7 code it was found that in all cases the problems with oversized cache items were still there.

To tackle this and similar large cache items, I started working on a solution with the following requirements:

Keep the single round trip to the cache backend.
Massively reduce the amount of data loaded on each request.
Avoid excessive cache writes or segmentation.
Fix the performance issue without breaking APIs.

Webwise agreed to fund core patches for the most serious issues; between that funding, my own time, and plenty of Drupal community review we eventually ended up with CacheArray as a new core API - merged into Drupal 8 and also backported to Drupal 7.

CacheArray allows for the following:
Implements ArrayAccess to allow (mostly) backwards compatibility with the vast Arrays of Doom™.
Begins empty and progressively fills in array keys/values are they’re requested.
Caches any newly found array keys at the end of the request with one write to the cache system, so they’re available for the next one.
Items that would have been loaded into the ‘real’ array every request, but are never actually used, skip being loaded, resulting in a much smaller cache item.

For many systems such as the theme registry and the schema cache, much less than half of the data is actually used during most requests to the site. Theme functions may be defined for an admin page which is never visited. Often the schema definition for a table is only required for a handful of tables on a site, for example if drupal_write_record() is used to write to them. By excluding these unused items from the runtime caching, we’re able to significantly reduce the amount of data that needs to be lugged around every request.

Revisiting the XHProf screenshots, let’s look at Connect with the Theme Registry replaced by a CacheArray, we can see that the time spent to fetch the cache item was reduced to 1.7ms and the memory usage to 360kb (although actual site traffic determines how large the cache item will eventually grow).

XHProf screenshot

So far, the following core issues have been fixed in Drupal 8, most of these are already backported to Drupal 7 or are in the process of being so:

Introduce DrupalCacheArray and use it for drupal_get_schema()

Theme registry can grow too large for MySQL max_allowed_packet and memcache default slab size

Be more sparing with locale cache clears

system_list() memory usage

Still in progress (reviews/re-rolls welcome!):

variable_set() should rebuild the variable cache, not just clear it

Avoid slow query for path alias whitelists

And related issues — not using CacheArray but resulting from the same research:

_content_type_info() memory usage improvements (CCK - fixed)
_field_info_collate_fields() memory usage (core, in progress).
views_get_default_view() - race conditions and memory usage.
views_fetch_data() cache item can reach over 10mb in size

Since CacheArray implements ArrayAccess which is only available in PHP 5.2+, it’s unfortunately not possible to backport it to Drupal 6 core. However Tag1 Consulting maintains a fork of Pressflow which contains many of these improvements. Since Pressflow core requires PHP 5.2 these can also be submitted as pull requests for the main distribution.

Apr 26 2012
Apr 26

At Tag1 Consulting we do a lot of work on increasing web site performance, especially around Drupal sites. One of the common tools we use is memcached combined with the Drupal Memcache module. In Drupal, there are a number of different caches which are stored in the (typically MySQL) database by default. This is good for performance as it cuts down on potentially large/slow SQL queries and PHP execution needed to display content on a site. The Drupal Memcache module allows you to configure some or all of those caches to be stored in memcached instead of MySQL, typically these cache gets/puts in memcache are much faster than they would be in MySQL, and at the same time it decreases work load on the database server. This is all great for performance, but it involves setting up an additional service (memcached) as well as adding a PHP extension in order to communicate with memcached. I've seen a number of guides on how to install these things on Fedora or CentOS, but so many of them are out-dated or give instructions which I wouldn't suggest such as building things from source, installing with the 'pecl' command (not great on a package based system), or using various external yum repositories (some of which don't mix well with the standard repos). What follows is my suggested method for installing these needed dependencies in order to use memcached with Drupal, though the same process should be valid for any other PHP script using memcache.

PECL Packages

For the Drupal Memcache module, either the PECL memcache or PECL memcached (note the 'd'!) extensions can be used. While PECL memcached is newer and has some additional features, PECL memcache (no 'd'!) tends to be better tested and supported, at least for the Drupal Memcache module. Yes, the PECL extension names are HORRIBLE and very confusing to newcomers! I almost always use the PECL memcache extension because I've had some strange behavior in the past using the memcached extension; likely those problems are fixed now, but it's become a habit and personal preference to use the memcache extension.

Installing and Configuring memcached

The first step is to get memcached installed and configured. CentOS 5 and 6 both include memcached in the base package repo, as do all recent Fedora releases. To install memcached is simply a matter of:
# yum install memcached

Generally, unless you really know what you're doing, the only configuration option you'll need to change is the amount of memory to allocate to memcached. The default is 64MB. That may be enough for small sites, but for larger sites you will likely be using multiple gigabytes. It's hard to recommend a standard size to use as it will vary by a large amount based on the site. If you have a "big" site, I'd say start at 512MB or 1GB; if you have a smaller site you might leave the default, or just bump it to 512MB anyway if you have plenty of RAM on the server. Once it's running, you can watch the memory usage and look for evictions (removal of a cache item once the cache is full) to see if you might want to increase the memory allocation.

On all Fedora / CentOS memcached packages, the configuration file is stored in /etc/sysconfig/memcached. By default, it looks like this:

PORT="11211"
USER="memcached"
MAXCONN="1024"
CACHESIZE="64"
OPTIONS=""

To increase the memory allocation, adjust the CACHESIZE setting to the number of MB you want memcached to use.

If you are running memcached locally on your web server (and only have one web server), then I strongly recommend you also add an option for memcached to listen only on your loopback interface (localhost). Whether or not you make that change, please consider locking down the memcached port(s) with a firewall. In order to listen only on the 127.0.0.1 interface, you can change the OPTIONS line to the following:

OPTIONS="-l 127.0.0.1"

See the memcached man page for more info on that or any other settings.

Once you have installed memcached and updated the configuration, you can start it up and configure it to start on boot:

# service memcached start
# chkconfig memcached on

CentOS / RHEL PECL Module Install

Fedora

If you are on Fedora and using PHP from the base repo in the distribution, then installation of the PECL extension is easy. Just use yum to install whichever PECL extension you choose:

# yum install php-pecl-memcache

Or

# yum install php-pecl-memcached

CentOS 5 / RHEL 5

CentOS and RHEL can be a bit more complicated, especially on EL5 which ships with PHP 5.1.x, which is too old for most people. Here are the options I'd suggest for EL5:

  • If you are OK using the PHP provided with EL5, then you can get the PECL extensions from EPEL. Once you've enabled the EPEL repository (instructions), you can install either PECL extension by using the same yum commands outlined above in the Fedora section.
  • If you want to use PHP 5.2 or PHP 5.3 with EL5, I suggest using the IUS repositories (IUS repo instructions). Note that IUS provides the PECL memcache extension, but not the PECL memcached extension. Based on which PHP version you decide to use, you can install the PECL memcache extension with either:

    # yum install php52-pecl-memcache

    Or

    # yum install php53u-pecl-memcache

CentOS 6 / RHEL 6

EL6 ships with PHP 5.3, though it is an older version than is available for EL6 at IUS. If you are using the OS-provided PHP package, then you can install the PECL memcache extension from the base OS repo. If you want the PECL memcached extension, it is not in the base OS repo, but is available in EPEL. See the instructions linked from the CentOS 5 section above if you need to enable the EPEL repo.

# yum install php-pecl-memcache

Or, enable EPEL and then run:

# yum install php-pecl-memcached

As with EL5, some people running EL6 will also want the latest PHP packages and can get them from the IUS repositories. If you are running PHP from IUS under EL6, then you can install the PECL memcache extension with:

# yum install php53u-pecl-memcache

Similar to EL5, the IUS repo for EL6 does not include the PECL memcached module.

PECL Memcache Configuration

If you are using PECL memcache extension and will be using the clustering option of the Drupal Memcache module which utilizes multiple memcached instances, then it is important to set the hash strategy to "consistent" in the memcache extension configuration. Edit /etc/php.d/memcache.ini and set (or un-comment) the following line:

memcache.hash_strategy=consistent

If you are using the PECL memcached module, this configuration is done at the application level (e.g. in your Drupal settings.php).

Once you've installed the PECL memcache (or memcached) extension, you will need to reload httpd in order for PHP to see the new extension. You'll also need to reload httpd whenever you change the memcache.ini configuration file.

# service httpd reload

SELinux

If you have SELinux enabled (you should!), I have an older blog post with instructions on configuring SELinux for Drupal.

That's it, you're now good to go with PHP and memcache!

Dec 22 2011
Dec 22

I see a lot of people coming by #centos and similar channels asking for help when they’re experiencing a problem with their Linux system. It amazes me how many people describe their problem, and then say something along the lines of, “and I disabled SELinux...”. Most of the time SELinux has nothing to do with the problem, and if SELinux is the cause of the problem, why would you throw out the extra security by disabling it completely rather than configuring it to work with your application? This may have made sense in the Fedora 3 days when selinux settings and tools weren’t quite as fleshed out, but the tools and the default SELinux policy have come a long way since then, and it’s very worthwhile to spend a little time to understand how to configure SELinux instead of reflexively disabling it. In this post, I’m going to describe some useful tools for SELinux and walk through how to configure SELinux to work when setting up a Drupal web site using a local memcached server and a remote MySQL database server -- a pretty common setup for sites which receive a fair amount of traffic.

This is by no means a comprehensive guide to SELinux; there are many of those already!
http://wiki.centos.org/HowTos/SELinux
http://fedoraproject.org/wiki/SELinux/Understanding
http://fedoraproject.org/wiki/SELinux/Troubleshooting

Too Long; Didn’t Read Version

If you’re in a hurry to figure out how to configure SELinux for this particular type of setup, on CentOS 6, you should be able to use the following two commands to get things working with SELinux:
# setsebool -P httpd_can_network_connect_db 1
# setsebool -P httpd_can_network_memcache 1

Note that if you have files existing somewhere on your server and you move them to the webroot rather than untar them there directly, you may end up with SELinux file contexts set incorrectly on them which will likely deny access to apache to read those files. If you are having a related problem, you’ll see something like this in your /var/log/audit/audit.log:
type=AVC msg=audit(1324359816.779:66): avc: denied { getattr } for pid=3872 comm="httpd" path="/var/www/html/index.php" dev=dm-0 ino=549169 scontext=root:system_r:httpd_t:s0 tcontext=root:object_r:user_home_t:s0 tclass=file

You can solve this by resetting the webroot to its default file context using the restorecon command:
# restorecon -rv /var/www/html

Server Overview

I’m going to start with a CentOS 6 system configured with SELinux in targeted mode, which is the default configuration. I’m going to be using httpd, memcached, and PHP from the CentOS base repos, though the configuration wouldn’t change if you were to use the IUS PHP packages. MySQL will be running on a remote server which gives improved performance, but means a bit of additional SELinux configuration to allow httpd to talk to a remote MySQL server. I’ll be using Drupal 7 in this example, though this should apply to Drupal 6 as well without any changes.

Initial Setup

Here we will setup some prerequisites for the website. If you already have a website setup you can skip this section.

We will be using tools such as audit2allow which is part of the policycoreutils-python package. I believe this is typically installed by default, but if you did a minimal install you may not have it.
# yum install policycoreutils-python

Install the needed apache httpd, php, and memcached packages:
# yum install php php-pecl-apc php-mbstring php-mysql php-pecl-memcache php-gd php-xml httpd memcached

Startup memcached. The CentOS 6 default configuration for memcached only listens on 127.0.0.1, this is great for our testing purposes. The default of 64M of RAM may not be enough for a production server, but for this test it will be plenty. We’ll just start up the service without changing any configuration values:
# service memcached start

Startup httpd. You may have already configured apache for your needs, if not, the default config should be enough for the site we’ll be testing.
# service httpd start

If you are using a firewall, then you need to allow at least port 80 through so that you can access the website -- I won’t get into that configuration here.

Install Drupal. I’ll be using the latest Drupal 7 version (7.9 as of this writing). Direct link: http://ftp.drupal.org/files/projects/drupal-7.9.tar.gz
Download the tarball, and expand it to the apache web root. I also use the --strip-components=1 argument to strip off the top level directory, otherwise it would expand into /var/www/html/drupal-7.9/
# tar zxf drupal-7.9.tar.gz -C /var/www/html --strip-components=1

Also, we need to get the Drupal site ready for install by creating a settings.php file writable by apache, and also create a default files directory which apache can write to.
# cd /var/www/html/sites/default/
# cp default.settings.php settings.php
# chgrp apache settings.php && chmod 660 settings.php
# install -d -m 775 -g apache files

Setup a database and database user on your MySQL server for Drupal. This would be something like this:
mysql> CREATE DATABASE drupal;
mysql> GRANT ALL ON drupal.* TO [email protected] IDENTIFIED BY 'somepassword';

Test this out by using the mysql command line tool on the web host.
# mysql -u drupal_rw -p -h drupal

That should connect you to the remote MySQL server. Be sure that is working before you proceed.

Now for the Fun Stuff

If you visit your new Drupal site at http://your-hostname-here, you’ll be presented with the Drupal installation page. Click ahead a few times, setup your DB info on the Database Configuration page -- you need to expand “Advanced Options” to get to the hostname field since it assumes localhost. When you click the button to proceed, you’ll probably get an unexpected error that it can’t connect to your database -- this is SELinux doing its best to protect you!

Allowing httpd to Connect to a Remote Database

So what just happened? We know the database was setup properly to allow access from the remote web host, but Drupal is complaining that it can’t connect. First, you can look in /var/log/audit/audit.log which is where SELinux will log access denials. If you grep for ‘httpd’ in the log, you’ll see something like the following:
# grep httpd /var/log/audit/audit.log
type=AVC msg=audit(1322708342.967:16804): avc: denied { name_connect } for pid=2724 comm="httpd" dest=3306 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:mysqld_port_t:s0 tclass=tcp_socket

That is telling you, in SELinux giberish language, that the httpd process was denied access to connect to a remote MySQL port. For a better explanation of the denial and some potential fixes, we can use the ‘audit2why’ utility:
# grep httpd /var/log/audit/audit.log | audit2why
type=AVC msg=audit(1322708342.967:16804): avc: denied { name_connect } for pid=2724 comm="httpd" dest=3306 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:mysqld_port_t:s0 tclass=tcp_socket

Was caused by:
One of the following booleans was set incorrectly.
Description:
Allow HTTPD scripts and modules to connect to the network using TCP.

Allow access by executing:
# setsebool -P httpd_can_network_connect 1
Description:
Allow HTTPD scripts and modules to connect to databases over the network.

Allow access by executing:
# setsebool -P httpd_can_network_connect_db 1

audit2why will analyze the denial message you give it and potentially explain ways to correct it if it is something you would like to allow. In this case, there are two built in SELinux boolean settings that could be enabled for this to work. One of them, httpd_can_network_connect, will allow httpd to connect to anything on the network. This might be useful in some cases, but is not very specific. The better option in this case is to enable httpd_can_network_connect_db which limits httpd generated network connections to only database traffic. Run the following command to enable that setting:
# setsebool -P httpd_can_network_connect_db 1

It will take a few seconds and not output anything. Once that completes, go back to the Drupal install page, verify the database connection info, and click on the button to continue. Now it should connect to the database successfully and proceed through the installation. Once it finishes, you can disable apache write access to the settings.php file:
# chmod 640 /var/www/html/sites/default/settings.php

Then fill out the rest of the information to complete the installation.

Allowing httpd to connect to a memcached server

Now we want to setup Drupal to use memcached instead of storing cache information in MySQL. You’ll need to download and install the Drupal memcache module available here: http://drupal.org/project/memcache
Install that into your Drupal installation, and add the appropriate entries into settings.php. For this site, I did that with the following:
# mkdir /var/www/html/sites/default/modules
# tar zxf memcache-7.x-1.0-rc2.tar.gz -C /var/www/html/sites/default/modules

Then edit settings.php and add the following two lines:
$conf['cache_backends'][] = 'sites/default/modules/memcache/memcache.inc';
$conf['cache_default_class'] = 'MemCacheDrupal';

Now if you reload your site in your web browser, you’ll likely see a bunch of memcache errors -- just what you wanted! I bet it’s SELinux at it again! Check out /var/log/audit/audit.log again and you’ll see something like:
type=AVC msg=audit(1322710172.987:16882): avc: denied { name_connect } for pid=2721 comm="httpd" dest=11211 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:memcache_port_t:s0 tclass=tcp_socket

That’s very similar to the last message, but this one is for a memcache port. What does audit2why have to say?
# grep -m 1 memcache /var/log/audit/audit.log | audit2why
type=AVC msg=audit(1322710172.796:16830): avc: denied { name_connect } for pid=2721 comm="httpd" dest=11211 scontext=unconfined_u:system_r:httpd_t:s0 tcontext=system_u:object_r:memcache_port_t:s0 tclass=tcp_socket

Was caused by:
One of the following booleans was set incorrectly.
Description:
Allow httpd to act as a relay

Allow access by executing:
# setsebool -P httpd_can_network_relay 1
Description:
Allow httpd to connect to memcache server

Allow access by executing:
# setsebool -P httpd_can_network_memcache 1
Description:
Allow HTTPD scripts and modules to connect to the network using TCP.

Allow access by executing:
# setsebool -P httpd_can_network_connect 1

Again, audit2why gives us a number of options to fix this. The best bet is to go with the smallest and most presice change for our needs. In this case there’s another perfect fit: httpd_can_network_memcache. Enable that boolean with the following command:
# setsebool -P httpd_can_network_memcache 1

Success! Now httpd can talk to memcache. Reload your site a couple of times and you should no longer see any memcache errors. You can be sure that Drupal is caching in memcache by connecting to the memcache CLI (telnet localhost 11211) and typing ‘stats’. You should see some number greater than 0 for ‘get_hits’ and for ‘bytes’.

What are all these booleans anyway?

Now we’ve used a couple SELinux booleans to allow httpd to connect to memcached and MySQL. You can see a full list of booleans which you can control by using the command ‘getsebool -a’. They are basically a preset way for you to allow/deny certain pre-defined access controls.

Restoring default file contexts

As I mentioned briefly in the ‘TL;DR’ section, another common problem people experience is with file contexts. If you follow my instructions exactly, you won’t have this problem because we untar the Drupal files directly into the webroot, so they will inherit the default file context for /var/www/html. If, however, you were to untar the files in your home directory, and then use ‘mv’ or ‘cp’ to place them in /var/www/html, they will maintain the user_home_t context which apache won’t be able to read by default. If this is happening to you, you will see the file denials logged in /var/log/audit/audit.log -- something like this:
type=AVC msg=audit(1324359816.779:66): avc: denied { getattr } for pid=3872 comm="httpd" path="/var/www/html/index.php" dev=dm-0 ino=549169 scontext=root:system_r:httpd_t:s0 tcontext=root:object_r:user_home_t:s0 tclass=file

The solution in this case is to use restorecon to reset the file contexts back to normal:
# restorecon -rv /var/www/html

Update: It was noted that I should also mention another tool for debugging audit messages, 'sealert'. This is provided in the setroubleshoot-server package and will also read in the audit log, similar to what I described with audit2why.
# sealert -a /var/log/audit/audit.log

Oct 28 2011
ChX
Oct 28

DISQUS is a popular "social commenting" platform. It is integrated with many hosted blog platforms and open source CMSes, including Drupal. A client of ours exported the comments from their old Wordpress blog and then imported them into DISQUS. The problem was that the comments were showing up in the DISQUS dashboard, however, when you clicked their corresponding URLs, these imported comments did not appear in Drupal. While the Drupal module looks for comments on the node/X URLs, DISQUS was storing them at the old Wordpress URL which were implemented as path aliases in this case.

Fixing this was relatively easy as DISQUS has a mapping tool available. You can download the URLs it knows about and then upload a very simple CSV file to change the URLs. To generate the CSV file after you have saved the Disqus URLs into disqus-urls.csv, just run the following script with drush php-script disqusmap.php > map:

<?php
$base = 'http://www.example.com/';
$n = strlen($base);
foreach (file('disqus-urls.csv') as $url) {
  $path = substr(trim($url), $n);
  $source = drupal_get_normal_path($path);
  if ($source != $path) {
    echo "$base$path, $base$source\n";
  }
}

After you have uploaded the file there is nothing to do but wait. As far as I can see, there are no logs, progress reports, or anything that provides status. In this client's case re-mapping worked to solve their missing comments and so hopefully it will work for you as well.

Oct 25 2011
Oct 25

Tag1 Consulting is sponsoring my work on Drupal.org Infrastructure. What this means is that instead of working on drupal.org whenever I can, I get to spend 20 paid hours per week on drupal.org infrastructure. In return for this, I have agreed to write a blog entry per month describing some of my work in detail. These will be entries covering security, performance, high-availability configuration and anything else interesting in my work on drupal.org. Hopefully these will be useful.

I look forward to spending more time securing and improving the performance of drupal.org and would like to thank everyone at Tag1 and our clients for this opportunity.

About Narayan Newton
Partner/CTO
Narayan is a co-owner of Tag1 Consulting, who joined the team in 2008. He was introduced to Drupal during his tenure at the Oregon State University Open Source Lab where he was the co-lead system administrator, and served as their Database Administrator of over 180 MySQL databases, the Freenode Server administrator, and the Drupal.org Server Administrator. He is a permanent member of the Drupal Association as their Systems Coordinator, and is a member of the Drupal.org infrastructure team. He is also a co-maintainer of the Pressflow high performance Drupal distribution.

Outside of Drupal, Narayan has been deeply involved in the FreeBSD, Gentoo and Slackware communities. He was a contributing editor and system administrator of a popular Linux community website. As a Google Summer of Code student he developed the KDE front-end to the SUSE System Update server, and became a KDE core committer in 2007. He also ported and updated the XML-RPC system to KDE3 and KDE4.

More recently, he acted as infrastructure lead for the examiner.com 2.0 re-launch and infrastructure lead for the drupal.org redesign launch. Narayan is currently Chief Technology Officer at Tag1 Consulting.

Jul 09 2011
Jul 09

During performance and scalability reviews of sites, we regularly find ourselves submitting patches to contrib modules and core to resolve performance issues.

Most large Drupal installations we work with use a variation of this workflow to track patches:

  • Upload the patch to an issue on Drupal.org if it's not already there
  • Add the patch to a /patches folder in revision control
  • Document what the patch does, the Drupal.org nid, and a reference to the ticket in the client's issue tracker in /patches/README.txt
  • Apply the patch to the code base

(Drush make seems like a popular alternative to this, but most of the sites we work on predate that workflow so I've not actually tried it).

Applying patches is usually straightforward enough if only changing PHP code, however you can run into trouble with this method if you're changing the schema of a database table. In our case, 99% of the time when we need to make schema changes we're adding or modifying indexes.

Recently on a client site, I found they'd applied a core patch adding an index to the comment module (this one: http://drupal.org/node/336483). I discovered this when checking their comment module schema while reviewing another core patch, which adds a different index to the same table (http://drupal.org/node/243093).

With schema changes, the ordering of hook_update_N () matters greatly.

If you apply a patch that adds an update to a module, let's say comment_update_7004(), you have no guarantee that your patch will get committed before anyone else adds comment_module_7004() to that module. In this case with two patches competing for the same update number, that seems relatively likely.

Drupal will not run the same update twice, so this risks missing schema updates to the module entirely. This could put you in the very position of having to run those updates manually yourself, or work on your own upgrade path to get your site back into sync, not fun.

To avoid this, recently we've been using this workflow:

  • Upload the patch to an issue Drupal.org
  • Add a hook_schema_alter() in a custom module, with a @see to the Drupal.org issue nid
  • Add a custom update handler: mycustommodule_update_N() to add the index

This will avoid running into conflicts with update numbering, so that when other schema changes are added to the module, you'll still have those updates run.

Health warning
While this approach will ensure that you don't skip any core or contrib updates, it will not prevent updates being run twice - once from your custom module, once from the module itself.

If you're only adding indexes this is usually OK, but if in doubt, http://drupal.org/project/schema can show you any discrepancies. Drupal 7 has the handy db_index_exists() function, this should help when resolving index mismatches.

It should go without saying that you should test this all on dev and staging servers before making any changes to your live site. Additionally, indexes can usually be added or removed with less impact than other schema changes - other kinds of changes can get you into a lot more trouble.

Mar 03 2011
Mar 03

We're excited to have 7 members of the Tag1 Consulting team attending the DrupalCon in Chicago next week. We are all looking forward to participating in another fantastic Drupal Conference. If you've not already bought your tickets, it's still not too late! Don't miss this one!

In Chicago, Tag1 will be passing out copies of Drupal Watchdog, participating in training courses, sessions, and BoFs, and generally enjoying the two-way sharing of knowledge with our fellow Drupal developers and users.

We will be sporting fancy new t-shirts in Chicago designed by Codename Design, incorporating our new logo designed by Candice Gansen. Find a Tag1 team member and ask us about Drupal performance and scalability, and you might get one yourself! :)

Stephanie modeling Tag1 t-shirt
Jeremy Andrews
Jeremy, Tag1's CEO and founding partner, spent much of this year focused on getting Drupal Watchdog issue #1 out the door. He'll be circulating the DrupalCon looking for feedback that will impact future issues of Drupal's only dedicated print magazine.

Narayan Newton
Narayan, Tag1's CTO and partner, will be one of the teachers at the Drupal Performance and Scalability pre-conference training course. As a member of the Drupal infrastructure team he was heavily involved in the recent CVS -> Git migration and the drupal.org redesign.

Rudy Grigar
Rudy, Tag1's Director of Operations, is part of the Drupal infrastructure team and was heavily involved in the Drupal Redesign effort. You can read more about the infrastructure improvements required for the Redesign effort in Rudy's article in Drupal Watchdog issue #1.

Other members of the Tag1 team that will be present at the DrupalCon include Brandon Bergren, Nat Catchpole, Jason Markantes and Bob Williams. With 77% of our virtual office physically attending the DrupalCon, we'll return to our normal schedule and availability the following week, March 14th.

Pages

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web