Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Jan 15 2021
Jan 15

Last year, Mariano had a proposal: let’s try to automatically deploy after successful testing on Travis. We never had anything like that before, all we had is a bunch of shell scripts that assisted the process. TL;DR: upgrading to CD level was easier than we thought, and we have introduced it for more and more client projects.

This is a deep dive into the journey we had, and where we are now. But if you’d like to jump right into the code, our Drupal Starter Kit has everything already wired in - so you can use everything yourself without too much effort.


At the moment the site is assembled and tests are run on every push. To be able to perform a deployment as well, a few pre-conditions must be met:

  • a solid hosting architecture, either a PaaS solution like Pantheon or Platform.sh, just to name a few that are popular in the Drupal world, or of course a custom Kubernetes cluster would do the job as well
  • a well-defined way to execute a deployment

Let me elaborate a bit on these points. In our case, as we use a fully managed hosting, tailored to Drupal, a git push is enough to perform a deployment. Doing that from a shell script from Travis is not especially complicated. The second point is much trickier. It means that you must not rely on manual steps during the deployments. That can be hard sometimes. For example while Drupal 8 manages the configuration well, sometimes there’s a need to alter the content right after the code change. The same is true when it comes to translations - it can be tricky as well. For every single non-trivial thing that you’d like to manage, you need to think about a fully automated way, so Travis can execute it for you.

In our case, even when we did no deployments from Travis at all, we used semi-automatic shell scripts to do deployments by hand. Still, we had exceptions, so we prepared “deployment notes” from release to release.

Authorization & Authentication

What works from your computer, for instance pushing to a Git repository, won’t work from Travis or from an empty containerized environment. These days, whatever hosting you use, where you have the ability for password-less authentication, it’s via a public/private key pair, so you need a new one, dedicated to your new buddy, the deployment robot.

ssh-keygen  -f deployment-robot-key -N "" -C "[email protected]"

Then you can add the public key to your hosting platform, so the key is authorized to perform deployments (if you raise the bar, even to the live environment). Ideally you should create a new dummy user on the hosting platform so it’s evident in the logs that a particular deployment comes from a robot, not from a human.

So what’s next? Copying the private key to the Git repository? I hope not, as you probably wouldn’t like to open a security issue and allow anyone to perform a deployment anytime, right? Likely not. Travis, as most of the CI platforms, provides a nice way to encrypt such things that are not for every coworker. So bootstrap the CLI tool and from the repository root, issue:

travis encrypt-file deployment-robot-key

Then follow the instructions on what to commit to the repository and what not.

Now you can deploy both from localhost and from Travis as well.

Old-Fashioned, Half-Automated Deployments

Let’s see a sample snippet from a project that has existed since 2007 for Drupal 7:

echo -e "${GREEN}Git commit new code.${NORMAL}\n"
git add . --all

echo -e "${YELLOW}Sleeping for 5 seconds, you can abort the process before push by hitting Ctrl-C.${NORMAL}\n"
git status
sleep 5
git commit -am "Site update from $ORIGIN_BRANCH"
git push

A little background to understand what’s going on above: For almost all the projects we have, there are two repositories. One is hosted on GitHub, has all the bells and whistles, the CI integration, all the side scripts and the whole source code, but typically not the compiled parts. Whereas, the Pantheon Git repository could be considered as an artefact repository, where all the compiled things are committed in, like the CSS from the SCSS. On that repo we also have some non-site related scripts.

So a human being sits in front of their computer, has the working copies of the two repositories, the script is able to deploy to the proper environment based on the branch. After the git push, it’s up to Pantheon to do the heavy lifting.

We would like to do the same, minus the part of having a human in middle of the process.

Satisfying the Prerequisites

All the developers (of that project) had various tools installed like Robo (natively or in that Docker container), the Pantheon repository was cloned locally, the SSH keys were configured and tested, but inside Travis, you have nothing more than just the working copy of the GitHub repository and an empty Ubuntu image.

We ended up with a few shell scripts, for instance ci-scripts/prepare_deploy.sh:


set -e

cd "$TRAVIS_BUILD_DIR" || exit 1

# Make Git operations possible.
cp travis-key ~/.ssh/id_rsa
chmod 600 ~/.ssh/id_rsa

# Authenticate with Terminus.
ddev auth pantheon "$TERMINUS_TOKEN"

ssh-keyscan -p 2222 "$GIT_HOST" >> ~/.ssh/known_hosts

git clone "$PANTHEON_GIT_URL" .pantheon

# Make the DDEV container aware of your ssh.
ddev auth ssh

And another one that installs DDEV inside Travis.

That’s right, we heavily rely on DDEV for being able to use Drush and Terminus cleanly. Also it ensures that what Travis does is identically replicable at localhost. The trickiest part is the process of doing an ssh-keyscan before the cloning, otherwise it would complain about the authenticity of the remote party. But how do you ensure the authenticity this way? One possible improvement is to use https protocol, so the root certificates would provide some sort of check. For the record, it was a longer hassle to figure out that the private key of our “robot” is exposed correctly, and the cloning fails because the known_hosts file is not yet populated.

Re-Shape the Travis Integration

Let’s say we’d like to deploy to the qa Pantheon environment every time master is updated. First of all, only successful builds should be propagated for deployments. Travis offers stages that we can use to accomplish that, so in practise, in our case: if linting fails, do not waste the time for the tests. If the tests fail, do not propagate the bogus state of the site to Pantheon. Here’s Drupal-starter’s travis.yaml file.

We execute linting first, for coding standard and for clean SCSS, then our beloved PHPUnit test follows, and finaly we deploy to Pantheon for QA.

Machine-Friendly, Actualized Script

So what’s inside that Robo command in the end? More or less just the same as previously, the shell script became tidier and smarter, and after a while, we migrated to Robo to get managed command execution, formatted output and error handling out of the box. And also an important part is that all PHP devs can feel comfortable with that code, even if they are not Bash gurus. Here’s the deploy function. Then your new robot buddy can join the deployment party:

An automated deploy in the Pantheon dashboard

Do It Yourself

If you would like to try this out in action, just follow these steps - this is essentially how we set up a new client project these days:

Do You Need It?

Making Drupal deployments work in a fully automated way takes time. We invested about 80 hours polishing our deployment pipeline. But we estimate that we saved about 700 hours of deployment time with this tool. Should you invest that time? Dan North, in his talk Decisions, Decisions says

Do not automate something until it gets boring…

So, automate your deployments! But don’t rush until you learn how to do this perfectly. And if you decide to automate, we encourage you to build it on top of our work, and save yourself a lot of time!

May 01 2020
May 01

Static sites are the best. They are the most secure and fastest of sites. They are perfect for anonymous users, where you would want content editors to have a secure and hidden backend where they can administer the content - but have the content served elsewhere.

Having search on top of that can be a bit more challenging. There are different solutions for having a local search like lunr.js (and a Drupal module to integrate with it), but it’s quite limited. That is, it will create a local index where you could have some JS to look into it, but that is no match to full-blown search engines such as Elasticsearch.

In this blog post I will share a demo website we’ve built as a proof of concept for a much larger site. I’m not going to dwell on the advantages of static sites, instead I’m going to share the high-level concepts that guided us, along with some technical tips. While the specific details have nothing to do with Drupal - our client’s site is in Drupal, so it was convenient to build it around it. But you can really do it with any language.

Here is the demo repo, and that’s how it looks:


With static sites, deploying and reverting deploys is easy. It’s not much more than git push or git revert if something went wrong. But what about search? As we’ve mentioned, we want to keep using Elasticsearch for things like aggregations (a.k.a. facets), spell checks, etc. But how can we support, for example, a rollback in the deploy - making sure that search is always searching only through the content that exists in the deployed static site. Thankfully, Elasticsearch supports index cloning, so we could have something like this:

  1. We would have a single Elasticsearch instance, that will have multiple indices.
  2. The default index is the one that Drupal will connect to, and it will be a read and write. That is, you can think of it as a “master” index, from which a snapshot will be taken.
  3. When we want to create a new static site, we would also create a clone of the default index. That will be a read-only index.
  4. Our JS app that will be in charge of calling Elasticsearch should know about the name of the read-only index. So if we rollback a deploy to previous versions, the JS code will connect to the right index.

A word about the Elasticsearch instance and its exposure to the outside. In a similar way to how we place Drupal in the backend, away from the public eye, we could do the same with Elasticsearch. For example, when we host Elasticsearch instance(s) on Google Cloud, we can use a load balancer to provide a public-facing, SSLed, URL - which, in turn, will call the Elasticsearch with any kind firewall rules needed, such as preventing any admin-type requests from hitting the instance, or prevent any request reaching the default index if it doesn’t originate from the Drupal backend.

HTTrack VS wget

We actually started our research looking into Tome, a Drupal module that allows exporting a Drupal site into a static site. While it did work nicely, and I’m sure there could be benefits to using it, I’ve thought that there wasn’t really any specific need in our case for Drupal itself to provide the export. We may as-well use other open source tools, which have been around for quite a few years.

Then started a longer than anticipated trial and comparisons between HTTrack and wget. While we did eventually go with wget it’s worth sharing some of our experience. First we’ve tried HTTrack. We have already used it to help us move some old sites to be static, and it did the job well. The amazing Karen Stevenson also wrote a great post about it that goes into more details.
My impression from HTTrack is that it works surprisingly well for a tool that has a site that looks as outdated as it looks. Documentation is pretty good, although at times lacking with concrete examples.

One important thing we’ve done from the get go was take as example a real client site. This immediately manifested the biggest challenge - exporting a static site quickly. Normally, when archiving some site, it’s totally fine for the export to take even an hour. But if we’re thinking of a live site, that regularly updates as content in the backend is changed - having to wait so long is problematic.

This is where we were jumping back and forth between HTTrack and wget. Lets see some of the results we got. I’m going to hide the real URL, so you won’t abuse our client’s site as much as we did! :)

httrack https://example.com -O . -N "%h%p/%n/index%[page].%t" -WqQ%v --robots=0 --footer ''

HTTrack shows download stats.

37 minutes for about 450 MB of saved content (HTML along with all the assets, images, and files).

That attempt was on my local computer. So we spinned a Google Cloud instance, to see if executing this from a server with a more powerful internet connection would be much faster - but it wasn’t.

So we decided to exclude all user-generated files (images, PDFs, etc). The idea would be that files would be uploaded to some storage service like AWS or Google Cloud and be served from there. That is, when a content editor will upload a new file in Drupal, instead of saving it on the server, it will be saved “remotely.”

httrack https://example.com -N "%h%p/%n/index%[page].%t" --display --robots=0 "-*.pdf" "-*.zip" "-*.jpg" "-*.gif" "-*.png"

Doing so shaved quite a lot, and got us down to 20 minutes with 120 MB.

Then we checked if we could increase the number of concurrent connections. HTTrack has a few options, notably the scarily-named --disable-security-limits that should allow us to lightly DDOS our own site, or by setting a specific count of connections (e.g. -c16). However, we didn’t seem to see a change in the number of connections.

Then we tried to update an existing static site by executing httrack --update. This one was already considerably faster, and got us down to 4.5 minutes, and this time we also saw multiple active connections.

HTTrack update with multiple connections.

After that I started looking into wget, running wget --recursive --page-requisites --adjust-extension --span-hosts --convert-links --restrict-file-names=windows --domains example.com --no-parent --reject jpg,jpeg,png,zip,pdf,svg https://example.com/ (see gist for explanation on each option).

First grab took only 8 minutes, x3 faster than HTTrack! Update took about 5 minutes.

After spending many hours digging in different forums and StackOverflow, I’ve decided to go with wget, but I do think HTTrack should be considered if you go with a similar solution. I believe it’s a “your mileage may vary” situation.

One shortcoming I found with wget, was that unlike HTTrack I couldn’t tell it under what name to save the HTML file. Instead of /content/foo/index.html I got the file as /content/foo.html. Do you know how I solved it? With a hack equally nice as it’s ugly. On Drupal I’ve changed the path aliases pattern to be /content/[node:title]/index.html. I hope maybe someone will point me to a way of getting wget to do it for us.

Drupal's path alias settings.

Bundle Everything Together

My personal belief is that technical posts should put in the hands of their readers the means to get their hands dirty, experiment, and collaborate. So here you go, and here’s the recipe for the demo repo:

It was scaffolded out of our Drupal 8 & 9 Starter and as such you can easily download and try yourself as ddev is the only requirement. All commands are executed from inside the ddev container - so no special installation is required.

Follow the installation steps, and it should result with a Drupal site, and dummy content. Installation should give you an admin link, but you can always grab a new one with ddev . drush uli.

Next, let’s create a static site with ddev robo snapshot:create. The Robo command is responsible for clearing Drupal’s cache, wget-ing the site and massaging it a bit, cloning the Elasticsearch index, and is even nice enough to ask you if you’d like to run a local HTTP server to see your new static site. Notice how the search is still working, but if you look inside the network panel, you’d see it calls a specific index. That’s our “snapshot” read-only index.

Elm app is performing search on the correct Elasticsearch index.

Go ahead and make some changes to your site. Add/ delete content, or even switch the theme to Bartik. ddev robo snapshot:create once more. Note that you might need to hard refresh the page in the browser to see the changes.

Re-exported static site, using the Bartik theme.

Elm Application

I’ve written the JS application for the search in Elm. We fetch data from the correct index URL, and then we show them in a paginated list. Like any app we’ve written in Elm, It Simply Works™.

As Elm is already bundled with the ddev container, we have a simple Robo command to watch for changes and compile them to the right place inside Drupal.

Two important things we do when create a static site, is find & replace a JS variable which determines if Elm is operating in Drupal context, or static site context. With this knowledge we could have different functionality. For example, in the Drupal context, we could have shown an Edit link next to each result. At this same time, we also change the URL of the index, to the cloned one, so our JS app always calls the right URL.


I believe we have been successful with our attempts, and the requirements we had were well answered. Going with a static site provides considerable security, performance and stability gains. Having JS to communicate with a snapshot-specific Elasticsearch index wasn’t too hard and it’s working nicely. Naturally, in the demo repo the app is quite basic, but it lays the foundation for fancier features.

The biggest challenge of exporting the static site in a speedy manner boils down to excluding any user generated assets or pages. But still, some export time is always to be expected. Finally having Robo commands executed inside ddev is a convenient way to automate and share with other devs on the team.

Feb 20 2020
Feb 20

My appreciation for form API in Drupal is on the same level as my attempt to avoid it when it comes to user facing forms. Both are pretty high. The reasons I love it are because it’s extendible and security is built in. I’ve worked with a few other frameworks in different languages, and my impression is that Drupal’s form API is significantly more advanced than any other solution I’ve seen.

The reason I try to avoid it, on the other hand, is mainly because it’s hard to create forms that satisfy the end users, and achieve their expectations. Basically, forms are bulky, and going with a mix of JS/Ajaxy solutions is often a pain. Having a JS form (i.e. some JS widget that builds and controls the entire form), that POSTs to a RESTful endpoint takes more code, but often times provides a more streamlined user experience.

Not sure why and how, but over the years we’ve been tasked quite a few times with creating form wizards. It’s frequently used for more complex registrations, like for students, or alumnus applying for different programs. In the early Drupal 7 days we went with CTools’ wizard, and then switched to Elm (i.e. a JS form) along with RESTful endpoints. Drupal 8 however has one major feature that makes it very appealing to work once more with form API - that is “Form modes.”

This post has an example repo, that you should be able to reliably download and run locally thanks to DDEV. I will not try to cover every single line of code - but rather share the concepts behind our implementation, with some references to the code. The audience is intermediate-level Drupal developers, that can expect to have a good sense of how to use Form modes to build wizards after reading this post and going over the code.

Before diving in, it’s important to recognize that “wizards” come in many flavors. I personally hold the opinion that a generic module cannot be (easily) built to accommodate all cases. Instead, I look at Drupal 8 with its core and a couple of contrib modules as the “generic” solution to build complex - sprinkled with lots of custom business logic - wizards.

In our case, and for demonstration purposes, we’ve built a simplified “Visa application” wizard – the same you may find when visiting other countries. As an aside, I do wish some of the countries I’ve visited lately would have implemented a similar solution. The experience of their wizards did bring me to tear some of the remaining hair I’ve still got.

Our objectives are:

  1. A user can have only a single Visa application.
  2. Users should have an easy overview of the state of their application.
  3. Sections (i.e. the wizard pages) must be completed to be able to submit it to final processing; however, it doesn’t have to happen in a single go. That is, a user can save a partial draft and return to it later.
  4. After a user “signed” the last section, the application is locked. The user can still see what they have submitted, but cannot edit it.
  5. A site admin should have easy access to view and edit existing applications.

With the above worthy goals, we were set for the implementation. Here are the two main concepts we’ve used.

Sections as Form Modes

In Drupal 7 we could define View modes for a node (well, for any entity, but let’s be node centric for now). For example, a “Teaser” view mode, would show only the title and trimmed body field; And a “Full” view mode would show the entire content. The same concept was applied to the node forms. That is, the node add/edit form we know, is in fact a “Default” form mode. With Drupal 8, we can now define multiple form modes.

That’s pretty big. Because it means we can have multiple forms, showing different fields - but they are all accumulated under a single node.

For each wizard “page” we’ve added a section: section_1, section_2, etc. If you have the example repo running locally, you can see it in:


Form modes configuration page.

Next we have to enable those form modes for our Visa application content type.


Enable form modes for the content type.

Then we can see those form modes enabled, allowing us to setup fields and disable others under each section.


Configure fields under each form mode.

It’s almost enough. However, Drupal still doesn’t really know about those sections, and so we have to explicitly register them.

The next ingredient to our recipe, is having our wizard page controller recognize which section needs to be rendered. Drupal 8 has made it quite elegant, and it requires only a few lines to get it. So now, based on the URL, Drupal serves us the node form - with the correct form mode.

Faux Required is Not Required

One of our goals, as often requested by clients, was to allow saving the application in a draft state. We can easily do it by not checking the “required” option on the field settings, but from a UX perspective - how could a user know the field would eventually be required? So we’ve decided to mark the “required” fields with the usual red asterisk, along with a message indicating they are able to save a draft.

As for the config of the field, there is a question: should the fields be marked as required, and on the sections be un-required? Or should it be the other way around? We have decided to make it optional, as it has the advantage that a site admin can edit a node – via the usual node edit form – in case of some troubleshooting, and won’t be required to fill in all the fields (don’t worry – we’ll cover the fact that only admins can access directly the node view/edit/delete in the “Access” section).

So to reconcile the opposing needs, we came up with a “faux-required” setting. I have entertained the idea of calling it a “non-required required” just to see how that goes, but yeah… Faux-required is a 3rd party setting, in field config lingo.


Faux required setting in the field configuration page.

By itself it doesn’t do much. It’s just a way for us to build our custom code around it. In fact, we have a whole manager class that helps us manage the logic, and have a proper API to determine, for example, the status of a section – is a section empty, partially filled, or completed. To do that we basically ask Drupal to give us a list of all the faux-required fields that appear under a given Form mode.

Access & Application Status

We need to make sure only site-admins have direct access to the node, and we don’t want applicants to be able to edit the node directly. So we have a Route subscriber that redirects to our own access callback, which in turn, relies on our implementation of hook_node_access.

As for knowing in which status the application is, we have a single required field (really required, not faux-required) - a field called “Status” with New, Locked, Accepted and Rejected options. Those are probably the most basic ones, and I can easily imagine how they could be extended.

With this status, we can control when the form is editable by the user or disabled and without submit buttons. Having to write $form['#disabled'] = TRUE; and knowing it will disable any element on the form, is one of the prettiest parts form API.

Theming that Freaking Required Symbol

The subtitle says it all. Theming, like always and forever, is one of the hardest parts of the task. That is, unlike writing some API functions and having the feeling - “that’s the cleanest way possible,” with theme I often have the feeling of “it works, but I wonder if that’s the best way”. I guess the reason for this is that theming is indeed objectively hard to get “right.”

Anyway, with a bunch of form alters, preprocess and process functions callback we were able to tame the beast.

Field Groups

Many fields in different sections can make the job of site admins (or the ones responsible for processing or supporting the applications) quite hard. Field group is a nice way to mimic the structure of the sections both to the default node form, as well as the node view.


Wizard section appear as fieldsets in the regular node edit page.


Form modes in Drupal 8 are a very useful addition, and a cornerstone in our implementation of wizards. I encourage you to jump into the code of the example repository, and tinker with it. It has lots of comments, and if you ignore some of the ugly parts, you might find some pretty ones hidden there: Extending the complex inline entity form widget with custom logic to be more user friendly, use Form API states for the “other” option and getting that to work with faux-required, implementing Theme negotiator, our DDEV config and more…

And finally, if you spot something you have a better idea for, I’d love to get PRs!

Jan 02 2020
Jan 02

Recent travel to Rwanda has brought me to build a POC (Proof-of-Concept) with a familiar stack, only in a very different structure.

To better understand why the POC was built that way, I should give you the backstory.

The Backstory

In 2016, I was invited to present about Drupal & Elm in DrupalCamp Tokyo. I always like to bring this fact up in any kind of conversation - but this time there’s even a reason beyond my usual bragging rights: The flight is terribly long from Israel to Tokyo. Twenty-four hours door-to-door kind of long.

As it so happened, a short time before my flight, Adam, Gizra US Director had virtually dropped a PDF on my table. I was preparing myself for yet another long RFP (Request for proposal) with an impossible spec, and an even less possible timeline. I was surprised to see that was not the case. That PDF was forty-something pages, with a wireframe per page and some notes, showing the flow of a rather interesting app.

Wireframe from the spec

Three years later I still refer to those pages as the best spec we’ve ever received. The people behind those wireframes were Dr. Wendy Leonard and her Ihangane team. They were planning an electronic medical record for an HIV prevention program in Rwanda.

I was really impressed. Sure, the wireframes were rougher than usual, but they did exactly what they were supposed to. The team was smart enough to not rush into development and in fact, they even printed out the spec pages, went to the field, sat with nurses, and let them click on the screens. The printed screens. They clicked on paper!

Anyway, did I ever mention I was invited to Tokyo in 2016?

That long 24 hours flight. I’ve finished my book (“Ancillary Justice”), watched a movie (“Wreck-It Ralph”, as for some reason I love watching cartoons on planes), and there were still many hours before my arrival. So I took my laptop out, spun up a Drupal backend and an Elm frontend - and the first POC for Ihangane’s app called “E-Heza” was born in the sky.

The Good Ol’ Days

And so began the development. A decoupled architecture, where we had a RESTful server in the backend and Elm in the front. Like any project we had: different challenges, deadlines, deadlines we missed, scope creep, managing the project, managing the client, managing expectations, delivering, hot fixing, being proud of what we’ve accomplished, having a technical debt. Rinse and repeat.

While we were developing, the Ihangane team was hard at work implementing our work in the field. In fact, they were able to get it going in a dozen health centers in Rwanda. And while it worked for the most part, there was one challenge we knew was lurking, and waiting - Offline.

Have you ever worked with non profits, or NGOs, and they tell you they have a need for a site, and some of its users are in developing countries where infrastructure isn’t always good, internet is slow, and so we must take care of a light theme and shave every single byte possible, but then in the design they insist on hero banners with huge images, and every possible JS transition?

Well, we have. But that was not such a case. This time it was for real - all health centers had internet, but it was not rare for them to have the connections dropped for days on end. And so, we decided it was time to implement a proper offline support. The reason I say “proper” Is that we actually had offline support from the get go, but it was a very crude one. Only after the nurses checked all the patients, and a “session” was done, did the nurses manually start a sync process. But that wasn’t very nice, and we decided to go with “opportunistic syncing” - where data was pushed and pulled whenever an internet connection was available.

Offline & Service Workers

I’m going to side step, and go technical here: Getting offline support was admittedly a bit easier than I had feared. It is surely the most complicated part of the system, but we were able to keep it confined to the edges of the front-end application.

Elm, in all its greatness, already taught us to think about interacting with JS the same way we would with a remote server. So, instead of having Elm contact directly the remote server, we’ve placed a service worker as a middleware.

That meant that Elm keeps using HTTP requests, the same way it always did, but now we intercept those requests, and the service worker decides - are we online, then call the server and sync the data; Or are we offline, and we should use the data from IndexDB. To make the wiring of the offline part to Elm even less disruptive, we’ve structured our responses from the service worker to be similar to the ones we used to get from the server. In other words, the change was minimal, and it looks beautiful.

But still, as one can imagine, adding this layer brings a new set of challenges. Our app isn’t a news reader, where the worst case scenario is you cannot read your news. No, our app is all around creating new content, and the worst-case scenario for us is losing patient data.

The Rwanda Visit

Back to the main narrative. I was invited to attend a workshop that Ihangane had organized in Rwanda (you know, like that one time I was invited to Tokyo in 2016). It was a two-day workshop where different stakeholders came, and had discussions not only around Ihangane’s solutions but also in general - how to improve the health care in Rwanda.

Some parts were boring - the usual amount of suits talking about things they don’t necessarily know much about. But there was one topic that I found especially interesting: The Government of Rwanda is spearheading a “Health Cloud” effort. This project is a huge undertaking. To give just one example of the many challenges involved - currently in Rwanda when a baby is born, they don’t always get a national ID. In other words, there’s no unique identifier for every person - so the challenges are somewhat larger than just which best methods to use to create this platform.

But what really caught my attention, is how dedicated they were to make sure no vendor locking will happen under their supervising eye. I really appreciated it. I can only imagine how many private, for-profit organizations are trying to get their foot in the door in those emerging markets. I’m not saying all for-profit organizations are evil, I’m just not naive.

Rural Rwanda

I love saying “Rural Rwanda.” With my Israeli accent, it’s a real tongue twister for me. I also loved having a one day visit in the outskirts of Kigali, visiting nurses in Muhondo health center and see our system in action.

Spent a whole day in rural Rwanda, watching - in action - an @elmlang web-app with a #Drupal backend we've built for @ihanganeproject.

No marketing landing pages, no sale funnels, no targeted ads. Just a web-app that helps provide a better health care to mothers and their babies pic.twitter.com/PnLqV0vpSQ

— Amitai Burstein (@amitaibu) November 14, 2019

Make sure your speakers are on. That’s what I call true “stress testing” - when you have a room with about a hundred mothers and their (cute, yet screaming) babies, every extra second the app takes, is another second people are waiting.

What a great day it was for us - watching with our own eyes, seeing the good parts of the app, finding a couple of bugs, but mainly spotting workflows that could be improved, and benefit the team operating it, as well as the patients. When you see these women that have walked for hours with their babies on their backs to reach the health center, “try to refresh the page” or “let’s wait a couple minutes for the system to fully sync” is becoming less acceptable. I was very proud to see how we were able to be part of a thing that really does good to much less fortunate people.

Letting it All Sink In

  • We have mothers with HIV or high risk for HIV.
  • We have their babies, that need to be examined, and be checked for different signs of malnutrition.
  • We have no internet connection in some areas.
  • We need to support Android devices, as the measurements taken in the field are being registered in those devices.
  • We have this “Health Cloud” idea, that we want to help think about. We don’t want to follow a typical enterprise kind of thinking, but rather use our own more radical one.

At the same time of my visit to Rwanda I have started my affair with Plain text accounting with hledger. I think this quote sums it up nicely:

Accounting data is valuable; we want to know that it will be accessible for ever - even without software. We want to know when it changes, and revision-control it. We want to search and manipulate it efficiently.

It took my brain about a month to switch the word “Accounting” with “Medical”, as in:

Medical data is valuable; we want to know that it will be accessible for ever …

Then another thing happened. Did I ever tell you about my invitation to DrupalCamp Tokyo in 2016? Well, while I was in Rwanda I was invited to DrupalCamp Tokyo 2019.

Once again I was in a plane, finished my book (“The Sudden Appearance of Hope”), watched a movie (“Aladdin”, because cartoons are for kids!), and still had many hours ahead of me. Up in the sky, and for the sake of a good story I’d like to believe it was in the exact same spot the first POC was conceived, a new one started to form.

Same Ingredients, Different Mix

What if we didn’t have a DB, and stored medical records in plain text, the same as we’ve been doing with our accounting data?

What if Git, faithful and efficient Git, was taking care of syncing and working offline?

What if we’ve used Termux - to have a terminal on Android devices, and thus open the door to running a server locally on each device?

I felt those questions required me to answer them. The result is in this repo and the most important part about it, is the data. All the information we have is stored in YAML files. For example, here’s how we define the information of a mother, and the relation to her children:

id: 000f8e43-d638-49fa-8c9a-3429bb819f21
first_name: Aurelia
last_name: Sherlyn
    - ad1bce42-69c7-4d55-82e1-129fb4b91e87
    - b16df0be-ad66-4901-b7a1-2aa720c9968e
    - 0c9a59e4-dd51-40d6-8978-157fc9b65909

That’s it really. Thank you for reading, bye bye.

No, seriously, there’s some more on how we have some software to help view and manipulate the data (which is what you are probably interested in), but I think that having the data in a human readable format (as opposed to be inside DB tables) is the biggest thing in this POC.

Think about how you build a feature in a traditional website:

  1. The client has a feature idea.
  2. You sketch out some wireframes, the client approves them, and they will be out of the loop until section (6).
  3. You do a technical design.
  4. You add a table in your DB.
  5. You write some front end code.
  6. You deploy to TEST - and there the client sees the outcome for the first time.

But with plain text medical records, the “client” can write for themselves how the result of measurements should look. Any kind of conversation would be around that piece of data. Even if there’s no software on top of it, you know what the measurements are just by reading it.

version: 1
group_meeting: 2020-01-30-morning
# We have a single config file in the root, that has the measurements' units.
height: 125
weight: 12
# Mid-Upper Arm Circumference, known as MUAC.
muac: 30

For me, that’s the biggest win. In fact, maybe that’s the “no vendor lock-in” solution for Health Cloud, we can start building on top!

The Disposable Software

I wanted to build the local server in Haskell, but it seemed that to get it running on Termux it was more complicated than a simple pkg install php. So I decided to go with Symfony, and it was indeed a good decision. Setting everything up worked nicely, with a lot of copy & paste from the docs that are great. I’d say that only the Normalizer package (the equivalent of encoders/ decoders in Elm and Haskell) wasn’t fun to work with. Maybe because I’m so used to the guarantees of Elm and Haskell, working with a non-typed language felt less reliable.

Since PHP has its own web server, all we have to do is spin up the web server, and then use the browser to connect to that local server. The server is not meant for production, but in our case we have only a single client, so it actually works great. We can view data, we can add or change data, and we can even upload images directly from the device.

Adding measurements via Browser

Looking at the Git diff, we’d see the commit that resulted with submitting the form.

version: 1
group_meeting: 2020-01-30-morning
- height: 125
+ height: 130
weight: 12

I also wanted to show how the fact that we have the app in PHP doesn’t mean we are tightly coupled with it. We only care about the data, and that one is language agnostic. So I came up with a tiny Haskell utility, that you can give it a path to the data in another folder, and it will create 100 fake mothers and children, so we could test the app with more data.

Furthermore, since we’ve changed the structure of the stack, and we no longer need to think about a server as an entry point to get or manipulate data, the next thing I did was to create a console command to show a listing of mothers participating in a single group meeting. Below we can see the fake date we’ve created with the Haskell utility.

A list of mothers shown in the terminal

Final Thoughts

I think the biggest accomplishment here is stepping outside of the usual stack structure, and recognizing the unique situation that we are facing. This structure isn’t going to be right for a majority of the cases, but I’d argue that there are a good portion of problems that could be sovled with such a stack. Obviously it’s not a panacea, and it brings different problems; on the other hand, I find it solves some existing ones quite nicely.

In terms of syncing and access control - that’s baked deep inside Git, not to mention that having it as distributed, version controlled, and (sorry for the buzzword) “blockchain”-ed has many merits. In terms of security of running a local PHP server - since it’s running locally, and it’s not open to the outside world, it’s nothing more than a nicer interface to editing files. But obviously, by using Symfony and applying best practices - we make sure we write secure code.

I don’t know what’s next for this POC, as it’s not only for me to decide. But I do hope we’d get a chance to move forward some more with it, to reach a point where we can test it out in the field. If one day a nurse in a tiny room with many crying babies will tell me it improves her day to day work - that would be well worth it.

Dec 23 2019
Dec 23

Making multilingual sites is hard. I’ve been using Drupal since version 5 and I can say a few things about the evolution of Drupal multilingual capabilities:

  • First, Drupal 8 is – in my opinion – the first version of Drupal where someone could say that multilingual works, pretty much out of the box.
  • Second, the documentation about how to deal with different scenarios is quite good.
  • And third, from a user experience perspective, translating the user interface of a site is really hard.

In this post we will talk about the third point and what we did to manage that complexity.

Our Current Scenario

We are building a really complex site, and the main challenges we faced regarding multilingual are:

  • The site is a multisite architecture, one database using Organic Groups.
  • Each group represents a country, and each country needs its site in one or more languages.
  • We have several variations of the same language depending on the region this language is spoken in.
  • We don’t want to let content editors translate the site directly from the UI.
  • We don’t speak all the languages the site is in.

The last item is quite relevant, when you don’t speak a language, you cannot even be sure if the string you are copying into a textbox says what it should.

The First Attempt

We started with a translation matrix to store all the translations. A simple Google drive spreadsheet to track each string translation in each language.

Each column uses the language code as a header.

Using a tool to convert Spreadsheets into po files we get each translation file fr.po, es.po, pt.po.

We used wichert/po-xls to achieve this with good results.

Not So Fast

This initial, somewhat naive, approach had a few problems.

  • Drupal string translations are case sensitive. This means that if you made a typo and wrote Photo instead of photo the translation will fail.
  • Some strings are the result of a calculation. For example. Downloads: 3 is actually managed by Drupal as Downloads: @count.

But the more complex item is that Drupal 8 has two ways to translate strings. The first one is inherited from Drupal 7. The one that makes use of the well known t function for example t('Contact us.').

The other one is a new way that allows site builders to translate configuration entities.

The two scenarios that allow translation of a Drupal site.

Translating Configuration Entities is Really Hard

To translate configuration entities, you need to identify which configuration needs translation, and find the exact part relevant to you. For complex configuration entities like views, this could be quite hard to understand.

Even for an experienced site admin this can be hard to understand.

Another problem that we had to solve was the vast amount of configuration alternatives you have when dealing with a medium-size Drupal site.

Each category has a lot of items to translate.

It was clear to us that in order to translate all those items we needed to find another way.

More problems… Identifying Which Strings to Translate is Hard

One thing to consider when dealing with Drupal translations is that it’s not easy to identify if a string is displayed somewhere in the frontend or if it is only a backend string.

Translating the entire codebase may not be a viable option if you want to keep a short list of translations reviewed by a group of people. In our case, it was important to make sure that translations are accurate, and that translators do not feel overwhelmed.

We don’t have a great solution to this problem yet. One of the strategies we used was to search for all the strings in twig templates and custom modules code using a grep search.

egrep -hro "[\>, ]t\('.*'\)" . | cut -c 5-   # Get strings inside ->t(...) and t(...)
egrep -hro "{{ '.*'\|\t" .                   # Get twig strings '....'|t
egrep -hro " trans .*" .                     # Get twig strings inside trans

However, as we figured out later by reading the documentation, twig strings cannot be used as a source for translations. Internally, Drupal maps those strings back to the regular use of t('strings').

This means that strings like:

{% trans >}}Copyright {{ year }}{% endtrans >}}

Are actually converted to

t('Copyright @year')

And that last string is the one you should use as source of the translation.

At the end, we cleaned up the spreadsheet list using visual inspect, and so far it has been working fine.

How We Solved the Problems?

To recap the problems we had:

  • We did not want to translate all the available strings.
  • We did not know all the languages, therefore copy and pasting was a risk.
  • Translators were expecting to have a reduced number of strings to translate.
  • Configuration translations are quite complex to track.

As we mentioned before using the xls-to-po tool, we were able to obtain the PO files to translate one part of the strings that we needed to translate.

We also used drush_language to automate the process.

drush language-import --langcode=fr path/to/po_files/fr.po

This little snippet iterates over all of the po files in the po_files directory and imports the language using the drush command mentioned above.

find po_files -type f -name *.po | xargs basename --suffix=.po | \
xargs [email protected] drush language-import --langcode=@ @.po

The xls spreadsheet has in the first column the Message Id, and the language codes of the system

By using conditional cell colors, we can quickly identify which translations are pending.

Solving the Configuration Translation Problem

The second part of our problem was a bit more tricky to fix.

We used a custom script to get all the config entity strings that were relevant to us.

Here is a simplified version of the script.

$prefix = 'views.view.custom_view';
$key = 'display.default.display_options.exposed_form.options.reset_button_label';

$configFactory = \Drupal::service('config.factory');
$list = $configFactory->listAll($prefix);

$rows = [];

foreach ($list as $config_name) {
  $columns = [];
  // Add the unique identifier for this field.
  $columns[] = $config_name . ':' . $key;

  // Get the untranslated value from the config.
  $base_config = $configFactory->getEditable($name);
  $columns[] = $base_config->get($key);

  $rows[] = $columns;

If you wonder how to get the $prefix and $key, they are obtained by inspecting the name of the field we want to translate in the Configuration Translation UI.

You need to inspect the HTML of the page, see the name attribute.

We print the result of the script to obtain a new CSV file that looks like this

The first column is a unique id that combines the prefix and the key.

Then, we copy and paste this CSV file as a new tab in the general translation matrix, and complete the header with the rest of the languages translations.

Finally we use a spreadsheet formula to find the translation we want for the languages we are interested in.


This will search for a match in the Strings matrix, and provide a translation.

Spreadsheet magic.

Final step: Importing the Configuration Strings Translation Back to Drupal

Once we have all the translations we need. We export the CSV file again and use this other script (simplified version) to do the inverse process:

use Symfony\Component\Serializer\Serializer;
use Symfony\Component\Serializer\Encoder\CsvEncoder;
use Symfony\Component\Serializer\Normalizer\ObjectNormalizer;

$filename = 'path/to/config_translations.csv';

$serializer = new Serializer([new ObjectNormalizer()], [new CsvEncoder()]);
$configFactory = \Drupal::service('config.factory');
$languageManager = \Drupal::service('language_manager');

$serializer->encode($data, 'csv');
$data = $serializer->decode(file_get_contents($filename), 'csv');

foreach ($data as $row) {
  $name_key = array_values($row)[0];
  list($name, $key) = explode(':', $name_key);

  // The languages we care start after the second column.
  $languages = array_filter(array_slice($row, 2));

  foreach ($languages as $langcode => $translation) {
    $config_translation = $languageManager
                            ->getLanguageConfigOverride($langcode, $name);
    $saved_config = $config_translation->get();
    $config_translation->set($key, $translation);

Some Other Interesting Problems We Had

Before finishing the article, we would like to share something interesting regarding translations with contexts. As you may know, context allows you to have variations of the same translation depending on, well… context.

In our case, we needed context to display different variations of a French translation. In particular, this is the string in English that we needed to translate to French:

Our organization in {Group Name}

In France, this translates into Notre organisation en France. But if you want to say the same for Canada, due to French grammatical rules you need to say Notre organisation au Canada (note the change en for au).

We decided to create a context variation for this particular string using context with twig templating.

{% trans with {'context': group_iso2_code} >}}
Our organization in { group_name }
{% endtrans >}}

This worked ok-ish, until we realized that this affected all the other languages. So we need to specify the same translation for each group even if the language was not French

This is not what we want...

After some research we found the translation_fallback module but unfortunately it was a Drupal 7 solution.

Long story short, we ended up with this solution.

{% if group_uses_language_context >}}
  {% trans with {'context': country_iso2_code} >}}
    Our organization in { group_name }
  {% endtrans >}}
{% else >}}
  {% trans >}}Our organization in { group_name }{% endtrans >}}
{% endif >}}

Which basically provides two versions of the same string. But if the group needs some special treatment, we have the change to override it. Lucky for us, xls-to-po has support for strings with context. This is how we structured the translations for strings that require context:

CA, in this case, is the ISO code for Canada


For us, this is still a work in progress. We will have to manage around 20 or more languages at some point in the project. By that point, having everything in a single spreadsheet may not be maintainable anymore. There are other tools that could help us to organize source strings. But so far a shared Google Sheet worked.

We still use configuration management to sync the strings in production. The snippets provided in this post are run against a backup database so we can translate all the entities with more confidence. Once we ran the script we use drush config:export to save all the translations to the filesystem.

Nov 15 2019
Nov 15

Recently we got an exciting task, to scrape job descriptions from various web pages. There’s no API, no RSS feed, nothing else, just a bunch of different websites, all using different backends (some of them run on Drupal). Is this task even time-boxable? Or just too risky? From the beginning, we declared that the scraping tool that we would provide is on a best effort basis. It will fail eventually (network errors, website changes, application-level firewalls, and so on). How do you communicate that nicely to the client? That’s maybe another blog post, but here let’s discuss how we approach this problem to make sure that the scraping tool degrades gracefully and also in case of an error, the debugging is as simple and as quick as possible.


The two most important factors for the architectural considerations were the error-prone nature of the scraping and the time-intensive process of downloading something from an external source. On the highest level, we split the problem into two distinct parts. First of all, we process the external sources and create a coherent set of intermediate XMLs for each of the job sources. Then, we consume those XMLs and turn it into content in Drupal.

So from a bird-eye-view, it looks like this:

Phase I: scheduling and scrapingPhase II: content aggregation

That was our plan before the implementation started, let’s see how well it worked.



You just need to fetch some HTML and extract some data, right? Almost correct. Sometimes you need to use a fake user-agent. Sometimes you need to download JSON (that’s rendered on a website later). Sometimes, if you cannot pass cookies in a browser-like way, you’re out of luck. Sometimes you need to issue a POST to view a listing. Sometimes you have an RSS feed next to the job list (lucky you, but mind the pagination).

For the scraping, we used Drupal plugins, so all the different job sites are discoverable and manageable. That went well.

What did not go so well was that we originally planned to use custom traits, like JobScrapeHtml and JobScrapeRss, but with all the vague factors enumerated above, there’s a better alternative. Goutte is written in order to handle all the various gotchas related to scraping. It was a drop-in replacement for our custom traits and for some sites that expected a more browser-like behavior (ie. no more strange error messages as a response for the HTTP request).

Content synchronization

For this matter, we planned to use Feeds, as it’s a proven solution for this type of task. But wait, only an alpha version for Drupal 8? Don’t get scared, we didn’t even need to patch the module.

Feeds took care of the fetching part, using Feeds Ex, we could parse the XMLs with a bunch of XPath expressions, and Feeds Tamper did the last data transformations we needed.

Feeds and its ecosystem is still the best way for recurring, non-migrate-like data transfer into Drupal. The only downside is the lack of extensive documentation, so check our sample in case you would like to do something similar.


This system has been in production for six months already. The architecture proved to be suitable. We even got a request to include those same jobs on another Drupal site. It was a task with a low time box to implement it, as we could just consume the same set of XMLs with an almost identical Feeds configuration. That was a big win!

Where we could do a bit better actually, was the logging. The first phase where we scraped the sites and generated the XML only contained minimal error handling and logging. Gradually during the weeks, we added more and more calls to record entries in the Drupal log, as network issues could be even environment-specific. It’s not always an option to simply replicate the environment in localhost and give it a go to debug.

Also in such cases, you should be the first one who knows about a failure, not the client, so a central log handling (like Loggly, Rollbar or you name it) is vital. You can then configure various alerts for any failure related to your scraping process.

However, when we got a ticket that a job is missing from the system, the architecture again proved to be useful. Let’s check the XMLs first. If it’s in the XML, we know that it’s a somehow Feeds-related issue. If it’s not, let’s dive deep into the scraping.

The code

In the days of APIs and web of things, sometimes it’s still relevant to use the antique, not so robust method of scraping. The best thing that can happen is to re-publish it in a fully machine consumable form publicly (long live the open data movement).

The basic building blocks of our implementation (ready to spin up on your local using DDEV) is available at https://github.com/Gizra/web-scraping, forks, pull requests, comments are welcome!

So start the container (cd server && ./install -y) and check for open positions at Gizra by invoking the cron (ddev . drush core-cron)!

Oct 18 2019
Oct 18

This post is going to be somewhat unusual, as it’s being written by four people. It’s going to be on a single topic (kind of) of having groups functionality, under a single code base and database.

It’s notably about Organic groups for Drupal 8, but not just. In fact, I think that at this point in time, my story about it, is just one out of three. Pieter Frenssen and Maarten Segers, the co-maintainers have their own voice.

But to make this one even more complete (and interesting), we should hear also Kristiaan Van den Eynde’s voice – the author of Group – a competing/complementing module.

Coincidently and apart from me, all those people heavily invested in group functionality are Flemish. So, if you ever wondered how does the story of “Three Flemish and an Israeli walk into a Bar,” goes… Here’s your chance.

Amitai Burstein

Ten years ago, I did my very first DrupalCon presentation about how the UN approached Gizra, my company, to ask us to build them a Drupal site, powered by OG to serve all UN country members.

The presentation video is of poor quality, which is in line with my poor acting and low talent for imitations. I watched it just recently and while I did occasionally laugh at my own past-me jokes, I was cringing at the rest. I sure had a lot of nerve at that time. I remember that I came to it with the “Make it or it break” attitude.

I think it succeeded, in the sense that overnight, I – along with Gizra – became more known in the community. It surely helped landing a few big contracts. But still, in one of those cases of “I didn’t see this was coming” reality is so good at, in 2019 Gizra is really building that UN site, and it’s really powered by OG8. So yeah, now on “Truth or Lie” games I can say “I’ve built a site for Kim Jong-un” and get away with it.

Here are some of my thoughts about Drupal, OG8 and being a maintainer of a popular module:

  1. My concept of group functionality has not changed much in the past ten years. Framing the problem is quite simple: have permissions and membership on the group level, similar to how one has it on the site as a whole. The implementation, or rather the flexible implementation is where it gets complex.

I’d argue that one can build a very hard-coded group functionality in a short time, in a similar way I’m constantly tempted to think that I could build an e-commerce site with hard-coded and limited functionality in a short time. Maybe I could. But nowadays I know it often ends badly. OG’s value (or to that matter Group’s, as-well) isn’t just the code itself. That could change from one version to another. The biggest value is what I like to call “codified knowledge base.”

Almost everything I and the rest of the co-maintainers know, is somewhere in the code. It’s not just about having the right IF and ELSE. It’s making sure all those pieces of knowledge and lessons learned, are dumped from our brains into a more Git-friendly version.

  1. Drupal – the CMS is quite amazing. It’s surely one of the best CMS out there. But like every big project it has many moving parts.

I believe that my biggest concern has always been trying to avoid breaking stuff once they are released. Automatic testing is a big part of it, and having a quite big test coverage is what allows us to code and refactor. Over the years, I have grown more and more fond of and familiar with static-typed languages, and friendly compilers. As a result, I’ve become more and more afraid of doing those refactors in PHP.

Since I’m not a purist, and rewriting Drupal to use [enter-your-favorite-statically-typed-language] is not an option, I’ve also learned to overcome these fears. Write code; Write tests; Cross fingers; Rinse and Repeat.

  1. Drupal – the community is quite amazing. It’s surely one of the best communities out there. No buts.

While I have been less involved with the actual coding of OG8 in the past couple of years, I keep reading almost every single PR. This is such a great learning experience for me. From going over Pieter’s well-thought, well-implemented, and crazily-tested; to learning Maarten’s techniques for code review – manual and automatic ones.

It’s also interesting to see what I’ve felt for some time now, being put into numbers in Dries’s post Who sponsors Drupal development? :

Most contributions are sponsored, but volunteer contributions remains very important to Drupal’s success.

OG8, unlike the days of OG7, isn’t being built by someone(s) spending their evenings on writing features and fixing bugs of community “strangers”. It’s done by senior developers working for organizations that sponsor their work (and of course, some of their spare time, which I value even more!)

This means that currently OG8 is a snapshot of different needs, and almost nothing more. As crazy as it may sound, the UI for setting permissions for OG roles isn’t yet there. It’s surely needed by others, but not enough for a big organization to sponsor it. I suspect this means that while in the past OG was adopted by site builders, nowadays it more oriented towards developers. I’d want that eventually it will allow site builders to set it up – but evidently I don’t want it enough to do it in my spare time.

  1. I’ve been with OG for a long time – since Drupal 4.7. First as a site builder, then slowly started to learn how to code, and submitted patches, and then I became the maintainer, and then I have let others join the ride.

I love the fact that I was able to evolve inside this little island, and it’s still an interesting journey. Before Drupal 8 was released, for a year or two I have felt a bit bad for not having it more in me to giving OG8 more love. I was noticing how Group module was gaining momentum, while OG wasn’t advancing as fast as I was used to. It was probably one of the very few times I had that contribution (and lack of it) to open source caused me some inconvenience. I wasn’t burned out or too stressed, just thinking about OG didn’t give me as much pleasure as it once did.

But along those two years I have also done some more significant changes in my life/work style. Starting to work at home (which I love); Working less hours (which I adore); And training 5-6 times a week (which I can’t stop talking about).

I think that every maintainer of a popular module reaches at a certain point this cross road, asking themselves how to continue. Finding that spot I currently have, which is more of – let’s make sure the ship is heading the right way, and not falling into pitfalls I know for so long – is I believe valuable for the project, but also for me as a person.

Pieter Frenssen

If I would make a list of all the skills I have learned in a decade of working with Drupal then “contributing to open source” would be right at the top. I have been an enthusiastic user of open source software for many years, but I thought it would be impossible for a self-taught programmer like me to make any meaningful contributions to these software packages I was using every day. Among a bunch of other software I used Drupal to build my websites, but – apart from reporting the occasional bug in the issue queue – I did not make any contributions. But this changed at my first Drupal conference.

As many people in our field I was originally working in a different industry. I am an audio engineer by trade, and I spent the first 15 years of my career in recording studios and on concert stages. Programming was a fun hobby and a great way to pass the long waiting periods in between concerts. Over time, I got more and more demand for creating websites and at some point I realized it was time to take my hobby more seriously. I needed to get in touch with other programmers and a conference seemed like a great way to do this.

Some unit tests are done by grabbing a microphone and shouting “Test 1, 2, 3!”

Drupalcon Copenhagen was the first time I attended a tech conference and it was an eye opening experience. Here I met the people whose names I saw in the issue queue, and not only were they giving presentations to share their knowledge, but also actively working on building the software. And they were very happy to include newcomers. I was suddenly taking part in discussions and was invited to work on fixing some bugs in the upcoming Drupal 7. This was very interesting and tons of fun.

Back home I started to regularly contribute small fixes for bugs that I stumbled upon. It was in this period that I made my first contributions to Organic Groups: I fixed some small bugs like PHP notices and mistakes in the help texts. But apart from these my involvement with OG in the Drupal 7 era was as an end user. I made several projects with it, and interestingly enough one of those projects was done in collaboration with Kristiaan Van den Eynde who later went on to create the Group module based on his vision of how groups were supposed to work.

When Drupal 8 came around, my team was evaluating the modules we needed to have for porting a large Drupal 6 based project to Drupal 8. There were a few modules that already had a stable Drupal 8 version, but most were either in alpha or beta, or did not even have a release yet. One of those modules was Organic Groups. At the time it was not more than a proof of concept and most of the functionality that was used in our D6 project was not there yet. We decided to go ahead with it, and implement any functionality that we needed from OG in scope of our project. There were a few factors that made this not as crazy as it might seem:

There was a large time frame involved. The project is large and very complex, having been in continuous development for 8 years. It was estimated that the D8 port – including a completely new backend architecture – would take around 18 months of time. Any work that we would put towards improving Organic Groups on Drupal 8 would benefit our organisation in the future. We have a large number of websites using Organic Groups on Drupal 7 which will need to be migrated to D8 at some point in the future. If we were only considering a single site it would have been more economical to migrate to Group instead.

Organic Groups is one of the most complex modules in the Drupal ecosystem, and has evolved over a period of 15 years, being first conceived in 2004. It implements many edge cases that came out of real life use of the module in tens of thousands of different websites. And now it needed to be ported to Drupal 8. How do you approach a task like this? Our project manager did not like the idea of having a Jira ticket called “Port OG to D8”, estimated to 500 hours, and locking away a developer for a couple of months. So we decided on the following:

  • We would contribute work in small, well scoped PRs. Small PRs are easier to review and accept by Amitai and his team of co-maintainers. We considered that this would reduce the risk that our sprint goals would be affected by a PR not being accepted in time. There was always a direct link between a PR in the Organic Groups issue queue and a ticket on our Jira board regarding some business objective.

  • Try to be as clear as possible in the description of the PR about what the change is intended to achieve. If there was any doubt about the approach to take, we would create a discussion issue before starting any development. On our side, the Jira ticket describing the business need could then be postponed to the next sprint and blocked on the result of the discussion. Since we are working for a paying client we want to avoid spending time developing a solution that would end up being rejected by the module maintainers.

  • To further assist the speedy acceptance of PRs, we assign members of our own team on doing code review on the upstream work. This meant we could already do a couple iterations and improve the code quality to a state that hopefully could be accepted right away, or only require minor modifications. Work with clear labels on the PRs to effectively communicate the state of the code. A label like “Ready for review” is very helpful and avoids people wasting time reviewing unfinished work.

  • Provide extensive and well-documented tests on every PR. It cannot be underestimated how well a set of tests can help module maintainers to quickly understand the goal of a PR, and trust that the code does what it is supposed to do and will not break existing functionality.

An important rule to follow for making contributions to an open source project is to always put the requirements of the project first. It is tempting when working against a sprint deadline to propose a solution that provides some narrow functionality that your project needs, but often these are not aligned with the needs of the project’s wider use base. It is understandable in this case that a PR will be rejected. Often a Drupal module can be configured to cater to different use cases that are needed for different kinds of projects. This flexibility comes at the cost of added development time. One example in OG is the cleaning up of orphaned group content when a group is being deleted: for small sites this is done immediately, but for large sites OG offers an option to do this on a cron job.

As a developer working for a paying customer it can be difficult to justify why we are spending time to develop certain use cases which are not directly needed by the project, but are needed upstream. This means we need to pick our battles carefully. Often the additional work is not substantial, and will be compensated by the fact that the code will be maintained upstream or can be used by other projects within our organisation. But in other cases the work simply cannot be justified from the customer’s viewpoint. One example is the Organic Groups UI module. The way groups are created and managed in our project differed so much from the way this is done in OG that we decided to implement our group administration UI in our own custom module.

The decision to only work on what is strictly needed for the project also meant that we had to accept the fact that our project would probably go to production before the module would be completely finished. We mitigated this risk by implementing our own set of functional tests that fully cover our functionality and UI. We run these in addition to the test coverage of OG. This turned out to be a very good plan, since at the time of launch not only OG was still unstable, but also 20 of the other modules that we are using.

From our perspective the collaboration was a great success. We were able to contribute a significant part of the D8 port, the communication with the other OG contributors in the issue queue went super smooth, and we paved the way for using OG in our other websites. Since our project went live the nature of our contributions has changed – our grouping functionality is feature complete so we no longer need to make any sweeping changes, but we still discover bugs from time to time and were able to identify some performance bottlenecks. Along the way I also got promoted to co-maintainer, and whenever I am at a code sprint I make time to review PRs.

Maarten Segers

The first project where I used Organic Groups was a platform for the European Commission that enables learning and knowledge exchange through online groups where members can keep in touch with each other and share their work. At that time the platform was using Drupal 6 and we didn’t have a lot of the cool things we now have in Drupal 8, like configuration management, services and dependency injection, and the new and improved API’s for caching, entities, plugins, etc. I had used another module (Domain Access) on a different project before to solve the problem of having different subsites but ran into a lot of bugs related to the access layer and it was primarily used to support different domain names which we didn’t need for the EC platform.

What I liked, and still like, most about OG was that it was really well tested. If you make any changes to the default access system you’d better make sure you have tests for it because:

  • Access API is one of the toughest API’s in Drupal
  • Bugs with access can quickly become security issues (e.g. unauthorized access)

Developing this platform taught me that adding a “group” feature can quickly make your website become very complex from a functional perspective as it can affect other existing or new features. Let’s say we have a feature to show the latest news articles, once we have groups there might be a feature for:

  • The latest news articles for a given group
  • The latest news articles for your own groups
  • The latest news articles for a given user’s groups

If we have comments there might be:

  • Comments in a given group
  • Comments in your own groups
  • Comments for a given user’s groups

Now replace “comments” with “most recent comments” or “most active comments” and repeat. If we have a site search there will likely be a group search or filter and now imagine some more complex features like multilingual or content moderation and all the variations groups may introduce on them.

Before I met Amitai at DrupalCon Amsterdam in 2014, I wasn’t contributing very actively to Drupal but as Dries Buytaert puts it, I was a taker.

In my early days as a developer back in the ‘90s I used to build client server solutions and desktop applications. I built the software on Windows and Linux and assembled the hardware. Years later, I discovered these inexpensive microcomputers and microcontrollers that enabled me to build IoT solutions as a hobby. I’m still amazed how much more bang for the buck we get these days when it comes to hardware. But it wasn’t only the cheap hardware that enabled people to build IoT solutions. In the 90s most of the software was still proprietary. Although I did use sourceforge (again, as a taker), it wasn’t until I discovered GitHub that I really started contributing myself. I couldn’t believe how much was already out there, if you need something, anything, someone had probably already written a library for it. Also the ease of use and the low threshold to add or change a line of code made me become very active in contributing to a lot of different projects related to IoT libraries. Note that a lot of commits for bug fixes are one liners and a lot of PRs for new features on a well maintained project are just a few lines of code.

When Amitai asked me why I wasn’t so visible on drupal.org, I did some reflection and it was mostly because of two reasons. First, I didn’t like to attach .patch files to issues in the issue queue. I still don’t like it but now I try to convince myself that it’s worth it and I hope that the GitLab integration will arrive soon! In the meanwhile I’ve been contributing the most on modules that also live on GitHub like OG and Search api solr. The second reason was that most of our solutions were “tailored fit” custom modules.

Once you start contributing to open source project you quickly start to understand how in turn you benefit from it. I learned that by open sourcing it’s possible to reduce the total cost of ownership of a project. I’m not going too much into detail here on the advantages of doing open source right but I’ll list a few:

  • with public scrutiny, others can identify and fix possible issues for you
  • you learn a lot
  • you may receive new features
  • your own contributions may be further improved
  • you avoid reinventing the wheel

In my opinion collaboration tools like GitHub have drastically improved open source workflows over the years. Did you know that you can now put multi-line comments or inline comments? Don’t hesitate to open an issue, a pull request on GitHub or join the Organic Groups Slack channel if you want to get updates have questions or you want to contribute yourself, it will be appreciated!

When developing the website for the European Year of Development (in Drupal 7) we used Organic Groups to model the countries and organisations. When designing the website’s features we made sure that the group feature didn’t have a huge impact on the complexity as we only had a few months to build. For instance, with Organic Groups it would have been possible to manage a hierarchical editorial workflow but instead we decided to use the existing platform to prepare and discuss the initial content.

When building websites using Drupal 6 and 7 I never really hesitated to use Organic Groups as the module had proved it was stable and well maintained. There were some performance issues but nothing we couldn’t handle. It wasn’t until we started building Drupal 8 sites that I started looking for alternatives as the Drupal 8 release for OG wasn’t in sight.

I met Kristiaan at DrupalCon in Dublin in 2016 and we discussed the Group module. By that time a lot of work had already gone into both the “Organic Groups” as well as the “Group” module. While both have a different architecture they try to solve issues in the same problem space. For example, each module has its own submodule to create group navigation (Group menu and OG menu).

A similar situation existed for the search problem space, more specifically for the Apache Solr integration in Drupal: there was the Apachesolr project and the Search api solr project. Both had a huge amount of submodules that tried to solve the same problems: Sorting search results, autocompletion, geospatial search, etc. This meant that two groups of developers were spending time to solve the same issues. The maintainers succeeded to join their efforts in Search api solr which also lead to a single Search API ecosystem.

Perhaps one day we can define one Group API as a basis for Drupal 10 and build our own Group ecosystem!

Kristiaan Van den Eynde

I’ve never really gone into the specifics of why I started developing Group and how I ended up in a position where I am invited to add my thoughts to a blog post written by three people working on a competing/complementing module. Maybe it’s time I cross that bridge publicly once and for all, because in my heart and mind I’ve already left that part of my past far behind me.

Way back in 2010, I started my first official IT job at a university college in Antwerp, Belgium. I had been tinkering with websites many years before that, but this time it was the real deal. After a round of headaches using closed source applications, we managed to convince the higher-ups to allow us to use Drupal for our websites.

One of the first Drupal 7 projects we built was a student platform which needed the functionality people have grown to expect from OG or Group, except there was only OG at the time. Having just had my first taste of what Drupal 7 development was like, I was still in this phase where everything is both daunting and complicated, but the solutions offered almost never seem to suit your needs. OG was not an exception to this rule.

This often triggered the naive, energetic caffeine junky that I was at the time to do all the things maintainers hate to see from their users: I complained in issues about how broken things were, about patches not being accepted, about the general approach of the module and so on.

And thus came the fateful day, where after a round of “frustrating” issues I encountered with OG, I went to see my boss and pitched him the idea that I could write this better and more tailored towards our needs. The fact that he was a very understanding boss and we had way more time on our hands than we needed led him to say yes, and in a single instant changed my career in the most unimaginable way ever.

When I initially started Group development, my main focus wasn’t to reinvent the functionality OG offered but rather to come up with a different data architecture, and as a result, a different developer and user experience. I must have struck gold somehow, because over the next few months of development, I got a lot of positive feedback and was therefore even more motivated to spend my every waking moment polishing my module.

Over the years to come, I went overboard in my enthusiasm and added far more features to Group than I cared to maintain. I was starstruck by the sudden popularity and wanted to appease everyone to make sure I kept that momentum going. This was a really bad idea.

Around the end of 2015, the fact that I was the maintainer of Group landed me a job at a prestigious Drupal agency in the UK: Deeson. My personal life had seen a few drastic changes which meant I no longer had all the spare time in the world to work on Group. So I brokered a deal where I got to work one paid day a week on Group. This is when I was first encouraged to begin development on the Drupal 8 version of Group and to start speaking at events.

I have since grown a lot both as a developer and a person and learned a few valuable lessons, the most important being to respect anyone willing to invest such a huge amount of time into writing a module the size of OG or Group. Next on that list is to first try to collaborate on a project before deciding to reinvent the wheel. Even if I have my career to thank for it, writing Group from scratch was an undertaking that should have definitely crashed and burned, yet somehow didn’t.

So here we are: A rookie turned senior who has come to respect and like Amitai, even if many years ago I strongly disliked his product. Talking to Amitai now, I realize that we both share the same knowledge and that we both want to fix this problem space the best way possible.

It is my hope that one day we can combine OG and Group into a single module that uses the best from both worlds and can be as well written and tested as Commerce. While that ship has definitely sailed for Drupal 8 and we might still have some minor disagreements about how to best approach certain functionalities, I hope to sit down with Amitai, Pieter and Maarten one day to make this happen for Drupal 9 or, more realistically, Drupal 10. In the meantime, I’ll just keep spending my one day a week (now at Factorial) to work on Group and Core.

And who knows: If us “burying the hatchet” someday leads to a single module whose approach people tend to disagree with over time, we might see another person like me step up and try to do it differently. I would certainly welcome the competition as, in the end, it improves the product for both parties involved. I would like to leave that person one piece of advice, though: Do not get dragged down the rabbit hole, keep a healthy work-life balance and try to respect the people who came before you. I know I should have.

For those wondering how "competing" module maintainers (og and group) get along in #Drupal.#DrupalCon pic.twitter.com/rCXJkW97dL

— Amitai Burstein (@amitaibu) September 27, 2016
Jun 12 2019
Jun 12

Some years ago, a frontend developer colleague mentioned that we should introduce SASS, as it requires almost no preparation to start using it. Then as we progress, we could use more and more of it. He proved to be right. A couple of months ago, our CTO, Amitai made a similar move. He suggested to use ddev as part of rebuilding our starter kit for a Drupal 8 project. I had the same feeling, even though I did not know all the details about the tool. But it felt right introducing it and it was quickly evident that it would be beneficial.

Here’s the story of our affair with it.

For You

After the installation, a friendly command-line wizard (ddev config) asks you a few questions:

The configuration wizard holds your hand

It gives you an almost a perfect configuration, and in the .ddev directory, you can overview the YAML files. In .ddev/config.yaml, pay attention to router_http_port and router_https_port, these ports should be free, but the default port numbers are almost certainly occupied by local Nginx or Apache on your development system already.

After the configuration, ddev start creates the Docker containers you need, nicely pre-configured according to the selection. Even if your site was installed previously, you’ll be faced with the installation process when you try to access the URL as the database inside the container is empty, so you can install there (again) by hand.

You have a site inside ddev, congratulations!

For All of Your Coworkers

So now ddev serves the full stack under your site, but is it ready for teamwork? Not yet.

You probably have your own automation that bootstraps the local development environment (site installation, specific configurations, theme compilation, just to name a few), now it’s time to integrate that into ddev.

The config.yaml provides various directives to hook into the key processes.

A basic Drupal 8 example in our case looks like this:

    - exec-host: "composer install"
    # Install Drupal after start
    - exec: "drush site-install custom_profile -y --db-url=mysql://db:[email protected]/db --account-pass=admin --existing-config"
    - exec: "composer global require drupal/coder:^8.3.1"
    - exec: "composer global require dealerdirect/phpcodesniffer-composer-installer"
    # Sanitize email addresses
    - exec: "drush sqlq \"UPDATE users_field_data SET mail = concat(mail, '.test') WHERE uid > 0\""
    # Enable the environment indicator module
    - exec: "drush en -y environment_indicator"
    # Clear the cache, revert the config
    - exec: "drush cr"
    - exec: "drush cim -y"
    - exec: "drush entup -y"
    - exec: "drush cr"
    # Index content
    - exec: "drush search-api:clear"
    - exec: "drush search-api:index"

After the container is up and running, you might like to automate the installation. In some projects, that’s just the dependencies and the site installation, but sometimes you need additional steps, like theme compilation.

In a development team, you will probably have a dev, stage and a live environment that you would like to routinely sync to local to debug and more. In this case, there are integrations with hosting providers, so all you need to do is a ddev pull and a short configuration in .ddev/import.yaml:

provider: pantheon
site: client-project
environment: test

After the files and database are in sync, everything in post-import-db will be applied, so we can drop the existing scripts we had for this purpose.

We still prefer to have a shell script wrapper in front of ddev, so we have even more freedom to tweak the things and keep it automated. Most notably, ./install does a regular ddev start, which results in a fresh installation, but ./install -p saves the time of a full install if you would like to get a copy on a Pantheon environment.

For the Automated Testing

Now that the team is happy with the new tool, they might be faced with some issues, but for us it wasn’t a blocker. The next step is to make sure that the CI also uses the same environment. Before doing that, you should think about whether it’s more important to try to match the production environment or to make Travis really easily debuggable. If you execute realistic, browser-based tests, you might want to go with the first option and leave ddev out of the testing flow; but for us, it was a desirable to spin an identical site on local to what’s inside Travis. And unlike our old custom Docker image, the maintenance of the image is solved.

Here’s our shell script that spins up a Drupal site in Travis:

#!/usr/bin/env bash
set -e

# Load helper functionality.
source ci-scripts/helper_functions.sh

# -------------------------------------------------- #
# Installing ddev dependencies.
# -------------------------------------------------- #
print_message "Install Docker Compose."
sudo rm /usr/local/bin/docker-compose
curl -s -L "https://github.com/docker/compose/releases/download/1.22.0/docker-compose-$(uname -s)-$(uname -m)" > docker-compose
chmod +x docker-compose
sudo mv docker-compose /usr/local/bin

print_message "Upgrade Docker."
sudo apt -q update -y
sudo apt -q install --only-upgrade docker-ce -y

# -------------------------------------------------- #
# Installing ddev.
# -------------------------------------------------- #
print_message "Install ddev."
curl -s -L https://raw.githubusercontent.com/drud/ddev/master/scripts/install_ddev.sh | bash

# -------------------------------------------------- #
# Configuring ddev.
# -------------------------------------------------- #
print_message "Configuring ddev."
mkdir ~/.ddev
cp "$ROOT_DIR/ci-scripts/global_config.yaml" ~/.ddev/

# -------------------------------------------------- #
# Installing Profile.
# -------------------------------------------------- #
print_message "Install Drupal."
ddev auth-pantheon "$PANTHEON_KEY"

cd "$ROOT_DIR"/drupal || exit 1
if [[ -n "$TEST_WEBDRIVERIO" ]];
  # As we pull the DB always for WDIO, here we make sure we do not do a fresh
  # install on Travis.
  cp "$ROOT_DIR"/ci-scripts/ddev.config.travis.yaml "$ROOT_DIR"/drupal/.ddev/config.travis.yaml
  # Configures the ddev pull with Pantheon environment data.
  cp "$ROOT_DIR"/ci-scripts/ddev_import.yaml "$ROOT_DIR"/drupal/.ddev/import.yaml
ddev start
if [[ -n "$TEST_WEBDRIVERIO" ]];
  ddev pull -y

As you see, we even rely on the hosting provider integration, but of course that’s optional. All you need to do after setting up the dependencies and the configuration is to ddev start, then you can launch the tests of any kind.

All the custom bash functions above are adapted from https://github.com/Gizra/drupal-elm-starter/blob/master/ci-scripts/helper_functions.sh, and we are in the process of having an ironed out starter kit from Drupal 8, needless to say, with ddev.

One key step is to make ddev non-interactive, see global_config.yaml that the script copies:

APIVersion: v1.7.1
omit_containers: []
instrumentation_opt_in: false
last_used_version: v1.7.1

So it does not ask about data collection opt-in, as it would break the non-interactive Travis session. If you are interested in using the ddev pull as well, use encrypted environment variables to pass the machine token securely to Travis.

The Icing on the Cake

ddev has a welcoming developer community. We got a quick and meaningful reaction to our first issue, and by the time of writing this blog post, we have an already merged PR to make ddev play nicely with Drupal-based webservices out of the box. Contributing to this project is definitely rewarding – there are 48 contributors and it’s growing.

The Scene of the Local Development Environments

Why ddev? Why not the most popular choice, Lando or Drupal VM? For us, the main reasons were the Pantheon integration and the pace of development. It definitely has the momentum. In 2018, it was the 13th choice for local development environment amongst Drupal developers; in 2019, it’s at the 9th place according to the 2019 Drupal Local Development survey. This is what you sense when you try to contribute: the open and the active state of the project. What’s for sure, based on the survey, is that nowadays the Docker-based environments are the most popular. And with a frontend that hides all the pain of working with pure Docker/docker-compose commands, it’s clear why. Try it (again), these days - you can really forget the hassle and enjoy the benefits!

Feb 25 2019
Feb 25

Earlier we wrote about stress testing, featuring Blazemeter where you could learn how to do crash your site without worrying about the infrastructure. So why did I even bother to write this post about the do-it-yourself approach? We have a complex frontend app, where it would be nearly impossible to simulate all the network activities faithfully during a long period of time. We wanted to use a browser-based testing framework, namely WebdriverI/O with some custom Node.js packages on Blazemeter, and it proved to be quicker to start to manage the infrastructure and have full control of the environment. What happened in the end? Using a public cloud provider (in our case, Linode), we programmatically launched the needed number of machines temporarily, provisioned them to have the proper stack, and the WebdriverI/O test was executed. With Ansible, Linode CLI and WebdriverIO, the whole process is repeatable and scalable, let’s see how!

Infrastructure phase

Any decent cloud provider has an interface to provision and manage cloud machines from code. Given this, if you need an arbitrary number of computers to launch the test, you can have it for 1-2 hours (100 endpoints for a price of a coffee, how does this sound?).

There are many options to dynamically and programmatically create virtual machines for the sake of stress testing. Ansible offers dynamic inventory, however the cloud provider of our choice wasn’t included in the latest stable version of Ansible (2.7) by the the time of this post. Also the solution below makes the infrastructure phase independent, any kind of provisioning (pure shell scripts for instance) is possible with minimal adaptation.

Let’s follow the steps at the guide on the installation of Linode CLI. The key is to have the configuration file at ~/.linode-cli with the credentials and the machine defaults. Afterwards you can create a machine with a one-liner:

linode-cli linodes create --image "linode/ubuntu18.04" --region eu-central --authorized_keys "$(cat ~/.ssh/id_rsa.pub)"  --root_pass "$(date +%s | sha256sum | base64 | head -c 32 ; echo)" --group "stress-test"

Given the specified public key, password-less login will be possible. However this is far from enough before the provisioning. Booting takes time, SSH server is not available immediately, also our special situation is that after the stress test, we would like to drop the instances immediately, together with the test execution to minimize costs.

Waiting for machine booting is a slightly longer snippet, the CSV output is robustly parsable:

## Wait for boot, to be able to SSH in.
while linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'status' --no-headers | grep -v running
  sleep 2

However the SSH connection is likely not yet possible, let’s wait for the port to be open:

for IP in $(linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'ipv4' --no-headers);
  while ! nc -z $IP 22 < /dev/null > /dev/null 2>&1; do
    sleep 1

You may realize that this is overlapping with the machine booting wait. The only benefit is that separating the two allows more sophisticated error handling and reporting.

Afterwards, deleting all machines in our group is trivial:

for ID in $(linode-cli linodes list --group=stress-test --text --delimiter ";" --format 'id' --no-headers);
  linode-cli linodes delete "$ID"

So after packing everything in one script, also to put an Ansible invocation in the middle, we end up with stress-test.sh:



if ! [[ $NUMBER_OF_VISITORS =~ $NUM_RE ]] ; then
  echo "error: Not a number: $NUMBER_OF_VISITORS" >&2; exit 1

if (( $NUMBER_OF_VISITORS > 100 )); then
  echo "warning: Are you sure that you want to create $NUMBER_OF_VISITORS linodes?" >&2; exit 1

echo "Reset the inventory file."
cat /dev/null > hosts

echo "Create the needed linodes, populate the inventory file."
for i in $(seq $NUMBER_OF_VISITORS);
  linode-cli linodes create --image "linode/ubuntu18.04" --region eu-central --authorized_keys "$(cat ~/.ssh/id_rsa.pub)" --root_pass "$(date +%s | sha256sum | base64 | head -c 32 ; echo)" --group "$LINODE_GROUP" --text --delimiter ";"

## Wait for boot.
while linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'status' --no-headers | grep -v running
  sleep 2

## Wait for the SSH port.
for IP in $(linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'ipv4' --no-headers);
  while ! nc -z $IP 22 < /dev/null > /dev/null 2>&1; do
    sleep 1
  ### Collect the IP for the Ansible hosts file.
  echo "$IP" >> hosts
echo "The SSH servers became available"

echo "Execute the playbook"
ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3' -T 300 -i hosts main.yml

echo "Cleanup the created linodes."
for ID in $(linode-cli linodes list --group="$LINODE_GROUP" --text --delimiter ";" --format 'id' --no-headers);
  linode-cli linodes delete "$ID"

Provisioning phase

As written earlier, Ansible is just an option, however a popular option to provision machines. For such a test, even a bunch of shell command would be sufficient to setup the stack for the test. However, after someone tastes working with infrastructure in a declarative way, this becomes the first choice.

If this is your first experience with Ansible, check out the official documentation. In a nutshell, we just declare in YAML how the machine(s) should look, and what packages it should have.

In my opinion, a simple playbook like this below, is readable and understandable as-is, without any prior knowledge. So our main.yml is the following:

- name: WDIO-based stress test
  hosts: all
  remote_user: root

    - name: Update and upgrade apt packages
      become: true
        upgrade: yes
        update_cache: yes
        cache_valid_time: 86400

    - name: WDIO and Chrome dependencies
        name: "{{ item }}"
        state: present
         - unzip
         - nodejs
         - npm
         - libxss1
         - libappindicator1
         - libindicator7
         - openjdk-8-jre

    - name: Download Chrome
        url: "https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb"
        dest: "/tmp/chrome.deb"

    - name: Install Chrome
      shell: "apt install -y /tmp/chrome.deb"

    - name: Get Chromedriver
        url: "https://chromedriver.storage.googleapis.com/73.0.3683.20/chromedriver_linux64.zip"
        dest: "/tmp/chromedriver.zip"

    - name: Extract Chromedriver
        remote_src: yes
        src: "/tmp/chromedriver.zip"
        dest: "/tmp"

    - name: Start Chromedriver
      shell: "nohup /tmp/chromedriver &"

    - name: Sync the source code of the WDIO test
        src: "wdio"
        dest: "/root/"

    - name: Install WDIO
      shell: "cd /root/wdio && npm install"

    - name: Start date

    - name: Execute
      shell: 'cd /root/wdio && ./node_modules/.bin/wdio wdio.conf.js --spec specs/stream.js'

    - name: End date

We install the dependencies for Chrome, Chrome itself, WDIO, and then we can execute the test. For this simple case, that’s enough. As I referred to earlier:

ansible-playbook -e 'ansible_python_interpreter=/usr/bin/python3' -T 300 -i hosts main.yml

What’s the benefit over the shell scripting? For this particular use-case, mostly that Ansible makes sure that everything can happen in parallel and we have sufficient error-handling and reporting.

Test phase

We love tests. Our starter kit has WebdriverIO tests (among many other type of tests), so we picked it to stress test the full stack. If you are familiar with JavaScript or Node.js the test code will be easy to grasp:

const assert = require('assert');

describe('podcasts', () => {
    it('should be streamable', () => {
        $('.contact .btn').click();

        const menu = $('.header.menu .fa-bars');
        $('#mep_0 .mejs__controls').waitForDisplayed();
        $('#mep_0 .mejs__play button').click();

This is our spec file, which is the essence, alongside with the configuration.

Could we do it with a bunch of requests in jMeter or Gatling? Almost. The icing on the cake is where we stress test the streaming of the podcast. We simulate a user who listens the podcast for 10 seconds. For for any frontend-heavy app, realistic stress testing requires a real browser, WDIO provides us exactly this.

The WebdriverIO test execution - headless mode deactivated

Test execution phase

After making the shell script executable (chmod 750 stress-test.sh), we are able to execute the test either:

  • with one visitor from one virtual machine: ./stress-test.sh 1
  • with 100 visitors from 100 virtual machines for each: ./stress-test.sh 100

with the same simplicity. However, for very large scale tests, you should think about some bottlenecks, such as the capacity of the datacenter on the testing side. It might make sense to randomly pick a datacenter for each testing machine.

The test execution consists of two main parts: bootstrapping the environment and executing the test itself. If bootstrapping the environment takes too high of a percentage, one strategy is to prepare a Docker image, and instead of creating the environment again and again, just use the image. In that case, it’s a great idea to check for a container-specific hosting solution instead of standalone virtual machine.

Would you like to try it out now? Just do a git clone https://github.com/Gizra/diy-stress-test.git!

Result analysis

For such a distributed DIY test, analyzing the results could be challenging. For instance, how would you measure requests/second for a specific browser-based test, like WebdriverI/O?

For our case, the analysis happens on the other side. Almost all hosting solutions we encounter support New Relic, which could help a lot in such an analysis. Our test was DIY, but the result handling was outsourced. The icing on the cake is that it helps to track down the bottlenecks too, so a similar solution for your hosting platform can be applied as well.

However what if you’d like to somehow gather results together after such a distributed test execution?

Without going into detail, you may study the fetch module of Ansible, so you can gather a result log from all the test servers and have it locally in a central place.


It was a great experience that after we faced some difficulty with a hosted stress test platform; in the end, we were able to recreate a solution from scratch without much more development time. If your application also needs special, unusual tools for stress-testing, you might consider this approach. All the chosen components, such as Linode, WebdriverIO or Ansible are easily replaceable with your favorite solution. Geographically distributed stress testing, fully realistic website visitors with heavy frontend logic, low-cost stress testing – it seems now you’re covered!

Oct 26 2018
Oct 26

After almost one year, and that $1.6M for a single item we had a couple more (big) sales that are worth talking about.

If you expect this to be a pat on the shoulder kind of post, where I’m talking about some hyped tech stack, sprinkled with the words “you can just”, and “simply” - while describing some incredible success, I can assure you it is not that.

It is, however, also not a “we have completely failed” self reflection.

Like every good story, it’s somewhere in the middle.

The Exciting World of Stamps

Many years ago, when Brice and me founded Gizra, we decided “No gambling, and no porn.” It’s our “Do no evil” equivalent. Along all the life of Gizra we always had at least one entrepreneurial project going on in different fields and areas. On all of them we just lost money. I’m not saying it necessarily as a bad thing - one needs to know how to lose money; but obviously it would be hard to tell it as a good thing.

Even in the beginning days, we knew something that we know also now - as a service provider there’s a very clear glass ceiling. Take the number of developers you have, multiple by your hourly rate, number of working hours, and that’s your your optimal revenue. Reduce at least 15% (and probably way more, unless you are very minded about efficiency) and now you have a realistic revenue. Building websites is a tough market, and it’s never getting easy - but it pays the salaries and all things considered, I think it’s worth it.

While we are blessed with some really fancy clients, and we are already established in the international Drupal & Elm market, we wanted to have a product. I tend to joke that I already know all the pain points of being a service provider, so it’s about time I know also the ones of having a product.

Five years ago Yoav entered our door with the idea of CircuitAuction - a system for auction houses (the “going once&mldr; going twice&mldr;” type). Yoav was born to a family of stamps collectors and was also a Drupaler. He knew building the system he dreamed of was above his pay grade, so he contacted us.

Boy, did we suck. Not just Gizra. Also Yoav. There was a really good division between us in terms of suckiness. If you think I’m harsh with myself, picture yourself five years ago, and tell yourself what you think of past you.

I won’t go much into the history. Suffice to say that my believe that only on the third rewrite of any application do you start getting it right, was finally put to the test (and proved itself right). Also, important to note that at some point we turned from service provides to partners and now CircuitAuction is owned by Brice, Yoav, and myself. This part will be important when we reach the “Choose your partners right” section.

So the first official sale along with the third version of CircuitAuction happened in Germany at March 2017. I’ve never had a more stressful time at work than the weeks before, and along the sale. I was completely exhausted. If you ever heard me preaching about work life balance, you would probably understand how it took me by surprise the fact that I’ve worked for 16 hours a day, weekdays and weekends, for six weeks straight.

I don’t regret doing so. Or more precisely, I would probably really regret it if we would have failed. But we were equipped with a lot of passion to nail it. But still, when I think of those pre-sale weeks I cringe.

Stamp Collections & Auction Houses 101

Some people, very few (and unfortunately for you the reader, you are probably not one of them) are very, very (very) rich. They are rich to the point that for them, buying a stamp in thousands or hundreds of euros is just not a big deal.

Some people, very few (and unfortunately for you the reader, you are probably not one of them), have stamp collections or just a couple of valuable stamps that they want to sell.

This is where the auction house comes in. They are not the ones that own the stamps. No, an auction house’s reputation is determined by the the two rolodexes they have: the one with the collectors, and the one with the sellers. Privacy and confidentiality, along with honesty, are obviously among the most important traits for the auction house.

So, you might think “They just need to sell a few stamps. How hard can that be?”

Well, there are probably harder things in life, but our path led us here, so this is what we’re dealing with. The thing is that along those five days of a “live sale” there are about 7,000 items (stamps, collections, postcards etc’) that beforehand need to be categorized, arranged, curated and pass an extensive and rigorous workflow (if you would buy these 4 stamp for 74,000 euro, you’d expect it to be carefully handled, right?).

Screenshot of the live auction webapp, built with Elm. A stamp is being sold in real time for lots of Euros!

Now mind you that handling stamps is quite different from coins and they are both completely different from paintings. For the unprofessional eye those are “just” auctions, but when dealing with such expensive items, and such specific niches, each one has different needs and jargon.

We Went Too Far. Maybe.

  • Big stamps sales are a few million euros; but those of coins are of hundreds.
  • The logic for stamp auctions is usually more complex than that of coins.
  • Heinrich Koehler, our current biggest client and one of the most prestige stamps auction houses in the world has an even crazier logic. Emphasis on the crazier. Being such a central auction house, every case that would normally be considered as edge case, manifests itself on every sale.

So, we went with a “poor” vertical (may we all be as poor as this vertical), and with a very complex system. There are a few reasons for that, although only time would tell if was a good bet:

Yoav, our partner, has a lot of personal connections in this market - he literally played as kid or had weekend barbecues with many of the existing players. The auction houses by nature are relying heavily on those relations, so having a foothold in this niche market is an incredible advantage

Grabbing the big player was really hard. Heinrich Koehler requires a lot of care, and enormous amount of development. But once we got there, we have one hell of a bragging right.

There’s also an obvious one that is often not mentioned - we didn’t know better. Only until very late in the process, we never asked those questions, as we were too distracted with chasing the opportunities that were popping.

But the above derails from probably the biggest mistake we did along the years: not building the right thing.

If you are in the tech industry, I would bet you have seen this in one form or another. The manifestation of it is the dreaded “In the end we don’t need it” sentence floating in the air, and a team of developers and project managers face-palming. Developers are cynical for a reason. They have seen this one too many times.

I think that developing something that is only 90% correct is much worse than not developing it at all. When you don’t have a car, you don’t go out of town for a trip. When you do, but it constantly breaks or doesn’t really get you to the point you wanted, you also don’t get to hike, only you are super frustrated at the expense of the misbehaving car, and at the fact that it’s, well, not working.

We were able to prevent that from happening to many of our clients, but fell to the same trap. We assumed some features were needed. We thought we should build it in a certain way. But we didn’t know. We didn’t always have a real use case, and we ended rewriting parts over and over again.

The biggest change, and what has put us on the right path, was when we stopped developing on assumptions, and moved one line of code at a time, only when it was backed up with real use cases. Sounds trivial? It is. Unfortunately, also doing the opposite - “develop by gut feeling” is trivial. I find that it requires more discipline staying on the right path.

Luckily, at some point we have found a superb liaison.

The Liaison, The Partners, and the Art of War

Tobias (Tobi) Huylmans, our liaison, is a person that really influenced for the better and helped shape the product to what it is.

He’s a key person in Heinrich Koehler dealing with just about any aspect of their business. From getting the stamps, describing them, expediting them (i.e. being the professional that gives the seal of approval that the item is genuine), teaching the team how to work with technology, getting every possible complaint from every person nearby, opening issues for us on GitHub, getting filled with pure rage when a feature is not working, getting excited when a feature is working, being the auctioneer at the sale, helping accounting with the bookkeeping, and last, but not least, being a husband and a father.

There are quite a few significant things I’ve learned working with him. The most important is - have someone close the team, that really knows what they are talking about, when it comes to the problem area. That is, I don’t think that his solutions are always the best ones, but he definitely understands the underlying problem.

It’s probably ridiculous how obvious this above resolution is, and yet I suspect we are not the only ones in the world who didn’t grasp it fully. If I’d have to make it an actionable item for any new entrepreneur I’d call it “Don’t start anything unless you have an expert in the field, that is in a daily contact with you.”

Every field has a certain amount of logic, that only when you immerse yourself in it do you really get it. For me personally it took almost four months of daily work to “get it”, when it came to how bids should be allowed to be placed. Your brain might tell you it’s a click of a button, but my code with 40+ different exceptions that can be thrown along a single request is saying differently.

We wouldn’t have gotten there without Tobi. It’s obvious that I have enormous respect for him, but at the same time he can drive me crazy.

I need a calm atmosphere in order to be productive. However, Tobi is all over the place. I can’t blame him - you’ve just read how many things he’s dealing with. But at times of pressure he’s sometimes expecting FOR THINGS TO BE FIXED IMMEDIATELY!!!
You probably get my point. I’m appreciative for all his input, but I need it to be filtered. Luckily me and my partners’ personalities are on slightly different spectrums that are (usually) complimenting each other:

I can code well in short sprints, where the scope is limited. I’m slightly obsessed with clean code and automatic testing, but I can’t hold it for super long periods.

Brice is hardly ever getting stressed and can manage huge scopes. He’s more of a “if it works don’t fix it”, while I have a tendency to want to polish existing (ugly) code when I come across it. His “Pragmatic” level is set all the way to maximum. So while I don’t always agree with his technical decisions, one way or another, the end result is a beast of a system that allows managing huge collections of items, with their history and along with their accounting, invoicing and much more. In short, he delivers.

Yoav’s knows the ins and outs of the auction field. On top of that his patience is only slightly higher than a Hindu cow. One can imagine the amount of pressure he has undergone in those first sales when things were not as smooth as they should have been. I surely would have cracked.

This mix of personalities isn’t something we’re hiding. In fact it’s what allows us to manage this battle field called auctions sales. Sometimes the client needs a good old tender loving care, with a “Yes, we will have it”; sometimes they need to hear a “No, we will not get to it on time” with a calm voice; and sometimes they need to see me about to start foaming from the mouth when I feel our efforts are not appreciated.

Our Stack

Our Elm & Drupal stack is probably quite unique. After almost 4 years with this stack I’m feeling very strongly about the following universal truth:

Elm is absolutely awesome. We would not have had such a stable product with JS in such a short time. I’m not saying others could not do it in JS. I’m saying we couldn’t, and I wouldn’t have wanted to bother to try it. In a way I feel that I have reached a point where I see people writing apps in JS, and can’t understand why they are interacting with that language directly. If there is one technical tip I’d give someone looking into front end and feeling burned by JS is “try Elm.”

Drupal is also really great. But it’s built on a language without a proper type system and a friendly compiler. On any other day I’d tell you how now days I think that’s a really a bad idea. However, I won’t do it today, because we have one big advantage by using Drupal - we master it. This cannot be underestimated: even though we have re-written CircuitAuction “just” three times, in fact we have built with Drupal many (many) other websites and web applications and learned almost everything that can be thought. I am personally very eager to getting Haskell officially into our stack, but business oriented me doesn’t allow it yet. I’m not saying Haskell isn’t right. I’m just saying that for us it’s still hard to justify it over Drupal. Mastery takes many years, and is worth a lot of hours and dollars. I still choose to believe that we’ll get there.

On Investments, Cash Flow, and Marketing

We have a lot of more work ahead of us. I’m not saying it in that extra cheerful and motivated tone one hears in cheesy movies about startups. No, I’m seeing it in the “Shit! We have a lot of more work ahead of us.” tone.
Ok, maybe a bit cheerful, and maybe a bit motivated - but I’m trying to make a point here.

For the first time in our Gizra life we have received a small investment ($0.5M). It’s worth noting that we sought a small investment. One of the advantages of building a product only after we’ve established a steady income is that we can invest some of our revenues in our entrepreneurial projects. But still, we are in our early days, and there is just about only a single way to measure if we’ll be successful or not: will we have many clients.

We now have some money to buy us a few months without worrying about cash flow, but we know the only way to keep telling the CircuitAuction story is by selling. Marketing was done before, but now we’re really stepping on it, in Germany, UK, US and Israel. I’m personally quite optimistic, and I’m really looking forward to the upcoming months, to see for real if our team is as good as I think and hope, and be able to simply say “We deliver.”

Jul 31 2018
Jul 31

Everything was working great&mldr; and then all the tests broke.

This is the story of how adding a single feature into an app can break all of your tests. And the lessons can be learned from it.

The Feature that Introduced the Chaos

We are working on a Drupal site that makes uses of a multisite approach. In this case, it means that different domains are pointed at the same web server and the site reacts differently depending on which domain you are referencing.

We have a lot of features covered by automatic tests in Webdriver IO – an end to end framework to tests things using a real browser. Everything was working great, but then we added a new feature: a content moderation system defined by the workflow module recently introduced in Drupal 8.

The Problem

When you add the Workflow Module to a site – depending on the configuration you choose – each node is no longer published by default until a moderator decides to publish it.

So as you can imagine, all of the tests that were expecting to see a node published after clicking the save button stopped working.

A Hacky Fix

To fix the failing test using Webdriver you could:

  1. Login as a user A.
  2. Fill in all the fields on your form.
  3. Submit the node form.
  4. Logout as user A.
  5. Login as user B.
  6. Visit the node page.
  7. Publish the node.
  8. Logout as user B.
  9. Login back as user A.
  10. And make the final assertions.

Here’s a simpler way to fix the failing test:

You maintain your current test that fills the node form and save it. Then, before you try to check if the result is published, you open another browser, login with a user that can publish the node, and then with the previous browser continue the rest of the test.

Multiremote Approach

To achieve this, Webdriver IO has a special mode called multiremote:

WebdriverIO allows you to run multiple Selenium sessions in a single test. This becomes handy when you need to test application features where multiple users are required (e.g. chat or WebRTC applications). Instead of creating a couple of remote instances where you need to execute common commands like init or url on each of those instances, you can simply create a multiremote instance and control all browser at the same time.

The first thing you need to do is change the configuration of your wdio.conf.js to use multiple browsers.

export.config = {
    // ...
    capabilities: {
        myChromeBrowser: {
            desiredCapabilities: {
                browserName: 'chrome'
        myFirefoxBrowser: {
            desiredCapabilities: {
                browserName: 'firefox'
    // ...

With this config, every time you use the variable browser it will repeat the actions on each browser.

So, for example, this test:

    var assert = require('assert');

    describe('create article', function() {
        it('should be possible to create articles.', function() {
            browser.login('some user', 'password');

            browser.setValueSafe('#edit-title-0-value', 'My new article');
            browser.setWysiwygValue('edit-body-0-value', 'My new article body text');


will be executed multiple times with different browsers.

Each step of the test is executed for all the browsers defined.

Instead of using browser you can make use of the keys defined in the capabilities section of the wdio.conf.js file. Replacing browser with myFirefoxBrowser will execute the test only in the Firefox instance, allowing you to use the other browser for other types of actions.

Using the browser name, you can specify where to run each step of the test.

The Custom Command Problem

If you take a deeper look at previous code, you will notice that there are three special commands that are not part of the WebdriverIO API. login, setValueSafe and setWysiwygValue are custom commands that we attach to the browser object.

You can see the code of some of those commands in the drupal-elm-starter code.

The problem is – as @amitai realized some time ago – that custom commands don’t play really well with the multiremote approach. A possible solution to keep the custom commands available in all of the browsers is to use some sort of class to wrap the browser object. Something similar to the PageObject pattern.

An example of the code is below:

    class Page {

        constructor(browser = null) {
            this._browser = browser;

        get browser() {
            if (this._browser) {
                return this._browser;
            // Fallback to some browser.
            return myChromeBrowser;

        visit(path) {

        setWysiwygValue(field_name, text) {
                'CKEDITOR.instances["' + field_name + '"].insertText("' + text + '");'

        login(user, password) {
            this.browser.setValue('#edit-name', user);
            this.browser.setValue('#edit-pass', password);


    module.exports = Page;

So now, you have a wrapper class that you can use in your tests. You can create multiple instances of this class to access the different browsers while you are running a test.

    var assert = require('assert');
    var Page = require('../page_objects/page');

    describe('create article', function() {
        it('should be possible to create articles.', function() {
            let chrome = new Page(myChromeBrowser);
            let firefox = new Page(myFirefoxBrowser);

            chrome.login('some user', 'password');
            firefox.login('admin', 'admin');

            chrome.setValueSafe('#edit-title-0-value', 'My new article');
            chrome.setWysiwygValue('edit-body-0-value', 'My new article body text');

            // Here is where the second browser start to work.
            // This clicks the publish button of the workflow module

            // Once the node was published by another user in another browser
            // you can run the final assertions.

What About Automated Tests?

You may be also wondering, does this work seemlessly for automated tests? And the answer is: yes. We have only tried it using the same browser version in different instances. This means that we trigger several chrome browser instances that acts as independent browsers.

If you have limitations in how many cores you have availble to run tests, it should not limit how many browsers you can spawn. They will just wait their turn when a core becomes available. You can read more on how we configure travis to optimize resources.

As you can see, having multiple browsers available to run tests simplifies their structure. Even if you know that you will not need a multiremote approach at first, it may be a good idea to structure your tests using this browser wrapper, as you don’t know if you will need to refactor all of your tests to run things differently in the future.

This approach also can help to refactor the ideas provided by one of our prior posts. Using JSON API with WebdriverIO Tests so you don’t need to worry about login in with the right user to make the json requests.

Jul 02 2018
Jul 02

In Drupal, you can write automated tests with different levels of complexity. If you need to test a single function, or method of a class, probably you will be fine with a unit test. When you need to interact with the database, you can create kernel tests. And finally, if you need access to the final HTML rendered by the browser, or play with some javascript, you can use functional tests or Javascript tests. You can read more about this in the Drupal.org documentation.

So far this is what Drupal provides out of the box. On top of that, you can use Behat or WebDriver tests. This types of tests are usually easier to write and are closer to the user needs. As a side point, they are usually slower than the previous methods.

The Problem.

In Gizra, we use WebdriverIO for most of our tests. This allow us to tests useful things that add value to our clients. But these sort of tests, where you only interact with the browser output, has some disadvantages.

Imagine you want to create an article and check that this node is unpublished by default. How do you check this? Remember you only have the browser output&mldr;

One possible way could be this: Login, visit the Article creation form, fill the fields, click submit, and then&mldr; Maybe search for some unpublished class in the html:

    var assert = require('assert');

    describe('create article', function() {
        it('should be possible to create articles, unpublished by default', function() {
            browser.loginAs('some user');

            browser.setValueSafe('#edit-title-0-value', 'My new article');
            browser.setWysiwygValue('edit-body-0-value', 'My new article body text');



This is quite simple to understand, but it has some drawbacks.

For one, it depends on the theme to get the status of the node. You could take another approach and instead of looking for a .node-unpublished class, you could logout from the current session and then try to visit the url to look for an access denied legend.

Getting Low-Level Information from a Browser Test

So the problem boils down to this:

How can I get information about internal properties from a browser test?

The new age of decoupled Drupal brings an answer to this question. It could be a bit counterintuitive at first, therefore just try to see is fit for your project.

The idea is to use the new modules that expose Drupal internals, through json endpoints, and use javascript together with a high-level testing framework to get the info you need.

In Gizra we use WDIO tests write end-to-end tests. We have some articles about this topic. We also wrote about a new module called JsonAPI that exposes all the information you need to enrich your tests.

The previous test could be rewritten into a different test. By making use of the JsonAPI module, you can get the status of a specific node by parsing a JSON document:

var assert = require('assert');

describe('create article', function() {
    it('should be possible to create articles, unpublished by default', function() {
        browser.loginAs('some user');

        browser.setValueSafe('#edit-title-0-value', 'My unique title');
        browser.setWysiwygValue('edit-body-0-value', 'My new article body text');


        // Use JSON api to get the internal data of a node.
        let query = '/jsonapi/node/article'
                 += '?fields[node--article]=status'
                 += '&filter[status]=0'
                 += '&filter[node-title][condition][path]=title'
                 += '&filter[node-title][condition][value]=My unique title'
                 += '&filter[node-title][condition][operator]=CONTAINS'

        browser.waitForVisible('body pre');
        let json = JSON.parse(browser.getHTML('body pre', false));

        assert.equals(false, json[0].attributes.content['status']);

In case you skipped the code, don’t worry, it’s quite simple to understand, let’s analyze it:

1. Create the node as usual:

This is the same as before:

browser.setValueSafe('#edit-title-0-value', 'My unique title');
browser.setWysiwygValue('edit-body-0-value', 'My new article body text');


2. Ask JsonAPI for the status of an article with a specific title:

Here you see the two parts of the request and the parsing of the data.

let query = '/jsonapi/node/article'
            += '?fields[node--article]=status'
            += '&filter[status]=0'
            += '&filter[node-title][condition][path]=title'
            += '&filter[node-title][condition][value]=My unique title'
            += '&filter[node-title][condition][operator]=CONTAINS'


3. Make assertions based on the data:

Since JsonAPI exposes, well, json data, you can convert the json into a javascript object and then use the dot notation to access to a specific level.

This is how you can identify a section of a json document.
browser.waitForVisible('body pre');
let json = JSON.parse(browser.getHTML('body pre', false));
assert.equals(false, json[0].attributes.content['status']);

A Few Enhancements

As you can see, you can parse the output of a json request directly from the browser.

browser.waitForVisible('body pre');
let json = JSON.parse(browser.getHTML('body pre', false));

The json object now contains the entire response from JsonAPI that you can use as part of your test.

There are some drawbacks of the previous approach. First, this only works for Chrome. That includes the Json response inside a XML document. This is the reason why you need to get the HTML from body pre.

The other problem is this somewhat cryptic section:

let query = '/jsonapi/node/article'
        += '?fields[node--article]=status'
        += '&filter[status]=0'
        += '&filter[node-title][condition][path]=title'
        += '&filter[node-title][condition][value]=My unique title'
        += '&filter[node-title][condition][operator]=CONTAINS'

The first problem can be fixed using a conditional to check which type of browser are you using to run the tests.

The second problem can be addressed using the d8-jsonapi-querystring package, that allows you to write an object that is automatically converted into a query string.

Other Use Cases

So far, we used JsonAPI to get information about a node. But there are other things that you can get from this API. Since all configurations are exposed, you could check if some role have some specific permission. To make tests shorter we skipped the describe and it sections.

browser.loginAs('some user');

let query = '/jsonapi/user_role/user_role'
         += '?filter[is_admin]=null'

browser.waitForVisible('body pre');
let json = JSON.parse(browser.getHTML('body pre', false));

json.forEach(function(role) {
    assert.ok(role.attributes.permissions.indexOf("bypass node access") == -1);

Or if a field is available in some content type, but it is hidden to the end user:

browser.loginAs('some user');

let query = '/jsonapi/entity_form_display/entity_form_display?filter[bundle]=article'

browser.waitForVisible('body pre');
let json = JSON.parse(browser.getHTML('body pre', false));


Or if some specific HTML tag is allowed in an input format:

let query = '/jsonapi/filter_format/filter_format?filter[format]=filtered_html'

browser.waitForVisible('body pre');
let json = JSON.parse(browser.getHTML('body pre', false));

let tag = '';

assert.ok(json[0].attributes.filters.filter_html.settings.allowed_html.indexOf(tag) > -1);

As you can see, there are several use cases. The benefits of being able to explore the API by just clicking the different links sometimes make this much easier to write than a kernel test.

Just remember that this type of tests are a bit slower to run, since they require a full Drupal instance running. But if you have some continuous integration in place, it could be an interesting approach to try. At least for some specific tests.

We have found this quite useful, for example, to check that a node can be referenced by another in a reference field. To check this, you need the node ids of all the nodes created by the tests.

A tweet by @skyredwang could be accurate to close this post.

Remember how cool Views have been since Drupal 4.6? #JSONAPI module by @e0ipso is the new "Views".

— Jingsheng Wang (@skyredwang) January 9, 2018
May 23 2018
May 23

But I just want to upload images to my site&mldr;

There is a clear difference between what a user expects from a CMS when they try to upload an image, and what they get out of the box. This is something that we hear all the time, and yet we, as a Drupal community, struggle to do it right.

There are not simple answers on why Drupal has issues regarding media management. As technology evolves, newer and simpler tools raise the bar on what users expects to see on their apps. Take Instagram for example. An entire team of people (not just devs) are focused on making the experience as simple as possible.

Therefore it’s normal to expect that everyone wants to have this type of simplicity everywhere. However, implementing this solutions is not always trivial, as you will see.

We are working in a new project that needs some image management capabilities, and just to avoid reinventing the wheel, we looked to the solutions by two mature distributions for Drupal 8, Thunder and Lightning.

Thunder seemed more aligned to what we were looking for, so we just tried to replicate the features into our platform. But there was a catch. Thunder is still using Media from a contrib module, and we wanted to stay as close to core as possible.

After spending a few hours replicating most of the functionality it became evident that there are a lot of concepts under the hood interesting to explore. Just to warn you, this article is not about of how to create a media management library. Instead we will focus on understand what we are configuring and why.

The Journey Begins.

Before Media entities was a thing, this is how a media gallery was designed with Drupal. You basically add an image field to a content type.

A simple content type with an image field.

Drupal 8.5 introduces the concept of Media entities. This is an important concept because each time you upload media into a site you may want to associate some metadata to be able to search them later. For example, if you want to categorize images, you need to add a vocabulary to an image. The media entity acts as a bridge between your assets and the fields that enrich them.

After installing the Media module you can create Media entities types. This is, if you have images, you may want to have Media Images; the same applies for videos and audio.

We wanted to create an image gallery, therefore we created a content type called Image Gallery, that has an entity reference field that references&mldr; that’s right, media entities.

A media entity linked to a content type.

So far this is quite simple, just core modules, and two entities types referenced by a field.

Making It Usable

The regular entity reference widget.

Now the challenge is to make this easy to use. The first step is to replace the entity reference with something much more flexible. Here is where our first contributed module comes in. Meet the Entity Browser module.

The Goal of this module is to provide a generic entity browser/picker/selector. It can be used in any context where one needs to select few entities and do something with them.

There is a great article that explains a lot of the details on this module. Let’s keep this simple to understand the full picture. Just make sure to use the 8.x-2.0 branch that is compatible with Media module provided by core.

The widget of the entity reference is what you are configuring.

The entity browser, as we said, allows you to replace the entity reference widget with something more fancy. It also allows you to create – in place – a new media entity. But, you will need an extra pair of modules to provide the fanciness you need.

The dropzone module allows you to upload multiple media items in a single upload. One of the main differences between media and and Drupal core is that the media name is required now, so you may need some custom code to auto populate this field somehow in case you want to hide it.

Another module you will need is Views, which fortunately is in core now, so you don’t need to download it. The views module is used to generate a view that lists media entities, there is a special field you need to attach to this view, that is the Media: Entity browser bulk select form field.

Entity Browser in action configured with Dropzone.

Customizing the Rest

So let’s recap, we have two entities (Media Image, and Image Gallery), referenced by a entity reference. The widget that we are using for the entity reference is an Entity Browser widget, that allows you not only to select existing media (by using a view) but also to upload new images using dropzone.

If a new image is uploaded, a new Media Entity is created and the image is attached to it automatically. If an image is selected, an existing entity will be referenced by the content type Gallery using the entity reference field. All this steps are handled by the Entity Browser module.

A picture is worth a thousand words.

Another feature usually expected by clients is the ability to select which part of an image is the important one. You can use the Focal Point module that allows you to specify a focal point to focus and crop the image.

To use focal points you need to configure the widget of the image field.

If we take a look to the selection of images done by Thunder, we can see there is a green indicator that shows we choose an image. This is done by custom code. But don’t worry, if you use the same names defined by thunder for the views and fields (as we did), you can borrow the code from media_thunder that adds the magic.

You may want to copy media_thunder/css, media_thunder/img and media_thunder/js as well.

 * Implements hook_preprocess_views_view().
function custom_module_media_preprocess_views_view(&$variables) {

  $custom_module_media_browser = [

  if (in_array($variables['view']->id(), $custom_module_media_browser)) {
    $variables['view_array']['#attached']['library'][] = 'custom_module_media/entity_browser_view';

By default you see a lot of fields that are not relevant when you upload the image. Let’s see how we can configure the form to make it easier to use.

Configurable concepts for entity types and fields.

What you see in the image is how the media entity form mode is configured. This is the UI that you can use to hide the things you don’t need.

You may want to do the same thing with the display mode of the media entity to indicate what to show once the image is uploaded.

The thunder_admin theme provides some nice theme enhancement.

The entity browser allows you to select which display mode use after selecting an image. That is defined in the entity reference widget settings.

Here we are configuring the entity reference widget of the Gallery content type.

But the elements rendered in the Media entity are configured as part of the display mode of of the Media Entity.

Display mode configuration of the Media Entity.

And here’s some good news: we are experimenting with the new Layout Builder module, and we are happy to confirm this is working fine within the media ecosystem.

Configuring the thumbnail view mode for the entity Media type.

Configuring the thumbnail view mode to show only what you need makes the form really easy to use.

Here you configure what you want to display when a field item is selected. It also works with the new Layout builder module


As you can see, there are a good balance between core modules and contributed modules. This is the work of several years of work by dozens of developers around the world.

There is still a lot of things that can be improved and polished. Even more, the recently committed Allow creation of file entities from binary data via REST requests functionality to core and modules like jsonAPI open the door to replace this solution by something more decoupled of Drupal.

The trick to get what you want is to play with the form modes and display modes of each entity involved. Is a bit of try and error but you will gain a lot of understanding of the Drupal basics.

And in the future? Who knows, maybe next versions of Drupal will include this feature ready to use of the box, until then, have fun configuring your own set of building blocks.

May 22 2018
May 22

This is going to be a simple exercise to create a decoupled site using Drupal 8 as the backend and an Elm app in the frontend. I pursue two goals with this:

  • Evaluate how easy it will be to use Drupal 8 to create a restful backend.
  • Show a little bit how to set up a simple project with Elm.

We will implement a very simple functionality. On the backend, just a feed of blog posts with no authentication. On the frontend, we will have a list of blog posts and a page to visualize each post.

Our first step will be the backend.

Before we start, you can find all the code I wrote for this post in this GitHub repository.

Drupal 8 Backend

For the backend, we will use Drupal 8 and the JSON API module to create the API that we will use to feed the frontend. The JSON API module follows the JSON API specification and currently can be found in a contrib project. But as has been announced in the last DrupalCon “Dries”-note, the goal is to move it to an experimental module in core in the Drupal 8.6.x release.

But even before that, we need to set up Drupal in a way that is easy to version and to deploy. For that, I have chosen to go with the Drupal Project composer template. This template has become one of the standards for site development with Drupal 8 and it is quite simple to set up. If Composer is already installed, then it is as easy as this:

composer create-project drupal-composer/drupal-project:8.x-dev server --stability dev --no-interaction

This will create a folder called server with our code structure for the backend. Inside this folder, we have now a web folder, where we have to point our webserver. And is also inside this folder where we have to put all of our custom code. For this case, we will try to keep the custom code as minimal as possible. Drupal Project also comes with the two best friends for Drupal 8 development: drush and drupal console. If you don’t know them, Google it to find more out about what they can do.

After installing our site, we need to install our first dependency, the JSON API module. Again, this is quite easy, inside the server folder, we run the next command:

composer require drupal/jsonapi:2.x

This will accomplish two things: it will download the module and it will add it to the composer files. If we are versioning our site on git, we will see that the module does not appear on the repo, as all vendors are excluded using the gitignore provided by default. But we will see that it has been added to the composer files. That is what we have to commit.

With the JSON API module downloaded, we can move back to our site and start with site building.

Configuring Our Backend

Let’s try to keep it as simple as possible. For now, we will use a single content type that we will call blog and it will contain as little configuration as possible. As we will not use Drupal to display the content, we do not have to worry about the display configuration. We will only have the title and the body fields on the content type, as Drupal already holds the creation and author fields.

By default, the JSON API module already generates the endpoints for the Drupal entities and that includes our newly created blog content type. We can check all the available resources: if we access the /jsonapi path, we will see all the endpoints. This path is configurable, but it defaults to jsonapi and we will leave it as is. So, with a clean installation, these are all the endpoints we can see:

JSON API default endpoints

But, for our little experiment, we do not need all those endpoints. I prefer to only expose what is necessary, no more and no less. The JSON API module provides zero configurable options on the UI out of the box, but there is a contrib module that allows us to customize our API. This module is JSON API Extras:

composer require drupal/jsonapi_extras:2.x

JSONAPI Extras offer us a lot of options, from disabling the endpoint to changing the path used to access it, or renaming the exposed fields or even the resource. Quite handy! After some tweaking, I disabled all the unnecessary resources and most of the fields from the blog content type, reducing it just to the few we will use:

JSONAPI blog resource

Feel free to play with the different options. You will see that you are able to leave the API exactly as you need.

Moving Our Configuration to Version Control

If you have experience with Drupal 7, you probably used the Features module to export configuration to code. But one of the biggest improvements of Drupal 8 is the Configuration Management Interface (CMI). This system provides a generic engine to export all configuration to YAML files. But even if this system works great, is still not the most intuitive or easy way to export the config. But using it as a base, there are now several options that expand the functionality of CMI and provide an improved developer experience. The two bigger players on this game are Config Split and the good old Features.

Both options are great, but I decided to go with my old friend Features (maybe because I’m used to it’s UI). The first step, is to download the module:

composer require drupal/features:3.x

One of the really cool functionalities that the Drupal 8 version of the Features module brings is that can instantly create an installation profile with all our custom configuration. Just with a few clicks we have exported all the configuration we did in previous steps; but not only that, we have also created an installation profile that will allow us to replicate the site easily. You can read more of Features in the (official documentation on drupal.org)[https://www.drupal.org/docs/8/modules/features/building-a-distribution-w....

Now, we have the basic functionality of the backend. There are some things we should still do, such as restricting the access to the backend interface, to prevent login or registration to the site, but we will not cover it in this post. Now we can move to the next step: the Elm frontend.


I used Features in this project to give it a try and play a bit. If you are trying to create a real project, you might want to consider other options. Even the creators of the Features module suggest not to use it for this kind of situations, as you can read here.

The Frontend

As mentioned, we will use Elm to write this app. If you do not know what it is, Elm is a pure functional language that compiles into Javascript and it is used to create reliable webapps.

Installing Elm is easy. You can build it from the source, but the easiest and recommended way is just use npm. So let’s do it:

npm install -g elm

Once we install Elm, we get four different commands:

  • elm-repl: an interactive Elm shell, that allows us to play with the language.
  • elm-reactor: an interactive development tool that automatically compiles our code and serves it on the browser.
  • elm-make: to compile our code and build the app we will upload to the server.
  • elm-package: the package manager to download or publish elm packages.

For this little project, we will mostly use elm-reactor to test our app. We can begin by starting the reactor and accessing it on the browser. Once we do that, we can start coding.

Elm Reactor

Our First Elm Program

If you wish to make apple pie from scratch, you must create first the universe. Carl Sagan

We start creating a src folder that will contain all our Elm code and here, we start the reactor with elm reactor. If we go to our browser and access http://localhost:8000, we will see our empty folder. Time to create a Main.elm file in it. This file will be the root of our codebase and everything will grow from here. We can start with the simplest of all the Elm programs:

module Main exposing main

import Html exposing (text)

main =
    text "Hello world"

This might seem simple, but when we access the Main.elm file in the reactor, there will be some magic going on. The first thing we will notice, is that we now have a page working. It is simple, but it is an HTML page generated with Elm. But that’s not the only thing that happened. On the background, elm reactor noticed we imported a Html package, created a elm-packages.json file, added it as dependency and downloaded it.

This might be a good moment to do our first commit of our app. We do not want to include the vendor packages from elm, so we create a .gitignore file and add the elm-stuff folder there. Our first commit will include only three things, the Mail.elm file, the .gitignore and the elm-packages.json file.

The Elm Architecture

Elm is a language that follows a strict pattern, it is called (The Elm Architecture)[https://guide.elm-lang.org/architecture/]. We can summarize it in this three simple components:

  • Model, which represents the state of the application.
  • Update, how we update our application.
  • View, how we represent our state.

Given our small app, let’s try to represent our code with this pattern. Right now, our app is static and has no functionality at all, so there are not a lot of things to do. But, for example, we could start moving the text we show on the screen to the model. The view will be the content we have on our main function, and as our page has no functionality, the update will do nothing at this stage.

type alias Model
    = String

model : Model
model = "Hello world"

view : Model -> Html Msg
view model =
    text model

main =
    view model

Now, for our blog, we need two different pages. The first one will be the listing of blog posts and the second one, a page for the individual post. To simplify, let’s keep the blog entries as just a string for now. Our model will evolve into a list of Posts. In our state, we also need to store which page we are located. Let’s create a variable to store that information and add it to our model:

type alias Model =
    { posts : List Post
    , activePage : Page

type alias Post
    = String

type Page
    = List
    | Blog

model : Model
model =
    { posts = ["First blog", "Second blog"]
    , activePage = List

And we need to update or view too:

view Model : Model -> Html Msg
view model =
        [ List.map viewPost model.posts

viewPost : Post -> Html Msg
viewPost post =
        [ text post ]

We now have the possibility to create multiple pages! We can create our update function that will modify the model based on the different actions we do on the page. Right now, our only action will be navigating the app, so let’s start there:

type Msg
    = NavigateTo Page

And now, our update will update the activePage of our model, based on this message:

update : Msg -> Model -> (Model, Cmd Msg)
update msg model =
    case msg of
        NavigateTo page ->
            ( {model | activePage = page}, Cmd.none )

Our view should be different now depending on the active page we are viewing:

view : Model -> Html Msg
view model =
    case model.activePage of
        BlogList ->
            viewBlogList model.posts
        Blog ->
            div [] [ text "This is a single blog post" ]

viewBlogList : List Post -> Html Msg
viewBlogList posts =
        [ List.map viewPost model.posts

Next, let’s wire the update with the rest of the code. First, we fire the message to change the page to the views:

viewPost post =
        [ onClick <| NavigateTo Blog ]
        [ text post ]

And as a last step, we replace the main function with a more complex function from the Html package (but still a beginner program):

main : Program Never Model Msg
main =
        { model = model
        , view = view
        , update = update

But we still have not properly represented the single blogs on their individual pages. We will have to update our model once again along with our definition of Page:

type alias Model =
    { posts : Dict PostId Post
    , activePage : Page

type alias PostId =

type Page
    = List
    | Blog PostId

model : Model
model =
    { posts = Dict.fromList [(1, "First blog"), (2, "Second blog")]
    , activePage = List

And with some minor changes, we have the views working again:

view : Model -> Html Msg
view model =
    case model.activePage of
        BlogList ->
            viewBlogList model.posts

        Blog postId ->
                [ onClick <| NavigateTo BlogList ]
                [ text "This is a single blog post" ]

viewBlogList : Dict PostId Post -> Html Msg
viewBlogList posts =
        (Dict.map viewPost model.posts |> Dict.values)

viewPost : PostId -> Post -> Html Msg
viewPost postId post =
        [ onClick <| NavigateTo <| Blog postId ]
        [ text post ]

We do not see yet any change on our site, but we are ready to replace the placeholder text of the individual pages with the content from the real Post. And here comes one of the cool functionalities of Elm, and one of the reasons of why Elm has no Runtime exceptions. We have a postId and we can get the Post from the list of posts we have on our model. But, when getting an item from a Dict, we always risk the possibility of trying to get an non-existing item. If we call a function over this non-existing item, it usually causes errors, like the infamous undefined is not a function. On Elm, if a function has a chance of return the value or not, it returns a special variable type called Maybe.

view : Model -> Html Msg
view model =
    case model.activePage of
        BlogList ->
            viewBlogList model.posts

        Blog postId ->
                -- This is our Maybe variable. It could be annotated as `Maybe Post` or a full definition as:
                -- type Maybe a
                --   = Just a
                --   | Nothing
                post =
                    Dict.get postId model.posts
                case post of
                    Just aPost ->
                            [ onClick <| NavigateTo BlogList ]
                            [ text aPost ]

                    Nothing ->
                            [ onClick <| NavigateTo BlogList ]
                            [ text "Blog post not found" ]

Loading the Data from the Backend

We have all the functionality ready, but we have to do something else before loading the data from the backend. We have to update our Post definition to match the structure of the backend. On the Drupal side, we left a simple blog data structure:

  • ID
  • Title
  • Body
  • Creation date

Let’s update the Post, replacing it with a record to contain those fields. After the change, the compiler will tell us where else we need to adapt our code. For now, we will not care about dates and we will just take the created field as a string.

type alias Post =
    { id : PostId
    , title : String
    , body : String
    , created : String

model : Model
model =
    { posts = Dict.fromList [ ( 1, firstPost ), ( 2, secondPost ) ]
    , activePage = BlogList

firstPost : Post
firstPost =
    { id = 1
    , title = "First blog"
    , body = "This is the body of the first blog post"
    , created = "2018-04-18 19:00"

Then, the compiler shows us where we have to change the code to make it work again:

Elm compiler helps us find the errors
-- In the view function:
case post of
    Just aPost ->
            [ h2 [] [ text aPost.title ]
            , div [] [ text aPost.created ]
            , div [] [ text aPost.body ]
            , a [ onClick <| NavigateTo BlogList ] [ text "Go back" ]

-- And improve a bit the `viewPost`, becoming `viewPostTeaser`:
viewBlogList : Dict PostId Post -> Html Msg
viewBlogList posts =
        (Dict.map viewPostTeaser model.posts |> Dict.values)

viewPostTeaser : PostId -> Post -> Html Msg
viewPostTeaser postId post =
        [ onClick <| NavigateTo <| Blog postId ]
        [ text post.title ]

As our data structure now reflects the data model we have on the backend, we are ready to import the information from the web service. For that, Elm offers us a system called Decoders. We will also add a contrib package to simplify our decoders:

elm package install NoRedInk/elm-decode-pipeline

And now, we add our Decoder:

postListDecoder : Decoder PostList
postListDecoder =
    dict postDecoder

postDecoder : Decoder Post
postDecoder =
    decode Post
        |> required "id" string
        |> required "title" string
        |> required "body" string
        |> required "created" string

As now our data will come from a request, we need to update again our Model to represent the different states a request can have:

type alias Model =
    { posts : WebData PostList
    , activePage : Page

type WebData data
    = NotAsked
    | Loading
    | Error
    | Success data

In this way, the Elm language will protect us, as we always have to consider all the different cases that the data request can fail. We have to update now our view to work based on this new state:

view : Model -> Html Msg
view model =
    case model.posts of
        NotAsked ->
            div [] [ text "Loading..." ]

        Loading ->
            div [] [ text "Loading..." ]

        Success posts ->
            case model.activePage of
                BlogList ->
                    viewBlogList posts

                Blog postId ->
                        post =
                            Dict.get postId posts
                        case post of
                            Just aPost ->
                                    [ h2 [] [ text aPost.title ]
                                    , div [] [ text aPost.created ]
                                    , div [] [ text aPost.body ]
                                    , a [ onClick <| NavigateTo BlogList ] [ text "Go back" ]

                            Nothing ->
                                    [ onClick <| NavigateTo BlogList ]
                                    [ text "Blog post not found" ]

        Error ->
           div [] [ text "Error loading the data" ]

We are ready to decode the data, the only thing left is to do the request. Most of the requests done on a site are when clicking a link (usually a GET) or when submitting a form (POST / GET), then, when using AJAX, we do requests in the background to fetch data that was not needed when the page was first loaded, but is needed afterwards. In our case, we want to fetch the data at the very beginning as soon as the page is loaded. We can do that with a command or as it appears in the code, a Cmd:

fetchPosts : Cmd Msg
fetchPosts =
        url =
        Http.send FetchPosts (Http.get url postListDecoder)

But we have to use a new program function to pass the initial commands:

main : Program Never Model Msg
main =
        { init = init
        , view = view
        , update = update
        , subscriptions = subscriptions

Let’s forget about the subscriptions, as we are not using them:

subscriptions : Model -> Sub Msg
subscriptions model =

Now, we just need to update our initial data; our init variable:

model : Model
model =
    { posts = NotAsked
    , activePage = BlogList

init : ( Model, Cmd Msg )
init =
    ( model
    , fetchPosts

And this is it! When the page is loaded, the program will use the command we defined to fetch all our blog posts! Check it out in the screencast:

Screencast of our sample app

If at some point, that request is too heavy, we could change it to just fetch titles plus summaries or just a small amount of posts. We could add another fetch when we scroll down or we can fetch the full posts when we invoke the update function. Did you notice that the signature of the update ends with ( Model, Cmd Msg )? That means we can put commands there to fetch data instead of just Cmd.none. For example:

update : Msg -> Model -> ( Model, Cmd Msg )
update msg model =
    case msg of
        NavigateTo page ->
                command =
                    case page of
                        Blog postId ->
                            fetchPost postId
                        BlogList ->
            ( { model | activePage = page }, command )

But let’s leave all of this implementation for a different occasion.

And that’s all for now. I might have missed something, as the frontend part grew a bit more than I expected, but check the repository as the code there has been tested and is working fine. If yuo have any question, feel free to add a comment and I will try to reply as soon as I can!

End Notes

I did not dwell too much on the syntax of elm, as there is already plenty of documentation on the official page. The goal of this post is to understand how a simple app is created from the very start and see a simple example of the Elm Architecture.

If you try to follow this tutorial step by step, you will may find an issue when trying to fetch the data from the backend while using elm-reactor. I had that issue too and it is a browser defense against Cross-site request forgery. If you check the repo, you will see that I replaced the default function for get requests Http.get with a custom function to prevent this.

I also didn’t add any CSS styling because the post would be too long, but you can find plenty of information on that elsewhere.

Feb 16 2018
Feb 16

Chances are that you already use Travis or another cool CI to execute your tests, and everyone politely waits for the CI checks before even thinking about merging, right? More likely, waiting your turn becomes a pain and you click on the merge: it’s a trivial change and you need it now. If this happens often, then it’s the responsibility of those who worked on those scripts that Travis crunches to make some changes. There are some trivial and not so trivial options to make the team always be willing to wait for the completion.

This blog post is for you if you have a project with Travis integration, and you’d like to maintain and optimize it, or just curious what’s possible. Users of other CI tools, keep reading, many areas may apply in your case too.

Unlike other performance optimization areas, doing before-after benchmarks is not so crucial, as Travis mostly collects the data, you just have to make sure to do the math and present the numbers proudly.


To start, if your .travis.yml lacks the cache: directive, then you might start in the easiest place: caching dependencies. For a Drupal-based project, it’s a good idea to think about caching all the modules and libraries that must be downloaded to build the project (it uses a buildsystem, doesn’t it?). So even a variant of:

    - $HOME/.composer/cache/files

or for Drush

    - $HOME/.drush/cache

It’s explained well in the verbose documentation at Travis-ci.com. Before your script is executed, Travis populates the cache directories automatically from a successful previous build. If your project has only a few packages, it won’t help much, and actually it can make things even slower. What’s critical is that we need to cache slow-to-generate, easy-to-download materials. Caching a large ZIP file would not make sense for example, caching many small ones from multiple origin servers would be more beneficial.

From this point, you could just read the standard documentation instead of this blog post, but we also have icing on the cake for you. A Drupal installation can take several minutes, initializing all the modules, executing the logic of the install profile and so on. Travis is kind enough to provide a bird’s-eye view on what eats up build time:

Execution speed measurements built in the log

Mind the bottleneck when making a decision on what to cache and how.

For us, it means cache of the installed, initialized Drupal database and the full document root. Cache invalidation is hard, we can’t change that, but it turned out to be a good compromise between complexity and execution speed gain, check our examples:

Do your homework and cache what’s the most resource-consuming to generate, SQL database, built source code or compiled binary, Travis is here to assist with that.

Software Versions

There are two reasons to pay attention to software versions.

Use Pre-installed Versions

Travis uses containers of different distributions, let’s say you use trusty, the default one these days, then if you choose PHP 7.0.7, it’s pre-installled, in case of 7.1, it’s needed to fetch separately and that takes time for every single build. When you have production constraints, that’s almost certainly more important to match, but in some cases, using the pre-installed version can speed things up.

And moreover, let’s say you prefer MariaDB over MySQL, then do not sudo and start to install it with the package manager, as there is the add-on system to make it available. The same goes for Google Chrome, and so on. Stick to what’s inside the image already if you can. Exploit that possibility of what Travis can fetch via the YML definition!

Use the Latest and (or) Greatest

If you ever read an article about the performance gain from migrating to PHP 7, you sense the importance of selecting the versions carefully. If your build is PHP-execution heavy, fetching PHP 7.2 (it’s another leap, but mind the backward incompatibilities) could totally make sense and it’s as easy as can be after making your code compatible:

language: php
  - '7.2'

Almost certainly, a similar thing could be written about Node.js, or relational databases, etc. If you know what’s the bottleneck in your build and find the best performing versions – newer or older – it will improve your speed. Does that conflict with the previous point about pre-installed versions? Not really, just measure which one helps your build the most!

Make it Parallel

When a Travis job is running, 2 cores and 4 GBytes of RAM is available – that’s something to rely on! Downloading packages should happen in parallel. drush make, gulp and other tools like that might use it out of the box: check your parameters and configfiles. However, on the higher level, let’s say you’d like to execute a unit test and a browser-based test, as well. You can ask Travis to spin up two (or more) containers concurrently. In the first, you can install the unit testing dependencies and execute it; then the second one can take care of only the functional test. We have a fine-grained example of this approach in our Drupal-Elm Starter, where 7 containers are used for various testing and linting. In addition to the great execution speed reduction, the benefit is that the result is also more fine-grained, instead of having a single boolean value, just by checking the build, you have an overview what can be broken.

All in all, it’s a warm fuzzy feeling that Travis is happy to create so many containers for your humble project:

If it's independent, no need to serialize the execution

Utilize RAM

The available memory is currently between 4 and 7.5 GBytes , depending on the configuration, and it should be used as much as possible. One example could be to move the database main working directory to a memory-based filesystem. For many simpler projects, that’s absolutely doable and at least for Drupal, a solid speedup. Needless to say, we have an example and on client projects, we saw 15-30% improvement at SimpleTest execution. For traditional RMDBS, you can give it a try. If your DB cannot fit in memory, you can still ask InnoDB to fill memory.

Think about your use case – even moving the whole document root there could be legitimate. Also if you need to compile a source code, doing it there makes sense as well.

Build Your Own Docker Image

If your project is really exotic or a legacy one, it potentially makes sense to maintain your own Docker image and then download and execute it in Travis. We did it in the past and then converted. Maintaining your image means recurring effort, fighting with outdated versions, unavailable dependencies, that’s what to expect. Still, even it could be a type of performance optimization if you have lots of software dependencies that are hard to install on the current Travis container images.

+1 - Debug with Ease

To work on various improvements in the Travis integration for your projects, it’s a must to spot issues quickly. What worked on localhost, might or might not work on Travis – and you should know the root cause quickly.

In the past, we propagated video recording, now I’d recommend something else. You have a web application, for all the backend errors, there’s a tool to access the logs, at Drupal, you can use Drush. But what about the frontend? Headless Chrome is neat, it has built-in debugging capability, the best of which is that you can break out of the box using Ngrok. Without any X11 forwarding (which is not available) or a local hack to try to mimic Travis, you can play with your app running in the Travis environment. All you need to do is to execute a Debug build, execute the installation part (travis_run_before_install, travis_run_install, travis_run_before_script), start Headless Chrome (google-chrome --headless --remote-debugging-port=9222), download Ngrok, start a tunnel (ngrok http 9222), visit the exposed URL from your local Chrome and have fun with inspection, debugger console, and more.


Working on such improvements has benefits of many kinds. The entire development team can enjoy the shorter queues and faster merges, and you can go ahead and apply part of the enhancements to your local environment, especially if you dig deep into database performance optimization and make the things parallel. And even more, clients love to hear that you are going to speed up their sites, as this mindset should be also used at production.

Jan 02 2018
Jan 02

I tell my kids all the time that they can’t have both - whether it’s ice cream and cake or pizza and donuts - and they don’t like it. It’s because kids are uncorrupted, and their view of the world is pretty straightforward - usually characterized by a simple question: why not?

And so it goes with web projects:

Stakeholder: I want it to be like [insert billion dollar company]’s site where the options refresh as the user makes choices.

Me: [Thinks to self, “Do you know how many millions of dollars went into that?”] Hmm, well, it’s complicated&mldr;

Stakeholder: What do you mean? I’ve seen it in a few places [names other billion dollar companies].

Me: [Gosh, you know, you’re right] Well, I mean, that’s a pretty sophisticated application, and well, your current site is Drupal, and well, Drupal is in fact really great for decoupled solutions, but generally we’d want to redo the whole architecture… and that’s kind of a total rebuild…

Stakeholder: [eyes glazed over] Yeah, we don’t want to do that.

But there’s is a way.

Have your cake and eat it too - ©Leslie Fay Richards (CC BY 2.0)

Elm in Drupal Panels

Until recently, we didn’t have a good, cost-effective way of plugging a fancy front-end application into an existing Drupal site. The barrier to develop such a solution was too high given the setup involved and the development skills necessary. If we were starting a new project, it would be a no-brainer. But in an “enhancement” scenario, it was never quite worth the time and cost. Over time, however, our practiced approach to decoupled solutions has created a much lower barrier of entry for these types of solutions.

We now have a hybrid approach, using Elm “widgets” nested inside of a Drupal panel. Our reasons for using Elm - as opposed to some of the other available front-end frameworks (React, Angular, Ember) - are well-documented, but suffice it to say that Drupal is really good at handling content and it’s relationships, and Elm is really good at displaying that content and allowing a user to interact with it. But add to that the fact that Elm has significant guarantees (such as no runtime exceptions) and gives us the ability to do unit tests that we could never do with jQuery or Ajax, and all of a sudden, we have a solution that is not only more slick, but more stable and cost efficient.

A nifty registration application with lots of sections and conditional fields built as an Elm widget inside a Drupal Panel. Also shown, the accompanying tests that ensure the conditions yield the proper results. Oh how we miss the Drupal Form API!

In these cases, where our clients have existing Drupal websites that they don’t want to throw away, and big ideas for functionality that their users have come to expect, we can now deliver something better. This is groundbreaking in particular for our non-profit clients, as it gives them an opportunity to have “big box” functionality at a more affordable price point. Our clients can have their proverbial cake and eat it too.

What’s more, is that it helps us drive projects even further using our “Gizra Way” mindset: looking at a project as the sum of its essential parts. Because - in these scenarios - we don’t need to use Drupal for everything (and likewise, we don’t need to use Elm for everything either), we can pick and choose between them, and mix and match our approach depending upon what a particular function requires. In a way, we can ask: would this piece work nicely as a single-page application (SPA)? Yes? Let’s drop an Elm widget into a Panel. Is this part too much tied to the other Drupal parts of the apparatus? Fine, let’s use Drupal.

Building a Summer Planner

FindYourSummer.Org is a program operated by the Jewish Education Project in New York (jointly funded by UJA-Federation of New York and the Jim Joseph Foundation) and is dedicated to helping teens find meaningful Jewish summer experiences. They have amassed a catalogue of nearly 400 summer programs, and when they decided to expand their Drupal site into a more sophisticated tool for sorting options, comparing calendars, and sharing lists, the expected functionality exceeded Drupal’s ability to deliver.

Separating out the functional components into smaller tasks helped us to achieve what we needed without going for a full rebuild.

For instance, some of the mechanisms we left to Drupal entirely:

Adding a program to the summer planner (an action similar to adding an item to a shopping cart) is a well known function in commerce sites and in Drupal can be handled by Ajax pretty well. Just provide an indication that the item is added and increment the shopping cart and we’re all set.

Ajax does just fine for adding a program to the summer planner.

The new feature set also required more prompts for users to login (because only a logged-in user can share their programs), and again, Drupal is up for the task. Dropping the user login/registration form into a modal provides a sophisticated and streamlined experience.

Login prompts provided by a Ctools Modal lets a user know that to continue using the planner, they need to login. A key performance indicator for the project was to increase signups in order to track users who actually register for summer experiences.

For when a user gets into the planner (the equivalent of the shopping cart on a commerce site) the team had big ideas for how users would interact with the screen: things like adding dates and labels, sharing programs with friends and families, and removing items from the planner altogether.

Drupal could certainly handle those actions, but given the page refreshes that would be needed, the resulting interface would be sluggish, prone to error, and not at all in line with users’ expectation of a modern “shopping” experience. But, because we could define all of the actions that we wanted on one screen, we began to think of the “cart” page as a SPA. As such, it was a perfect opportunity to use Elm inside a Panel and provide a robust user experience.

Fast and stable (and well-tested) performance, slick user experience, and cost efficiency using Elm.

While the biggest benefit to the user is the greatly enhanced interaction, perhaps the biggest benefit to the client was the cost. The cost to handle this feature with an Elm application was only marginally more costly than it would have been in Drupal only. The most significant extra development is to provide the necessary data to the Elm app via RESTful endpoints. Everything else - from the developer experience perspective - is vastly improved because Elm is so much easier to deal with and provides so many guarantees.

Elm Apps Everywhere!

Maybe not. Sometimes - with new projects or in cases where the functionality can’t be boiled down into a single page - it’s more beneficial to start fresh with a fully decoupled solution. In these cases though, where there’s an existing Drupal site, and the functionality can be easily segmented, projects can have it both ways. It’s not surprising that we’ve been using this technique quite a bit lately, and as we get more adept, it only means the barrier to cost effectiveness is getting lower.

Dec 25 2017
Dec 25

If you happen to know Brice - my colleague and Gizra’s CEO - you probably have picked up that he doesn’t get rattled too easily. While I find myself developing extremely annoying ticks during stressful situations, Brice is a role model for stoicism.

Combine that with the fact that he knows I dislike speaking on the phone, let alone at 6:53pm, almost two hours after my work day is over, you’d probably understand why I was surprised to get a call from him. “Surprised” as in, immediately getting a stomach ache.

The day I got that call from him was a Sale day. You see, we have this product we’ve developed called ״Circuit Auction״, which allows auction houses to manage their catalog and run live, real-time, auction sales - the “Going once, Going twice” type.

- “Listen Bruce,” (that’s how I call him) “I’m on my way to working out. Did something crash?” I don’t always think that the worst has happened, but you did just read the background.
- “No.”

I was expecting a long pause. In a way, I think he kind of enjoys those moments, where he knows I don’t know if it’s good or bad news. In a way, I think I actually do somehow enjoy them myself. But instead he said, “Are you next to a computer?”

- “No. I’m in the car. Should I turn back? What happened?”

I really hate to do this, but in order for his next sentence to make sense I have to go back exactly 95 years, to 1922 Tokyo, Japan.

Professor Albert Einstein was visiting there, and the story tells that he scribbled a note in German and handed it to a bellboy after he did not have cash for a tip:

“A calm and modest life brings more happiness than the pursuit of success combined with constant restlessness,” it reads.

I wonder if it’s really the way it went. I’d like to believe it is. Seriously, just imagine that event!

Anyway, back to late October of 2017. Professor Einstein is long dead. The bellboy, even if still alive, is surely no longer a bellboy. Me, in my car, waiting for the light to turn Green - it’s either a left to go workout, or a u-turn back home. And the note. The note!

That note was up for sale that day. The opening price was $2,000, and it was estimated to be sold between $5,000 to $8,000.

- “It’s just passed a million dollars!”

That’s what he said next. Mind the exclamation mark. Brice almost never pronounces it, but this time I could swear I heard it. Heck, if we were next to each other we might have ended up hugging and crying together, and marvelling at how something we’ve created ended up selling a note for $1.6M!

Yes, the same note that reads “A calm and modest life brings more happiness than the pursuit of success combined with constant restlessness” was finally purchased after a hectic thirty minutes bidding war for lots and lots of money. I always enjoy good irony as much as I enjoy a good story. And by the way - it totally happened.

Screenshot of the live sale

We’re now launching a new version of the webapp. It has Headless Drupal in the backend, Elm in the client, and it’s sprinkled with Pusher and Serverless for real-time response.


Even after almost three years, Elm doesn’t cease to amaze me. I honestly don’t get why people are still directly JSing without at least TypeScript, to get a taste of something better and move on to a better solution. For our needs, Elm is definitely the right solution. If rewriting 60 flies with zero bugs once it compiles doesn’t impress you, then probably nothing I’ll present here will.

There are many advantages to Elm, and one of the bigger ones is how we can help the compiler help us using types. Here’s an example of how we model the notion of an Item status. When selling an item it transitions through many different states. Is it open for sale? Is it the currently selected item? Was it withdrawn by the auctioneer? Is it Active, Going, Gone?

Below is our way of telling the compiler what are the allowed states. You can not have a Going status, while the Item is actually Withdrawn as that would be an “impossible state”. Having impossible states is the holy grail of webapps. If you don’t allow certain states to happen, it means you simply don’t have to think about certain edge cases or bugs as they cannot be written!


We decided to go with a super complex Drupal 8 setup. The kind that you at home probably don’t have, and never will. It’s a super secret branch that …

No, just kidding, It’s Drupal 7. With RESTful 1.x, and just the custom code we need along with some key modules such as Entity API, Message, Features and it is all tested with a large amount of SimpleTests.

Here is a short Q&A to questions no one really asked me, probably because I offer an answer before they are asked:

Q: Why not Drupal 8?
A: Could you also ask me why not Haskell? It would be easier to answer both these questions together.

Q: Why not Haskell?
A: Great questions! I’ll start with the latter. We’ve been dabbling with Haskell for some time now, and after doing Elm for so long we can definitely appreciate the language. However, our team was missing two important things: experience and mastery.

I think that often time, in the (silly) arguments of which is the best framework/ system, we are presented with the language’s features. But we need to also take into account experience. After 10 years with Drupal, there are very few problems we haven’t encountered, and we have had a chance to iterate and improve those solutions. We have the manpower in Gizra that is most experienced with Drupal, so scaling the dev team is easier. Combine it with a big ecosystem such as Panteon for hosting, Blackfire.io integrated in our CI to prevent regression in performance, Drupal was in the end the budget correct choice.

So back to Drupal 8. I’ve never been too shy on my opinion of Drupal 8. It’s probably the best CMS out there, but from my Drupal 7 perspective and current needs, it doesn’t offer anything that is much better. For example, we’ve never had config problems in Drupal 7 we couldn’t solve, so Drupal 8’s config feature that everybody raves around isn’t that appealing. Also, Drupal 8’s DX is indeed better, but at a cost of way more complexity. In the end, the way I see it - if you scratch Drupal 8 in some part, you will find Drupal 7 buried.

So Drupal 8 is on one hand not so far from Drupal 7, and in the other not radically different enough to be worth the learning curve.

Don’t get me wrong, we do develop on Drupal 8 for clients in Gizra. But for new projects, we still recommend starting with Drupal 7. And for non-CMS (similar to our Circuit Auction webapp), we’re looking to start using Yesod - a Haskell framework.

If I had to choose one topic I’m proud of in this project on the Drupal side, I’d have to pick our attention to docs and automatic testing.

The PHPDocs are longer than the actual codeCI testing is extensive

Static data with S3 & Lazy Loading with Pusher

Drupal, with PHP 7 isn’t slow. It actually performs quite well. But it is probably not as scalable as we’d get from working with Haskell. But even if we would go with a super fast solution, we’ve realized that all clients hitting the server at the same time could and should be avoided.

As we’re dealing with a real-time app, we know all the bidders are looking at the same view – the live sale. So, instead of having to load the items ahead of time, we’ve taken a different path. The item is actually divided into static and dynamic info. The static info holds the item’s name, uuid, image, description, etc. We surely can generate it once, upload it to S3, and let Amazon take the hit for pennies.

As for the calculated data (minimum price, starting price, etc), Drupal will serve it via RESTful. However, the nifty feature we’ve added is, once a Clerk hits the Gone button on an item, and the sale jumps to the next Item, we don’t let the clients ask the item from the server, but rather let the server send a single message to Pusher, which will in turn distribute it to all the clients. Again, Drupal is not taking the hit, and the cost is low.

It’s actually a bit more complicated than that as different people should see slightly different data. A winning bidder should see different info, for example a “You are the highest bidder” message, but that winning bidder can change any second. So, caching or Varnish wouldn’t cut it. Instead, we’re actually using Pusher’s public and private channels, and make sure to send different messages to the right people. It’s working really fast, and the Drupal server stays clam.

Keen.io & Serverless

We’re using keen.io to get analytics. It’s pretty exciting to see the reactions of the clients - the auction house owners, when we tell them about this. Because they can suddenly start getting answers for questions they didn’t know they could ask.

- “Which user hovered over the Place Bid button but didn’t press it?”
- “Who used the carousel, and to which item did they they scroll?”
- “Do second time bidders bid more?”
- “When do most bidders join the sale?”

Keen.io is great, since it allows us to create analytics dashboards per auction house, without them having any access to others auction houses.

Showing the number of hovers over the `Place bid` button a user did

Serverless is useful when we want to answer some of those questions in real time. That is, the question “Which user hovered over the Place Bid button but didn’t press it?” is useful for some post-sale conclusions, but what if we wanted to ask “Which user is currently hovering over the Place Bid button?", so the auctioneer could stall the sale, giving a hesitating bidder a chance to place their bid.

Even Though the latency of keen is quite low (about 30 sec), it’s not good enough for real-time experience – certainly where each items sale can be even less than a minute. This is where Serverless comes in. It acts as a proxy server, where each client sends MouseIn, MouseOut events, and Serverless is responsible to Broadcasting it via Pusher to the Auctioneers’ private channel.

Setting up Serverless was lots of fun, and knowing there’s zero thought we need to give to the infrastructure, along with its cost - made it fit nicely into our product.

Jun 01 2017
Jun 01

In case you haven’t noticed, we are on the verge of a new era - the bot era!!

[Dramatic music in the background]

Sort of&mldr; Mark Zuckerberg showed us what it could really be. Though, we need to understand that for now, bots are more like what Dries showed us in his keynote at DrupalCon Baltimore.

Ro-bot 101

So what is a bot? If you’ve ever said Hey Siri, OK Google, Hey Cortona, or Alexa, then you have interacted with a bot. That bot received an input, returned an output - it woke you up, reminded you of something, sent an email, or ordered a new book from Amazon - and you if you’re lucky, a drone delivered it!

Platforms and technologies

Now that we understand what a bot is, let’s start to see how we can write one. Any chat platform provide bots, and there are two ways for them to communicate:

  1. Web sockets - Though we use this technology for push events to the user’s browser, platforms provide a web socket channel and push events when something happens in a channel in which your bot is listening.
  2. Web hooks - In this case you get http requests with the information on events.

Slack, Facebook Messenger, Skype, and Telegram all provide integrations with bots in one way or another, but in this post we will focus on Slack.

Writing Your First Bot is Easier than You Think

First, create a bot in your team. You can do it under http://yourteam.slack.com/apps and create a custom bot integration.

Next you’ll need to start with a library. There’s something like a quadrillion libraries - PHP, NodeJS, Python, and couple in Go lang. For our purposes, we need an easy setup, listening to events and acting, understanding from the text what kind of task the user requires, and even more (cron tasks for reminders, incoming web hooks, DB layer, etc.). Sounds a bit daunting, no? You’re right!

When I started to write the first task, I saw that analyzing the text is more than just matching a function to the text. When there are a lot of tasks, the code will get long and messy. And that’s why I created Nuntius: a PHP framework based on Symfony components that helped me organize the code.

Introducing Nuntius

Though Nuntius is well-documented, let’s see how easy is to set up a task. After settings up nuntius we need to write the first task.

Hooks? Event dispatching?

No and no. To make things easy, Nuntius does not use hooks or events dispatching to integrate with custom code. Instead, all of the integrations defined in a YML file.

In our hooks.local.yml we will add a custom task:

Our task will be located at src/Custom/LookForAPicture.php:

After implementing the task, we can start to code. So, what should this task do? Get a keyword, look for an image which relates to that keyword, and send it as an attachment to the message. When you are feeling down you could ask for a picture of a cute kitten to take away your sorrow.

We will need to search for pictures via a REST request. I found a nice service for that: pixabay. You’ll need to register and get an API key.

After acquiring the access token, we need to store it somewhere. The best place would be in the credentials.local.yml which located under the settings library:

Let’s have a look at the code to get the picture:

What we need to do is to return the image URL, and that’s it! This the full code:

And this is the result:

An embedded kitten image.

But Wait, There’s More!

Slack was kind and embedded the picture for us, but it’s not the best practice. Using attachments makes the message much more readable and gives us this:

Attaching an image by Slack best practice.

The code is a bit more complex than the simple URL we returned:

External Services

Your mind was blown, I know. But what’s next? If you thought that this branch of development is left without any SAAS solution, you are wrong. Bots need to interact with people and require deep learning and natural language analysis. There are two famous players in the market (for now) - api.ai and wit.ai.

In a nutshell, they will give you informative object to interact with and train with deep learning.

No. Skynet is not around the corner, and HAL 9000 isn’t going to be in NASA’s new spaceships. This is just the beginning. Unless you are a big company like Apple, Google, or Facebook you probably won’t provide a solution like Siri and other bots.

Bots, as I can see it, will be another interaction with the product and can provide a simple way to get information like:

  • When are you going launch the new version?
  • Let users know that the site is under maintenance for now, and when it’s back.
  • Notify the user when a new season of a TV show is going to launch (I was informed about Silicon Valley new season via Facebook Messenger.)

Don’t forget that QA was once done pretty much manually. Now, QA is automated but still need to be written and maintained. We might be killing one field but creating a new field.

May 23 2017
May 23

Chances are that you already using Travis or another Cool CI to execute your tests. Very often getting boolean or textual output from the execution is enough, because knowing which tests are failing is a good starting point to start to debug the problematic code. In our case, with WebdriverI/O (WDIO) and with an architecture where the frontend and backend are decoupled, it’s much more complicated.

It might be that the browser could not click on an element, or the frontend could not contact the backend, or the frontend has a runtime error (well, you might be faced with it, but at Gizra we use Elm, where it is practically impossible). Who knows, even the browser could crash due to lack of memory - the same applies to Travis too. One solution is to manually start reproducing what Travis does. It’s fun the first time, but doing it again and again is just a waste of time. But recently, our CTO, Amitai gave excellent pointers about dockerized Selenium and insisted that having video recordings is much better than simple static screenshots - and it was so true.

These days at Gizra - on client projects - we can benefit by knowing exactly how and why our browser-based tests failed. The fact that we already used Docker inside Travis helped a lot, but this additional video recording on the browser-based test makes the life of the developers much easier.


Let’s overview what’s bundled into Drupal Elm Starter, and who is responsible for what.

  • Upon a push, GitHub invokes Travis to start a build, that’s just the standard for many projects on GitHub for a long time.

  • Travis executes a set of shell scripts according to the build matrix. The only noteworthy thing is that using the build matrix with environment variables can be used to test the things in parallel - like one element of the matrix is the WDIO test, and another element could be any kind of Lint to scrutinize the code quality.

  • From this point, we only focus on one element of the build matrix. Docker Compose launches two containers, one with the application and the test code, the other with a Selenium Grid. It also helps the containers talk to each other via expressive hostnames.

  • The WDIO executes our test suites, but the Selenium host is not localhost, but rather the address of the other Docker container. This way Zalenium is able to record a video of the WDIO tests, it hosts the browser, the Selenium Grid and ffmpeg to encode the movie on-the-fly.

  • Google Drive hosts the videos of the failed tests. To use Google Drive programmatically, several steps are needed, but the gdrive uploader tool has excellent documentation.

  • In the very end, Gizra Robot posts a comment on the conversation thread of the pull request. Adding a robot user to GitHub is not different from adding a human - you can create a new GitHub user and dedicate it to this purpose. The exact process is documented in the repository.

The result

You can see an example video of the test on a recent pull request. The icing on the cake is that if you receive the GitHub notification email to your GMail inbox, you can launch a video straight from there via a YouTube player!

WebdriverI/O in action

Lessons learned

I joined Gizra three months ago, and the Gizra Way‘s time-box/escalation system helped a lot to accomplish this task, where many layers of the CI stack were new to me. Needless to say, debugging Travis is hard. And then, you need to wait. And wait. A lot. Then your issue has a timebox on it, so hard things must be done quickly and by following best practices.

Seems impossible, right?

My experience is that this rigorous workflow helped me to find creative ways to solve the problems (not talking about ugly hacks here - just merely changing the way to find proper solutions), if the complexity is adequately calibrated to the developer, it triggers good stress that helps in problem solving too and contributes to the work satisfaction.

Let’s see how I was led to make it happen.

Dissect steps

It seems to be obvious that you need to break the problem into smaller chunks, but when the testability is so problematic, you must follow this principle very strictly. In this case, the most helpful was to test the different units in the simplest environment as possible. For instance there’s a Bash script that’s responsible for the GitHub upload. Instead of launching the script via Travis or via a similar local environment, in the native local environment, just feeding the script with the proper environment variables, what Travis would do, helped to speed up the process to almost real time debuggability.

Even a small Bash construct can be extracted and tested separately. Same for a curl invocation that posts a comment on GitHub. So in the end, I enjoyed the efficiency that came from the way of testing all the things with the minimally needed context - without all the hassle.

Invest in easy troubleshooting

It was a strong sign that we wanted to invest a significant amount to have this functionality at our project template, at Elm Starter, just to help future work. Similarly on the low level, it was mandatory at some point to be able to SSH into the Travis build. It’s enabled for private repositories, but in our case, it was mandatory to write to Travis support and this way, for our public repository, it was possible to use this functionality. It helped a lot to understand why the process behaves differently than at the local environment.

Contributing what you can

During the implementation, there were some issues with Zalenium, the side container, which provided Selenium Grid and the video recording (https://github.com/zalando/zalenium/pull/92). It got merged to upstream after 12 days, mostly the time the maintainer waited for my answer. It is just a little documentation fix, but it might save fellow developers frustration. On my side, I had the confirmation from the most capable person that I should not try to use --abort-on-exit with that container. Such scenarios reinforces the best practice, give back what you have, either it is knowledge, a patch or a full-blown solution.


The solution that is publicly available at the repository is easy to re-use in any project that has a similar browser-based test, the only criteria is that it should support the execution on a Selenium Grid. You might capture videos of your pure Simpletest, Behat, WDIO or Nightwatch.js (and who knows what kind of other test frameworks are out there in the wild) test suite and port this code from Drupal Elm Starter to easily understand why your test fails, the only criteria is that you should be able to execute Zalenium Docker container aside. Pull requests are more than welcome to make the process more robust or even sleeker!

Sep 22 2016
Sep 22

In Gizra, we run an unusual stack. Drupal, our bread and butter, serves as the backend and is complimented by Elm for the front end, and as of recently Yesod - a Haskell framework.
Before Yesod, we were running NodeJs as our proxy server for light tasks or real-time messages.

But ever since I came across Elm, Javascript and I are hardly friends. I respect it, but can no longer turn my eye from its shortcomings. Functional programming has caught my heart and mind.

In this post I’m not going to try and present strong points on why you should adapt this stack, but rather share with you the unique path we are paving.


Elm was my first dip into the functional programming (FP) world. I recommend starting from there. It’s way more gentle than Haskell, and it has, of course, one important advantage - it helps you build rock solid, crazily fast, joy-filling, web apps.

Maybe this [post and video]({{ site.url }}/content/faithful-elm-amazing-router/) will get you excited as well.

Gentle Intro to Haskell

A good way to start explaining what’s so special about Haskell is to dive directly into Haskell. While PHP is defined (by some) as a “productive” language, Haskell is often blamed (by some) as being an “academic language”. Both statements are probably wrong.

Often when mentioning Haskell, we talk about the compiler and the type system. They are truly marvelous tools. For the type system, I think the Elm post I lined above should be enough to get the hang of it. But I’d like to continue and reveal a bit more by highlighting mini examples on how Haskell makes us approach development tasks differently.


Below are examples from the REPL (the interactive shell for Haskell)

> [1, 2] ++ [3, 4]

[1, 2, 3, 4]

What we have here are two lists of integers that are appended to each other. A list of integers is represented as [Int] in Haskell. The ++ is the operation that causes those two lists to be grouped into a single one.

Haskell comes with some handy shortcuts.

> [1..5] ++ [6, 7]

[1, 2, 3, 4, 5, 6, 7]

And even ones that can save a few lines of foreach:

> [x * 2 | x <- [1..3]]

[2, 4, 6]

The above is asking Haskell to generate a list of Int from 1 to 3, and feed it into x * 2. We can also do fancier stuff here, like take a list from 1 to 10, but use only the even numbers:

> [x | x <- [1..10], rem x 2 == 0]

[2 ,4 ,6, 8, 10]

A String in Haskell is actually a list of Char ([Char] in Haskell talk). So this may seem a little surprising, but we can actually act on the list of chars in the same way we did with numbers.

> ['a'..'e']


> ['a'..'e'] ++ ['f'..'h']


> "Hello " ++ ['w', 'o', 'r', 'l', 'd']

"Hello world"


So far, nothing here is overly complicated or life changing. When I gave this talk in Gizra, even the non-devs weren’t alarmed at this point. No glazing eyes (yet).

Now we can start looking at how Haskell brings a different approach to programming. One that, as Drupal developers, we are not too familiar with.

The fine folks of Haskell looked at the appending of the lists, and probably told themselves “Hey, there’s a pattern here, and it can be satisfied by a set of rules”. The pattern they recognized is the fact that we need to append variables of the same type. That is, two lists of integers can be added together and form a longer list of integers. The type doesn’t change, just the number of members inside of it.

The ++ operation we saw can be generalized and called mappend. All the following will have the same result:

> [1, 2] ++ [3, 4]
> mappend [1, 2] [3, 4]

-- The backtick sign (`) means we can use the mappend between both arguments.
> [1, 2] `mappend` [3, 4]

[1, 2, 3, 4]

So, trick question: with the above abstraction in mind, what would you expect the result of appending two integers be?

> 5 `mappend` 6`

Should it be 11 (5 + 6)? In that case why wouldn’t it be 30 (5 * 6)?
Or maybe even 56 (smashing the two digits together)?

It’s unclear. However, Haskell has a solution. We can use types to explain to the computer what our intention is. So we have two types Sum and Product:

> Sum 5 `mappend` Sum 6


> Product 5 `mappend` Product 6



Hope you are not too afraid of the above title. Monoid is the name of the abstraction we just briefly went over. It should not be confused with the dreadful Monad term, which I have no intention to cover.

In PHP talk, a Monoid class is like an interface that has two methods: mappend and mempty.

The mempty defines an “empty” value. What is the empty value in the case of the above Sum?

Sum 5 `mappend` mempty = Sum 5

mempty would be Sum 0 (because 5 + 0 = 5).

And what would it be in the Product case?

Product 5 `mappend` mempty = Product 5

mempty would be Product 1 (because 5 * 1 = 5).

For completeness - the second and the last rule for a Monoid is that mappend x y should be equal to mappend y x (5 + 6 == 6 + 5).

That is it. You are on the path to Monoid zen. Now, let’s put our knowledge to use, and try to implement an example.

12 Hour Clock

Here’s a nice Monoid example I came across. Let’s say we have an old clock with 12 hours. No AM or PM. Something like this

In case you missed out life in the past century, this is a clock

So let’s say that time now is 10. What time will it be in 4 hours from now?

We cannot say 14, because inside the clock there are only 12 hours. We already know will happen - it’s simply going to pass the 12 hours and “reset”. That is, the answer is 2.

Inside this clock we can continue adding numbers, but our logic will never fail. Any number we add will never fall outside of the clock, thus causing the universe to collapse into itself.

Monoid can help us accomplish this behavior because we are able to define our own completely arbitrary new type, and have a standardized way to explain to the computer how numbers are to be appended.

data Clock12Hours = Clock12Hours Int
    deriving (Show)

instance Monoid Clock12Hours where
    mappend (Clock12Hours x) (Clock12Hours y) = Clock12Hours $ (x + y) `mod` 12
    mempty = Clock12Hours 12

Don’t bother too much with the above code’s syntax. What’s important here is to understand that after defining this Haskell instance - or in PHP talk say that Clock12Hours is implementing the Monoid interface - we can use the same mappend we used above with integers and chars.

> Clock12Hours 4 `mappend` Clock12Hours 10

Clock12Hours 2

So What?

How would we do this in PHP? That is, if we had something like:

$hour1 = new Clock12Hour(4);
$hour2 = new Clock12Hour(10);

How can we add 10 + 4 in this PHP example? The idea of appending items, as seen in this example, was not generalized in PHP, so we are bound to think about this task differently.

My point here is that the language we’re using for development is dictating the way we are able to articulate the problem, and the way we model the solution. I don’t think, necessarily, one is better than the other (well actually I do, but I’m not here for the flame wars), but rather looking to emphasize that Haskell is different enough from most other languages. Just by learning it - similar to learning Elm - can have a positive impact on our developers’ skills.


Ok, back to web-development. That Haskell thingy we just saw? Yesod is built on top of it.

Haskell has a big learning curve. Admittedly, it took me more months than I’d be happy to share, to understand some of the basic concepts. And I still don’t understand many.

However, even though it’s based on Haskell, in the day to day tasks, you don’t really need to deal with all those abstractions. Your route will respond to some arguments, get data from the DB, massage the data, and send it to the templating system to be rendered.

It’s hard to explain without providing lots of points why a framework is great. So I won’t try, and instead I’ll give an anecdotal example that I love, because it illustrates to what degree of guarantees - that is, shifting run time errors to compilers errors - Yesod has reached.

In our Yesod website if we want to add Bootstrap CSS file, we write addStylesheet $ StaticR css_bootstrap_css.

We don’t have a static folder in Drupal, but it can be roughly translated into drupal_add_css('static/css/bootstrap.css'); which isn’t something to brag about. We’re just loading the CSS.

Well, not really. You see, in Yesod, if you will notice, we don’t write StaticR css/bootstrap.css but rather use css_bootstrap_css. That string, as the docs explain, is a reference to the actual file.

This means that if the CSS file doesn’t exist, the compiling will fail! You will get a big fat error message telling you to go and add the missing file because you probably don’t want to deploy your website without its CSS.

Pushing code to production without all the necessary assets isn’t something I’ve done many time, but it has happen in the past. I wouldn’t mind if it was guaranteed not to ever happen again, without me needing to put any cognitive effort into it.

Unfair Comparisons

Coming from a full blown CMS, frameworks tend to look appealing at times. They are so light, and so much faster.

Then, I find myself hand coding flag like functionality, which in Drupal is just a case of downloading a well tested module, enabling, and configuring.

In Yesod, I had to write it myself. There are some Yesod packages out there, but nowhere as near as what Drupal offers.

On the other hand, a lot of Drupal’s modules are of low quality or very specific, so quantity doesn’t necessarily mean much. We usually just need a tiny subset of it.

And that flag functionality I wrote - well, my local server spit the JSON response back within 6ms.

Flagging a user

6ms! The same JSON response took about 200ms on my local coming from Drupal.

“But that is not fair!", you may say, “the comparison isn’t right. Drupal does so much more out of the box!”

You are, of course, right. But on the other hand - 6ms. When I clicked the flag link, I didn’t need the other stuff Drupal brought (and bootstraps on each request). I just needed my item to be flagged. And it was. In 6ms.

Our users don’t care about the underlying technology. They just want it fast. But then again, it’s very hard to compete with Drupal’s community, maturity, eco-system, hosting providers, etc.

It’s up to us to select the right tool for the job, and that’s why I love web development and my job so much.

How to start

If I got you interested, here are some good resources:

Sep 15 2016
Sep 15

Creating a plain text email with Drupal is a simple task. However, when you want to send a nicely designed message with a logo, an article, a linked button or any other unique design with dynamic content personalized for each user, it can get complicated.

The complication stems not from the dynamic content, but rather from the fact that the CSS that can be applied inside email templates is limited. In general, targeting multiple email clients can be worse then getting your CSS to work on IE9!

This post will walk you through a solution we use to address these requirements, but before jumping in, let me first explain Gizra’s approach to themes. We don’t use custom CSS in Drupal themes. When we start building a website, we divide our work into several milestones, the first is creating a clean and static markup, using Jekyll. At Gizra, we take pixel perfect very seriously, and by doing the markup first, we can concentrate on building our App pages exactly the way they are suppose to look, test their responsiveness, show our clients a first draft, and fix bugs before getting into the logic. We use gulp to compile the SASS files into one CSS file, and after doing that, we copy the CSS file to the themes folder. Then we take our static pages, cut them into pieces, and use them in Drupal themes and plugins.

By doing this, we can focus on our logic without worrying about how it may look with different dynamic content. Focusing on Frontend and Backend as separate tasks makes building websites easier and faster. Even fixing bugs discovered while implementing dynamic content can now be easily fixed. Our No more CSS in your Drupal Theme! blog post talks more extensively about the way we work with Drupal themes.

The same approach is implemented when we create an email template. We first build the email markup with static content, and then we use it to create dynamic content messages. Oh, and we cheat, because we don’t write a single line of HTML or CSS!

A demo email template created for this post

Creating an email template

When we build a website we need to take into consideration that our users use different browsers and adjust our CSS rules so that our website will look pretty much the same in all of them. Achieving this is more difficult when it comes to emails.
Our users use different email services and view their emails on different browsers. Each email may look a bit different on each browser or software. Some of the email services do not support all HTML tags and CSS designs. We can’t use everything we use in our website for example: Gmail and Outlook users have poor support for float, margin and padding. Also, some email services may overwrite our design and replace it with its defaults like links color, or image visibility. Another issue is the screen widths where mobile or tablet users view emails very differently.

Our way to overcome this problem is to design our emails with nested tables since they are supported by most email services. However, this is still not enough. To make sure that our email will look the way we want, we need to set a specific width for each table cell and that’s a lot of work. After creating our email template we need to find a way to test it and make sure that it looks the way we meant on every media or mail service.

This is when we decided to take advantage of Mailchimp’s wysiwyg editor. In the editor we can build the static version of the email, that will include links, images, videos, etc.

We use the editor to create the email friendly HTML and CSS for us, and later use the export functionality to grab it and move it into Drupal.

Mailchimp wysiwyg editorMailchimp content items

Behind the scenes, Mailchimp converted my design into nested tables with the most mail supported CSS rules. There is also an option to view the source on the Mailchimp editor and do my own adjustments. I upload my images to Mailchimp’s cloud, so I won’t have to worry about my users email software blocking images attached to the email.

Mailchimp also gives us the opportunity to test our email template on desktop, mobile, and inbox software such as different versions of Outlook, Gmail, and Yahoo on different browsers. After finishing the email template, I go to my templates list and export it as an HTML file, which is combined with the inline CSS. Next we need to wire it into Drupal.

From Static to Dynamic

The idea is now to take the static HTML and slowly introduce the dynamic content.

Instead of calling drupal_mail() directly we usually use the Message and Message notify modules to send the emails. It has the advantage that each mail is saved in the system, so it’s easier to track (and find bugs, if there are any).

One way to do this, is to create a theme function that has a tpl file, were we have our HTML. Then when creating the message we can replace a dynamic token with the content of theme('my_email_template', $variables)

But in order to convert the static parts to dynamic, we need to see the result. Sending an email on every change can be time consuming and make debugging harder, so we start by seeing the email inside our site, while we develop it. To do that, we can define in hook_menu() a debug path like message-email-debug/% where the argument will be the message ID.

However we’ll need to make sure to disable our site CSS before we view our message, because it might change the way we view the emails. We can safely remove all CSS, since the email’s CSS is inlined.

 * Implements hook_css_alter().
 * Remove all CSS
function example_css_alter(&$css) {
  $item = menu_get_item();
  if ($item['path'] != 'message-email-debug/%') {
  $css = array();

We can also go ahead and override page.tpl.php so on message-email-debug it will print out only without any other elements.

At this point, we can start converting our static into dynamic HTML, while being sure that the final HTML it guaranteed to work in email clients.

Sep 07 2016
Sep 07

As OG8 is steadily being built, I have noticed a certain pattern - or a mindset - we’ve been following which I think is worth sharing.

OG8 is the third time I’ve written OG for Drupal. The first OG7 version was a head jump into the Entity goodness that Drupal 7 brought along with Entity API. The second version was taking a small step back away from the Entity fiesta, but took two steps forward into the field API.

I think that as a developer I have matured since then. Edge cases are no longer my concern. I mean, I’m making sure that edge cases can be done and the API will cater to it, but I won’t go too far and implement them. It’s not that in OG7 we tried to tackle all of the edge cases, but in OG8 it’s even less.

In fact, although we write a lot of new code as part of the porting, as the way we write modules for Drupal 8 has changed considerably, writing OG8 feels like&mldr; Well, it feels like I’m mostly deleting files.

Removing lines of code is so much better than adding

Myths Debunked

It’s not too rare to hear rants about OG. Often they are not backed by actual data, or even refer to older versions.

I was even quite surprised to find out in one of DrupalCon BOFs that an “OG alternative” module (that now seems to be without any much activity for the past year) was created by an author that never bothered to check OG 7.x-2.x. They knew OG 6 and kind of knew OG 7.x-1.x, and yet they used to bash OG pretty badly.

(And just to prevent any mistake, but still not call out the module by name - I am not referring to the Group module. The above section is about another module. You can read about my positive and negative critique of Group here.)

Being in that BOF was both funny, and a little sad at the same time.

Now, don’t get me wrong. There’s nothing bad with alternatives. In fact Message and RESTful modules have grown as alternatives to existing solutions, but they all grew after a deep understanding of all the existing solutions.

So, just for the fun, here are the rants ordered by popularity:

OG is complicated

It is. After all, it’s dealing with a complicated problem. Just like many other important contrib modules, it does the heavy lifting so that you and I won’t have to do it when we build our sites. OG is dealing mostly with access - so no easy shortcuts can be taken.

With that said, the concept itself along with the implementation is quite easy to explain. In fact, in OG8 we’ve simplified it even more. That is, we’ve somewhat reduced the flexibility in order to reduce the complexity; but while doing so, made sure edge cases can still hook into the process.

I always find that doing sketches by hand, can show that ideas are actually easier then what they might seem. Here’s OG in free hand format:

Concepts should be expressed as easily as possible. Circles made with a small glass, and straight lines with a business card that was on my desk

Seriously, I can’t think of a simpler solution that will still allow a robust group site:

  1. The reference between a group content to a group is done by core’s entity reference.
  2. The reference between a user and a group is done by the OgMembership entity, that can hold aside from the association, also metadata such as the created time and the state of the membership (active, pending, or blocked).
  3. An OgMembership can also reference an OgRole, which is a role that applies inside the group.

OG adds a lot of overhead

I used Blackfire.io to check the performance of a clean Drupal 8 installation with 1000 users and 5000 nodes. Then I ran the same test on the nodes being also an OG “group” (i.e. OG uses it as the “container” of other “group content”). Profiling was done on an out of the box Basic page node view. When OG was enabled, it was tested with a user that had 15 groups (which is more than the typical use case).

Clean DrupalDrupal with OGDifferenceTime440 ms468 ms+28.3 ms (+6.43%)I/O Wait21.1 ms21.1 ms+44.2 µs (+0.21%)CPU Time419 ms447 ms+28.2 ms (+6.74%)Memory36.5 MB39.5 MB+2.98 MB (+8.16%)

The gist of it: OG added merely 28 ms to the request, and 3 MB more in memory. And for the record, we have not started doing optimizations yet!

Module X is much simpler/ faster/ betterer/ awesomer

Does that module work for you, and you are happy with it? Is it well tested and the maintainer does a good job?

Awesome, stay with that module. OG is just a tool - not a life choice :)


I have a healthy obsession over quality and the idea of “correctness.” How can you move forward quickly with your software, but have enough guarantees that you are not breaking existing code. Since PHP is lacking a compiler, it leaves us with a few other good options.

Data integrity

Here’s one of my favorite images, a Drupal 8 developer sees on a daily basis.

Better have an error than wrong data

It’s an exception thrown by code that was not satisfied with the data it received. It’s not a notice, appearing in a red box on top of your page, which your brain has learned to ignore. It’s an “in your face” error message that makes sure you stop being lazy, and go fix your code.

OG8 is applying this same approach. Are you trying to create an OG permission with illegal value? Sorry, this won’t work, and we make sure you know about it. Silent errors are risky and can be easily overlooked and pushed to production.
Are you trying to save an OG membership for an anonymous user? Once again, we will throw an exception. This will make sure you won’t have a security issue, where suddenly anonymous users might get too much access on your site.

Automatic Testing

As good as exceptions are, they are just safeguards. Tests are what will make you and I sleep better at night. But of course, this really depends on the quality and quantity of your tests.

If I would tell you that OG8 has about 50 files, you might refer me to the “OG is complicated” section above, and gently imply that it sounds like a lot of files.

But sorry, I lied, as in fact OG8 has currently 120 files. However, 50 of those files are under the tests folder.

You see, OG, like any other good module out there has the life above the surface and below the surface. As a site builder or a dev you interact with that part. But the side below is responsible for asserting that we are not breaking the existing functionality or open up security holes. That’s the side we - the OG developers - interact with locally or on Travis-CI.

As you can imagine, this is very time consuming. In fact, it’s not rare that developing the tests can take more time than the actual code. Just Look at this example: the unsubscribe controller, responsible for checking if a user can unsubscribe from a group is about 15 LOC (lines of code). The unit test that covers this method has 230 LOC. In fact it’s even not the only test that covers this functionality, as there is also a Functional test to assert it.

That’s a lot of tests! And even though it’s time consuming, it actually allows us to move fast and save time in the long run. Because when you have the confidence that the system is well tested, you are not afraid to continuously iterate, rewrite, and polish existing work.

I think there is another hidden merit in tests. By taking the time to carefully go over your own code - and using it - you give yourself some pause to think about the necessity of your recently added code. Do you really need it? If you are not afraid of writing code and then throwing it out the window, and you are true to yourself, you can create a better, less complex, and polished module.

Simplifying and hiding advanced features

One of the mistakes that I feel made in OG7 was exposing a lot of the advanced functionality in the UI. It’s not a terrible mistake (ponder on the amount of complex stuff Views UI allows you to do), but I think that it contributed to feeling people had that things are complex.

This notorious administration page allowed you to add OG related fields to different entities. It also allowed you to add different field instances of the same type, as for example you can have multiple OG audience fields on the same bundle.

Don't worry kids, the beast is gone.

But these are all advanced use cases. When thinking about how to port them to OG8, I think found the perfect solution: we did’t port it. It might sound a bit funny, but I think there are important advantages in doing so:

  1. Less code to write and maintain.
  2. Less complexity in the system.
  3. Lower the barrier for site builders. They will have just a single page to set a bundle as related to OG.
Adding any OG related field will be done via a simple UI

Obviously, the more advanced features (such as the above-mentioned multiple OG audience fields) remain in the code, so advanced developers can use them when needed via code:

// Make bundle a "group".
\Drupal\og\Og::addGroup('node', 'page');

// Add OG audience field to a "group content"
\Drupal\og\Og::createField(\Drupal\og\OgGroupAudienceHelper::DEFAULT_FIELD, 'node', 'article');

Excited? So are we! Come and join us, we have low-hanging-fruit issues you can start with, and you’ll find yourself writing features and tests in no time!

Aug 20 2016
Aug 20
Driesnote where GraphQL was featured. Picture from Josef Jerabek

After some time contributing to the Drupal project in different ways, I finally decided to step up and get involved in one of the Core Initiatives. I was on IRC when I saw an announcement about the JSON API / GraphQL initiative weekly meeting and it seemed like a great chance to join. So, this blog post is about how you can get involved in a Core Initiative and more specifically, how can you get involved in the JSON API / GraphQL Initiative.

The JSON API / GraphQL Initiative

This initiative appears as part of a larger one: the API First Initiative. The goal is to fully integrate the JSON API and the GraphQL specifications into Drupal Core. This fits perfectly into the API First Initiative as the of the parent issues is to make the data stored and managed by Drupal available for other software.

What happens if I’m still stuck with Drupal 7? While these initiatives are focused on Drupal 8 and will not have a direct backport to Drupal 7, there are a few solutions done on contrib that implement most of this functionality, like the RESTful module.

Why is this initiative important?

All the improvements on web services are a big step forward for Drupal when it is used as a backend for other applications like mobile apps, or for architectures like a Headless Drupal (Drupal used as backend and a totally decoupled frontend done in a different technology). It also helps to better connect with other services or applications, resulting in a more powerful tool.

The initiatives also uses common standards, which will help people who have less knowledge of Drupal to still use its advantages.

Now you know where you want to work so, where should you start?

Getting info

Before diving into the issue queue or the code, you should know well what is involved in the chosen initiative. The different initiatives usually have a central point where all the related information and issues are gathered. This can be a meta issue or a drupal.org project.

For the API First Core Initiative there is an issue you can find here. This gives us a general overview of the whole initiative and where fits within the JSON API and GraphQL initiative. In that issue, we can find the links to the specific projects of JSON API and GraphQL. Those pages and their respective issues queues are where the development of the initiatives is being held. You should read them carefully.

Mateu, maintainer of the JSON API project has also prepared a very useful set of videos explaining the features of the JSON API module and the different changes that are occurring during the development.

In addition, you can also get the most updated news on the meetings that are happening related to the core initiatives. There is a public Google Calendar with all the meetings. Some initiatives meet weekly, others are every two weeks or even monthly. My first contact with the initiative was in one of these meetings. I wasn’t able to participate a lot, but at the end of the meeting, I could get some helpful hints on where to start, and I had the idea of what could be my first contribution: this post!

I also learned that this initiative hangs out on the #drupal-wscci IRC channel, where you can ask for help or get the meeting hangout link. Just ping the people involved in the initiative, like Mateu Aguilo (e0ipso) from the Json API initiative, Daniel Wehner (dawehner) and Wim Leers (WimLeers) maintainers of the REST module on core or Sebastian Siemssen (fubhy), from the GraphQL Initiative.

Next steps

Get used to the code! Download and set up the Drupal 8 environment with the modules from the initiatives. Test them, and play a bit with them to know how they work. Subscribe to some issues to see how the development goes. Talk to the people involved in the initiatives and don’t be afraid to ask them.

And let’s start contributing!

Jun 08 2016
Jun 08

Drupal-8-manina is at its highest. Modules are being ported, blog posts are being written, and new sites are being coded, so we in Gizra decided to join the party.

We started with a simple site that will replace an existing static site. But we needed to migrate node attachments, and we just couldn’t find an existing solution. Well, it was time to reach out to the community

Any example of #Drupal 8 migration of files/ images out there? (including copy from source into public:// )

— Amitai Burstein (@amitaibu) April 8, 2016

A few minutes after the tweet was published, we received a great hint from the fine folks at Evoloving Web. They were already migrating files into Drupal 8 from Drupal 7, and were kind enough to blog post about it.

However, we were still missing another piece of the puzzle, as we also wanted to migrate files from an outside directory directly into Drupal. I gave my good friend @jsacksick a poke (it’s easy, as he sits right in front of me), and he gave me the answer on a silver platter.

Post has a happy end - we were able to migrate files and attach to a node!

An example for super heroes

For this blog post I created a dummy Drupal 8 installation profile with way too much information about super heroes. The migration module can migrate some images along with some CSV data about them.

If you’ll look closely you can see that I’ve attached an SQL dump with raw tables. This raw table will be the source that eventually will migrated into nodes, and you can read here how it was created with csv2sql.

Basic structure of migration

The description of the mapping between the source table and the destination node type is in a yaml file.

Let’s go over the interesting parts of the process:

    plugin: default_value
    default_value: super_heroes
    plugin: default_value
    default_value: 1
  title: _title
    source: _image
    plugin: file_import
  field_alter_ego: _alter_ego
  'body/value': _description

In Drupal 7, because we wanted to prepare the value before populating the entity fields, we did it in a prepare method. In Drupal 8 we have process plugins.

For example, the default_value plugin will populate the (configurable) field of the entity with a raw value like the name of a content type or a user ID, in case we are migrating all the nodes with the same author (e.g. user ID 1).

But we can, of course, have our own logic. In the transform method of the process plugin we can massage our data and return any value which will eventually populate the field.

In our case, the transform method is responsible for adding the new file into Drupal using file_unmanaged_copy and friends.


Some of the know hows and best practices are still missing from Drupal 8. But they can and should be re-learned and re-published. So remember kids, if you ever feel frustrated about not finding a solution, always reach out to smart community members and then write a post about it, so everybody can profit.

Jun 07 2016
Jun 07

Today we held our inaugural #TheGizraWay webinar. The web series is intended to showcase some of “The Gizra Way” principles - a set of best practices and methodologies, borrowed from the Open Source development world and applied to operations, workflow, and overall company culture.

For the first in the series, we chose the topic of price estimations because it provides a real - and perhaps radical - example of how transparent communication from the beginning about a project’s needs alongside its budget can turn the process on it’s head. In the video below Amitai Burstein discusses how a budget-and time-driven discovery process gets a project off on the right foot.

[embedded content]

The next webinar - to take place in July 2016 - will be announced shortly. If you have any suggestions for topics or an idea that you would like to present in a future session let us know.

May 17 2016
May 17

For years I have been hearing about DrupalCon from Brice and Amitai. Every six months they would send me a massive group photo and challenged me to locate them among the crazy mustaches, viking helmets, and identical t-shirts. Needless to say, I failed every time and the number of people in those pictures grew every year. I’m also happy to say that that last group photo - from a week ago - included me as well (Bonus points if you can spot me).

2016 DrupalCon Group Photo.

My first Con was an overwhelmingly great experience and I learned a ton of new things. Here are the top 12:

1) Count down from 100 if you can’t fall asleep at night

DrupalCon’s sessions and keynotes are diverse and engaging. For instance, the Community Keynote by @schnitzel (Michael Schmid), was full of tips to keep your brain ready and aware, such as: Start your day doing things you dislike, drink plenty of water that will force you to take a lot of pee breaks, and play with kids to clear your mind.

The enormous amount of people and ideas exchanged in DrupalCon are so invigorating that you might find it hard to sleep at night. Try counting backwards slowly from hundred to zero. I have already put it to the test and it works - that tip alone was worth the trip.

Michael Schmid (@schnitzel) delivers the Community Keynote

2) Gator omelette for breakfast

New Orleans is a seafood and meat town. Crab, crawfish, sausages, bbq, and alligator - the Queen City is not known for its consumptions of vegetables.

Breakfast portions are huge and everything is golden-brown. But in New Orleans there is a special name for that little strip of green ground in the middle of a boulevard - “neutral ground” (thanks Trivia Night!). Perhaps they can grow fresh vegetables there!

A typical three-person breakfast. We've never finished it!

3) Trivia Night is fun, Especially in the World War II Museum

I was slightly afraid of joining a team in Trivia Night, but it seems the fine lords of Trivia have made newbies a wanted commodity - just by sitting at the table my team got 14 points because I was a first-timer both in Trivia Night and at DrupalCon.

Despite that bonus, and the fact that my Metal Toads teammates really know their Drupal, we did not win. But still, for me, Trivia night is the essence for the whole week: smart, funny, friendly, welcoming people, having fun while geeking out till midnight. Loved it!

World War II plane hovering above Trivia night

4) Contribution can come in many forms, including opera singing and dressing as Sriracha sauce

It was on the last day when I told Amitai that I felt bad that I have 0 credit points. Coding isn’t really my strong suite and my tenor voice range is limited.

In Gizra we are all encouraged to contribute to the community on company time, this is core practice of the the Gizra Way.

Yes, I attend local meetups and camps, but seeing all the different forms of contributions in DrupalCon and the enthusiasm related to it has inspired me to write this blog post as my way to commit to the community - there are many Drupalers that could not attend and hopefully they will get a bit of it from this post. Or I can always hop onto the issue queue as Leigh Carver suggested in her session.

The funny pre-keynote. Even man in tights are considered a form of a contribution in DrupalCon land

5) Wi-fi is created by a Roomba vacuuming robot.

Not really a Roomba vacuuming robot

6) Amanda G from the emails is a real person

And so are all the rest of the Drupal Association’s nicks. Presumably being human, it seemed like they worked nonstop, 24 hours a day, with zero sleep during DrupalCon. I saw them everywhere at all hours of the day. You can’t help but appreciate the enormous effort of producing this huge event, as well as maintaining a community year around.

Don't ask - it's hard to explain

7) There is such a thing as an inflatable beard of bees

And Dries used it as an example in his keynote to describe something that has a single use case and very limited reach. Exactly the opposite of Drupal’s future - a versatile platform with very high usages. Drupal has already been used in a non-traditional environments such as Tesla car dashboard and NYC Subway information kiosk and our benevolent dictator has shared with us his personal vision for building cross-channel customer experiences that span various devices powered by Drupal.

Unfortunately, he did not wear the beard but I bet that there will few of those in the upcoming Dublin DrupalCon.

Not Dries

8) The halls and sprints are where the real con is happening

BOFs, sprints, hallways, parties etc. - all the places where there are no video recordings - are where the real DrupalCon is happening. There’s so much value in face to face meetings, where people can tackle the problems and discuss the features that make Drupal great.

And if counting down from hundred to zero does not help - the sprint lounge is open 24!

9) On a ghost tour in New Orleans, you do not see ghosts. So what’s the point?

Thanks to Amazee Labs for sponsoring a haunted tour that I experienced on my first night. I had no idea that social events of all kinds are such a huge part of the Con. From a cheerful funeral procession, a laid-back Lullabot party or a just a friendly Women in Drupal mingle event, there was something every night in a different location around the city, making DrupalCon four continuous days with very little sleep in between.

According to our ghost tour guide, the locals call it Touchdown Jesus

10) Your servers should be “cattle not pets”

Do not get attached to your web servers. Do not name them or treat them like pets. It will then feel weird when the time comes that the traffic volume goes down and you have to let them go. Think about them as cattle, have just the amount you need and can feed. I learned that lesson from Steven Merrill’s great session on “How Major League Soccer Scores Superior Digital Experiences”.

11) You can use Drupal to build a social network

Well, I guess you already knew that. But you might not have to do it yourself, as others are already working on it. Open Social, a distribution for building social communities and intranets will be released this summer. It looks great and it is the de facto successor of Commons for Drupal 8. GoalGorilla is leading the project and using it as a use case for commercial distribution. Taco Potze talked about about in his session “Selling Drupal modules & Distros.”

Open Social devs are excited.

12) Gizra now has a US office

This last one is actually to myself. As the director of Gizra US, it’s there to remind me, that I’ll get another chance to experience the same great stuff next year at DrupalCon Baltimore!

Apr 01 2016
Apr 01

The wonderful Migrate module is used in every one of our projects, whether we have actual legacy content, or “just” want to create dummy and [XSS content]({{ site.url }}/content/xss-attack/).

So you received from your client a scary looking SQL dump or Excel file with old website data. Where should you start?

Here are some of the best practices we apply in Gizra to ensure a smooth migration project, with the least possible amount of surprises or bugs.

Discovery phase

A good discovery phase before doing any actual coding is key to the success of the project. We believe it’s a good idea to do it as part of the [price estimation]({{ site.url }}/content/budget-goggles/).

Understand the data

The first step is to understand how to map the old data into Drupal’s entities - which data should be a node and which one a taxonomy term. As always, we try to map the “real world” into Drupal entities. An article, for example, is obviously a piece of content, so it needs to be a node, and not a taxonomy term. However there are more aspects to consider like the number of fields, semantic meaning of the content, and whether hierarchy is needed or not. It’s also worth noting that this is the exact time to think about improvements to the data structure.

There are some cases where content types were separate in the old website for historical reasons that can be merged and vice versa. Our rule of thumb is that the content type should try to map as much as possible the reality. That is, a “case study” content type might have very similar fields to an “article” content type but semantically they are different, thus they should be two different content types.

After you decide what fields your content type will have, pay careful attention to the data itself. If an article “topic” is a set of values that are constantly repeating, then naturally you would prefer to convert the field to a select list or even to a proper taxonomy term. Notice variants in the same data (“Politics” vs “politics”) and make the extra effort to sanitize and clean the data.

Don’t migrate what you don’t need

Not all of the existing data really needs to be migrated. For example, past events may not add much or any value to the new site. It’s up to you to present to the client the possibility to save some money on a low impact migrate and shift resources to something more important.

Don’t go fully automatic for nothing

Doing manual migration is fine. As part of the discovery phase figure out how many items per content type need to be migrated. If it is less than 50 offer the client to do it manually. Yes, you may “lose” some billable hours, but you gained the client’s trust.

Development phase

Convert to SQL

If you received your data in csv format it is advised to convert it to SQL. Your migrate may work fine with a few hundreds lines, but it will choke if you have more - there’s no primary key column in CSV, so it basically loops over the same rows again and again.

SQL also provides another layer of safety, since it won’t accept wrong data. For example strings cannot be inserted into Int columns, so if you get an SQL error, you can easily find where the data is corrupted.

We got you covered with our csv2sql.

Content during development

While you are developing the platform and constantly rebuilding the site from scratch, of course you don’t want to wait hours for the import of all the data. You can add some dummy data, but a better approach would be to take about 50 rows from each table/content type. However, don’t take 50 consecutive rows, but rather take random rows to increase the chances of hitting data corruption or just plain bugs in development instead of in production.

Migrations testing

Automatic tests are great, right? They catch bugs, and make you feel more confident about your code base, right?

So write automatic tests for your migration scripts, and wire them to Travis CI!

It’s obvious that when you have a huge amount of content you can’t check every single piece of content. But even little of coverage is better than none.

Start writing tests during the development against smaller subset of content (e.g. those 50 rows mentioned earlier). There is no need to create complicated scenarios for migration of content testing, you should simply check that the fields contain the correct data (texts, images), and that references are set correctly.

The crucial part in the tests requires your QA person to visually compare the old data with the new data. Once this is done, your automatic tests make sure there is no regression.

And please, don’t think writing those tests is a waste of time. On the contrary, it saves you so much effort chasing horrible regressions. Here’s an example of a [properly written]({{ site.url }}/content/behat-the-right-way/) Behat test.

  Scenario Outline: Login to site, and check a article content.
    Given I login with user "test"
    When  I visit "" node of type "article"
    Then  I should see the title "<Title>"
    And   I should see the "description" field "<Description>"
    And   I should see the "tag" field "<Tag>"
    And   I should see "<Number of pictures>" pictures in "images" section

    | Title       |          Description     |      Tag      | Number of pictures |
    | Curabitur   | Nam sed ex vitae arcu    | Education     | 1                  |
    | Quisque     | Praesent maximus a mi si | Science       | 0                  |
    | Lorem ipsum | Aenean sem lectus, porta | Entertainment | 2                  |
</code></pre></div><p>Of course, don’t forget that when the migration is ready, run some final tests
against the site with all the content.</p><figure><img data-src=/blog/images/migration-best-practices/image1.jpg class=lazyload></figure><h2 id=getting-the-data-for-testing>Getting the data for testing</h2><p>Since you will migrate data from SQL tables (thanks to csv2sql), it should be easy
to get data already prepared for tests using MySQL queries in phpMyAdmin.</p><p>Here is an example of an SQL query to get data from the sql table which will take
the <code>_title</code> and <code>_description</code> from each 100th line of the <code>_raw_article</code> table.</p><p>The sql query selects the fields you want to check in your automatic test from
every 100th line, and adds empty placeholders (i.e. <code>leftPipe</code> and <code>rightPipe</code>)
that will be converted to pipes in the beginning
and in the end of each line during the export.</p><div class=highlight><pre style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-sql data-lang=sql><span style=color:#66d9ef>SELECT</span> <span style=color:#e6db74>'</span><span style=color:#e6db74>'</span> <span style=color:#66d9ef>as</span> leftPipe, <span style=color:#f92672>`</span>_title<span style=color:#f92672>`</span>,<span style=color:#f92672>`</span>_description<span style=color:#f92672>`</span>, <span style=color:#e6db74>'</span><span style=color:#e6db74>'</span> <span style=color:#66d9ef>as</span> rightPipe <span style=color:#66d9ef>FROM</span> <span style=color:#f92672>`</span>_raw_article<span style=color:#f92672>`</span> <span style=color:#66d9ef>WHERE</span> (<span style=color:#f92672>`</span>_raw_video<span style=color:#f92672>`</span>.<span style=color:#f92672>`</span>__id<span style=color:#f92672>`</span> <span style=color:#f92672>%</span> <span style=color:#ae81ff>100</span>) <span style=color:#f92672>=</span> <span style=color:#ae81ff>0</span>;
</code></pre></div><p>Now you can use phpMyAdmin to export the result table to CSV format. You can download the data as a CSV file or directly copy it:</p><figure><img data-src=/blog/images/migration-best-practices/image3.jpg class=lazyload><figcaption>Export result of the query.</figcaption></figure><figure><img data-src=/blog/images/migration-best-practices/image4.jpg class=lazyload><figcaption>Use custom settings.</figcaption></figure><figure><img data-src=/blog/images/migration-best-practices/image5.jpg class=lazyload><figcaption>Use CSV format, separate columns with pipes.</figcaption></figure><p>Here’s the output you will get, ready to be added to your Behat test:</p><div class=highlight><pre style=color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4><code class=language-txt data-lang=txt>  | Ludwig Blum    | The Israeli Ambassador |
  | Julia Lagusker | Julia Lagusker         |
  | Noah Heymann   | Interview the artist   |
</code></pre></div><h2 id=known-pitfalls>Known pitfalls</h2><ul><li>Some images may be missing. A <a href=https://gist.github.com/HelenaEksler/e01a3572afc39f189ecc>bash script</a>, for example, can help you identify which.</li><li>The text of content may be polluted with messy HTML. Worse, it may even have broken links. Don’t leave it to the end and deal with this early on, as
it can be a tough one.</li><li>If your data should be translated decide in advance what to do if some
translated items are missing.</li></ul><p>That’s it, I’d love to hear other best practices devs are applying to their migration projects.</p></div><div class="max-w-3xl mx-auto px-4 mt-5"><div class="flex flex-col bg-white rounded-lg rounded-br-lg shadow-md border-t p-8 text-center"><div class="grid gap-4 sm:gap-6 sm:grid-rows-1 grid-flow-col pb-4 border-b border-purple-400"><div><img class="w-16 h-16 mx-auto rounded-full" src=/images/team/avatars/HelenaEksler.jpg alt="HelenaEksler's profile"><h3 class="mt-6 text-gray-900 text-sm font-medium">Helena Eksler</h3></div></div><time class="mt-4 text-gray-600">01 April 2016</time></div></div><div class="bg-gray-800 py-8 sm:py-10 mt-8"><div class="max-w-3xl mx-auto px-4 font-semibold text-white sm:text-lg"><div class="grid grid-cols-2 gap-x-4"><div><a href=https://www.gizra.com/content/docker-travis-ci/><div class="flex flex-row"><i class="fas fa-angle-left mr-2 mt-1"></i>Docker and Travis-CI</div></a></div><div><a href=https://www.gizra.com/content/job-post-goes-viral/><div class="flex flex-row justify-end text-right">A Job Posting Goes Viral
<i class="fas fa-angle-right ml-2 mt-1"></i></div></a></div></div></div></div></main></div><footer class="border-t border-gray-100 pt-4 bg-gradient-to-b from-gray-100 to-white -mt-8"><div class="fluid-container max-w-2xl mt-5"><div class="grid grid-cols-2 sm:grid-cols-3"><div class="col-start-1 text-left"><div class="text-lg text-gray-700 mb-2">Chicago, USA</div><div class="text-xs text-gray-500">+1-312-585-7625</div></div><div class="col-start-1 row-start-2 text-left sm:text-right sm:col-start-3 sm:row-start-1"><div class="text-lg text-gray-700 mb-2">Tel-Aviv, Israel</div><div class="text-xs text-gray-500">+972-54-4444986</div></div><div class="flex flex-col space-x-3 items-end sm:items-center sm:space-x-0 sm:mx-auto col-start-2 row-start-1 sm:col-start-2 sm:row-start-1"><div class="flex flex-row space-x-5"><a class="w-9 h-9 text-pink-400 hover:text-pink-500" target=_blank rel=noopener aria-label="Gizra's Twitter Profile" href=https://twitter.com/gizraweb><i class="fab fa-twitter fa-2x"></i></a><a class="w-9 h-9 text-pink-400 hover:text-pink-500" target=_blank rel=noopener aria-label="Gizra's Github Profile" href=https://github.com/Gizra><i class="fab fa-github fa-2x"></i></a></div><div><a class="text-xs text-gray-500 hover:text-gray-600 italic" href="https://www.gizra.com/cdn-cgi/l/email-protection#e28b8c848da2858b989083cc818d8f"><span class="__cf_email__" data-cfemail="fc95929a93bc9b95868e9dd29f9391">[email protected]</span></a></div></div></div></div><div class="mt-6 pt-6 pb-6"><div class="fluid-container text-center text-xs text-gray-500">Copyright <i class="far fa-copyright"></i> 2021, Gizra.</div></div></footer><script data-cfasync="false" src="https://www.gizra.com/cdn-cgi/scripts/5c5dd728/cloudflare-static/email-decode.min.js"></script><script type=text/javascript src=https://www.gizra.com/app.min.f721e545f051c16ac18ed0fa3471afc3fd7c19bfdf682b3af61310b5ddb69dc9.js integrity="sha256-9yHlRfBRwWrBjtD6NHGvw/18Gb/faCs69hMQtd22nck=" media=screen></script><script type=text/javascript src=https://cdnjs.cloudflare.com/ajax/libs/anchor-js/4.2.2/anchor.min.js></script></body></html>
Mar 30 2016
Mar 30

If you have ever had to setup Travis-CI, I’m sure you know where the biggest pain point is - debugging the machine.

We have lost so many hours on intermediate commits, waiting for Travis to spin up a new box, run the scripts only to see it fail on the last command. Not to mention how hard it is to debug a webapp on a machine you don’t have direct access to, and there is no browser.

But there’s a new and better way - we can use Docker inside Travis.

What does that mean? In short, it means that setting up Travis, and debugging your test scripts has just become much easier. You can do all the setup, and the debugging from your own local computer!

Behat tests running from within the container

In fact, even our .travis.yml has been reduced to just a few lines - docker build or docker pull an existing image, and then just docker run it, as all the setup and tests are now controlled by your docker scripts:

sudo: false

language: php

  - docker

# Replace with your image name.
  - docker pull gizra/drupal-lamp

  - docker run -it -p 8080:80 gizra/drupal-lamp

You want it too?

Why am I even asking? Of course you want it! So I’ve created a base image for all the Drupal devs to use (or fork) and it’s on docker hub (and on GitHub). The base image will give you Drupal, Apache2, MySQL, Git, Composer, Drush, etc.

I also wrapped a Hedley skeleton project and it’s available here.

Next time you need to setup Travis, don’t try to do it the “old” way - just remember it’s easier to debug on your own machine.

Mar 17 2016
Mar 17

By default IT projects will fail. They may still launch but not accomplish any one of the following requirements of a successful project: on time, within budget, with happy users.

Gizra addresses this by defining the ambiguous with the simple but powerful principle that the budget dictates the project. It all begins (and ends) at the price estimation. Our budget breakdown creates a common language for all stakeholders and sets expectations early on.

After a discovery phase, the project is broken down to timeboxed tasks - each with a clear deadline and cost to produce the deliverable. We set the maximum time to complete a task at 12 hours because this number is easily grasped by all stakeholders. Every feature and bug has a price tag and we encourage our clients not to try to buy them all (at least not at first).

An example of our budget breakdown:

The awareness of time increases productivity. Deliverables and deadline expectations are clear and the risk of scope creep and over engineering (the enemy of good is perfect) is mitigated. Awareness of cost encourages thinking about value - does it make sense for our clients to pay this much for this task?

In the rare case that a task exceeds the timebox, a review process is triggered to identify the cause and is quickly remedied. Identifying an issue - such as a wrong estimate, design problems, lack of understanding of a task, or missing critical information - at the task level, before it compounds, ensures a low cost of mistakes.

Determining success at the task level - was the task completed on time and in budget - provides a real-time measurement of success. At any stage of the project we have a measurable way to identify where we stand and how aligned we are with the set expectations.

In the case of lean demonstrations of a product idea (Proof-of-Concept, Prototype, Minimal Viable Product) the scope is not yet defined so the uncertainty and inherent risk can be immediately addressed by first timeboxing at the project-wide level. Web based product ideas can be successfully demonstrated within 3-4 months of development and with 1-3 developers which is easily translated into a budget range. With this anchor, clients can move forward confidently with the project.

The beauty of this budget driven approach is that it works for most type and size projects. Gizra has found this to be an invaluable premise in our efforts to execute successful projects.

Feb 29 2016
Feb 29

When we develop a website we should take care of many things like design, responsiveness, speed, QA - and of course, security.

One of the major security concerns in websites and web applications is Cross Site Scripting (XSS). You definitely don’t want somebody to run their own malicious code in your website. And to avoid it - you would like to have some kind of “vaccine” from such “disease”.

We all know about Drupal’s check_plain(), filter_xss() and similar functions that sanitize user generated text, but unless we are actively checking for XSS, how can we be sure we’ve added them on all the right places?

Well, we found a nice solution for it that can be easily applied in your projects as well.

XSS on the title and body fields of an article

Assuming you are creating some dummy data during the development - you should create an “XSS Example” for each content type, vocabulary etc.

But creating dummy data after each re-install during the development is annoying. All Gizra’s projects are scaffolded from our home grown hedley generator, which comes with migration files that help with adding dummy and real content to your Drupal site. What we do when we add our own dummy migration data is add some data that has deliberate XSS in the title, body and text fields like in the image above.

XSS properly sanitized on the node view

If you are following this best practice of using the Migration module to import content into your platform, you should add “XSS Examples” for each migration. If you start from Hedley, you are lucky, as I’ve just added such an example there.

Of course it’s not a panacea, but at least you reduce the chance someone will be able to inject dangerous script into your website.

For the most part Drupal core will already take care of preventing XSS, but for contrib modules and your custom code extra attention is required.
In fact, we’ve found that one of the most widely used contrib modules doesn’t santize page titles properly in a specific scenario involving i18n (obviously we opened a security issue and a patch was already created)

JS alert popping up is the sign for XSS in action

The point is, if we hadn’t worked with infected data, we wouldn’t have noticed the security hole. By having this in place, we further minimized our site’s security threat.

Give it a try, and notice how a JS alert page pops up in unexpected places from time to time. And then be grateful you found it, before it reached your production.

Jan 24 2016
Jan 24

We are often challenged with the maintenance of existing projects that were developed by other agencies, or a new developer arrives and we need to quickly bring them on board. The complexity of legacy projects can be very high and the risk of breaking existing logic is something we want to avoid.

One way we like to look at a project before diving into the code is through its data structure. The different entities and their relations can tell us a lot about the business logic of the site and its internal logic. We assumed that if we could easily generate a graph with all the bundles, entities, and their relations this complex task would be easier.

Having done this for a while now, I believe our assumption was right. Taking our open-source Productivity project (Gizra’s internal ERP/Project management system) as an example, it’s much easier to look at the following graph and understand that Work session, Time tracking, Payment, and Github Issue bundles are pointing at a Project, which in turn points to an Account bundle. Github Issue can also reference itself.

Thanks to Damien Tournoud’s previous work getting the graph out wasn’t too complicated. His sanbox module already output the entities and bundles. It was merely missing the relationships between them, and a UI to display the graph.

So, standing on the shoulders of giants, we’ve adapted the code and now have an Entities Diagram module.


There are two interfaces to get the graph:


When using Drush we need the Graphviz command line library to generate the graph, since the module only generates text file in the [DOT language](https://en.wikipedia.org/wiki/DOT_(graph_description_language).

Use this from the command line to get an image:
drush entitiesdiagram | dot -Gratio=0.7 -Eminlen=2 -T png -o ./test.png


Craft your URL according to your needs:

  • All the entities at /admin/entities-diagram
  • Only the node’s bundles /admin/entities-diagram/node
  • A simplified graph of the node’s bundles, with no fields at /admin/entities-diagram/node/false

Simple, but powerful.

Jan 17 2016
Jan 17

You might have heard of Burning Man. Basically it’s a lot of hippies settling down in the desert for a few days, setting up small camps with different themes that make up a big, temporary city.

It’s not for me.

Radical Self-expression is one of Midburn ten principles. &#169; Eyal Levkovich.

And yet, I found myself going to the hackathon of our local Burning man community as an enthusiastic Drupal developer willing to solve any Drupal issue (and you can assume they had a few). My part was to write the backend, and the obvious choice was using the RESTful module.

Soon I came across a big problem: How can we manage 3rd party applications and make sure they can’t access resources which they shouldn’t have access to?
How can we prevent the Secret Santa application (an app that provides addresses of other Burning Man attendees so they could receive gifts) from accessing a user’s medical qualification data? Or prevent the Midburn questionnaire application from accessing attendees private data?

Apps Entity Restrictions in action

Apps Entity Restrictions is our answer to the problem.

With this module, which I developed, you can create 3rd party application representation where you can determine which field or property each registered application can access on each entity. You can even restrict the allowed CRUD operations.

While working on Gizra’s modules and projects in the past years I came to realize that a good API and a good DX is the one thing responsible for a good module. By default, any app is restricted from doing any operation - you know, security.

Progrmatically creating an application with allowed GET operation on the body and the node ID is as you would might expect it to be:

  ->setTitle('Demo application')
  // Allow only GET operations.
  ->allow('node', 'methods', 'get')
  // Explicitly allow access to both properties/ fields.
  ->allow('node', 'properties', 'nid')
  ->allow('node', 'properties', 'body')

Checking those restriction via code is easy:

// Check property access.
if (!$app->entityPropertyAccess('get', 'node', 'field_address')) {
  throw new \Exception("This app has no GET access to the address field.");

Restful Integration and Other bonuses

There is a cool and easy Restful integration. The module provides a set of API for developers to get this restriction validation on their endpoint. If you’re interested in Decoupled Drupal, you should probably take a look at this.

The next step is baking in some more statistics. Wouldn’t it be great if you could know the usage stats for each application? Apart from knowing usage patterns, having information like, for example, amount of invalid requests might help in detecting intrusion attempts.

Requests graphs. Cool, right?
Nov 19 2015
Nov 19

The vast majority of our projects at Gizra use Bootstrap for layout. We spend a lot of time and effort creating the perfect responsive layout and UX across all breakpoints. As Bootstrap comes by default with four breaking points, we naturally implemented them all, until we started asking ourself:

Q: Is responsive really needed? A: Yes, Of course.

Q: Do we always need so many breakpoints? A: No. Or, to say it differently: Yes. But not necessarily immediately.

Don’t get me wrong. I’m not against responsive design. I’m just saying each breakpoint has an impact on the project length and budget. It’s up to us to help the client decide how many breakpoints are right for them. As you know, Bootstrap can have custom breakpoints.

The “Bootstrap got it right, don’t mess with it” Approach

The project adapts itself to support multiple devices and screen sizes (large desktop, desktop, tablet, mobile).

Bootstrap provides us with a powerful responsive layout and a matching grid system. But with great power comes great responsibility! Our responsibility is making sure the UI is consistent. Making sure the layout looks perfect across all of the breakpoints isn’t always an easy task to achieve and above all it consumes a considerable amount of project time.

The “Bootstrap is awesome, but default is just a default” Approach

In this approach we define the project responsiveness needs based on quantitative data, from Google Analytics and similar resources, and decide on the breakpoints accordingly. For example, with this approach, we may start only with the large desktop and mobile breakpoints.

This approach doesn’t mean we cannot change or add more breakpoints in the future, however it allows us to concentrate on the vital elements in order to get the project out the door. In fact, there is a good chance we will find out there is a need for all Bootstrap’s default breakpoints, or even more - but at least we have actively, and mindfully, decided to do it.

Discovery Stage

The discovery stage of our projects is so important it probably deserves its own blog post, but lets focus only on the breakpoints aspect:

Project Audience

In our case the audience are devices (phones, tablets, desktops). Concentrate on the current most important audience. If possible, try to analyze based on real data (e.g. Google Analytics) to get a better idea of the kind of traffic we have and adjust accordingly to it.

Time & Budget

Our project’s time estimation gets bigger and the budget more expensive with every extra breakpoint. We often like to put it into numbers for the client and say (amount might vary, but it helps getting the point): “Each breakpoint will cost you $2,500. Lets skip two of them, and invest that $5,000 in your core features.”


To better understand the impact of this approach you can check the demo site. Notice how for example in col-md size, the above layout seems cut.

Wonder what normal people do when they see their site cut?

They simple make the browser’s window bigger :)

Oct 01 2015
Oct 01

Part of my job is to get my hands dirty with technologies I stumble upon. I’ve decided to have a go at React. Well, one thing led to the other and it seems I went down the client side rabbit hole. I’d like to share with you my path - watch out though, it’s a slippery slope.

&quot;Hello World&quot; in Elm

It all started with this Thinking Flux video which explains the problems Facebook faced in its front-end and the new application architecture they are now using.

Since the Flux concept was out, different libraries were written implementing it, but in my view it seems that Redux is the winner in terms of simplicity, popularity, docs and community. I really recommend going over it - at least the intro and basics. You might be tempted to actually learn a bit of React (tutorial) to follow the examples more easily.

Then I saw Redux was crediting Elm for some of its inspiration, so I decided to give it a quick look. I was immediately blown away by Elm. The syntax is weird (unless you know Haskell), it has a crazy learning curve, but a lot of it makes so much sense.

The following recording is a presentation I first gave internally for Gizra devs, then as a BoF in DrupalCon Barcelona, and finally recorded to share it with everyone.

[embedded content]

My goal is to get more people excited about Elm so the community and contributions grows. I feel it is now very much like Drupal about 10 years ago - a small community, far from being mainstream, but with a lot of potential.

Maybe if we’ll draw from Drupal’s experience in building and cultivating a community we’ll be able to bring this awesome tool closer to the mainstream.

Aug 04 2015
Aug 04
Complex dynamic page

A well known DrupalCon fact is that the action mostly takes place in the hallways and social gatherings. The logic is that the sessions are recorded, and the rest isn’t.

On DrupalCon L.A. I was spending most of my time in the hallways stalking people to show them the newly born Shoov and ask for their feedback. One of those people was my good friend Mike Anello, which later expressed his feelings about Shoov in this fun DrupalEasy podcast.

Few months later, with the next DrupalCon already around the corner and Shoov maturing every day, I’ve contacted him to get his feedback on what we have so far, and I got this:

I think one that could help me and other developers is almost something like a glossary. I’ve heard of many of the technologies in your visual monitoring “stack”, but not entirely clear on what the purpose of each one is (Yeoman, WebdriverCSS, mocha, etc… - Behat I know!

Let’s start from the end, because those three words made an impact on me - “Behat I know”.

As a reader of this post you are probably familiar with Behat or at least know what functional testing frameworks are.

Do you remember that time in the past you didn’t?
Do you remember that time in the past where functional testing was just a nice to have for you and not the life saver it is now days?

I believe visual monitoring is at the same place Behat was three years ago when we started using it. It was a thing that we heard could maybe help us with the biggest pain points in development - bugs and regression - but we didn’t realize how much it will change the way we work (hint: deploy cycle became much shorter and more solid).

Three years later down that road, I cannot imagine a project without automatic tests.

So to answer Mike’s question I’ve started writing a glossary; which evolved to a basic setup and installation notes; which in turn evolved to a series of complete step by step tutorials (and some more chapters are in the making). In fact, we’ve published a dummy site just so everybody could easily swallow the blue pill and start adding visual regression tests to their sites.

I bet that after you’ll complete the tutorials, and have your first tests committed to your GitHub repository you will have a similar reaction as I had - You will ask yourself how you and your team had the audacity to work on your websites’ CSS without having such a tool in place.

Jul 20 2015
Jul 20

In recent months I’ve been demoing visual monitoring to many developers. The reaction was always positive, but I’ve realized that not enough people have taken the step from recognizing the need to actually implementing it on their own projects.

If you have been following my recent blog posts or tweets you’ve probably noticed we are trying to bring visual monitoring along with Shoov to the masses. To do so we’re trying to reduce the complexity and codify our “lessons learned”.

Yeoman generators is one way to achieve this. With the new yo shoov - a single command makes sure all the files needed for visual monitoring are immediately scaffolded in your repository. In fact, it also sets up Behat tests along with a .shoov.yml that will allow Shoov to run your visual monitoring tests periodically.

Since visual monitoring might be new for a lot of people, the generator not only scaffolds the files but also attempts to check if your system is properly installed, and tells you how to fix it if not.

Shoov generator in action.

What’s cool about using libraries such as WebdriverCSS in conjunction with 3rd party services such as BrowserStack or Sauce Labs is that you can write once, and execute multiple times in different environments.

# Run on Chrome 43.0 on Mac X 10.10
PROVIDER_PREFIX=saucelabs SELECTED_CAPS=chrome mocha

# Run on Internet Explorer on Windows 7

Another nifty feature of WebdriverCSS is the ability test the same pages on multiple viewports with a single line of code:

.webdrivercss(testName + '.homepage', {
  name: '1',
  // ...
  // Test on multiple view ports.
  screenWidth: selectedCaps == 'chrome' ? [320, 640, 960, 1200] : undefined,
}, shoovWebdrivercss.processResults)

Best Practices

Visual monitoring best practices are beyond lines of code - they are about the mindset needed when approaching this task. Here are the important ones:

  1. Don’t try to achieve 100% test coverage. Assuming that up until now you had 0 test coverage, it’s probably fine to reach 40% - so don’t feel bad when you exclude or remove the dynamic parts. Their functionality can be complimented by functional testing with Behat or CasperJs.
  2. Don’t compromise by using PhantomJS. Your sites are being consumed by real people on real browsers. Use 3rd party services, or your own Selenium cloud to run tests. It’s well worth the money.
  3. Given the last point - the time the tests run should be considered as a resource. Having a gazillion tests would eat much of that resource. Try to find the balance: doing as little effort as possible, with as much gain as possible.
  4. Look (and contribute) to examples. The Drupal.org monitoring example is there for the community to learn. While the repo has been “Shoovified”, there’s actually zero vendor lock-in - the WebdriverCSS code is valid with or without Shoov.
  5. Start simple. Visual monitoring is very powerful, but requires the time to master it. If you will overdo with lots of baseline images from the beginning you might need to adapt them very often until you will get the hang of it.

Shoov’s Next Steps

Apart from adding a nicer design to the application, Pusher integration for real time notifications, and lots of smaller features, the Gizra devs are constantly trying to push code to other projects, namely RESTful and yo hedley. We are hoping you could enjoy and use that work on your own Headless Drupal projects, and of course hope the community could contribute back.

Shoov's dashboard.

We’ve also began work on a getting started tutorial, which will help guide developers through example to reach visual monitoring best practices nirvana.

Jun 01 2015
Jun 01

Irony presents itself in many forms. Not being able to login to a site that is responsible for testing that the login is working properly on other live sites, is one of them.

As much as I’d like to say I was able to enjoy the irony, the six hours I spent tracking the bug were slightly frustrating.

One of the things Shoov is built for is assisting us with a quick configuration of live site monitoring using your preferred functional testing tool (e.g. Behat or CasperJS). As awesome as services like Pingdom are, they still provide very little insight to what’s actually going on in the site. In fact, according to Pingdom, Shoov was up and running, even though no user could have logged in.

Shoov login now working. When it wasn't, the fish were sad

Post Mortem

At this point of time I think very few people care about the “post mortem” of this incident since Shoov is still a work in progress, however some interesting lessons were learned, and some contributions were made.

First a quick overview of the problem. Shoov is a fully decoupled Drupal, where the only way to register and login are through the GitHub login. Upon login, after the user is redirected back from GitHub’s own authentication, it sends the one time access code it got from GitHub to our Drupal server to do a final validation and register the user.

The login worked perfectly, until suddenly it didn’t. To make things even harder my local installation worked fine.

After more hours that I’m willing to admit and a good pointer from @izuzak from GitHub’s support team, we figured that my local was sending the following POST:

POST /login/oauth/access_token HTTP/1.0
User-Agent: Drupal (+http://drupal.org/)
Host: github.com
Content-Length: 119


While the live server sent this:

POST /login/oauth/access_token HTTP/1.0
User-Agent: Drupal (+http://drupal.org/)
Host: github.com
Content-Length: 111


As you might have noticed (and I can’t blame you if you didn’t) the & char was escaped, and GitHub decided it doesn’t like that anymore.

Solve and Prevent Regressions

After quickly solving the error by hardcoding the & char I’ve decided to spend some time in figuring how I could prevent this from happening again. (Remember: Shoov means “again” in Hebrew for this very reason&mldr;)

I’ve noticed that even though RESTful has thrown an exception when it got the result from GitHub, and even though the site is piping the logs to Loggly I wasn’t notified about it.

So, the first thing I’ve done was to write a pull-request to RESTful to make sure that exceptions are registered in the watchdog. This means we now got that part covered not just for Shoov, but for all users of RESTful!

Next was writing a Behat test that will be executed by Shoov every few minutes, and constantly verify that Shoov’s login is working properly. At least this unfortunate bug will not go unnoticed should it return some day (as unfortunate bug tend to do from time to time).

Having to do all that, along with wanting to have it as a public repository, gave me the push to finally introduce the concept of encrypted keys. Since we don’t want the credentials of the dummy GitHub user we’ve created for the test to be exposed, Shoov will now have a secret private key that can be used for AES encryption. Shoov makes sure those encrypted variables will be available in our tests.

Encrypted keys using the same syntax as Travis in .shoov.yml file

And you know what else is great? Since all the different components of Shoov are open source, we can enjoy sharing the code the does the decryption with everyone.

May 23 2015
May 23

In this guest post, Luke Herrington shares his experience with integrating an existing Drupal backend with a Backbone.Marionette Todo app.

If you’re reading this, you probably already know about all of the great work that Gizra has done in the Drupal/REST space. If you haven’t, I highly recommend you check out their github repo. Also see the RESTful module.

One of the projects that Amitai has contributed is Todo Restful. It shows an Angular implementation of the canonical TodoMVC Javascript app connecting to a headless Drupal backend. It’s a great demonstration of how easy exposing Drupal content with the RESTful module is. It also shows that when a RESTful API adheres to best practices, connecting it with clients that follow the same best practices is like a nice handshake.

I saw the Todo Restful project and it got me thinking, “If Amitai did this right (hint: he did), then I should be able to get this working with Backbone pretty easily”. I was pleasantly surprised!

Todo app with a Drupal backend

Here’s a simplified list of everything I had to do to get it working:

  1. Fork the repo
  2. Delete everything in the client/app directory. (the Angular TodoMVC stuff)
  3. Put Backbone.Marionette implementation of TodoMVC into client/app directory.
  4. Change the API endpoint in the Backbone code Override parse functions on Todo model/collection to dig into the data portion of the response from Drupal. This was necessary because Backbone expects the response for a collection to be an array of models. The RESTful module sends back other data so it places the models inside an array named data. All I had to do is tell Backbone where to look.
  5. Edit Grunt file to work with the new app code. This was the hardest part because it was specific to the Angular app.
  6. Test, Commit, Deploy. Amitai setup a Grunt task to deploy the client side code to a github project page

Note: There are detailed instructions on how to get the app running locally on in the repo readme.

Notice I didn’t have to touch any code on the Drupal side. Amitai’s amazing installation script spun up the Drupal site for me with todo content type created and exposed with RESTful. It just worked. In fact the Backbone.Marionette demo app points to the same backend as the Angular app!

Also notice, except for steps 4 and 5 I didn’t have to touch the Backbone code!

Now imagine applying this to your project&mldr; If your requirements demand a lot of interactivity and slick UI elements, go for it! You don’t even have to go fully headless if you don’t want to! See the RESTful module documentation on how to expose your content and start innovating. The possibilities are endless!


Amitai and Josh Koenig (Pantheon), in their talk at DrupalCon LA, spoke about Headless Drupal and showed how Drupal can still be relevant in an age where client side frameworks (Angular, Ember, React, Backbone…) rule. It made me excited about Drupal again and showed a whole new way that Drupal can further “Get off the island” and start to play nicely with other technologies.

In the end, that’s why I went through this exercise: To show that Drupal is a viable CMS backend for not just Angular, but also Backbone.Marionette or any other front end framework for that matter! There’s a ton of front-end developers out there that don’t know what Drupal is capable of. I hope that by showing how easy this was, front-end devs can see that Drupal is more relevant than ever and that it makes their life really easy. I also did this to show current Drupal devs that with Headless Drupal we can imagine displaying and interacting with our content in ways we never have before.

So who’s next? React? Ember?

May 20 2015
May 20

As we dive deeper into visual regression testing in our development workflow we realize a sad truth: on average, we break our own CSS every week and a half.

Don’t feel bad for us, as in fact I’d argue that it’s pretty common across all web projects - they just don’t know it. It seems we all need a system that will tell us when we break our CSS.

While we don’t know of a single (good) system that does this, we were able to connect together a few (good) systems to get just that, with the help of: Travis-CI, webdriverCSS, Shoov.io, BrowserStack/Sauce Labs, and ngrok. Oh my!

Don’t be alarmed by the long list. Each one of these does one thing very well, and combining them together was proven to be not too complicated, nor too costly.

You can jump right into the .travis file of the Gizra repo to see its configuration, or check the webdriverCSS test. Here’s the high level overview of what we’re doing:

Gizra.com is built on Jekyll but visual regression could be executed on every site, regardless of the underlying technology. Travis is there to help us build a local installation. Travis also allows adding encrypted keys, so even though the repo is public, we were able to add our Shoov.io and ngrok access tokens in a secure way.

We want to use services such as BrowserStack or Sauce-Labs to test our local installation on different browsers (e.g. latest chrome and IE11). For that we need to have an external URL accessible by the outside world, which is where ngrok comes in: ngrok http -log=stdout -subdomain=$TRAVIS_COMMIT 9000 from the .travis.yml file exposes our Jekyll site inside the Travis box to a unique temporary URL based on the Git commit (e.g. https://someCommitHash.ngrok.io).

WebdriverCSS tests are responsible for capturing the screenshots, and comparing them against the baseline images. If a regression is found, it will be automatically pushed to Shoov, and a link to the regression would be provided in the Travis log. This means that if a test was broken, we can immediately see where’s the regression and figure out if it is indeed a bug - or, if not, replace the baseline image with the “regression” image.

Visual regression found and uploaded to shoov.io


Some gotchas to be aware of:

Even though visual regression testing with BrowserStack or Sauce Labs takes more time than running it on PhantomJS, it’s recommended to use such tools, since they test your site against real browsers.

Those tools cost money, but we find that it’s well worth it. We are currently using BrowserStack (99$/month), though we’re running into some issues with it not having an internal queue system - so if you reached your limit on virtual hosts concurrency, your tests will simply fail. For that reason we might switch to Sauce Labs (149$/month) which also provides more concurrent VMs.

Blog post page tested on IE11, Windows 7

Travis is limited to 50 minutes’ execution time. Capturing each image might take about 30 - 90 sec, so when you reach lots of tests, you should probably split them.

The free plan of ngrok allows only a single concurrent tunnel to be opened. Even though BroswerStack and Sauce Labs provide their own tunneling solution, we decided to go with ngrok, in order to provide a more generic solution. We happily upgraded to the $25/month business account following our excellent experience with the free account.

May 07 2015
May 07

Monitoring your live site is a pretty good idea - that’s generally agreed. Same goes for visual regression testing. Doing it, however, is hard. Enough so that very few companies actually do visual regression testing/monitoring, so don’t feel bad if you haven’t either until now. But after reading this post you should seriously consider doing it. Or at least give it a try.

For example, here’s an overview of how we could monitor Twitter, if someone would actually ask us to (as always you can jump right into the repository):

Visual regression on a Twitter page. So much functionality has been asserted in this simple screenshot

Visual Regression is Hard

Getting a baseline screenshot of a page requires some thinking, and a bit of trial and error. Twitter’s main page has lots of dynamic content: the user’s tweets count, followers, trending topics, actual tweets etc. Obviously our screenshot cannot include that info.

Luckily webdriverCSS already comes with some commands that help us to exclude (place a black rectangle) or even remove (completely hide) an element.

In the test file you can see we selected only the CSS selectors that need to be excluded or removed, leaving our page with some data, but not all of it. Don’t be discouraged by not covering 100% though. Even with some parts hidden we have already asserted so much functionality - certainly more than having no visual test in place.

Functional testing for dynamic parts

A screenshot is a powerful tool, but not the only one. We still have functional testing frameworks in our toolbelt. Behat can easily be used to assert that the number of tweets feature is working.

In the time of writing @gizra_drupal has about 340 tweets. So we could write a simple test that will assert that there’s a minimum of 300 tweets (giving us the flexibility to delete a few without breaking the tests).

The Behat PHP code to implement this is fairly straight forward. It finds the value in the HTML, converts it into an integer, and asserts it has a minimum value.


Up until now we didn’t need Shoov, which is good, since Shoov is agnostic to which tools you use - it’s only there to help you deal with the accumulated images and regressions.

Before we've hidden the spinner, Shoov showed us the regression

It’s important to realize that once you go down the visual regression road it’s hard to stop. Suddenly PhantomJS isn’t good enough, if you can use BrowserStack or SauceLabs to validate the site on many platforms and browsers. Testing your site on just one screen size isn’t enough either.
webdriverCSS comes with the powerful screenWidth property, which should make it super simple to test multiple view ports soon as this issue is fixed.

New stuff

  • We now have a Shoov.io page, which is updated from time to time with new info.
  • app.shoov.io is now the app site, sitting on Amazon S3 and CloudFront with SSL certificate, so you should feel safer using it.
  • “Dev Tips” is a new concept we’ve introduced. Since the site and all of its sub-components are open sourced, we have added a few tips in each page to help you - the developer - to better understand how the system works, and maybe even encourage you to contribute.
Login to Shoov.io to learn more from the dev tips


About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web