Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Apr 10 2015
Apr 10

A Silex and Elasticsearch app powered by Drupal 7 for content management

In the previous article I started exploring the integration between Drupal 7 and the Elasticsearch engine. The goal was to see how we can combine these open source technologies to achieve a high performance application that uses the best of both worlds. If you’re just now joining us, you should check out this repository which contains relevant code for these articles.


We’ll now create a small Silex application that reads data straight from Elasticsearch and returns it to the user.

Silex app

Silex is a great PHP micro framework developed by the same people that are behind the Symfony project. It is in fact using mainly Symfony components but at a more simplified level. Let’s see how we can get started really quickly with a Silex app.

There is more than one way. You can add it as a dependency to an existent composer based project:

"silex/silex": "~1.2",

Or you can even create a new project using a nice little skeleton provided by the creator:

composer.phar create-project fabpot/silex-skeleton

Regardless of how your project is set up, in order to access Elasticsearch we’ll need to use its PHP SDK. That needs to be added to Composer:

"elasticsearch/elasticsearch": "~1.0",

And if we want to use Twig to output data, we’ll need this as well (if not already there of course):

"symfony/twig-bridge": "~2.3"

In order to use the SDK, we can expose it as a service to Pimple, the tiny Silex dependency injection container (much easier than it sounds). Depending on how our project is set up, we can do this in a number of places (see the repository for an example). But basically, after we instantiate the new Silex application, we can add the following:

$app['elasticsearch'] = function() {
  return new Client(array());

This creates a new service called elasticsearch on our app that instantiates an object of the Elasticsearch Client class. And don’t forget we need to use that class at the top:

use Elasticsearch\Client;

Now, wherever we want, we can get the Elasticsearch client by simply referring to that property in the $app object:

$client = $app['elasticsearch'];

Connecting to Elasticsearch

In the previous article we’ve managed to get our node data into the node index with each node type giving the name of an Elasticsearch document type. So for instance, this will return all the article node types:


We’ve also seen how to instantiate a client for our Elasticsearch SDK. Now it’s time to use it somehow. One way is to create a controller:


namespace Controller;

use Silex\Application;
use Symfony\Component\HttpFoundation\Response;

class NodeController {

   * Shows a listing of nodes.
   * @return \Symfony\Component\HttpFoundation\Response
  public function index() {
    return new Response('Here there should be a listing of nodes...');
   * Shows one node
   * @param $nid
   * @param \Silex\Application $app
   * @return mixed
  public function show($nid, Application $app) {
    $client = $app['elasticsearch'];
    $params = array(
      'index' => 'node',
      'body' => array(
        'query' => array(
          'match' => array(
            'nid' => $nid,
    $result = $client->search($params);
    if ($result && $result['hits']['total'] === 0) {
      $app->abort(404, sprintf('Node %s does not exist.', $nid));
    if ($result['hits']['total'] === 1) {
      $node = $result['hits']['hits'];
      return $app['twig']->render('node.html.twig', array('node' => reset($node)));

Depending on how you organise your Silex application, there are a number of places this controller class can go. In my case it resides inside the src/Controller folder and it’s autoloaded by Composer.

We also need to create a route that maps to this Controller though. Again, there are a couple of different ways to handle this but in my example I have a routes.php file located inside the src/ folder and required inside index.php:


use Symfony\Component\HttpFoundation\Response;

 * Error handler
$app->error(function (\Exception $e, $code) {
  switch ($code) {
    case 404:
      $message = $e->getMessage();
      $message = 'We are sorry, but something went terribly wrong. ' . $e->getMessage();

  return new Response($message);

 * Route for /node
$app->get("/node", "Controller\\NodeController::index");

 * Route /node/{nid} where {nid} is a node id
$app->get("/node/{nid}", "Controller\\NodeController::show");

So what happens in my example above? First, I defined an error handler for the application, just so I can see the exceptions being caught and print them on the screen. Not a big deal. Next, I defined two routes that map to my two controller methods defined before. But for the sake of brevity, I only exemplified what the prospective show() method might do:

  • Get the Elasticsearch client
  • Build the Elasticsearch query parameters (similar to what we did in the Drupal environment)
  • Perform the query
  • Check for the results and if a node was found, render it with a Twig template and pass the node data to it.
  • If no results are found, abort the process with a 404 that calls our error handler for this HTTP code declared above.

If you want to follow this example, keep in mind that to use Twig you’ll need to register it with your application. It’s not so difficult if you have it already in your vendor folder through composer.

After you instantiate the Silex app, you can register the provider:

$app->register(new TwigServiceProvider());

Make sure you use the class at the top:

use Silex\Provider\TwigServiceProvider;

And add it as a service with some basic configuration:

$app['twig'] = $app->share($app->extend('twig', function ($twig, $app) {
  return $twig;
$app['twig.path'] = array(__DIR__.'/../templates');

Now you can create template files inside the templates/ folder of your application. For learning more about setting up a Silex application, I do encourage you to read this introduction to the framework.

To continue with our controller example though, here we have a couple of template files that output the node data.

Inside a page.html.twig file:

        <!DOCTYPE html>
            {% block head %}
                <title>{% block title %}{% endblock %} - My Elasticsearch Site</title>
            {% endblock %}
        <div id="content">{% block content %}{% endblock %}</div>

And inside the node.html.twig file we used in the controller for rendering:

{% extends "page.html.twig" %}

{% block title %}{{ node._source.title }}{% endblock %}

{% block content %}

        <h1>{{ node._source.title }}</h1>

        <div id="content">

            {% if node._source.field_image %}
                <div class="field-image">
                    {% for image in node._source.field_image %}
                        <img src="http://www.sitepoint.com/integrate-elasticsearch-silex//{{ image.url }}" alt="img.alt"/>
                    {% endfor %}
            {% endif %}

            {% if node._source.body %}
                <div class="field-body">
                    {% for body in node._source.body %}
                        {{ body.value|striptags('<p><div><br><img><a>')|raw }}
                    {% endfor %}
            {% endif %}

{% endblock %}

This is just some basic templating for getting our node data printed in the browser (not so fun otherwise). We have a base file and one that extends it and outputs the node title, images and body text to the screen.

Alternatively, you can also return a JSON response from your controller with the help of the JsonResponse class:

use Symfony\Component\HttpFoundation\JsonResponse;

And from your controller simply return a new instance with the values passed to it:

return new JsonResponse($node);

You can easily build an API like this. But for now, this should already work. By pointing your browser to http://localhost/node/5 you should see data from Drupal’s node 5 (if you have it). With one big difference: it is much much faster. There is no bootstrapping, theming layer, database query etc. On the other hand, you don’t have anything useful either out of the box except for what you build yourself using Silex/Symfony components. This can be a good thing or a bad thing depending on the type of project you are working on. But the point is you have the option of drawing some lines for this integration and decide its extent.

One end of the spectrum could be building your entire front end with Twig or even Angular.js with Silex as the API backend. The other would be to use Silex/Elasticsearch for one Drupal page and use it only for better content search. Somewhere in the middle would probably be using such a solution for an entire section of a Drupal site that is dedicated to interacting with heavy data (like a video store or something). It’s up to you.


We’ve seen in this article how we can quickly set up a Silex app and use it to return some data from Elasticsearch. The goal was not so much to learn how any of these technologies work, but more of exploring the options for integrating them. The starting point was the Drupal website which can act as a perfect content management system that scales highly if built properly. Data managed there can be dumped into a high performance data store powered by Elasticsearch and retrieved again for the end users with the help of Silex, a lean and fast PHP framework.

Apr 03 2015
Apr 03

A Silex and Elasticsearch app powered by Drupal 7 for content management

In this tutorial I am going to look at the possibility of using Drupal 7 as a content management system that powers another high performance application. To illustrate the latter, I will use the Silex PHP microframework and Elasticsearch as the data source. The goal is to create a proof of concept, demonstrating using these three technologies together.


The article comes with a git repository that you should check out, which contains more complete code than can be presented in the tutorial itself. Additionally, if you are unfamiliar with either of the three open source projects being used, I recommend following the links above and also checking out the documentation on their respective websites.

The tutorial will be split into two pieces, because there is quite a lot of ground to cover.

In this part, we’ll set up Elasticsearch on the server and integrate it with Drupal by creating a small, custom module that will insert, update, and delete Drupal nodes into Elasticsearch.

In the second part, we’ll create a small Silex app that fetches and displays the node data directly from Elasticsearch, completely bypassing the Drupal installation.


The first step is to install Elasticsearch on the server. Assuming you are using Linux, you can follow this guide and set it up to run when the server starts. There are a number of configuration options you can set here.

A very important thing to remember is that Elasticsearch has no access control so, once it is running on your server, it is publicly accessible through the (default) 9200 port. To avoid having problems, make sure that in the configuration file you uncomment this line:

network.bind_host: localhost

And add the following one:

script.disable_dynamic: true

These options make sure that Elasticsearch is not accessible from the outside, nor are dynamic scripts allowed. These are recommended security measures you need to take.


The next step is to set up the Drupal site on the same server. Using the Elasticsearch Connector Drupal module, you can get some integration with the Elasticsearch instance: it comes with the PHP SDK for Elasticsearch, some statistics about the Elasticsearch instance and some other helpful submodules. I’ll leave it up to you to explore those at your leisure.

Once the connector module is enabled, in your custom module you can retrieve the Elasticsearch client object wrapper to access data:

$client = elastic_connector_get_client_by_id('my_cluster_id');

Here, my_cluster_id is the Drupal machine name that you gave to the Elasticsearch cluster (at admin/config/elasticsearch-connector/clusters). The $client object will now allow you to perform all sorts of operations, as illustrated in the docs I referenced above.

Inserting data

The first thing we need to do is make sure we insert some Drupal data into Elasticsearch. Sticking to nodes for now, we can write a hook_node_insert() implementation that will save every new node to Elasticsearch. Here’s an example, inside a custom module called elastic:

 * Implements hook_node_insert().
function elastic_node_insert($node) {
  $client = elasticsearch_connector_get_client_by_id('my_cluster_id');
  $params = _elastic_prepare_node($node);

  if ( ! $params) {
    drupal_set_message(t('There was a problem saving this node to Elasticsearch.'));

  $result = $client->index($params);
  if ($result && $result['created'] === false) {
    drupal_set_message(t('There was a problem saving this node to Elasticsearch.'));

  drupal_set_message(t('The node has been saved to Elasticsearch.'));

As you can see, we instantiate a client object that we use to index the data from the node. You may be wondering what _elastic_prepare_node() is:

 * Prepares a node to be added to Elasticsearch
 * @param $node
 * @return array
function _elastic_prepare_node($node) {

  if ( ! is_object($node)) {

  $params = array(
    'index' => 'node',
    'type' => $node->type,
    'body' => array(),

  // Add the simple properties
  $wanted = array('vid', 'uid', 'title', 'log', 'status', 'comment', 'promote', 'sticky', 'nid', 'type', 'language', 'created', 'changed', 'revision_timestamp', 'revision_uid');
  $exist = array_filter($wanted, function($property) use($node) {
    return property_exists($node, $property);
  foreach ($exist as $field) {
    $params['body'][$field] = $node->{$field};

  // Add the body field if exists
  $body_field = isset($node->body) ? field_get_items('node', $node, 'body') : false;
  if ($body_field) {
    $params['body']['body'] = $body_field;

  // Add the image field if exists
  $image_field = isset($node->field_image) ? field_get_items('node', $node, 'field_image') : false;
  if ($image_field) {
    $params['body']['field_image'] = array_map(function($img) {
      $img = file_load($img['fid']);
      $img->url = file_create_url($img->uri);
      return $img;
    }, $image_field);

  return $params;

It is just a helper function I wrote, which is responsible for “serializing” the node data and getting it ready for insertion into Elasticsearch. This is just an example and definitely not a complete or fully scalable one. It is also assuming that the respective image field name is field_image. An important point to note is that we are inserting the nodes into the node index with a type = $node->type.

Updating data

Inserting is not enough, we need to make sure that node changes get reflected in Elasticsearch as well. We can do this with a hook_node_update() implementation:

 * Implements hook_node_update().
function elastic_node_update($node) {
  if ($node->is_new !== false) {

  $client = elasticsearch_connector_get_client_by_id('my_cluster_id');
  $params = _elastic_prepare_node($node);

  if ( ! $params) {
    drupal_set_message(t('There was a problem updating this node in Elasticsearch.'));

  $result = _elastic_perform_node_search_by_id($client, $node);
  if ($result && $result['hits']['total'] !== 1) {
    drupal_set_message(t('There was a problem updating this node in Elasticsearch.'));

  $params['id'] = $result['hits']['hits'][0]['_id'];
  $version = $result['hits']['hits'][0]['_version'];
  $index = $client->index($params);

  if ($index['_version'] !== $version + 1) {
    drupal_set_message(t('There was a problem updating this node in Elasticsearch.'));
  drupal_set_message(t('The node has been updated in Elasticsearch.'));

We again use the helper function to prepare our node for insertion, but this time we also search for the node in Elasticsearch to make sure we are updating and not creating a new one. This happens using another helper function I wrote as an example:

 * Helper function that returns a node from Elasticsearch by its nid.
 * @param $client
 * @param $node
 * @return mixed
function _elastic_perform_node_search_by_id($client, $node) {
  $search = array(
    'index' => 'node',
    'type' => $node->type,
    'version' => true,
    'body' => array(
      'query' => array(
        'match' => array(
          'nid' => $node->nid,

  return $client->search($search);

You’ll notice that I am asking Elasticsearch to return the document version as well. This is so that I can check if a document has been updated with my request.

Deleting data

The last (for now) feature we need is the ability to remove the data from Elasticsearch when a node gets deleted. hook_node_delete() can help us with that:

 * Implements hook_node_delete().
function elastic_node_delete($node) {
  $client = elasticsearch_connector_get_client_by_id('my_cluster_id');

  // If the node is in Elasticsearch, remove it
  $result = _elastic_perform_node_search_by_id($client, $node);
  if ($result && $result['hits']['total'] !== 1) {
    drupal_set_message(t('There was a problem deleting this node in Elasticsearch.'));

  $params = array(
    'index' => 'node',
    'type' => $node->type,
    'id' => $result['hits']['hits'][0]['_id'],

  $result = $client->delete($params);
  if ($result && $result['found'] !== true) {
    drupal_set_message(t('There was a problem deleting this node in Elasticsearch.'));

  drupal_set_message(t('The node has been deleted in Elasticsearch.'));

Again, we search for the node in Elasticsearch and use the returned ID as a marker to delete the document.

Please keep in mind though that using early returns such as illustrated above is not ideal inside Drupal hook implementations unless this is more or less all the functionality that needs to go in them. I recommend splitting the logic into helper functions if you need to perform other unrelated tasks inside these hooks.

This is enough to get us started using Elasticsearch as a very simple data source on top of Drupal. With this basic code in place, you can navigate to your Drupal site and start creating some nodes, updating them and deleting them.

One way to check if Elasticsearch actually gets populated, is to disable the remote access restriction I mentioned above you need to enable. Make sure you only do this on your local, development, environment. This way, you can perform HTTP requests directly from the browser and get JSON data back from Elasticsearch.

You can do a quick search for all the nodes in Elasticsearch by navigating to this URL:


…where localhost points to your local server and 9200 is the default Elasticsearch port.

For article nodes only:


And for individual articles, by the auto generated Elasticsearch ids:


Go ahead and check out the Elasticsearch documentation for all the amazing ways you can interact with it.


We’ve seen in this article how we can start working to integrate Elasticsearch with Drupal. Obviously, there is far more we can do based on even the small things we’ve accomplished. We can extend the integration to other entities and even Drupal configuration if needed. In any case, we now have some Drupal data in Elasticsearch, ready to be used from an external application.

That external application will be the task for the second part of this tutorial. We’ll be setting up a small Silex app that, using the Elasticsearch PHP SDK, will read the Drupal data directly from Elasticsearch. As with part 1, above, we won’t be going through a step-by-step tutorial on accomplishing a given task, but instead will explore one of the ways that you can start building this integration. See you there.

Oct 13 2014
Oct 13

At the recent DrupalCon Amsterdam sprints something amazing happened, people from all corners of the globe assembled to sprint on DrupalCI. DrupalCI is an initiative born out of the requirement for new testbot infrastructure. Our goal is to implement a brand new Continuous Integration (CI) workflow that can not only be used for Drupal but anyone wishing to run a CI infrastructure / Automated tasks. Until this point we had only corresponded via a weekly hangout and IRC.

While this was keeping us on track with building out some of the components, the conference gave us an opportunity to sit down in the same room and perform an end-to-end architectural review to ensure we didn't have any gaps. A modular design approach has been used to ensure that many of the following components could be used as a standalone entity in any infrastructure.

The components

Drupal.org integration module

This component will be used to trigger job runs on the DrupalCI infrastructure, in response to new patches and/or commits occuring on drupal.org (or git.drupal.org).


At the top of the DrupalCI stack is a Silex application which acts as an interface into the infrastructure . The API is responsible for passing build requests into the dispatcher, and build outcomes back to the results site. Since this component is abstracted away from the dispatcher, we can swap out the backend with whatever new technology comes out in the coming years. Long term this component could also provide functionality such as "streaming build output" so you can see the console output in real time.


The dispatcher is in charge of handling the resources for each build. For this role we have chosen Jenkins. Jenkins is a well known CI project with may plugins available for easy extensibility.  One plugin of note is the AWS EC2 plugin, which we use to spin up AWS AMI images with a preloaded testbot runner, containers and the results site console application.

Job Runner

The Job Runner is the core of the build as it orchestrates what steps are performed.  The job runner comes in the form of a number of job-specific scripts, and a  Symfony Console application which serves as both the automated test runner and manual end-user interface for standalone local testing.  For standalone use, the job runner was also "in the wild" at DrupalCon Amsterdam sprints. Huge kudos to Ricardo Amaro with his work on this component!


One component which had not recieved much attention until the DrupalCon Amsterdam sprints is the results server, responsible for archiving and exposing test results and build artifacts for all DrupalCI jobs.  This server will be a Drupal 8 site with a single content type, allowing us to push artifacts and render them in interesting ways to the end user. Our base development server has been built out on Drupal.org infrastructure, and we are now ready to begin prototyping the actual build. We are still working on a backlog, but given it's simplicity it won't be hard to prototype out in the mean time.

Metrics (monitoring / logging)

Last but not least is the Metric component. We want to ensure our CI infrastructure is up and running, as well as pulling some interesting statistics (want to know how many builds etc we got at a Drupalcon?). We will use this component for both monitoring the availablity of the architecture and keeping logs for a set period of time.

Putting it all together

Call to arms

With all of these components identified, and the outstanding gaps now mapped, there is no better time to join in on all the fun. The goal is to have a maintainer for each of these components, that way we can distribute out the effort while also giving everyone a chance to "own" there portion of the infrastructure. With that in mind please don't hestitate to contact myself, Jeremy, or Ricardo if you want to get involved in this very interesting project.

Contact details

Weekly Hangout: http://bit.ly/1sBysAN

Timezone for the Weekly Hangout: http://www.timeanddate.com/worldclock/fixedtime.html?msg=Modernizing+Testbot+Hangout&iso=20141012T14&p1=210&ah=1

The following are our usernames IRC, we can be found in the channel #drupal-testing.

Jeremy Thorson - jthorson

Nick Schuch - nick_schuch

Ricardo Amaro - ricardoamaro

DrupalCI DrupalCon Amsterdam Silex jenkins Drupal 8
Feb 08 2014
Feb 08

A few days ago, while I was writing a bit of Silex code and grumbling at Doctrine DBAL's lack of support for a SQL Merge operation, I wondered if it wouldn't be possible to use DBTNG without the rest of Drupal.

Obviously, although DBTNG is described as having been designed for standalone use: DBTNG should be a stand-alone library with no external dependences other than PHP 5.2 and the PDO database library, in actual use, the Github DBTNG repo has seen no commit in the last 3 years, and the D8 version is still not a Drupal 8 "Component" (i.e. decoupled code), but still a plain library with Drupal dependencies. How would it fare on its own ? Let's give it a try...

Bring DBTNG to a non-Drupal Composer project

Since Composer does not support sparse checkouts (yet ?), the simplest way to do bring in DBTNG for this little test is to just import the code manually and declare it to the autoloader manually. Let's start by getting just DBTNG out of the latest Drupal 8 checkout:

# Create a fresh repository to hold the example
mkdir dbtng_example
cd dbtng_example
git init

# Declare a remote
git remote add drupal http://git.drupal.org/project/drupal.git

# Enable sparse checkouts to just checkout DBTNG
git config core.sparsecheckout true

# Declare just what we want to checkout:
# 1. The DBTNG classes
echo core/lib/Drupal/Core/Database >> .git/info/sparse-checkout
# 2. The procedural wrappers, for simplicity in an example
echo core/includes/database.inc >> .git/info/sparse-checkout

# And checkout DBTNG
git pull drupal 8.x

ls -l core/
total 8
drwxrwxr-x 2 marand www-data 4096 févr.  8 09:32 includes
drwxrwxr-x 3 marand www-data 4096 févr.  8 09:32 lib
That's it: DBTNG classes are now available in core/lib/Drupal/Core/Database. We can now build a Composer file with PSR-4 autoloading on that code:
  "autoload": {
    "psr-4": {
      "Drupal\\Core\\Database\\": "core/lib/Drupal/Core/Database/"
  "description": "Drupal Database API as a standalone component",
  "license": "GPL-2.0+",
We can now build the autoloader:
php composer.phar install

Build the demo app

For this example, we can use the traditional settings.php configuration for DBTNG, say we store it in app/config/settings.php and point to a typical Drupal 8 MySQL single-server database:

// app/config/settings.php$databases['default']['default'] = array (
'database' => 'somedrupal8db',
'username' => 'someuser',
'password' => 'somepass',
'prefix' => '',
'host' => 'localhost',
'port' => '',
'namespace' => 'Drupal\\Core\\Database\\Driver\\mysql',
'driver' => 'mysql',

At this point, our dependencies are ready, let's build some Hello, world in app/hellodbtng.php. Since this is just an example, we will just list a table using the DBTNG Select query builder:

// app/hellodbtng.php

// Bring in the Composer autoloader.

require_once __DIR__ . '/../vendor/autoload.php';// Bring in the DBTNG procerural wrappers.
// @see https://wiki.php.net/rfc/function_autoloading
require_once __DIR__ . '/../core/includes/database.inc';// Finally load DBTNG configuration.
require_once __DIR__ . '/config/settings.php';$columns = array(
$result = db_select('key_value', 'kv')
fields('kv', $columns)
condition('kv.collection', 'system.schema')
range(0, 10)

foreach (

$result as $v) {
$v = (array) $v;
$value = print_r(unserialize($v['value']), true);
printf("%-32s %-32s %s\n", $v['collection'], $v['name'], $value);

Enjoy the query results

php app/hellodbtng.php

system.schema                    block                            8000
system.schema                    breakpoint                       8000
system.schema                    ckeditor                         8000
system.schema                    color                            8000
system.schema                    comment                          8000
system.schema                    config                           8000
system.schema                    contact                          8000
system.schema                    contextual                       8000
system.schema                    custom_block                     8000
system.schema                    datetime                         8000

Going further

In real life:

  • a production project would hopefully not be built like this, by manually extracting files from a repo
  • ... and it would probably not use the procedural wrappers, but wrap DBTNG in a service and pass it configuration using a DIC
  • I seem to remember a discussion in which the full decoupling of DBTNG for D8 was considered but postponed as nice-to-have rather than essential for Drupal 8.0.
  • Which means that a simple integration would probably either
    • use the currently available (but obsolete) pre-7.0 version straight from Github (since that package is not available on Packagist, just declare it directly in composer.json as explained on http://www.craftitonline.com/2012/03/how-to-use-composer-without-packagi... ),
    • or (better) do the required work to decouple DBTNG from D8 core and submit a core patch for that decoupled version, and use it from the newly-independent DBTNG Component.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web