Oct 24 2016
Oct 24

code

You know the old saying: “This is how the world ends: not with a bang, but with a misplaced DROP TABLE.” Working directly with Drupal 7’s database is an arduous task at best.  It’s a sprawling relational system and it uses many space and memory saving tricks to be as speedy as possible.  Thankfully, there is a robust system of functions built into Drupal to help you change almost any setting from code–perfect if you want to automate changes upstream and features doesn’t do it for you.  Let’s go over a situation in which you may have been utilizing some of these functions.

Let’s say you finished your product (congratulations!), launched, and are onto fixing bugs and planning exciting new features for the future.  You’re knocking out bugs left and right like some high-flying Drupal ninja and you discover that using a field collection with conditional fields causes the field collection data to not save and all of your metadata is getting erased when certain conditions are fired.  With Cthulhu’s hot breath on your neck, you talk to the client and realize a ray of hope: you don’t actually need a field collection there, a normal set of Drupal fields will do.  How do we go about creating the new fields, copying existing data, and deleting the old fields?

The first thing we do is create the new fields and attach them.  For this, we’ll need two functions: ‘field_create_field()’ and ‘field_create_instance()’.  Both of these take an array of settings: field_name and type are we need for creating the field (also cardinality if you want to have multiple values for the field), field_name, entity_type, and bundle are required for creating the instances, though you will likely also want label, or it will otherwise default to the machine name.  So, we should have something that looks like this:

$name = [
  ‘field_name’ => 'photographer_name',
  ‘type’ => ‘text’,
];
field_create_field($name);

$instance = array(
  'field_name' => $name['field_name'],
  'entity_type' => node,
  'bundle' => article,
  'label' => 'Name',
);

field_create_instance($instance);

If you go check out node/add/article, you should see your new text field there.  Congrats!  Next, we need to get the data from the old fields and copy it into our new field.  For this, we’ll rely on the nifty function 'entity_load()'.  This takes two arguments, bundle name and an array of ids.  Since we are getting field collection items, we know the bundle name is ‘field_collection_item’.  We’ll need the IDs, but we’ll also need the field collection value that references the fields in each collection for later, so we’ll get them both at once.  It might be tempting to use 'entity_load()' to get them, but in this case you are quite safe using straight SQL, which also happens to be significantly faster.  That looks like this:

$entity_ids = array();
$field_collection_ids = array();
// Select the field collection id and the attached entity id from the db.
$query = db_query('SELECT field_producer_value, entity_id FROM field_data_field_producer');
$results = $query->fetchAll();
// Separate the ids
foreach ($results as $result) {
  $field_collection_ids[] = $result->field_scald_producer_value;
  // We need to reference the entity ID by the field collection value for simplicity later
  $entity_ids[$result->field_scald_producer_value] = $result->entity_id;
}
// It’s possible that you might get duplicate Field Collection IDs, so we make sure they are all unique
$field_collection_ids = array_unique($field_collection_ids);
// Load all of the field collection entities.
$field_collection_results = entity_load('field_collection_item', $field_collection_ids);

Now that we have all of the entity ids and field collection ids, we can get to the fun part: copying data! (You know you have been doing this too long when that is exciting.) What we want to do is loop through the field collection ids, load the the entity (that has the new field on it) by the id associated with the collection, copy the data from the collection to the new field, and save.  It seems like a lot, but it’s fairly simple:

foreach ($field_collection_ids as $field_collection_id) {
  // Load the entity the field collection is attached to
  $entity = entity_load('node', array($entity_ids[$field_collection_id]));
  // Copy the data from the collection field to the new field
  $entity[$entity_ids[$field_collection_id]]->photographer_name['und'][0]['value'] =
  $field_collection_results[$field_collection_id]->field_producer_name['und'][0]['value'];
  // Save!
  entity_save('node', $entity[$entity_ids[$field_collection_id]]);
}

A word of warning: depending on how many entities you are processing, this could take a long time.  As of Drupal 7.34, there is a memory leak in entity_save()--this means that each save will take slightly longer than the last. This is not a problem if you have only a few hundred fields, but when you get up into five and six digits, this script will take many hours. At that point, unless you have the time (and/or can run the script as a process in the background), you might want to consider investigating other options.

Okay, so the data is copied, the nodes are saved, and the elder gods have hit the snooze button.  Last thing you have to do is delete the old field.  We’re not going to do that, at least not yet. Instead, we’re going to delete the instances of the fields.  This will preserve the old field collection data, but remove the fields from the edit forms. This way, if something goes wrong, you don’t lose the data in the old fields and can try again if needed. You can go back at a later time, if you wish, after you have confirmed that everything is correct and delete the fields. Luckily, this is the easy part:

$instance = array(
  'field_name' => 'field_scald_producer',
  'entity_type' => node,
  'bundle' => article
);
field_delete_instance($instance);

And that’s it, crisis averted!  You no longer lose data and no longer have to worry about supernatural madness and death!  All you need to do now is run your script upstream with 'drush php-script' and watch the magic.

This sort of scripting can be daunting at first glance, but Drupal’s rich entity API can keep you from pulling out your hair or inadvertently causing an otherworldly alien intelligence from rising from the deep.  There are many more functions to take advantage of, and just about anything you can set with a click in the interface you can set in code, perfect for automation or locked down production environments.

Happy Drupaling !

Jun 17 2016
Jun 17
This is the next steps in getting my module working in Drupal 8. In the previous post, I got the block setup working, but it's just a generic "Hello World!" output. The block gets the most recent entries from a database table, so my next step is getting the database setup.

Databases

This doesn't seem to have changed from Drupal 7 to 8. You basically copy over the .install file. The hook_schema() takes care of it. BTW, one thing that is different is if you need to uninstall/reinstall the modules, you just need to drush pm-uninstall MODULE vs. drush dis MODULE ; drush pm-uninstall MODULE.

Once I had the database tables in place, I dumped my database tables from D7 and imported them into D8. NOTE: I probably need to either codify this (perhaps there's a way to hook into the migration system?) or keep notes about what tables to dump/import once everything's in place.

Now I needed to update my block to query the most recent entries in the table. db_query() is still supported in D8, however it looks like it's going away for D9, so if I wanted to future-proof my module even more, I could convert it to the new injection technique. So I created a storage class in my src folder. I downloaded the examples module and looked at the dbtng_example submodule and its DbtngExampleStorage class. I also found this answer on StackOverflow and the example posted in the select doc helpful. I ended up with a hybrid approach:

So a few things there. First, my post_date field was a legacy datetime field, so I had to convert the current time from a timestamp to the date format that mysql expected. format_date() is deprecated, so they recommend using Drupal::service('date.formatter'). Also, I make my __construct() method a little dynamic so that I don't have to pass in a connection, but it can support it if it is passed in.

Then my block build() method uses my storage class and calls the get_reports() method, passing in the block's configuration array. First, I add use Drupal\gated_content\GCStorage; to the top of my block plugin. Then my build() looks like this:

  public function build() {
    $storage = new GCStorage;
    $reports = $storage->get_reports($this->configuration);
    $build = array();
    kint($reports, $build);
    return $build;
  }

Block View

So now that my block has the right set of reports to show, I need to show them. This takes me to the whole theming and templates stuff. The D7 version called theme() directly and there was a hook_theme and I had a template. The end result was a simple
and foreach report, a
and l(); call to link to the report. Kind of reminds me of item_list a bit, so I think I'll start there.

Ok, it looks like item_list looks about the same. My render array just needs to '#theme' => 'item_list'. But l() is gone. This comment pointed me in the right direction. So now it looks like this:

    $items = array();
    foreach ($reports as $report) {
      $url = Url::fromRoute('');
      $items[] = \Drupal::l(t($report['title']), $url);
    }
    $build = array(
      'report_list' => array(
        '#theme' => 'item_list',
        '#items' => $items,
        '#attributes' => array(
          'class' => 'report-block',
        ),
      ),
    );

Success!

Of course, these all link to just the homepage, which isn't very helpful.

Menu

So now I'm looking down another rabbit hole. The URL to my report is reports/[ID#]. But I use this module on separate domains, so the reports part is configurable. So I need to get that configuration setting to get the first bit of the URL. Actually, I think I can skip that since we'll be doing something different in our migration plans, so I can skip that part of the module. [postnote: if I do end up needing this, it looks like this shows how]

Also, I had this module all setup with pathauto to get a nice SEO-friendly public URL. Ok, so I need to setup the menu routing so the user can get to /gated_content and /gated_content/[ID#] and /gated_content/[ID#]/download. And I need to setup pathauto again.

First, all of those URL's were setup in D7 with type of MENU_CALLBACK, so according to this doc, I need to set them up in a .routing.yml file. This wasn't something DMU setup for me, so I did this from scratch. This was a great doc that showed some examples going from D7 to D8.

My first question is that I was familiar with named placeholders and how the name would be used to call a NAME_load() function to put together the argument that gets passed into the callback. But these examples are using book, which uses node and entity, so I need something that will use my own data type.

Stuff for the next post!

Jun 14 2016
Jun 14

Google Summer of Code (GSoC’ 16) is entering into the mid-Term evaluation stage. I have been working on the porting search configuration module for Drupal for the past three weeks.

Search configuration module helps to configure the search functionality in Drupal. This is really an important feature when it comes to a content management system like Drupal. I am almost mid-way in the port process as indicated in the timeline of Google Summer of Code.

It is really a great feeling to learn the Drupal concepts this summer. I would like to take this opportunity to share with you some key aspects I had to deal with in the past week.

Once a module is installed and later on if some changes are amended, we need not rewrite the Drupal code. Instead, Drupal gives up the option to make use of a hook, function hook_update_N which helps us to write the update needed and the database schema gets updated accordingly. Currently, since my module is just going into its first release after this port process, I need not write an update function. I just have to make this update in connection with the earlier code. The same hook works for both Drupal 7 and 8.

Another feature is the hook_node_insert, this hook is invoked to insert a new node into the database. So, here we are writing into the database a drupal record. In Drupal 7, this insertion was done by the drupal_write_record(). But,in D8, it has been replaced by the merge query and the entity API. The merge queries support the insert and update options to the database.

In connection with the nodes, another hook function is the hook_node_update. This functionality updates the node contents which has been inserted into the Drupal record (database). This takes in an argument, obviously,  the node has to be passed, for which the updating is intended.

The hook_uninstall gets invoked as the name suggests, in the uninstall process of the modules. The hook removes the variable used by the module so as to free the memory. It also modifies the existing tables if required. The D7 version of Drupal used the  variable_del for removing the variables.

For instance,

variable_del($nameOfVariable);

// Drupal 7 code

This has been replaced by the delete() of the configuration API.

i.e,

\Drupal::service('config.factory')->getEditable('search_config.settings')->delete();

search_config.settings is the default configuration file.

I will post the updates on this port process regularly. Please wait for the future posts.

Mar 09 2016
Mar 09

Migrate is one of the most established modules in the Drupal ecosystem. So much so that with Drupal 8, a decision has been made to get some of its functionality ported and added to Drupal core. An important reason was that the traditional upgrade between major releases was replaced with a migration of Drupal 6 or 7 content and configuration to Drupal 8.

Drupal 8 logo

Not all the functionality of the Migrate module has been moved into core though. Rather, we have the basic framework within the core migrate module and the functionality serving the upgrade path from Drupal 6 and 7 within the core migrate_drupal module. What’s left can be found in a host of contrib modules. The most important is Migrate Tools which contains, among other things, drush commands and the UI for running migrations. Additionally, the Migrate Source CSV, Migrate Source XML and Migrate Source JSON modules provide plugins for the most used types of migration sources.

In this article we are going to look at how migration works in Drupal 8 by migrating some content into node entities. For simplicity, the data we play with resides in tables in the same database as our Drupal installation. If you want to see the final code, you can check it out in this repository.

Drupal 8 Migration Theory

The structure of a migration in Drupal 8 is made up of three main parts: the source, the process and the destination, all three being built using the new Drupal 8 plugin system. These three parts form a pipeline. The source plugin represents the raw data we are migrating and is responsible for delivering individual rows of it. This data is passed to one or more process plugins that perform any needed manipulation on each row field. Finally, once the process plugins finish preparing the data, the destination plugins save it into Drupal entities (either content or configuration).

Creating a migration involves using such plugins (by either extending or directly using core classes). All of these are then brought together into a special migration configuration entity. This is shipped as module config that gets imported when the module is first enabled or can be constructed using a template (a case we won’t be exploring today). The migration configuration entity provides references to the main plugins used + additional metadata about the migration.

Movie Migration

Let us now get down to it and write a couple of migrations. Our data model is the following: we have two tables in our database: movies and movies_genres. The former has an ID, name and description while the latter has an ID, name and movie ID that maps to a movie from the first table. For the sake of simplicity, this mapping is one on one to avoid the complication of a third table. Here is the MySQL script that you can use to create these tables and populate with a few test records (if you want to follow along):

CREATE TABLE `movies` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `description` text,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;

CREATE TABLE `movies_genres` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  `movie_id` int(11) DEFAULT NULL,
  `name` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB AUTO_INCREMENT=5 DEFAULT CHARSET=utf8;

INSERT INTO `movies` (`id`, `name`, `description`)
VALUES
        (1, 'Big Lebowsky', 'My favorite movie, hands down.'),
        (2, 'Pulp fiction', 'Or this is my favorite movie?');

INSERT INTO `movies_genres` (`id`, `movie_id`, `name`)
VALUES
        (1, 1, 'Comedy'),
        (2, 1, 'Noir'),
        (3, 2, 'Crime');
        

What we want to achieve is migrate the movies into basic Drupal Article nodes and the genres into taxonomy terms of the Tags vocabulary (which the Article nodes reference via a field). Naturally, we also want the migration to mirror the association between the movies and the genres.

Genres

Let us first take care of the genre migration because in Drupal, the movies will depend on them (they will reference them).

Inside the config/install folder of our module, we need the following configuration entity file called migrate.migration.genres.yml:

id: genres
label: Genres
migration_group: demo
source:
  plugin: genres
  key: default
destination:
  plugin: entity:taxonomy_term
process:
  vid:
    plugin: default_value
    default_value: tags
  name: name

The first three elements of this configuration are the migration ID, label and group it belongs to. The following three keys are responsible for the three main parts of the migration pipeline mentioned earlier. Let’s talk about the latter three separately.

Source

Under the source key we specify which plugin the migration should use to represent and deliver the source data (in our case the plugin with the ID of genres). The key key is used to specify which database should our source query use (that is where our data is).

So in the Plugin/migrate/source folder of our module, let’s create our SQL based source plugin for our genres:

Genres.php

namespace Drupal\demo\Plugin\migrate\source;

use Drupal\migrate\Plugin\migrate\source\SqlBase;


class Genres extends SqlBase {

  
  public function query() {
    $query = $this->select('movies_genres', 'g')
      ->fields('g', ['id', 'movie_id', 'name']);
    return $query;
  }

  
  public function fields() {
    $fields = [
      'id' => $this->t('Genre ID'),
      'movie_id' => $this->t('Movie ID'),
      'name' => $this->t('Genre name'),
    ];

    return $fields;
  }

  
  public function getIds() {
    return [
      'id' => [
        'type' => 'integer',
        'alias' => 'g',
      ],
    ];
  }
}

Since we are using a SQL source, we need to have our own source plugin class to provide some information as to how the data needs to be read. It’s not enough to use an existing one from core. The query() method creates the query for the genre data, the fields() method defines each individual row field and the getIds() method specifies the source row field that acts as the unique ID. Nothing complicated happening here.

We are extending from SqlBase which, among other things, looks for the plugin configuration element named key to learn which database it should run the query on. In our case, the default one, as we detailed in the migration configuration entity.

And this is all we need for our simple source plugin.

Destination

For the migration destination, we use the default core taxonomy term destination plugin which handles everything for us. No need to specify anything more.

Process

Under the process key of the migration, we list each destination field we want populated and one or more process plugins that transform the source row fields into data for the destination fields. Since we want the genres to be all terms of the Tags vocabulary, for the vid field we use the default_value plugin which accepts a default_value key that indicates the value each record will have. Since all will be in the same vocabulary, this works well for us.

Lastly, for the term name field we can simply specify the source row field name without an explicit plugin name. This will, under the hood, use the get plugin that simply takes the data from the source and copies it over unaltered to the destination.

For more information on how you can chain multiple process plugins in the pipeline or what other such plugins you have available from core, I recommend you check out the documentation.

Movies

Now that our genres are importable, let’s take a look at the movies migration configuration that resides in the same folder as the previous (config/install):

migrate.migration.movies.yml

id: movies
label: Movies
migration_group: demo
source:
  plugin: movies
  key: default
destination:
  plugin: entity:node
process:
  type:
    plugin: default_value
    default_value: article
  title: name
  body: description
  field_tags:
    plugin: migration
    migration: genres
    source: genres
migration_dependencies:
  required:
    - genres

We notice the same metadata as before, the three main parts of the migration (source, process and destination) but also the explicit dependency which needs to be met before this migration can be successfully run.

Source

Like before, let’s take a look at the movies source plugin, located in the same place as the genres source plugin (Plugin/migrate/source ):

Movies.php:

namespace Drupal\demo\Plugin\migrate\source;

use Drupal\migrate\Plugin\migrate\source\SqlBase;
use Drupal\migrate\Row;


class Movies extends SqlBase {

  
  public function query() {
    $query = $this->select('movies', 'd')
      ->fields('d', ['id', 'name', 'description']);
    return $query;
  }

  
  public function fields() {
    $fields = [
      'id' => $this->t('Movie ID'),
      'name' => $this->t('Movie Name'),
      'description' => $this->t('Movie Description'),
      'genres' => $this->t('Movie Genres'),
    ];

    return $fields;
  }

  
  public function getIds() {
    return [
      'id' => [
        'type' => 'integer',
        'alias' => 'd',
      ],
    ];
  }

  
  public function prepareRow(Row $row) {
    $genres = $this->select('movies_genres', 'g')
      ->fields('g', ['id'])
      ->condition('movie_id', $row->getSourceProperty('id'))
      ->execute()
      ->fetchCol();
    $row->setSourceProperty('genres', $genres);
    return parent::prepareRow($row);
  }
}

We have the same three required methods as before, that do the same thing: query for and define the data. However, here we also use the prepareRow() method in order to alter the row data and available fields. The purpose is to select the ID of the movie genre that matches the current row (movie). That value is populated into a new source field called genres, which we will see in a minute how it’s used to save the Tags taxonomy term reference.

Destination

In this case, we use the node entity destination plugin and we need nothing more.

Process

There are four fields on the Article node we want populated with movie data. First, for the node type we use the same technique as before for the taxonomy vocabulary and set article to be the default value. Second and third, for the title and body fields we map the movie name and description source fields unaltered.

Lastly, for the tags field we use the migration process plugin that allows us to translate the ID of the genre (that we added earlier to the genres source row field) into the ID of its corresponding taxonomy term. This plugin does this for us by checking the migration mapping of the genres and reading these IDs. And this is why the genres migration is also marked as a dependency for the movies import.

Activating and Running the Migration

Now that we have our two migration configuration entities and all the relevant plugins, it’s time to enable our module for the first time and have the configuration imported by Drupal. If your module was already enabled, uninstall it and then enable it again. This will make the config import happen.

Additionally, in order to run the migrations via Drush (which is the recommended way of doing it), install the Migrate Tools module. Then all that’s left to do is to use the commands to migrate or rollback the movies and genres.

To see the available migrations and their status:

drush migrate-status

To import all migrations:

drush migrate-import --all

To roll all migrations back:

drush migrate-rollback --all

Conclusion

And there we have it – a simple migration to illustrate how we can now import, track and roll back migrations in Drupal 8. We’ve seen how the plugin system is used to represent all these different components of functionality, and how the migration definition has been turned into configuration that brings these elements together.

There is much more to learn, though. You can use different source plugins, such as for data in CSV or JSON, complex process plugins (or write your own), or even custom destination plugins for whatever data structure you may have. Good luck!

Daniel Sipos

Meet the author

Daniel Sipos is a Drupal developer who lives in Brussels, Belgium. He works professionally with Drupal but likes to use other PHP frameworks and technologies as well. He runs webomelette.com, a Drupal blog where he writes articles and tutorials about Drupal development, theming and site building.
Apr 03 2015
Apr 03

Drupal field was part of the Drupal core since version 7. The Field extends her ability to build different kinds of systems. Since it is basic units of each entity, it is one of the most important parts of the open source software. But, when it comes to the efficiency of using SQL storage engine, the field can still do better with efficiency. I sincerely believe that we may not afford to ignore it. Let put it under a microscope had a close look at field SQL storage.

Case study:

I had built a patient scheduling system for a couple clinic offices. The project itself is not complicated. I have attached the patient profile picture on this article. We built a patient profile node type on the form. It is not a complicated form, but there are over 40 fields. It is not difficult to set up a nice patient profile node form. I also created appointment node type that connected patient profile and doctor profile with entity reference fields. Using views with exposed filter for the various reports.

It was the project where I find the issue. I am a little bit uncomfortable after I take a close look at the database. Each field has two almost identical tables. I think fields took too much unnecessary database space. I have dumped one of the fields database information to explain my concern.

1) Base table: field_data_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | YES  | MUL | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Base table SQL script:

CREATE TABLE `field_data_field_initial` (
`entity_type` varchar(128) NOT NULL DEFAULT '',
`bundle` varchar(128) NOT NULL DEFAULT '',
`deleted` tinyint(4) NOT NULL DEFAULT '0',
`entity_id` int(10) unsigned NOT NULL,
`revision_id` int(10) unsigned DEFAULT NULL,
`language` varchar(32) NOT NULL DEFAULT '',
`delta` int(10) unsigned NOT NULL,
`field_initial_value` varchar(255) DEFAULT NULL,
`field_initial_format` varchar(255) DEFAULT NULL,
PRIMARY KEY (`entity_type`,`entity_id`,`deleted`,`delta`,`language`),
KEY `entity_type` (`entity_type`),
KEY `bundle` (`bundle`),
KEY `deleted` (`deleted`),
KEY `entity_id` (`entity_id`),
KEY `revision_id` (`revision_id`),
KEY `language` (`language`),
KEY `field_initial_format` (`field_initial_format`)

2) Revision table: field_revision_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | NO   | PRI | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Revision table SQL script:

CREATE TABLE `field_revision_field_initial` (
  `entity_type` varchar(128) NOT NULL DEFAULT '',
  `bundle` varchar(128) NOT NULL DEFAULT '',
  `deleted` tinyint(4) NOT NULL DEFAULT '0',
  `entity_id` int(10) unsigned NOT NULL,
  `revision_id` int(10) unsigned NOT NULL,
  `language` varchar(32) NOT NULL DEFAULT '',
  `delta` int(10) unsigned NOT NULL,
  `field_initial_value` varchar(255) DEFAULT NULL,
  `field_initial_format` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`entity_type`,`entity_id`,`revision_id`,`deleted`,`delta`,`language`),
  KEY `entity_type` (`entity_type`),
  KEY `bundle` (`bundle`),
  KEY `deleted` (`deleted`),
  KEY `entity_id` (`entity_id`),
  KEY `revision_id` (`revision_id`),
  KEY `language` (`language`),
  KEY `field_initial_format` (`field_initial_format`)

Here are my concerns.

1) Normalization.

Here is one of the fields' data record.

+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| entity_type | bundle | deleted | entity_id | revision_id | language | delta | field_initial_value | field_initial_format |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| node        | patient_profile      |       0 |      1497 |        1497 | und      |     0 | w                   | plain_text        |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+

We have value "W" in the Initial field. One character took 51 bytes for storage that had not included index yet. It took another 51 byte in the revision table and more for index. In this case here, only less than two percents of space are used for real data the initial 'W', and over 98% of space is for other purposes.

For the sake of space, I think we should not use varchar for entity_type, bundle, language, field_format column. Use small int, tiny int or intÎÎ that will only take one to four bytes. The field is a basic unit of a Drupal website. A medium website can hold millions of fields. Saved one byte is equal to multiple megabytes in precious MySQL database.

2) Too complicated primary key

Each field table has a complicated primary key. Base table use `entity_type`, `entity_id`, `deleted`, `delta`, `language` as primary key. Revision table use `entity_type`, `entity_id`, `revision_id`, `deleted`, `delta`, `language` as primary key. "In InnoDB, having a long PRIMARY KEY wastes a lot of disk space because its value must be stored with every secondary index record."ÎÎÎ. It may be worthy to add an auto increasing int as a primary key.

3) Not needed field column

I found bundle type column is not necessary. We can have the system running well without bundle type column. In my clinic project, I named the node type "patient profile". The machine name patient_profile appears in each field record's bundle type column. As varchar (255), it uses 16 bytes for each table record. Let do a quick calculation. if there are 100, 000 nodes and each node have 40 fields, 100,000 x 40 x 2 x 16 = 122MB are taken for this column. Or at least, we use 2 bytes small int that will take only one-eighth of the spaces.

4) Just use revision table.

Remove one of the field's data tables. It may need a little bit more query power to get field data, but it save time when we insert, update and delete field's data. By doing so, we maintain one less table per field, edit content faster. It helps to bring better editor experience and to save on database storage space.

A contributed module field_sql_leanÎÎ addressed some of the concerns here. It still needs a lot of work on itself and if we want other contributed modules compatible with it. After all, it changed the field table structure.

Reference:

1: http://dev.mysql.com/doc/refman/5.1/en/integer-types.html
2: http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html
3: Field SQL storage lean solution
4: Patient profile form:medical form

Jan 16 2013
Jan 16

Today we've got two FREE videos for your viewing enjoyment, one about a killer community tool, and the other about a handy module to help you out when working with Drupal databases. In Using IRC (Internet Relay Chat) we look at one of the most over-looked, but most useful community tools available, IRC. IRC is a way to chat online, in real-time with other people, and it is used heavily by the Drupal community. We'll sort out what IRC is, how to get connected, and walk you through some of the basics to get you started.

Our second lesson, Using Schema Module, introduces you to a great way to get information about your database, and a nice, quick way to get Drupal to write some code for you, when it comes to creating your module's install file.

We hope you enjoy these videos, and we'll have some cool new stuff next week, walking you through working with lightboxes in Drupal.

Nov 28 2012
Nov 28

This video looks at the basics of working with MySQL from the command line. We get into the mysql environment and look at databases, tables and fields. We cover creating and deleting databases, creating a user, and querying within a particular database.

Mar 14 2012
Mar 14

To get things started in this series we'll install Drupal 7. We start in Drupal.org to download the code, review some important points to watch for when copying the files, then we show how to set up a database using PHPMyAdmin before we dive into the Drupal installer.

Jan 10 2011
Jan 10

Welcome to the 5th installment of the Drupal 7 Line by Line series of articles.

In the first four installments I covered index.php, the basics of the drupal_bootstrap() function and the first two phases of the Drupal bootstrap process: DRUPAL_BOOTSTRAP_CONFIGURATION and DRUPAL_BOOTSTRAP_PAGE_CACHE. (There are links to those articles found at the bottom of this article if you want to catch up.)

In the previous article about the page cache bootstrap process I noted that unless your site is configured in a specific way, both the DRUPAL_BOOTSTRAP_DATABASE (phase 3) and DRUPAL_BOOTSTRAP_VARIABLES (phase 4) bootstrapping phases will be completed before the page cache (phase2) bootstrap process can itself finish. As a result, I am technically covering the database bootstrapping process out of order and am not literally following the line by line code execution of Drupal exactly. I hope you can forgive me.

_drupal_bootstrap_database()

The third Drupal bootstrap phase begins with a call to the _drupal_bootstrap_database() function.

The first few lines of this function can put to rest any argument that Drupal is an application development framework (like Zend framework, or CakePHP, or symfony). In its current state it simply isn't. Here, on virtually every page load, Drupal checks to see if it in an installation state.

<?php
 
// Redirect the user to the installation script if Drupal has not been
  // installed yet (i.e., if no $databases array has been defined in the
  // settings.php file) and we are not already installing.
 
if (empty($GLOBALS['databases']) && !drupal_installation_attempted()) {
    include_once
DRUPAL_ROOT . '/includes/install.inc';
   
install_goto('install.php');
  }
?>

If Drupal decides that it is in an installation state it would load install.inc and redirect the request to install.php. With a specific installer baked into its core Drupal has no separated between the framework that its functionality is built on and the product that exposes that functionality. Please don't misunderstand my point here. Drupal is a very flexible, extensible and developer friendly CMS product. It is very much like a framework. It looks like a duck and quacks like a duck, but it might be a swan or some other mixed metaphor.

But, I digress.

Drupal Testing System

Drupal 7 comes with a testing suite that allows developers to write functional tests and run them to make sure that whatever they are working on doesn't break the rest of their Drupal site. Part of that testing system is a fake 'user agent' / fake browser that makes page requests and makes the results available for testing in an automated way.

Here Drupal checks to see if the request is being made by this testing user agent with a call to drupal_valid_test_ua(). If it is a request from the testing user agent Drupal overrides the database settings defined in settings.php. The reason for this is that if you are running tests (which require reading and writing of bogus test data to the database), you don't want to impact any real data on your site. So, if you are running tests they are run on a different set of tables.

The Database System

Finally, Drupal includes /database/database.inc.

<?php
 
// Initialize the database system. Note that the connection
  // won't be initialized until it is actually requested.
 
require_once DRUPAL_ROOT . '/includes/database/database.inc';
?>

This file contains the core of the new Drupal 7 database abstraction layer.

Here is a part of the code comments from database.inc:

<?php
/* Drupal provides a database abstraction layer to provide developers with
* the ability to support multiple database servers easily. The intent of
* this layer is to preserve the syntax and power of SQL as much as possible,
* but also allow developers a way to leverage more complex functionality in
* a unified way. It also provides a structured interface for dynamically
* constructing queries when appropriate, and enforcing security checks and
* similar good practices.
*
* The system is built atop PHP's PDO (PHP Data Objects) database API and
* inherits much of its syntax and semantics.
*/
?>

The workings of the new database abstraction layer is enough material for several articles. You may want to do some reading on your own. I would suggest:
The main API documentation: http://api.drupal.org/api/group/database/7
Developing with the database API: http://drupal.org/developing/api/database

Autoloaders

Last but not least Drupal registers two autoloading functions. PHP provides a file autoloading mechanism so that if it encounters the instantiation of a previously undefined Class or encounters the implementation of a previously undefined Interface it can be told where to look in the file system to load the file that contains the Class or Interface definition.

<?php
 
// Register autoload functions so that we can access classes and interfaces.
  // The database autoload routine comes first so that we can load the database
  // system without hitting the database. That is especially important during
  // the install or upgrade process.
 
spl_autoload_register('drupal_autoload_class');
 
spl_autoload_register('drupal_autoload_interface');
?>

These two lines tell php to use the drupal_autoload_class() and drupal_autoload_interface() functions for loading undefined classes and interfaces. Both of these functions are actually a wrapper for the _registry_check_code() function.

The name of the function implies that Drupal now has a code registry. And indeed it does. What that means is that modules now have a way to tell Drupal where to find certain include files so that when they are needed they can be 'lazy loaded'. To find out more I suggest reading: Drupal's code registry.

Summary

DRUPAL_BOOTSTRAP_DATABASE is the third bootstrap phase. It checks to see if the site is currently in an installation state (rarely) and whether it is in a automated testing state (rarely). Most of the time it simply does two things: load database.inc and register autoloader functions.

As always feel free to leave comments, corrections or questions.

See Also: 

Dec 23 2010
Dec 23

Drush + Bash + Cron: Datbase Backup Goals

  • Scan sites directory for a given drupal install
  • Find all multisite folders/symlinks
  • For each multisite:
  • Use Drush to clear cache - we dont want cache table bloating up the MySQL dump file
  • Use Drush to delete watchdog logs - we dont want watchdog table bloating up the MySQL dump file
  • Use Drush to backup the database to pre-assigned folder
  • Use tar to compress and timestamp the Drush generated sql file
  • Setup Crontab to run perodically with the above commands as a bash file

Assumptions and Instructions

You will need to adjust the Bash file if any of these are not the same on your server

  • Drupal is installed in /var/www/html/drupal
  • Multisites are setup in the /var/www/html/drupal/sites folder
  • Backup folder exists in /var/www/backup/sqldumps
  • Drush is already installed in /root/drush/drush. If drush is not installed follow this Drush installation guide
  • AWK is already installed, if not, type: sudo yum install gawk

Drush Backup BASH file

Copy paste the code below and create a new bash file ideally in your/root home folder. Make the Bash file executable.

#!/bin/bash
#
 
# Adjust to match your system settings
DRUSH=/root/drush/drush
ECHO=/bin/echo
FIND=/usr/bin/find
AWK=/bin/awk
 
# Adjust to match your system settings
docroot=/var/www/html/drupal
backup_dir=/var/www/backup/sqldumps
 
multisites=$1
 
START_TIME=$(date +%Y%m%d%H%M);
 
# Add all multisites for a given docroot into a list. Detects all web addresses which are a directory which isn't named all, default or ends in .local.
if [ "${multisites}" = "all" ];then
        # If multisites are folders change -type d
        # If multisites are symlinks change -type l
        # Adjust $8 to match your docroot, it $x needs to be the name of the multisite folder/symlink
        multisites_list="`$FIND ${docroot}/sites/* -type l -prune | $AWK -F \/ '$8!="all" && $8!="default" && $8!~/\.local$/ { print $8 }'`"
else
        multisites_list=$multisites
fi
 
 
# Must be in the docroot directory before proceeding.
cd $docroot
 
for multisite in $multisites_list
do
        # Echo to the screen the current task.
        $ECHO
        $ECHO "##############################################################"
        $ECHO "Backing up ${multisite}"
        $ECHO
        $ECHO
 
        # Clear Drupal cache
        $DRUSH -y -u 1 -l "${multisite}" cc all
 
        # Truncate Watchdog
        $DRUSH -y -u 1 -l "${multisite}" wd-del all
 
        # SQL Dump DB
        $DRUSH -u 1 -l "${multisite}" sql-dump --result-file="${backup_dir}"/"${multisite}".sql
 
        # Compress the SQL Dump
        tar -czv -f "${backup_dir}"/"${START_TIME}"-"${multisite}".tar.gz -C "${backup_dir}"/ "${multisite}".sql
 
        # Delete original SQL Dump
        rm -f "${backup_dir}"/"${multisite}".sql
 
        $ECHO
        $ECHO
        $ECHO "Finished backing up ${multisite}"
        $ECHO
        $ECHO "##############################################################"
 
done

Setup Crontab

Assuming your bash file containing the code above is saved as /root/drush_backup.sh, you can setup a crontab for root user.

crontab -e
1 1 * * * /root/drush_backup_db.sh

Further Reading and Links

Related blog posts: 


Bookmark and Share
Oct 06 2010
Oct 06
Printer-friendly version

With Drupal 7 right around the corner, I have recently put myself to the task of ramping up on what's new, what's changed, and what do I – as a module developer – need to know when I sit down to code my first D7 module (or upgrade one of my D6 modules *shameless plug*). I've spent the last few weeks scouring over the D7 core and API documentation, and let me be the first to tell you if haven't heard it yet (unlikely): better times are ahead. For all of us.

The amount of API changes from D6 to D7 are broad and sweeping. D7 is certainly still Drupal, meaning that it still feels and acts like Drupal. Nodes are still nodes that can have taxonomy and comments, users are still users with roles and permissions, so on and so forth. But nearly everything under the hood and in the UI, for that matter, has been improved upon and/or refactored. From a coder’s perspective, I am soon to be in developer’s heaven when D7 goes stable. From the standpoint of being a framework, D7’s API is more mature and modern. You could almost think of D6 as a kid in highschool and with D7 that kid has graduated and is now ready for the real world. The refactoring that has taken place over the last couple of years and all of the new features now available will be a boon for module and theme developers alike. This also explains just why we have had to wait so long for a stable release. It’s almost as if you took your grocery-getter car to Q from James Bond for an “upgrade”. But, I digress.

This article will be the first in a series exploring D7’s API. For this article (itself a two-part series), I want to talk about what I feel, as a developer, is the most substantial and important change to the Drupal core. The new database abstraction layer (or DBTNG Database: The Next Generation) is a complete overhaul of the database layer we have grown accustomed to in versions D6 and before. Built on PDO, the database layer in Drupal is now truly abstract, meaning that it is database agnostic. This gives Drupal the flexibility to run on any number of different platforms without having to write new code or change any existing code that interacts with the database. This also means that we can expect to see a lot more enterprise-level Drupal sites once D7 hits stable as it’s footprint in the corporate world will now be able to grow substantially. Gravy. So, let’s see what all this means.

First and foremost, gone are the days of having to write ludicrously long and complicated MySQL or Postgres dependent SQL statements to fetch or write data. Gone are the days of having to define your placeholders’ type. And gone are the days of having to worry about what would happen to your data if an insert or update query failed in the middle of executing (yes, we finally get to use transactions!).

Let’s jump right in and start with basic SELECT queries in D7’s new database layer. D7 introduces us to the concepts of static and dynamic queries. Static queries are queries that get passed to the database as is, meaning you query the database with a SQL string you write. Dynamic queries are built by Drupal using a query object and that query object’s methods. A few basic examples may help illustrate the difference.

Example of a static query in D7:

<?php
  $result = db_query(“SELECT nid FROM {node}”);
?>

Well, that should look familiar to anyone who has done any Drupal module development in the past. Our old friend db_query() is still here but is much more powerful now. How so? Glad you asked. First, the way we define placeholders has changed. We no longer have to type hint them with %d or %s, and we also don’t have to worry about remembering to surround string placeholders with quotes. Let’s see the difference:

Example of using placeholders in D6:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = ‘%s’”, “Bob”);
?>

Example of using placeholders in D7:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = :name”, array(‘:name’ => ‘Bob’));
?>

At first, this may seem a little convoluted as we have to write a little more code to accomplish the same task, but the benefits far outweigh the few extra characters I have to type. Placeholders, as seen above, are now defined using the convention :identifier. No need for quotes, no need to worry about the type. The function db_query()’s second parameter is an associative array where the keys are the placeholder identifiers and then assigned their corresponding values. It’s basically the same principle that the Drupal function t() uses with its placeholders. You can also assign a placeholder value to be an array.

An example taken from the official handbook page for D7’s database layer:

<?php
  // This code:
  db_query("SELECT * FROM {node} WHERE nid IN (:nids)", array(':nids' => array(13, 42, 144));

  // Will get turned into this prepared statement equivalent automatically:
  db_query("SELECT * FROM {node} WHERE nid IN (:nids_1, :nids_2, :nids_3)", array(
    ':nids_1' => 13,
    ':nids_2' => 42,
    ':nids_3' => 144,
  ));

  // Which is equivalent to the following literal query:
  db_query("SELECT * FROM {node} WHERE nid IN (13, 42, 144)";
?>

Sweet. Another goodie that db_query() now has thanks to PDO is the ability to define how our results will be retrieved. You can provide db_query() with a third parameter which is a PDO-specific constant that defines the retrieval type.

The types we can define are:

  • PDO::FETCH_OBJ — fetches results into an object
  • PDO::FETCH_ASSOC — fetches results into an associative array
  • PDO::FETCH_NUM — fetches results into an indexed array
  • PDO::FETCH_BOTH — fetches result into both an indexed and associative array

By default, results are fetched into an object. Here’s what an example of fetching our results into an associative array looks like:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = :name”, array(‘:name’ => ‘Bob’), array(‘fetch’=> PDO::FETCH_ASSOC));
?>

As you can see, db_query()’s third parameter is an associative array with the key ‘fetch’ where we assign the retrieval type. This array can also have another key defined: ‘target’, which tells Drupal where to retrieve the results from. As of the writing of this article, this is limited to only 2 possible values: “default” and “slave”. Still, though, the potential here makes me giddy.

Additionally, we can also pass the name of a class we want are results to be retrieved in. Say what?!? Oh yes. If we have defined this class, and we pass db_query() the name of our class, results will be retrieved and assigned as properties of that class. If that class has a constructor method, by default it will run after the results are retrieved and the properties defined, but we can also tell db_query() to run the class’ constructor method before the properties are assigned.

Another example from the official handbook page:

<?php
class exampleClass {
  function __construct() {
    // Do something
  }
}

  $result = db_query("SELECT id, title FROM {example_table}", array(), array(
    'fetch' => 'ExampleClass',
  ));
?>

Dynamic queries in D7, on the other hand, are a different creature. We have a new function that we will use for dynamic queries: db_select(). We can pass db_select() three parameters. The first is a database table name. The second is an alias for that table name. The third parameter is identical to db_query()’s third parameter. db_select() returns a query object which has some methods defined that we can use to construct our dynamic query.

An example of a dynamic query in D7:

<?php
  $query = db_select(‘users’, ‘u’);
  $query
    ->condition(‘u.name’, ‘Bob’)
    ->fields(‘u’, array(‘mail’));
  
  $result = $query->execute();
?>

Need to join in another table? No sweat. Like I mentioned earlier in this article gone are the days of having to write cumbersome SQL queries. Here’s how easy it is to do a join with dynamic queries in D7:

<?php
  $table_alias = $query->join('user', 'u', 'n.uid = u.uid AND u.uid = :uid', array(':uid' => 5));
?>

Our query object’s join method takes a table name as the first parameter, and an alias for that table as the second parameter. The third parameter is the “ON” condition of the join and the last parameter is for placeholder replacement. The join method returns the table alias of the table assigned. We can also pass SELECT query object to the join method as the first argument.

An example from the handbook:

<?php
  $myselect = db_select('mytable')
    ->fields('mytable')
    ->condition('myfield', 'myvalue');
  $alias = $query->join($myselect, 'myalias', 'n.nid = myalias.nid');
?>

I don’t have nearly enough room to go into all the features and capabilities of dynamic queries in this article, suffice to say as you can see from the few simple examples above they are extremely powerful and flexible and you’ll be using them in your own modules a lot very soon. Less code and more bang for the buck is always fine by me.

In the second part of the this article, I will go over insert, update, and delete queries and will introduce you merge queries.

Until next time, happy coding.

Further reading:

Jul 24 2010
Jul 24
Printer-friendly versionPDF version

If you're like most people who build using Drupal you want to build your site around nodes, that have titles, descriptions, tags, comments, etc. It's no surprise since those types of content oriented sites are the ones that make the world of the web go round. But what about another type of site that focuses not on nodes but on data contained in the Drupal database? Can you do it? Yes. Should you do it? Only if that sort of thing excites you or perhaps if a client requests it.

I'll give you a little background on where I'm coming from and what made me decide to play around with Drupal as a way to expose data sets. About a decade ago I was part of a group of people who built a first of its kind application to send and receive data from wholesale power markets in North America. The application was built on three tiers: a client, an application server, and a database server. The client and application server were written in Java and the database relied on Oracle technology. It was about as far away from open source as you could get. The product was successful though and we did hundreds of installs all across North America. What the product basically did for its owners was suck in data from a central server then expose that data to the user via tables and charts visible from the user interface. At some point last year I started wondering if I could do something similar with Drupal. So I started investigating, tried out different modules and realized that I could do even more with a Drupal install than we did with that product.

There are four tasks that need to be accomplished if you want to use Drupal as an effective data platform.

  1. Capture data to the Drupal database tables
  2. Setting the data types (text, numeric, etc.)
  3. Expose the data to the user using tables and charts
  4. Offer simplified download format options

 Capturing Data

I have been capturing data using the Table Wizard module. Table Wizard allows you to create a database table by uploading a delimited (like tab or comma separated) file. This works pretty well. I recently uploaded a file with over 300,000 records and it was in the database within a few minutes. After you upload your file you click on the table name to "analyze" the table. Table Wizard will let you know what columns you have in the table, the data types and identify the primary key. When you first upload your data you will want to go switch over to your database to change data types of the appropriate ones and identify a primary key. Once you have done that you can come back and re-analyze the table. 

Setting the Data Types

This is probably the step that will be the least welcome to those who are not familiar with databases. If you want to do extensive work with data in Drupal then you will want to become familiar enough with phpMyAdmin and MySQL data types to be able to navigate to your database and apply the appropriate settings for each type of data. You should also indicate which field contains the primary key for the table. The primary key is a unique value that identifies each record in a table. If you have a dataset that does not have a column with unique values then consider adding a column unique numeric code in each row. You can name the column something like record_id and then add values like 100001, 100002, 100003 and so on.

Expose the Data Using Tables and Charts

If you have been a good boy or girl and added a single field primary key then the Table Wizard module will let you check a box that is labeled, "Provide default view." This option automatically creates a view that you can expand upon as you see fit. Auto created table wizard views will have the tag "tw" added to them. You can also create a view by navigating to /build/views/add and looking for the radio button in the 'View Type' section that starts with Database table. At this point you need to a thing or two about the Views module to go further. There's not enough space to cover that here so I'm going to skip over the details and point you to the key modules and settings.

The standard views module will let you create a table from your data. You just have to choose a Page display and choose the Table style. You can then choose your fields which will appear as columns within the table. Give the page a Path and perhaps add an Exposed Filter or two to allow users to narrow down the data they are looking at. If you want to get a little fancy you can choose use Views Calc to create tables that include calculations like min, max or average on a set of numeric data.

Charts are a little more tricky but it can be done with the assistance of Charts and Graphs and Views Charts. Charts and Graphs allows you to integrate a number of different free charting solutions (including Google Charts) into your site. View Charts makes those features available as a Style (called "Charts") within the Views interface. The chart gets exposed on a page based on the Path that you define. You can also create a chart within the Block display type. There are other charting modules that exist for Drupal. All of them have some drawbacks but so far the Charts and Graphs/Views Charts combo has worked the best for me.

Offering A Download Option

The ability to download data probably won't be a must have feature for every site. After all someone could probably just copy the table and paste it into their favorite spreadsheet program. It's a nice feature to offer though and you can do it pretty easily with the help of the Views Bonus Pack module. With Views Bonus Pack installed you get the option to add a Feed display within the Views interface. With the Feed display selected you then get the option of setting a Style that includes the option for CSV and other popular formats like TXT, DOC, XLS and XML. You can then attach that display to the Table display and there will be an image that appears at the bottom of the table page that will generate a file download when clicked. Be sure to add the same filters to the CSV display that you have added to the Table display so the downloaded file reflects what the user is seeing with any filters selected. 

The usual caveats apply to the ideas and tips that I've offered here. At any time a better module could come along or an existing module could change and alter the process a bit. So tread lightly and do some exploration before you commit to building a data driven site using Drupal. In fact, even though there was a very recent release of Table Wizard the development is being deprecated in favor of the Data module. I'll continue to upload using Table Wizard for now since I think it works well but also plan on giving the Data module a shot in the near future. 

As always I'd love to hear your thoughts and tips on modules and workflows in the comments. If you have built or know of a good data access website built using Drupal feel free to share those links as well.

Video Links

YouTube Version

Flash Version

Quicktime Version

Sep 14 2009
Sep 14

I had to move 264 databases from one Linux server (Fedora) to another (CentOS), hard copying /var/lib/mysql wasn't a good idea because version of two MySQL databases were different and i had to repair all the tables after move so i decided to write a very short and usefull bash script.

It dumps all the databases but each database into a different file :

for I in `echo "show databases;" | mysql -u myuser --password="" | grep -v Database`; do mysqldump -u myuser --password="" $I > "$I.sql"; done

As you can see you can put each .sql file into a different folder if you want, and it's possible to compress each file using piping :)

Apr 21 2006
Apr 21

This article explains a practical implementation of a technique outlined in the article "Sharing Drupal tables between databases using MySQL5 Views".

Problem

You have multiple (multisite) Drupal sites and you would like to manage the content for all of these sites through a single interface. Depending on the nature of a given piece of content, you may want the content published on one, several or all of your subsites, but you do not want to have to create copies of the same content for each site.

Solution

Taxonomy plus MySQL5 views. (NOTE: this solution will not work with versions of MySQL prior to 5.)

Assumming you have your subsites properly set up and running, the first step is to create a special vocabulary which you will use to target content.

Go to [your site's baseurl]/admin/taxonomy/add/vocabulary and create a vocabulary. We'll call it simply "sites".

Next, go back to your taxonomy page (/admin/taxonomy) and select "edit vocabulary" for the "sites" vocabulary.

Add a name for each of the subsites you would like to manage. For our example, we'll have two subsites, "foo" and "bar", and one master site, "master".

Now add at least three pieces of test content. Target one piece of content for each of foo, bar and both.

Next, we're going to create a node view for each of our subsites that we'll use to replace the actual node table.

The SQL is as follows:

CREATE VIEW [subsite, eg. "foo"]_node AS SELECT n.* FROM node n, term_data td, term_node tn, vocabulary v WHERE v.name = '[vocabulary name, eg. "sites"]' AND td.vid = v.vid AND td.name = '[subsite vocab term, eg. "foo"]' AND td.tid = tn.tid AND n.nid = tn.nid ;

Because the terms that serve as our subsite labels may very well exist within other vocabularies, we also need to join on the vocabulary table to ensure our solution works reliabley.

Finally, we need to have our subsites use the views we have created instead of our master nodes table, which only the "master" site will have access to directly.

In your drupal's sites directory, you should have directories that correspond to each of your drupal sites (both master and subsites). Edit the settings.php file for each of your subsites, and use the db_prefix variable to point the site to your view. So sites/foo.example.com/settings.php would contain the following:

$db_prefix = array( 'node' => 'foo_', );

At this point, you'll want to disable creation of content from within each of your subsites. You can do this in the from the admin/access page. If you attempt to create content from within the subsites, you'll likely get a 'duplicate key' error.

I hope that explanation is clear. These articles are written rather hastily. If you questions or suggestions regarding this solution, please leave a comment.

Mar 24 2006
Mar 24

Problem:

You have a "master" database that contains data from multiple Drupal sites and you want to share it among them. Normally, you could use table prefixing to allow each of your sites to point to a single table. But what if you do not want content from one site to "bleed" across to the other sites? Let's say you have a network of Drupal sites sharing a user database. You want to share that user's information across your entire network of sites, but only make the information visible from those sites to which the user has subscribed. Or maybe you want to populate baz.com with users who meet some arbitrary criteria. There are lots of possibilities here, but nobody's paying me to write this, so let's get on with it.

Solution:

Use MySQL5 Views. For our example, we'll use the users table. Our "master" users table (the table that contains all users for all of our sites) resides in a database called "master". The database of the our example site that will have restricted access to our masters users table is called "banana". Assuming you're starting with a fresh instance of the Drupal schema in your database "banana", do this:

use banana;
drop table users;
CREATE VIEW users AS
SELECT *
FROM master.users
WHERE uid IN (
SELECT uid
FROM somedatabase.sometable
WHERE uid = 0
OR label = 'foo'
)

Drupal will use banana.users just as it would a normal users table. No other modifications are necessary. Now only "foo" users will be included in the users table for your banana website. Note: Drupal has a dependency that is not really documented. Every users table must have an entry that contains uid=0. It's a "stub" entry that Drupal needs to function properly when a user is anonymous. A workaround for this dependency is to include "user 0" in the results set that defines your view. See, wasn't that easy? Please post questions here, and I'll update these instructions as needed.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web