Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Oct 24 2016
Oct 24

code

You know the old saying: “This is how the world ends: not with a bang, but with a misplaced DROP TABLE.” Working directly with Drupal 7’s database is an arduous task at best.  It’s a sprawling relational system and it uses many space and memory saving tricks to be as speedy as possible.  Thankfully, there is a robust system of functions built into Drupal to help you change almost any setting from code–perfect if you want to automate changes upstream and features doesn’t do it for you.  Let’s go over a situation in which you may have been utilizing some of these functions.

Let’s say you finished your product (congratulations!), launched, and are onto fixing bugs and planning exciting new features for the future.  You’re knocking out bugs left and right like some high-flying Drupal ninja and you discover that using a field collection with conditional fields causes the field collection data to not save and all of your metadata is getting erased when certain conditions are fired.  With Cthulhu’s hot breath on your neck, you talk to the client and realize a ray of hope: you don’t actually need a field collection there, a normal set of Drupal fields will do.  How do we go about creating the new fields, copying existing data, and deleting the old fields?

The first thing we do is create the new fields and attach them.  For this, we’ll need two functions: ‘field_create_field()’ and ‘field_create_instance()’.  Both of these take an array of settings: field_name and type are we need for creating the field (also cardinality if you want to have multiple values for the field), field_name, entity_type, and bundle are required for creating the instances, though you will likely also want label, or it will otherwise default to the machine name.  So, we should have something that looks like this:

$name = [ ‘field_name’ => 'photographer_name', ‘type’ => ‘text’, ]; field_create_field($name); $instance = array( 'field_name' => $name['field_name'], 'entity_type' => node, 'bundle' => article, 'label' => 'Name', ); field_create_instance($instance);

1

2

3

4

5

6

7

8

9

10

11

12

13

14

$name = [

  field_name’ => 'photographer_name',

  type=>text,

];

field_create_field($name);

$instance = array(

  'field_name' => $name['field_name'],

  'entity_type' => node,

  'bundle' => article,

  'label' => 'Name',

);

field_create_instance($instance);

If you go check out node/add/article, you should see your new text field there.  Congrats!  Next, we need to get the data from the old fields and copy it into our new field.  For this, we’ll rely on the nifty function ‘entity_load()’.  This takes two arguments, bundle name and an array of ids.  Since we are getting field collection items, we know the bundle name is ‘field_collection_item’.  We’ll need the IDs, but we’ll also need the field collection value that references the fields in each collection for later, so we’ll get them both at once.  It might be tempting to use ‘entity_load()’ to get them, but in this case you are quite safe using straight SQL, which also happens to be significantly faster.  That looks like this:

$entity_ids = array(); $field_collection_ids = array(); // Select the field collection id and the attached entity id from the db. $query = db_query('SELECT field_producer_value, entity_id FROM field_data_field_producer'); $results = $query->fetchAll(); // Separate the ids</span> foreach ($results as $result) { $field_collection_ids[] = $result->field_scald_producer_value; // We need to reference the entity ID by the field collection value for simplicity later $entity_ids[$result-&gt;field_scald_producer_value] = $result-&gt;entity_id; } // It’s possible that you might get duplicate Field Collection IDs, so we make sure they are all unique $field_collection_ids = array_unique($field_collection_ids); // Load all of the field collection entities. $field_collection_results = entity_load('field_collection_item', $field_collection_ids);

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

$entity_ids = array();

$field_collection_ids = array();

// Select the field collection id and the attached entity id from the db.

$query = db_query('SELECT field_producer_value, entity_id FROM field_data_field_producer');

$results = $query->fetchAll();

// Separate the ids</span>

foreach ($results as $result) {

  $field_collection_ids[] = $result->field_scald_producer_value;

  // We need to reference the entity ID by the field collection value for simplicity later

  $entity_ids[$result-&gt;field_scald_producer_value] = $result-&gt;entity_id;

}

// It’s possible that you might get duplicate Field Collection IDs, so we make sure they are all unique

$field_collection_ids = array_unique($field_collection_ids);

// Load all of the field collection entities.

$field_collection_results = entity_load('field_collection_item', $field_collection_ids);

Now that we have all of the entity ids and field collection ids, we can get to the fun part: copying data! (You know you have been doing this too long when that is exciting.) What we want to do is loop through the field collection ids, load the the entity (that has the new field on it) by the id associated with the collection, copy the data from the collection to the new field, and save.  It seems like a lot, but it’s fairly simple:

foreach ($field_collection_ids as $field_collection_id) { // Load the entity the field collection is attached to $entity = entity_load('node', array($entity_ids[$field_collection_id])); // Copy the data from the collection field to the new field $entity[$entity_ids[$field_collection_id]]->photographer_name['und'][0]['value'] = $field_collection_results[$field_collection_id]->field_producer_name['und'][0]['value']; // Save! entity_save('node', $entity[$entity_ids[$field_collection_id]]); }

1

2

3

4

5

6

7

8

9

foreach ($field_collection_ids as $field_collection_id) {

  // Load the entity the field collection is attached to

  $entity = entity_load('node', array($entity_ids[$field_collection_id]));

  // Copy the data from the collection field to the new field

  $entity[$entity_ids[$field_collection_id]]->photographer_name['und'][0]['value'] =

  $field_collection_results[$field_collection_id]->field_producer_name['und'][0]['value'];

  // Save!

  entity_save('node', $entity[$entity_ids[$field_collection_id]]);

}

A word of warning: depending on how many entities you are processing, this could take a long time.  As of Drupal 7.34, there is a memory leak in entity_save()–this means that each save will take slightly longer than the last. This is not a problem if you have only a few hundred fields, but when you get up into five and six digits, this script will take many hours. At that point, unless you have the time (and/or can run the script as a process in the background), you might want to consider investigating other options.

Okay, so the data is copied, the nodes are saved, and the elder gods have hit the snooze button.  Last thing you have to do is delete the old field.  We’re not going to do that, at least not yet. Instead, we’re going to delete the instances of the fields.  This will preserve the old field collection data, but remove the fields from the edit forms. This way, if something goes wrong, you don’t lose the data in the old fields and can try again if needed. You can go back at a later time, if you wish, after you have confirmed that everything is correct and delete the fields. Luckily, this is the easy part:

$instance = array( 'field_name' =&gt; 'field_scald_producer', 'entity_type' =&gt; node, 'bundle' =&gt; article ); field_delete_instance($instance);

1

2

3

4

5

6

$instance = array(

  'field_name' =&gt; 'field_scald_producer',

  'entity_type' =&gt; node,

  'bundle' =&gt; article

);

field_delete_instance($instance);

And that’s it, crisis averted!  You no longer lose data and no longer have to worry about supernatural madness and death!  All you need to do now is run your script upstream with ‘drush php-script’ and watch the magic.

This sort of scripting can be daunting at first glance, but Drupal’s rich entity API can keep you from pulling out your hair or inadvertently causing an otherworldly alien intelligence from rising from the deep.  There are many more functions to take advantage of, and just about anything you can set with a click in the interface you can set in code, perfect for automation or locked down production environments.

Happy Drupaling !

Jun 17 2016
Jun 17
This is the next steps in getting my module working in Drupal 8. In the previous post, I got the block setup working, but it's just a generic "Hello World!" output. The block gets the most recent entries from a database table, so my next step is getting the database setup.

Databases

This doesn't seem to have changed from Drupal 7 to 8. You basically copy over the .install file. The hook_schema() takes care of it. BTW, one thing that is different is if you need to uninstall/reinstall the modules, you just need to drush pm-uninstall MODULE vs. drush dis MODULE ; drush pm-uninstall MODULE.

Once I had the database tables in place, I dumped my database tables from D7 and imported them into D8. NOTE: I probably need to either codify this (perhaps there's a way to hook into the migration system?) or keep notes about what tables to dump/import once everything's in place.

Now I needed to update my block to query the most recent entries in the table. db_query() is still supported in D8, however it looks like it's going away for D9, so if I wanted to future-proof my module even more, I could convert it to the new injection technique. So I created a storage class in my src folder. I downloaded the examples module and looked at the dbtng_example submodule and its DbtngExampleStorage class. I also found this answer on StackOverflow and the example posted in the select doc helpful. I ended up with a hybrid approach:

So a few things there. First, my post_date field was a legacy datetime field, so I had to convert the current time from a timestamp to the date format that mysql expected. format_date() is deprecated, so they recommend using Drupal::service('date.formatter'). Also, I make my __construct() method a little dynamic so that I don't have to pass in a connection, but it can support it if it is passed in.

Then my block build() method uses my storage class and calls the get_reports() method, passing in the block's configuration array. First, I add use Drupal\gated_content\GCStorage; to the top of my block plugin. Then my build() looks like this:

  public function build() {
    $storage = new GCStorage;
    $reports = $storage->get_reports($this->configuration);
    $build = array();
    kint($reports, $build);
    return $build;
  }

Block View

So now that my block has the right set of reports to show, I need to show them. This takes me to the whole theming and templates stuff. The D7 version called theme() directly and there was a hook_theme and I had a template. The end result was a simple
and foreach report, a
and l(); call to link to the report. Kind of reminds me of item_list a bit, so I think I'll start there.

Ok, it looks like item_list looks about the same. My render array just needs to '#theme' => 'item_list'. But l() is gone. This comment pointed me in the right direction. So now it looks like this:

    $items = array();
    foreach ($reports as $report) {
      $url = Url::fromRoute('');
      $items[] = \Drupal::l(t($report['title']), $url);
    }
    $build = array(
      'report_list' => array(
        '#theme' => 'item_list',
        '#items' => $items,
        '#attributes' => array(
          'class' => 'report-block',
        ),
      ),
    );

Success!

Of course, these all link to just the homepage, which isn't very helpful.

Menu

So now I'm looking down another rabbit hole. The URL to my report is reports/[ID#]. But I use this module on separate domains, so the reports part is configurable. So I need to get that configuration setting to get the first bit of the URL. Actually, I think I can skip that since we'll be doing something different in our migration plans, so I can skip that part of the module. [postnote: if I do end up needing this, it looks like this shows how]

Also, I had this module all setup with pathauto to get a nice SEO-friendly public URL. Ok, so I need to setup the menu routing so the user can get to /gated_content and /gated_content/[ID#] and /gated_content/[ID#]/download. And I need to setup pathauto again.

First, all of those URL's were setup in D7 with type of MENU_CALLBACK, so according to this doc, I need to set them up in a .routing.yml file. This wasn't something DMU setup for me, so I did this from scratch. This was a great doc that showed some examples going from D7 to D8.

My first question is that I was familiar with named placeholders and how the name would be used to call a NAME_load() function to put together the argument that gets passed into the callback. But these examples are using book, which uses node and entity, so I need something that will use my own data type.

Stuff for the next post!

Jun 14 2016
Jun 14

Google Summer of Code (GSoC’ 16) is entering into the mid-Term evaluation stage. I have been working on the porting search configuration module for Drupal for the past three weeks.

Search configuration module helps to configure the search functionality in Drupal. This is really an important feature when it comes to a content management system like Drupal. I am almost mid-way in the port process as indicated in the timeline of Google Summer of Code.

It is really a great feeling to learn the Drupal concepts this summer. I would like to take this opportunity to share with you some key aspects I had to deal with in the past week.

Once a module is installed and later on if some changes are amended, we need not rewrite the Drupal code. Instead, Drupal gives up the option to make use of a hook, function hook_update_N which helps us to write the update needed and the database schema gets updated accordingly. Currently, since my module is just going into its first release after this port process, I need not write an update function. I just have to make this update in connection with the earlier code. The same hook works for both Drupal 7 and 8.

Another feature is the hook_node_insert, this hook is invoked to insert a new node into the database. So, here we are writing into the database a drupal record. In Drupal 7, this insertion was done by the drupal_write_record(). But,in D8, it has been replaced by the merge query and the entity API. The merge queries support the insert and update options to the database.

In connection with the nodes, another hook function is the hook_node_update. This functionality updates the node contents which has been inserted into the Drupal record (database). This takes in an argument, obviously,  the node has to be passed, for which the updating is intended.

The hook_uninstall gets invoked as the name suggests, in the uninstall process of the modules. The hook removes the variable used by the module so as to free the memory. It also modifies the existing tables if required. The D7 version of Drupal used the  variable_del for removing the variables.

For instance,

variable_del($nameOfVariable);

// Drupal 7 code

This has been replaced by the delete() of the configuration API.

i.e,

\Drupal::service('config.factory')->getEditable('search_config.settings')->delete();

search_config.settings is the default configuration file.

I will post the updates on this port process regularly. Please wait for the future posts.

Apr 03 2015
Apr 03

Drupal field was part of the Drupal core since version 7. The Field extends her ability to build different kinds of systems. Since it is basic units of each entity, it is one of the most important parts of the open source software. But, when it comes to the efficiency of using SQL storage engine, the field can still do better with efficiency. I sincerely believe that we may not afford to ignore it. Let put it under a microscope had a close look at field SQL storage.

Case study:

I had built a patient scheduling system for a couple clinic offices. The project itself is not complicated. I have attached the patient profile picture on this article. We built a patient profile node type on the form. It is not a complicated form, but there are over 40 fields. It is not difficult to set up a nice patient profile node form. I also created appointment node type that connected patient profile and doctor profile with entity reference fields. Using views with exposed filter for the various reports.

It was the project where I find the issue. I am a little bit uncomfortable after I take a close look at the database. Each field has two almost identical tables. I think fields took too much unnecessary database space. I have dumped one of the fields database information to explain my concern.

1) Base table: field_data_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | YES  | MUL | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Base table SQL script:

CREATE TABLE `field_data_field_initial` (
`entity_type` varchar(128) NOT NULL DEFAULT '',
`bundle` varchar(128) NOT NULL DEFAULT '',
`deleted` tinyint(4) NOT NULL DEFAULT '0',
`entity_id` int(10) unsigned NOT NULL,
`revision_id` int(10) unsigned DEFAULT NULL,
`language` varchar(32) NOT NULL DEFAULT '',
`delta` int(10) unsigned NOT NULL,
`field_initial_value` varchar(255) DEFAULT NULL,
`field_initial_format` varchar(255) DEFAULT NULL,
PRIMARY KEY (`entity_type`,`entity_id`,`deleted`,`delta`,`language`),
KEY `entity_type` (`entity_type`),
KEY `bundle` (`bundle`),
KEY `deleted` (`deleted`),
KEY `entity_id` (`entity_id`),
KEY `revision_id` (`revision_id`),
KEY `language` (`language`),
KEY `field_initial_format` (`field_initial_format`)

2) Revision table: field_revision_field_initial

+----------------------+------------------+------+-----+---------+-------+
| Field                | Type             | Null | Key | Default | Extra |
+----------------------+------------------+------+-----+---------+-------+
| entity_type          | varchar(128)     | NO   | PRI |         |       |
| bundle               | varchar(128)     | NO   | MUL |         |       |
| deleted              | tinyint(4)       | NO   | PRI | 0       |       |
| entity_id            | int(10) unsigned | NO   | PRI | NULL    |       |
| revision_id          | int(10) unsigned | NO   | PRI | NULL    |       |
| language             | varchar(32)      | NO   | PRI |         |       |
| delta                | int(10) unsigned | NO   | PRI | NULL    |       |
| field_initial_value  | varchar(255)     | YES  |     | NULL    |       |
| field_initial_format | varchar(255)     | YES  | MUL | NULL    |       |
+----------------------+------------------+------+-----+---------+-------+

Revision table SQL script:

CREATE TABLE `field_revision_field_initial` (
  `entity_type` varchar(128) NOT NULL DEFAULT '',
  `bundle` varchar(128) NOT NULL DEFAULT '',
  `deleted` tinyint(4) NOT NULL DEFAULT '0',
  `entity_id` int(10) unsigned NOT NULL,
  `revision_id` int(10) unsigned NOT NULL,
  `language` varchar(32) NOT NULL DEFAULT '',
  `delta` int(10) unsigned NOT NULL,
  `field_initial_value` varchar(255) DEFAULT NULL,
  `field_initial_format` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`entity_type`,`entity_id`,`revision_id`,`deleted`,`delta`,`language`),
  KEY `entity_type` (`entity_type`),
  KEY `bundle` (`bundle`),
  KEY `deleted` (`deleted`),
  KEY `entity_id` (`entity_id`),
  KEY `revision_id` (`revision_id`),
  KEY `language` (`language`),
  KEY `field_initial_format` (`field_initial_format`)

Here are my concerns.

1) Normalization.

Here is one of the fields' data record.

+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| entity_type | bundle | deleted | entity_id | revision_id | language | delta | field_initial_value | field_initial_format |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+
| node        | patient_profile      |       0 |      1497 |        1497 | und      |     0 | w                   | plain_text        |
+-------------+--------+---------+-----------+-------------+----------+-------+---------------------+----------------------+

We have value "W" in the Initial field. One character took 51 bytes for storage that had not included index yet. It took another 51 byte in the revision table and more for index. In this case here, only less than two percents of space are used for real data the initial 'W', and over 98% of space is for other purposes.

For the sake of space, I think we should not use varchar for entity_type, bundle, language, field_format column. Use small int, tiny int or intÎÎ that will only take one to four bytes. The field is a basic unit of a Drupal website. A medium website can hold millions of fields. Saved one byte is equal to multiple megabytes in precious MySQL database.

2) Too complicated primary key

Each field table has a complicated primary key. Base table use `entity_type`, `entity_id`, `deleted`, `delta`, `language` as primary key. Revision table use `entity_type`, `entity_id`, `revision_id`, `deleted`, `delta`, `language` as primary key. "In InnoDB, having a long PRIMARY KEY wastes a lot of disk space because its value must be stored with every secondary index record."ÎÎÎ. It may be worthy to add an auto increasing int as a primary key.

3) Not needed field column

I found bundle type column is not necessary. We can have the system running well without bundle type column. In my clinic project, I named the node type "patient profile". The machine name patient_profile appears in each field record's bundle type column. As varchar (255), it uses 16 bytes for each table record. Let do a quick calculation. if there are 100, 000 nodes and each node have 40 fields, 100,000 x 40 x 2 x 16 = 122MB are taken for this column. Or at least, we use 2 bytes small int that will take only one-eighth of the spaces.

4) Just use revision table.

Remove one of the field's data tables. It may need a little bit more query power to get field data, but it save time when we insert, update and delete field's data. By doing so, we maintain one less table per field, edit content faster. It helps to bring better editor experience and to save on database storage space.

A contributed module field_sql_leanÎÎ addressed some of the concerns here. It still needs a lot of work on itself and if we want other contributed modules compatible with it. After all, it changed the field table structure.

Reference:

1: http://dev.mysql.com/doc/refman/5.1/en/integer-types.html
2: http://dev.mysql.com/doc/refman/5.0/en/innodb-tuning.html
3: Field SQL storage lean solution
4: Patient profile form:medical form

Jan 16 2013
Jan 16

Today we've got two FREE videos for your viewing enjoyment, one about a killer community tool, and the other about a handy module to help you out when working with Drupal databases. In Using IRC (Internet Relay Chat) we look at one of the most over-looked, but most useful community tools available, IRC. IRC is a way to chat online, in real-time with other people, and it is used heavily by the Drupal community. We'll sort out what IRC is, how to get connected, and walk you through some of the basics to get you started.

Our second lesson, Using Schema Module, introduces you to a great way to get information about your database, and a nice, quick way to get Drupal to write some code for you, when it comes to creating your module's install file.

We hope you enjoy these videos, and we'll have some cool new stuff next week, walking you through working with lightboxes in Drupal.

Nov 28 2012
Nov 28

This video looks at the basics of working with MySQL from the command line. We get into the mysql environment and look at databases, tables and fields. We cover creating and deleting databases, creating a user, and querying within a particular database.

Mar 14 2012
Mar 14

To get things started in this series we'll install Drupal 7. We start in Drupal.org to download the code, review some important points to watch for when copying the files, then we show how to set up a database using PHPMyAdmin before we dive into the Drupal installer.

Jan 10 2011
Jan 10

Welcome to the 5th installment of the Drupal 7 Line by Line series of articles.

In the first four installments I covered index.php, the basics of the drupal_bootstrap() function and the first two phases of the Drupal bootstrap process: DRUPAL_BOOTSTRAP_CONFIGURATION and DRUPAL_BOOTSTRAP_PAGE_CACHE. (There are links to those articles found at the bottom of this article if you want to catch up.)

In the previous article about the page cache bootstrap process I noted that unless your site is configured in a specific way, both the DRUPAL_BOOTSTRAP_DATABASE (phase 3) and DRUPAL_BOOTSTRAP_VARIABLES (phase 4) bootstrapping phases will be completed before the page cache (phase2) bootstrap process can itself finish. As a result, I am technically covering the database bootstrapping process out of order and am not literally following the line by line code execution of Drupal exactly. I hope you can forgive me.

_drupal_bootstrap_database()

The third Drupal bootstrap phase begins with a call to the _drupal_bootstrap_database() function.

The first few lines of this function can put to rest any argument that Drupal is an application development framework (like Zend framework, or CakePHP, or symfony). In its current state it simply isn't. Here, on virtually every page load, Drupal checks to see if it in an installation state.

<?php
 
// Redirect the user to the installation script if Drupal has not been
  // installed yet (i.e., if no $databases array has been defined in the
  // settings.php file) and we are not already installing.
 
if (empty($GLOBALS['databases']) && !drupal_installation_attempted()) {
    include_once
DRUPAL_ROOT . '/includes/install.inc';
   
install_goto('install.php');
  }
?>

If Drupal decides that it is in an installation state it would load install.inc and redirect the request to install.php. With a specific installer baked into its core Drupal has no separated between the framework that its functionality is built on and the product that exposes that functionality. Please don't misunderstand my point here. Drupal is a very flexible, extensible and developer friendly CMS product. It is very much like a framework. It looks like a duck and quacks like a duck, but it might be a swan or some other mixed metaphor.

But, I digress.

Drupal Testing System

Drupal 7 comes with a testing suite that allows developers to write functional tests and run them to make sure that whatever they are working on doesn't break the rest of their Drupal site. Part of that testing system is a fake 'user agent' / fake browser that makes page requests and makes the results available for testing in an automated way.

Here Drupal checks to see if the request is being made by this testing user agent with a call to drupal_valid_test_ua(). If it is a request from the testing user agent Drupal overrides the database settings defined in settings.php. The reason for this is that if you are running tests (which require reading and writing of bogus test data to the database), you don't want to impact any real data on your site. So, if you are running tests they are run on a different set of tables.

The Database System

Finally, Drupal includes /database/database.inc.

<?php
 
// Initialize the database system. Note that the connection
  // won't be initialized until it is actually requested.
 
require_once DRUPAL_ROOT . '/includes/database/database.inc';
?>

This file contains the core of the new Drupal 7 database abstraction layer.

Here is a part of the code comments from database.inc:

<?php
/* Drupal provides a database abstraction layer to provide developers with
* the ability to support multiple database servers easily. The intent of
* this layer is to preserve the syntax and power of SQL as much as possible,
* but also allow developers a way to leverage more complex functionality in
* a unified way. It also provides a structured interface for dynamically
* constructing queries when appropriate, and enforcing security checks and
* similar good practices.
*
* The system is built atop PHP's PDO (PHP Data Objects) database API and
* inherits much of its syntax and semantics.
*/
?>

The workings of the new database abstraction layer is enough material for several articles. You may want to do some reading on your own. I would suggest:
The main API documentation: http://api.drupal.org/api/group/database/7
Developing with the database API: http://drupal.org/developing/api/database

Autoloaders

Last but not least Drupal registers two autoloading functions. PHP provides a file autoloading mechanism so that if it encounters the instantiation of a previously undefined Class or encounters the implementation of a previously undefined Interface it can be told where to look in the file system to load the file that contains the Class or Interface definition.

<?php
 
// Register autoload functions so that we can access classes and interfaces.
  // The database autoload routine comes first so that we can load the database
  // system without hitting the database. That is especially important during
  // the install or upgrade process.
 
spl_autoload_register('drupal_autoload_class');
 
spl_autoload_register('drupal_autoload_interface');
?>

These two lines tell php to use the drupal_autoload_class() and drupal_autoload_interface() functions for loading undefined classes and interfaces. Both of these functions are actually a wrapper for the _registry_check_code() function.

The name of the function implies that Drupal now has a code registry. And indeed it does. What that means is that modules now have a way to tell Drupal where to find certain include files so that when they are needed they can be 'lazy loaded'. To find out more I suggest reading: Drupal's code registry.

Summary

DRUPAL_BOOTSTRAP_DATABASE is the third bootstrap phase. It checks to see if the site is currently in an installation state (rarely) and whether it is in a automated testing state (rarely). Most of the time it simply does two things: load database.inc and register autoloader functions.

As always feel free to leave comments, corrections or questions.

See Also: 

Dec 23 2010
Dec 23

Drush + Bash + Cron: Datbase Backup Goals

  • Scan sites directory for a given drupal install
  • Find all multisite folders/symlinks
  • For each multisite:
  • Use Drush to clear cache - we dont want cache table bloating up the MySQL dump file
  • Use Drush to delete watchdog logs - we dont want watchdog table bloating up the MySQL dump file
  • Use Drush to backup the database to pre-assigned folder
  • Use tar to compress and timestamp the Drush generated sql file
  • Setup Crontab to run perodically with the above commands as a bash file

Assumptions and Instructions

You will need to adjust the Bash file if any of these are not the same on your server

  • Drupal is installed in /var/www/html/drupal
  • Multisites are setup in the /var/www/html/drupal/sites folder
  • Backup folder exists in /var/www/backup/sqldumps
  • Drush is already installed in /root/drush/drush. If drush is not installed follow this Drush installation guide
  • AWK is already installed, if not, type: sudo yum install gawk

Drush Backup BASH file

Copy paste the code below and create a new bash file ideally in your/root home folder. Make the Bash file executable.

#!/bin/bash
#
 
# Adjust to match your system settings
DRUSH=/root/drush/drush
ECHO=/bin/echo
FIND=/usr/bin/find
AWK=/bin/awk
 
# Adjust to match your system settings
docroot=/var/www/html/drupal
backup_dir=/var/www/backup/sqldumps
 
multisites=$1
 
START_TIME=$(date +%Y%m%d%H%M);
 
# Add all multisites for a given docroot into a list. Detects all web addresses which are a directory which isn't named all, default or ends in .local.
if [ "${multisites}" = "all" ];then
        # If multisites are folders change -type d
        # If multisites are symlinks change -type l
        # Adjust $8 to match your docroot, it $x needs to be the name of the multisite folder/symlink
        multisites_list="`$FIND ${docroot}/sites/* -type l -prune | $AWK -F \/ '$8!="all" && $8!="default" && $8!~/\.local$/ { print $8 }'`"
else
        multisites_list=$multisites
fi
 
 
# Must be in the docroot directory before proceeding.
cd $docroot
 
for multisite in $multisites_list
do
        # Echo to the screen the current task.
        $ECHO
        $ECHO "##############################################################"
        $ECHO "Backing up ${multisite}"
        $ECHO
        $ECHO
 
        # Clear Drupal cache
        $DRUSH -y -u 1 -l "${multisite}" cc all
 
        # Truncate Watchdog
        $DRUSH -y -u 1 -l "${multisite}" wd-del all
 
        # SQL Dump DB
        $DRUSH -u 1 -l "${multisite}" sql-dump --result-file="${backup_dir}"/"${multisite}".sql
 
        # Compress the SQL Dump
        tar -czv -f "${backup_dir}"/"${START_TIME}"-"${multisite}".tar.gz -C "${backup_dir}"/ "${multisite}".sql
 
        # Delete original SQL Dump
        rm -f "${backup_dir}"/"${multisite}".sql
 
        $ECHO
        $ECHO
        $ECHO "Finished backing up ${multisite}"
        $ECHO
        $ECHO "##############################################################"
 
done

Setup Crontab

Assuming your bash file containing the code above is saved as /root/drush_backup.sh, you can setup a crontab for root user.

crontab -e
1 1 * * * /root/drush_backup_db.sh

Further Reading and Links

Related blog posts: 


Bookmark and Share
Oct 06 2010
Oct 06
Printer-friendly version

With Drupal 7 right around the corner, I have recently put myself to the task of ramping up on what's new, what's changed, and what do I – as a module developer – need to know when I sit down to code my first D7 module (or upgrade one of my D6 modules *shameless plug*). I've spent the last few weeks scouring over the D7 core and API documentation, and let me be the first to tell you if haven't heard it yet (unlikely): better times are ahead. For all of us.

The amount of API changes from D6 to D7 are broad and sweeping. D7 is certainly still Drupal, meaning that it still feels and acts like Drupal. Nodes are still nodes that can have taxonomy and comments, users are still users with roles and permissions, so on and so forth. But nearly everything under the hood and in the UI, for that matter, has been improved upon and/or refactored. From a coder’s perspective, I am soon to be in developer’s heaven when D7 goes stable. From the standpoint of being a framework, D7’s API is more mature and modern. You could almost think of D6 as a kid in highschool and with D7 that kid has graduated and is now ready for the real world. The refactoring that has taken place over the last couple of years and all of the new features now available will be a boon for module and theme developers alike. This also explains just why we have had to wait so long for a stable release. It’s almost as if you took your grocery-getter car to Q from James Bond for an “upgrade”. But, I digress.

This article will be the first in a series exploring D7’s API. For this article (itself a two-part series), I want to talk about what I feel, as a developer, is the most substantial and important change to the Drupal core. The new database abstraction layer (or DBTNG Database: The Next Generation) is a complete overhaul of the database layer we have grown accustomed to in versions D6 and before. Built on PDO, the database layer in Drupal is now truly abstract, meaning that it is database agnostic. This gives Drupal the flexibility to run on any number of different platforms without having to write new code or change any existing code that interacts with the database. This also means that we can expect to see a lot more enterprise-level Drupal sites once D7 hits stable as it’s footprint in the corporate world will now be able to grow substantially. Gravy. So, let’s see what all this means.

First and foremost, gone are the days of having to write ludicrously long and complicated MySQL or Postgres dependent SQL statements to fetch or write data. Gone are the days of having to define your placeholders’ type. And gone are the days of having to worry about what would happen to your data if an insert or update query failed in the middle of executing (yes, we finally get to use transactions!).

Let’s jump right in and start with basic SELECT queries in D7’s new database layer. D7 introduces us to the concepts of static and dynamic queries. Static queries are queries that get passed to the database as is, meaning you query the database with a SQL string you write. Dynamic queries are built by Drupal using a query object and that query object’s methods. A few basic examples may help illustrate the difference.

Example of a static query in D7:

<?php
  $result = db_query(“SELECT nid FROM {node}”);
?>

Well, that should look familiar to anyone who has done any Drupal module development in the past. Our old friend db_query() is still here but is much more powerful now. How so? Glad you asked. First, the way we define placeholders has changed. We no longer have to type hint them with %d or %s, and we also don’t have to worry about remembering to surround string placeholders with quotes. Let’s see the difference:

Example of using placeholders in D6:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = ‘%s’”, “Bob”);
?>

Example of using placeholders in D7:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = :name”, array(‘:name’ => ‘Bob’));
?>

At first, this may seem a little convoluted as we have to write a little more code to accomplish the same task, but the benefits far outweigh the few extra characters I have to type. Placeholders, as seen above, are now defined using the convention :identifier. No need for quotes, no need to worry about the type. The function db_query()’s second parameter is an associative array where the keys are the placeholder identifiers and then assigned their corresponding values. It’s basically the same principle that the Drupal function t() uses with its placeholders. You can also assign a placeholder value to be an array.

An example taken from the official handbook page for D7’s database layer:

<?php
  // This code:
  db_query("SELECT * FROM {node} WHERE nid IN (:nids)", array(':nids' => array(13, 42, 144));

  // Will get turned into this prepared statement equivalent automatically:
  db_query("SELECT * FROM {node} WHERE nid IN (:nids_1, :nids_2, :nids_3)", array(
    ':nids_1' => 13,
    ':nids_2' => 42,
    ':nids_3' => 144,
  ));

  // Which is equivalent to the following literal query:
  db_query("SELECT * FROM {node} WHERE nid IN (13, 42, 144)";
?>

Sweet. Another goodie that db_query() now has thanks to PDO is the ability to define how our results will be retrieved. You can provide db_query() with a third parameter which is a PDO-specific constant that defines the retrieval type.

The types we can define are:

  • PDO::FETCH_OBJ — fetches results into an object
  • PDO::FETCH_ASSOC — fetches results into an associative array
  • PDO::FETCH_NUM — fetches results into an indexed array
  • PDO::FETCH_BOTH — fetches result into both an indexed and associative array

By default, results are fetched into an object. Here’s what an example of fetching our results into an associative array looks like:

<?php
  $result = db_query(“SELECT mail FROM {users} WHERE name = :name”, array(‘:name’ => ‘Bob’), array(‘fetch’=> PDO::FETCH_ASSOC));
?>

As you can see, db_query()’s third parameter is an associative array with the key ‘fetch’ where we assign the retrieval type. This array can also have another key defined: ‘target’, which tells Drupal where to retrieve the results from. As of the writing of this article, this is limited to only 2 possible values: “default” and “slave”. Still, though, the potential here makes me giddy.

Additionally, we can also pass the name of a class we want are results to be retrieved in. Say what?!? Oh yes. If we have defined this class, and we pass db_query() the name of our class, results will be retrieved and assigned as properties of that class. If that class has a constructor method, by default it will run after the results are retrieved and the properties defined, but we can also tell db_query() to run the class’ constructor method before the properties are assigned.

Another example from the official handbook page:

<?php
class exampleClass {
  function __construct() {
    // Do something
  }
}

  $result = db_query("SELECT id, title FROM {example_table}", array(), array(
    'fetch' => 'ExampleClass',
  ));
?>

Dynamic queries in D7, on the other hand, are a different creature. We have a new function that we will use for dynamic queries: db_select(). We can pass db_select() three parameters. The first is a database table name. The second is an alias for that table name. The third parameter is identical to db_query()’s third parameter. db_select() returns a query object which has some methods defined that we can use to construct our dynamic query.

An example of a dynamic query in D7:

<?php
  $query = db_select(‘users’, ‘u’);
  $query
    ->condition(‘u.name’, ‘Bob’)
    ->fields(‘u’, array(‘mail’));
  
  $result = $query->execute();
?>

Need to join in another table? No sweat. Like I mentioned earlier in this article gone are the days of having to write cumbersome SQL queries. Here’s how easy it is to do a join with dynamic queries in D7:

<?php
  $table_alias = $query->join('user', 'u', 'n.uid = u.uid AND u.uid = :uid', array(':uid' => 5));
?>

Our query object’s join method takes a table name as the first parameter, and an alias for that table as the second parameter. The third parameter is the “ON” condition of the join and the last parameter is for placeholder replacement. The join method returns the table alias of the table assigned. We can also pass SELECT query object to the join method as the first argument.

An example from the handbook:

<?php
  $myselect = db_select('mytable')
    ->fields('mytable')
    ->condition('myfield', 'myvalue');
  $alias = $query->join($myselect, 'myalias', 'n.nid = myalias.nid');
?>

I don’t have nearly enough room to go into all the features and capabilities of dynamic queries in this article, suffice to say as you can see from the few simple examples above they are extremely powerful and flexible and you’ll be using them in your own modules a lot very soon. Less code and more bang for the buck is always fine by me.

In the second part of the this article, I will go over insert, update, and delete queries and will introduce you merge queries.

Until next time, happy coding.

Further reading:

Jul 24 2010
Jul 24
Printer-friendly versionPDF version

If you're like most people who build using Drupal you want to build your site around nodes, that have titles, descriptions, tags, comments, etc. It's no surprise since those types of content oriented sites are the ones that make the world of the web go round. But what about another type of site that focuses not on nodes but on data contained in the Drupal database? Can you do it? Yes. Should you do it? Only if that sort of thing excites you or perhaps if a client requests it.

I'll give you a little background on where I'm coming from and what made me decide to play around with Drupal as a way to expose data sets. About a decade ago I was part of a group of people who built a first of its kind application to send and receive data from wholesale power markets in North America. The application was built on three tiers: a client, an application server, and a database server. The client and application server were written in Java and the database relied on Oracle technology. It was about as far away from open source as you could get. The product was successful though and we did hundreds of installs all across North America. What the product basically did for its owners was suck in data from a central server then expose that data to the user via tables and charts visible from the user interface. At some point last year I started wondering if I could do something similar with Drupal. So I started investigating, tried out different modules and realized that I could do even more with a Drupal install than we did with that product.

There are four tasks that need to be accomplished if you want to use Drupal as an effective data platform.

  1. Capture data to the Drupal database tables
  2. Setting the data types (text, numeric, etc.)
  3. Expose the data to the user using tables and charts
  4. Offer simplified download format options

 Capturing Data

I have been capturing data using the Table Wizard module. Table Wizard allows you to create a database table by uploading a delimited (like tab or comma separated) file. This works pretty well. I recently uploaded a file with over 300,000 records and it was in the database within a few minutes. After you upload your file you click on the table name to "analyze" the table. Table Wizard will let you know what columns you have in the table, the data types and identify the primary key. When you first upload your data you will want to go switch over to your database to change data types of the appropriate ones and identify a primary key. Once you have done that you can come back and re-analyze the table. 

Setting the Data Types

This is probably the step that will be the least welcome to those who are not familiar with databases. If you want to do extensive work with data in Drupal then you will want to become familiar enough with phpMyAdmin and MySQL data types to be able to navigate to your database and apply the appropriate settings for each type of data. You should also indicate which field contains the primary key for the table. The primary key is a unique value that identifies each record in a table. If you have a dataset that does not have a column with unique values then consider adding a column unique numeric code in each row. You can name the column something like record_id and then add values like 100001, 100002, 100003 and so on.

Expose the Data Using Tables and Charts

If you have been a good boy or girl and added a single field primary key then the Table Wizard module will let you check a box that is labeled, "Provide default view." This option automatically creates a view that you can expand upon as you see fit. Auto created table wizard views will have the tag "tw" added to them. You can also create a view by navigating to /build/views/add and looking for the radio button in the 'View Type' section that starts with Database table. At this point you need to a thing or two about the Views module to go further. There's not enough space to cover that here so I'm going to skip over the details and point you to the key modules and settings.

The standard views module will let you create a table from your data. You just have to choose a Page display and choose the Table style. You can then choose your fields which will appear as columns within the table. Give the page a Path and perhaps add an Exposed Filter or two to allow users to narrow down the data they are looking at. If you want to get a little fancy you can choose use Views Calc to create tables that include calculations like min, max or average on a set of numeric data.

Charts are a little more tricky but it can be done with the assistance of Charts and Graphs and Views Charts. Charts and Graphs allows you to integrate a number of different free charting solutions (including Google Charts) into your site. View Charts makes those features available as a Style (called "Charts") within the Views interface. The chart gets exposed on a page based on the Path that you define. You can also create a chart within the Block display type. There are other charting modules that exist for Drupal. All of them have some drawbacks but so far the Charts and Graphs/Views Charts combo has worked the best for me.

Offering A Download Option

The ability to download data probably won't be a must have feature for every site. After all someone could probably just copy the table and paste it into their favorite spreadsheet program. It's a nice feature to offer though and you can do it pretty easily with the help of the Views Bonus Pack module. With Views Bonus Pack installed you get the option to add a Feed display within the Views interface. With the Feed display selected you then get the option of setting a Style that includes the option for CSV and other popular formats like TXT, DOC, XLS and XML. You can then attach that display to the Table display and there will be an image that appears at the bottom of the table page that will generate a file download when clicked. Be sure to add the same filters to the CSV display that you have added to the Table display so the downloaded file reflects what the user is seeing with any filters selected. 

The usual caveats apply to the ideas and tips that I've offered here. At any time a better module could come along or an existing module could change and alter the process a bit. So tread lightly and do some exploration before you commit to building a data driven site using Drupal. In fact, even though there was a very recent release of Table Wizard the development is being deprecated in favor of the Data module. I'll continue to upload using Table Wizard for now since I think it works well but also plan on giving the Data module a shot in the near future. 

As always I'd love to hear your thoughts and tips on modules and workflows in the comments. If you have built or know of a good data access website built using Drupal feel free to share those links as well.

Video Links

YouTube Version

Flash Version

Quicktime Version

Sep 14 2009
Sep 14

I had to move 264 databases from one Linux server (Fedora) to another (CentOS), hard copying /var/lib/mysql wasn't a good idea because version of two MySQL databases were different and i had to repair all the tables after move so i decided to write a very short and usefull bash script.

It dumps all the databases but each database into a different file :

for I in `echo "show databases;" | mysql -u myuser --password="" | grep -v Database`; do mysqldump -u myuser --password="" $I > "$I.sql"; done

As you can see you can put each .sql file into a different folder if you want, and it's possible to compress each file using piping :)

Apr 21 2006
Apr 21

This article explains a practical implementation of a technique outlined in the article "Sharing Drupal tables between databases using MySQL5 Views".

Problem

You have multiple (multisite) Drupal sites and you would like to manage the content for all of these sites through a single interface. Depending on the nature of a given piece of content, you may want the content published on one, several or all of your subsites, but you do not want to have to create copies of the same content for each site.

Solution

Taxonomy plus MySQL5 views. (NOTE: this solution will not work with versions of MySQL prior to 5.)

Assumming you have your subsites properly set up and running, the first step is to create a special vocabulary which you will use to target content.

Go to [your site's baseurl]/admin/taxonomy/add/vocabulary and create a vocabulary. We'll call it simply "sites".

Next, go back to your taxonomy page (/admin/taxonomy) and select "edit vocabulary" for the "sites" vocabulary.

Add a name for each of the subsites you would like to manage. For our example, we'll have two subsites, "foo" and "bar", and one master site, "master".

Now add at least three pieces of test content. Target one piece of content for each of foo, bar and both.

Next, we're going to create a node view for each of our subsites that we'll use to replace the actual node table.

The SQL is as follows:

CREATE VIEW [subsite, eg. "foo"]_node AS SELECT n.* FROM node n, term_data td, term_node tn, vocabulary v WHERE v.name = '[vocabulary name, eg. "sites"]' AND td.vid = v.vid AND td.name = '[subsite vocab term, eg. "foo"]' AND td.tid = tn.tid AND n.nid = tn.nid ;

Because the terms that serve as our subsite labels may very well exist within other vocabularies, we also need to join on the vocabulary table to ensure our solution works reliabley.

Finally, we need to have our subsites use the views we have created instead of our master nodes table, which only the "master" site will have access to directly.

In your drupal's sites directory, you should have directories that correspond to each of your drupal sites (both master and subsites). Edit the settings.php file for each of your subsites, and use the db_prefix variable to point the site to your view. So sites/foo.example.com/settings.php would contain the following:

$db_prefix = array( 'node' => 'foo_', );

At this point, you'll want to disable creation of content from within each of your subsites. You can do this in the from the admin/access page. If you attempt to create content from within the subsites, you'll likely get a 'duplicate key' error.

I hope that explanation is clear. These articles are written rather hastily. If you questions or suggestions regarding this solution, please leave a comment.

Mar 24 2006
Mar 24

Problem:

You have a "master" database that contains data from multiple Drupal sites and you want to share it among them. Normally, you could use table prefixing to allow each of your sites to point to a single table. But what if you do not want content from one site to "bleed" across to the other sites? Let's say you have a network of Drupal sites sharing a user database. You want to share that user's information across your entire network of sites, but only make the information visible from those sites to which the user has subscribed. Or maybe you want to populate baz.com with users who meet some arbitrary criteria. There are lots of possibilities here, but nobody's paying me to write this, so let's get on with it.

Solution:

Use MySQL5 Views. For our example, we'll use the users table. Our "master" users table (the table that contains all users for all of our sites) resides in a database called "master". The database of the our example site that will have restricted access to our masters users table is called "banana". Assuming you're starting with a fresh instance of the Drupal schema in your database "banana", do this:

use banana;
drop table users;
CREATE VIEW users AS
SELECT *
FROM master.users
WHERE uid IN (
SELECT uid
FROM somedatabase.sometable
WHERE uid = 0
OR label = 'foo'
)

Drupal will use banana.users just as it would a normal users table. No other modifications are necessary. Now only "foo" users will be included in the users table for your banana website. Note: Drupal has a dependency that is not really documented. Every users table must have an entry that contains uid=0. It's a "stub" entry that Drupal needs to function properly when a user is anonymous. A workaround for this dependency is to include "user 0" in the results set that defines your view. See, wasn't that easy? Please post questions here, and I'll update these instructions as needed.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web