Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough
Nov 12 2021
Nov 12

A couple of years ago I was asked to take a look at a Drupal 7 site that was performing poorly where a colleague had spotted a strange function call in an Application Performance Management (APM) system.

The APM traces we were looking at included a __lamda_func under which was a class called Ratel. Under those were some apparent external calls to some dodgy looking domains.

One of my very excellent colleagues had done some digging and found some more details about the domains which confirmed their apparent dodginess.

They had also come across a github gist which looked relevant - it had the PHP source code for a Ratel class which appears to be an SEO spam injection tool:

https://gist.github.com/isholgueras/b373c73fa1fba1e604124d48a7559436

This gist included encoded versions of the dodgy URLs we'd seen when trying to analyse what was slowing the site down.

However it wasn't immediately obvious how this code was running within the infected Drupal site.

We'd grepped the file system and not found any signs of this compromise. One trick that's sometimes useful is to search a recent database dump.

Doing so turned up a reference to the Ratel class within the cache tables, but when we took a closer look inside the cache there wasn't much more info to go on:

$ drush ev 'print_r(cache_get("lookup_cache", "cache_bootstrap"));'
stdClass Object
(
    [cid] => lookup_cache
    [data] => Array
        (
 
[...snip...]
 
            [cRatel] => 
            [iRatel] => 
            [tRatel] => 

So this was more evidence that the malicious code had been injected into Drupal, but didn't tell us how.

I took a closer look at the malicious source code and noticed something it was doing to try and hide from logged in users:

  if (function_exists('is_user_logged_in')) {
    if (is_user_logged_in()) {
      return FALSE;
    }
  }

Being so used to reading Drupal code, I think I'd initially thought this was a Drupal API call.

However, on closer inspection I realised it's actually a very similarly named WordPress function.

That meant that the function almost certainly would not exist in this Drupal site, and that gave me a way to hook into the malicious code and find out more about how it had got into this site.

I temporarily added a definition for this function to the site's settings.php within which I output some backtrace information to a static file - something like this:

function is_user_logged_in() {
  $debug = debug_backtrace();
  file_put_contents('/tmp/debug.txt', print_r($debug, TRUE), FILE_APPEND);
  return FALSE;
}

This quickly yielded some useful info - along the lines of:

$ cat debug.txt 
Array
(
    [0] => Array
        (
            [file] => /path/to/drupal/sites/default/files/a.jpg(9) : runtime-created function
            [line] => 1
            [function] => is_user_logged_in
            [args] => Array
                (
                )
 
        )
 
    [1] => Array
        (
            [file] => /path/to/drupal/sites/default/files/a.jpg
            [line] => 10
            [function] => __lambda_func
            [args] => Array
                (
                )
 
        )
 
    [2] => Array
        (
            [file] => /path/to/drupal/includes/bootstrap.inc
            [line] => 2524
            [args] => Array
                (
                    [0] => /path/to/drupal/sites/default/files/a.jpg
                )
 
            [function] => require_once
        )

Wow, so it looked like the malicious code was hiding inside a fake jpg file in the site's files directory.

Having a look at the fake image, it did indeed contain code very similar to what we'd been looking at in the gist, albeit further wrapped in obfuscation.

$ file sites/default/files/a.jpg    
sites/default/files/a.jpg: PHP script, ASCII text, with very long lines, with CRLF line terminators

The malicious Ratel code had been encoded and serialized, and the fake image file was turning that obfuscated string back into executable code and creating a dynamic function from it:

$serialized = '** LONG STRING OF OBFUSCATED CODE **';
$rawData = array_map("base64_decode", unserialize($serialized));
$rawData = implode($rawData);
$outputData = create_function(false, $rawData);
call_user_func($outputData);

That's where the lamda function we'd been seeing had come from.

The final piece of the puzzle was how this fake image file was actually being executed during the Drupal bootstrap.

The backtrace we'd extracted gave us the answer; the require_once call on line 2524 of bootstrap.inc was this:

2523         case DRUPAL_BOOTSTRAP_SESSION:
2524           require_once DRUPAL_ROOT . '/' . variable_get('session_inc', 'includes/session.inc');
2525           drupal_session_initialize();
2526           break;

So the attacker had managed to inject the path to their fake image into the session_inc Drupal variable.

This was further confirmed by the fact that the malicious code in the fake image actually included the real Drupal session code itself, so as not to interfere with Drupal's normal operation.

require_once('includes/session.inc');

So although the Ratel class had perhaps initially been put together with WordPress in mind, the attacker had tailored the exploit very specifically to Drupal 7.

Drupal has a mechanism to disallow uploaded files from being executed as PHP but that didn't help in this case as the code was being included from within Drupal itself.

At some point there must have been something like a Remote Code Execution or SQL Injection vulnerability on this site which allowed the attacker to inject their variable into the database.

It's possible that was one of the notorious Drupal vulnerabilities often referred to as Drupalgeddon 1 and 2, but we don't know for sure. We believe that the site was most likely infected while at a previous host.

On the other hand, perhaps it was as simple as a poorly protected phpMyAdmin or something similar which allowed the attacker to manipulate the variables table.

This technique doesn't represent a vulnerability in itself, as the attacker needed to be able to upload the fake image and (most importantly) inject their malicious variable into the site.

It was, however, quite an interesting technique for achieving persistence within the Drupal site.

Once we'd uncovered all of these details, cleaning up the infection was as simple as deleting the injected variable and removing the malicious fake image file.

What could the site have done to defend itself against this attack?

Well the injection of the variable may have been via an exploit of an unpatched vulnerability on the site. Keeping up-to-date with patches from the Drupal Security Team is always advisable.

I'd certainly recommend against having a tool like phpMyAdmin publicly accessible (although we don't know for sure that's what had happened in this case).

Other than that, something like the mimedetect module might have been able to prevent the upload of the fake image file. Note that newer versions of Drupal have this capability built-in.

A manual review of the variables in the site's database could have caught this; there are a handful of variables that provide "pluggability" in D7 but session_inc is probably one of the most attractive from an attacker's point of view as it's typically invoked on most bootstraps unlike some of the others:

drupal-7.x$ grep -orh "variable_get.*\.inc')" includes modules | sort | uniq
 
variable_get('lock_inc', 'includes/lock.inc')
variable_get('menu_inc', 'includes/menu.inc')
variable_get('password_inc', 'includes/password.inc')
variable_get('path_inc', 'includes/path.inc')
variable_get('session_inc', 'includes/session.inc')

A simple drush command can show whether any of these variables are set:

$ drush vget _inc
No matching variable found.

Once we knew what had happened to the site we found a couple of references online to similar exploits:

Oct 14 2021
Oct 14

My colleagues and I in the Drupal Security Team recently became aware of a Zero Day RCE vulnerability in Ghostscript. This was later assigned CVE-2021-3781.

At least one viable Proof of Concept (PoC) was made public not long after the Zero Day which illustrated Scalable Vector Graphics (SVG) handling in Imagemagick being used as an attack vector.

Drupal core doesn't use Ghostscript directly, but it's fairly common for Drupal sites to use Imagemagick in some form.

As such, we began to look at how an attacker might try to exploit the Ghostscript vulnerability via SVG and Imagemagick on a Drupal site.

Our goal in such an investigation is to determine whether it would be sufficiently easy, with a common Drupal configuration, that we ought to issue a Public Security Announcement (PSA) warning Drupal users and providing any mitigation steps they might be able to take until an upstream fix was available.

Here's a quick write-up of some of the investigation I did.

We'd determined that SVG is not in the default list of permitted image extensions in Drupal.

However, the PoC write up showed Imagemagick being tricked into parsing an SVG with a fake jpg extension.

I verified that in Drupal 9 the built-in file type detection prevented a malicious SVG from being smuggled into an upload with a permitted file extension.

Drupal 7 core by itself doesn't have this protection, although modules are available that add e.g. mime-type sniffing such as https://www.drupal.org/project/mimedetect.

How might an attacker try to take advantage of this?

Drupal core has built-in support for "image styles" which can perform preset transformations on uploaded images. For example, a thumbnail image is often prepared to show in previews of articles.

So a Drupal site which uses Imagemagick to handle image processing (GD is the default in core but alternative "image toolkits" are available) might be exploited if an attacker could upload a malicious SVG and have the site try to perform an image manipulation on this, such as resizing it to prepare a thumbnail. This sort of functionality is quite common - for example - when a newly registered user uploads a profile picture. That would make a good target for an attacker.

I ran through what happens if you smuggle a malicious SVG masquerading as a permitted image type into a D7 upload field. As Drupal tries to record the metadata about the image, it makes a series of API calls which end up invoking image_get_info() which in turn calls image_toolkit_invoke('get_info', $image).

At that point both the default GD and the alternative Imagemagick toolkits call PHP's built in getimagesize() on the image. If the image in question is actually an SVG, this will typically return FALSE.

That means that Drupal will not attempt to perform a transformation on the image - for example to create a thumbnail - because it has not been able to derive the image's metadata including the dimensions of the original.

Even on a site which has explicit support for the uploads of SVG files (and you might argue that would be quite unusual for e.g. user profile pics) - because of the nature of SVGs - Drupal's default behaviour of deriving some image metadata and then deciding whether image transformations should be performed doesn't work like it does for e.g. jpg and png files.

At least one popular SVG contrib module tries to derive an SVG image's dimensions by parsing it as XML, but the malicious files produced by the PoC are not correctly formed for this.

It doesn't really make sense to try and run an SVG through Imagemagick's convert to create a thumbnail or other derivative of a different size - that's sort of the whole point of Scalable Vector Graphics.

This investigation had focused on Drupal 7, but it looked like Drupal 9 would be - if anything - better protected because of its built-in file type detection.

One popular configuration to support SVGs in D8/9 uses a vendored library to validate the XML and rejected the PoC's malicious SVG during upload validation.

There are a handful of different ways (as is often the case in Drupal) to set up SVG support in Drupal 9, but in one discussion about SVG support, phenaproxima (one of the media module's developers) stated:

The Media module itself has no opinion about SVG images. The Image module, on the other hand, doesn't normally allow SVG to be uploaded into any image field, due to the various security and integration issues (i.e., image styles).

So it really didn't look great for our potential attacker trying to exploit Drupal's image styles (automatic image transformations) via SVG and an Imagemagick toolkit to take advantage of the Ghostscript Zero Day.

As it looked like it wouldn't be easy to exploit this in what we'd consider a common configuration of Drupal, we decided against issuing a PSA.

This is not to say it's inconceivable that the vulnerability could be exploited on a Drupal site out there somewhere, but it would likely be considered an "edge case".

It's also possible that I got some of this wrong. If you know of any significant details I missed which mean that exploitation might be easier or more likely than my analysis suggested, please contact me or the Drupal Security Team privately in the first instance. We will credit any researchers who provide information that leads to us issuing a Security Advisory.

Thanks to Greg Knaddison (greggles) for (suggesting and) reviewing this post.

Mar 28 2018
Mar 28

This was originally posted on the dev.acquia.com blog.

Easy-to-guess passwords are all too often the means by which intruders gain unauthorised access. It's useful to be able to audit the passwords in use on your site - especially for user accounts with administrative privileges.

Ideally your Drupal site should have a robust (but user friendly) password policy (see my previous post: Password Policies and Drupal). However, this is not always possible.

The problem with checking your users' passwords is that Drupal doesn't actually know what they are; rather than storing the plaintext password, a cryptographic (salted) hash is stored in the database. When a user logs in, Drupal runs the supplied password through its hashing algorithm and compares the result with the hash stored in the database. If they match, the user has successfully authenticated themselves.

The idea is that even if the hashes stored in the database are compromised somehow, it should be very difficult (if not infeasible) to derive the original passwords from them.

So how can a site check whether users have chosen bad passwords?

One method is to check the password against a repository of known-compromised passwords while we (briefly) have it in plaintext; that is, when the user has just submitted it. That's how the Password Have I Been Pwned? module works.

However, if you wish to conduct an audit of many passwords in your system, it's not very convenient to have to wait for users to type those passwords in. It would be better to be able to check the hashes.

Tools such as John the Ripper (John), take a list of possible passwords (usually referred to as the wordlist) and compare each against stored hashes. John supports Drupal hashes, but to use it you need to take the hashes from the database and put them in a text file. That's not convenient, and may introduce unwanted risks; a text file containing password hashes should itself be treated as very sensitive information.

Drop the Ripper

Another option is the drush module Drop the Ripper, which is inspired by John. Drop the Ripper (DtR, which I created and maintain) comes with a default wordlist (curated by John the Ripper's maintainers) and uses Drupal's own code to check the hashes stored in the database.

It's fairly safe to use on production sites; it does not need to be installed as module, but it will use some resources if you are running a lot of checks.

The default options have DtR check the passwords for all users on the site against the top 25 "bad passwords" in the wordlist (along with a few basic guesses based on the user's details). Here's an example of that:

$ drush dtr
Match: uid=2 name=fred status=1 password=qwerty              [success]
Match: uid=4 name=marvin status=1 password=123456            [success]
Ran 65 password checks for 4 users in 2.68 seconds.          [success]

In that case, two of the users had pretty bad passwords!

You can narrow the check down by role, but roles are arbitrary in Drupal; how do you know which ones grant "administrative" privileges? There's an option to check all users with a role that includes any "restricted" permissions (those which show "Give to trusted roles only; this permission has security implications" in the admin interface). This is a good way of checking the accounts that could do serious damage if they were compromised:

$ drush dtr --restricted
Match: uid=1 name=admin status=1 password=admin              [success]  
Match: uid=3 name=sally status=1 password=password           [success]  
Match: uid=4 name=marvin status=0 password=abc123            [success]  
Ran 24 password checks for 3 users in 1.04 seconds.          [success]

You can target one or more specific users by their uid:

$ drush dtr --uid=11 --top=100
Match: uid=11 name=tom status=1password=changeme             [success]
Ran 47 password checks for 1 users in 3.85 seconds.          [success]

This can be useful if - for example - you notice something in your logs which suggests a particular account may have been subject to a brute force login attack.

Check the command's built-in help for details of more options, and several examples.

So isn't this dangerous? Can hackers use it?

Well, you can only run DtR if you can run drush commands on a site, in which case you can already log in as any user you want (drush uli) and/or change any user's password (drush upwd). However, it should be used carefully and responsibly; you should treat the output of the command as sensitive data in itself. There is an option to hide actual passwords, but consider that if a user came up as a "Match" with the default options, we can infer that their password is very obvious or high up on the wordlist.

Keep in mind also that people have a bad habit of using the same password everywhere. If DtR reveals that the username [email protected] has the password "abc123", we'd hope that's not also their gmail password. But it could be.

This tool should typically be used by site admins to check that their users - especially those with administrative super powers - have chosen passwords that are not trivial for bad actors to guess. If it turns out that there are bad passwords in place, one option is to use drush to set a hard-to-guess password for the account(s) in question, and then politely suggest that they reset their password to something better.

Drop the Ripper supports both Drupal 7 and 8/9, via both Drush 8 and 11.

Mar 04 2017
Mar 04

This was originally posted on the dev.acquia.com blog.

People tend to choose bad passwords if they are allowed to.

By default Drupal provides some guidance about how to "make your password stronger," but there's no enforcement of any particular password policy out of the box. As usual, there's a module for that. More than one in fact.

Thinking on password policies has evolved over the years. The United States National Institute for Standards and Technology (NIST) has been working for some time on a new set of guidelines which are a good basis on which to formulate your own password policy.

Security is always a compromise between mitigating risk and convenience. A fairly recent piece of research on password strategies from Microsoft, for example, suggested that users should use simple memorable passwords for "low-risk" sites, reserving more complex passwords for sites where the risk warrants the higher effort involved. Not everybody will agree with this suggestion, but it illustrates the tradeoff.

In other words, if your site is all about users sharing gifs of their cats, you may choose to make your password policy somewhat more lenient than that for a site where users access sensitive financial or healthcare information.

The NIST guidelines emphasize user-friendliness, and point out that excessively onerous password policies often have negative effects on security in terms of users' behavior. Forcing users to change their passwords every week, for example, is likely to lead to many choosing worse passwords than they would otherwise have done.

We'll now go through the main points of the NIST guidelines and look at how they relate to Drupal. What modules and/or configuration can be used to implement a policy based on these guidelines?

Size: Minimum 8 characters, maximum length of at least 64

Drupal password policy allows you to set a minimum length for all passwords.

As (any current version of) Drupal does salting and hashing of passwords, there's effectively no constraint on the maximum length imposed by storage in the database. However, there are practical limits imposed by the Form API. The password form input will typically have a maxlength of 128 characters (based on the database schema, although as noted it's not the cleartext password which will go into the database).

It's possible to set passwords longer than 128 characters (e.g. with drush) but users won't actually be able to submit these passwords through Drupal's forms to login. It would also be possible to increase that 128-character limit imposed by the combination of the database schema and the Form API, if that was a strict requirement.

Composition:

Do allow all printable ASCII characters, including spaces, and should accept all UNICODE characters, too, including emoji. Do not prescribe composition rules (e.g at least 2 of numbers, lower and upper case etc.)

Drupal has had fairly decent Unicode support for a long time (e.g. https://www.drupal.org/node/26688), but support for most emojis (and multi-byte UTF-8 in general) in the database came relatively late in the Drupal 7 development cycle, and often requires some tweaking of the database settings (see: https://www.drupal.org/node/2754539). However, as a salted hash rather than the password itself is stored in the database, passwords with emojis and other unicode characters typically work fine in Drupal even without the database support. Depending on how your PHP is set up (e.g. whether you have the mbstring extension), it may be worth testing this.

The Password Policy module allows quite complex composition rules to be set up for passwords, but the NIST guidelines reflect substantial research which suggests these often do more harm than good.

The Password Strength module "provides realistic password strength measurement and server-side enforcement," which is a user-friendly alternative to prescriptive composition rules.

The NIST guidelines suggest that spaces are allowed in passwords, which arguably makes for more user friendly policies when it comes to pass phrases. Drupal will allow spaces in passwords out-of-the-box.

Screen for known bad passwords

The Drupal 7 version of Password Policy module has (fairly simple) support for a blacklist rule. This blacklist can be populated with a dictionary of known-bad passwords, for example this list of the 500 worst passwords, or many alternatives. See the Openwall wordlists collection for more details.

Blacklist functionality is being worked on for the Drupal 8 version: https://www.drupal.org/node/2678578

Correct Horse Battery Staple

This is a relatively complex subject. Although very simple phrases based on only a few dictionary words may not make great passwords (See: Bad Passwords Are Not Fun and Cracking The 12+ Character Password Barrier, Literally ), it's perfectly possible to create good high entropy passwords using only dictionary words (See The Diceware Passphrase Home Page ), especially given a sufficiently high maximum password length.

The dictionary checking/blacklisting capabilities of the Password Policy module are fairly basic. It would almost certainly not be a good idea to use it to check passwords against a wordlist of ordinary dictionary words.

Avoid using hints or reminders

Drupal doesn't implement passwords hints or security questions out-of-the-box. There are contrib modules such as: Security Questions.

However the NIST guidelines recommend against using using these. There are many recent examples where security questions with fairly easy to guess or research answers have been used to compromise user accounts.

Although not strictly the same issue: Username Enumeration Prevention is worth a mention here. It aims to make it more difficult for potential attackers to find usernames which they can then use to attempt to login.

Use TFA / MFA

Two Factor or Multi Factor Authentication is an effective way to prevent unauthorised logins in the event of credential compromise or successful brute force attack.

Combining something you know ( password ) with something you have ( TFA/MFA token ) greatly reduces the risk of unauthorized access.

Drupal has a mature Two-factor Authentication (TFA) module available.

Plugins are available to integrate this with various libraries and services, such as Google Authenticator and SMS providers (e.g. Twilio).

A Drupal 8 version of this module is being worked on (details on the project page).

Implement rate-limiting

This is important for guarding against brute force attacks.

Drupal does rate-limiting out of the box, referred to internally as 'flood control'. However, there's not really any UI which exposes the configurations that can be tweaked. See: Flood control.

You don't need the UI module to change these configurations, but it's useful to show you the defaults and what Drupal variables governing the flood control settings you can change (perhaps looking at the screenshot of its admin form is sufficient). A Drupal 8 version of this module is being worked on (details on the project page).

There are modules which allow you to take rate limiting further, for example: Login Security.

There are also modules which integrate with firewall-level blocking of sources of potentially malicious requests, for example: Fail2ban Firewall Integration

Other considerations

Both Yahoo! and Google have recently done some work around doing away with passwords completely. There's a module for that: Passwordless.

Some sites have decided - for dubious reasons - to disallow pasting from the clipboard into password fields. This obstructs the use of password managers, and arguably goes against the NIST recommendations of being user friendly.

Developers should avoid using bad passwords when sites are being built on the assumption that somebody will replace them with proper passwords before go live. When handing sites over or setting users up for the first time, make use of Drupal's one-time login link functionality (for example via the drush uli command).

Conclusion

Drupal's user password system is fairly capable out of the box but - as usual - there are contrib modules which enhance the functionality and allow additional options and configurations. The Drupal 8 version of the Password Policy module uses the plugin system, meaning other modules can extend the base functionality. The Password Strength module works this way in Drupal 8, for example.

NIST's guidelines emphasise user-friendliness when it comes to password policies. Some of the more prescriptive approaches can be argued to do more harm than good when it comes to encouraging users to select good passwords.

In general, password policies should try to avoid users selecting very poor passwords, but, just as important, sites should not get in the way of users trying to employ high entropy "secure" passwords.

Oct 08 2014
Oct 08

This is a simple trick which (unless my googlefu simply failed me) I didn't find described anywhere when I had a quick look:

$ drush ev '$file = file_load(21749); var_dump(file_delete($file, TRUE));'
bool(true)

This means all the appropriate hooks are called in file_delete so the Drupal API gods should smile on you, and you should get to see the TRUE/FALSE result reflecting success or otherwise. Note that we're passing $force=TRUE "indicating that the file should be deleted even if the file is reported as in use by the file_usage table." So be careful.

To delete multiple files you could use file_load_multiple but there's not a corresponding file_delete_multiple function, so you'd have to loop over the array of file objects.

That's all there is to this one.

Nov 15 2013
Nov 15

Cache invalidation is known as one of the very few hard things in computer science.

It seems to be a common misconception that Drupal's cache_get checks whether a given cache entry has expired, and won't return a stale result. In fact, in Drupal this is not always the case.

The docs for both D6 and D7 actually say that if a specific timestamp is given as the $expire parameter in a cache_set, that this "Indicates that the item should be kept at least until the given time, after which it behaves like CACHE_TEMPORARY.". [D6/D7]

So this does not say that cache entries will expire (i.e. cache_get will not return them) after this timestamp has passed; rather it says that "the item should be removed at the next general cache wipe."

What this actually means is that it's the responsibility of the code which does a cache_get to check whether any object that it gets back is still valid in terms of the time it should expire.

So, if you want to use Drupal's cache system in D6 or D7 to store a value for a short amount of time, but not wait for the cache entry to be cleared until "the next general cache wipe", you must check the expire timestamp on any cache object that you receive back from a cache_get.

Here's a little php script which illustrates this; we still get a cache object back even although it has expired:

expire < REQUEST_TIME) {
    $reset_cache = TRUE;
    print "cached data has expired; resetting\n";
  }
}
else {
  $reset_cache = TRUE;
}
 
if ($reset_cache) {
  print 'setting this to cache: ' . ($data = md5(rand())) . "\n";
  cache_set('test_cache_expiry', $data, 'cache', REQUEST_TIME + TEST_CACHE_LIFETIME);
}

...and here's what happens if we run it a few times in quick succession:

$ for i in {1..8}; do drush scr cache_test.php; sleep 3; done
 
###
running cache test at 1384557409
setting this to cache: 5d9f014b374764e35220ead02102b1e7
 
###
running cache test at 1384557412
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557416
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557419
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
 
###
running cache test at 1384557422
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => 5d9f014b374764e35220ead02102b1e7
    [created] => 1384557409
    [expire] => 1384557419
    [serialized] => 0
)
cached data has expired; resetting
setting this to cache: a57b9e9734824207e0aa6d4d6a4b6973
 
###
running cache test at 1384557426
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
 
###
running cache test at 1384557429
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
 
###
running cache test at 1384557433
this came from cache: stdClass Object
(
    [cid] => test_cache_expiry
    [data] => a57b9e9734824207e0aa6d4d6a4b6973
    [created] => 1384557422
    [expire] => 1384557432
    [serialized] => 0
)
cached data has expired; resetting
setting this to cache: abbe82035a1bcaea187259f316f04309

Note that not all cache backends work the same - memcache doesn't seem to return cache entries after their expire timestamp has passed, for example.

We should assume, however, that we might well get a cache object which has expired back from cache_get, so we should always check the expire property before assuming that the cache entry is valid

See https://drupal.org/node/534092 for some discussion as to whether this is a bug or a feature.

Feb 07 2012
Feb 07

Quite some time ago I wrote a post about how patching makes you feel good in which I talked about the motivations for, and benefits of submitting patches on drupal.org (d.o). I concluded by suggesting that project maintainers should be generous in recognising the efforts of those who submit patches.

Well, now that d.o has its magnificent git infrastructure, project maintainers have even better tools for giving credit to contributors who help fix or improve the code. There is still the well-established convention for commit messages which encourages that "others [who] have contributed to the change you are committing" are credited by name. e.g.

Issue #123456 by dww, Dries: Added project release tracking.

Similar messages are often added to the project's changelog too.

The new tool that perhaps not everyone knows about yet is the ability to assign the authorship of the commit to another user e.g.

git commit --author=[username]@[uid].no-reply.drupal.org

This is appropriate when committing a patch that is entirely somebody else's work. Perhaps some maintainers will be generous and attribute authorship even if they've had to make a few tweaks to the patch, but somebody else did the majority of the work to identify and fix a bug, for example.

When a user is credited as the author in this way, the commit will show up on their drupal.org profile page, which I think many people will feel is a great reward for the time they spent putting a patch together.

There are however some limitations and drawbacks to the system. The committer of the patch is not rewarded by seeing their commit count incremented, which some may find a disincentive for generosity in attributing authorship.

Where the maintainer might split the credit in a commit message for a fix where a user was helpful by giving a detailed bug report in the issue queue, but where they themselves had to actually fix the problem, for example, they're probably justified in leaving themselves as the author of the commit. Of course they can still mention the helpful user in the commit message and changelog.

There will surely also be less monochrome cases where authorship of the code being committed should be split between multiple users. As far as I'm aware, the git infrastructure on d.o doesn't cater for this situation, and messy workarounds such as breaking the commit up to split authorship have been suggested.

There are undoubtedly some limitations, and project maintainers will occasionally find themselves with tricky decisions to make. However, for the reasons I detailed in my patching makes you feel good post, I really encourage maintainers to be generous with the credit when it comes to patches which have been submitted in issue queues, and the option to set an author for a commit in git is a great way of doing so.

Mar 14 2011
Mar 14

I was using the brilliant context module in a project recently. The fact that it uses ctools means it has a few characteristics reminiscent of views (and panels). One of these is the import / export functionality, and the distinction between the different types of storage for the contexts you've set up - i.e.

  • normal
  • default
  • overridden

Seeing this, I was certain there must be a way of defining contexts in code in a module, similar to the way you can define default views hook_views_api() and hook_views_default_views(). However, I really struggled to find any documentation about the correct hooks and syntax to achieve this.

Of course, one way of finding out was to use the features module to package up a context and have a look at the code it produced.

It turns out this works in a very similar way to the views module (unsurprisingly given their shared heritage). I thought I'd document it here in case other people are struggling to find some clear instructions as to how to include your own default context objects in your module.

Just like with views, you need to implement an api hook, and then the actual context_default_contexts hook which returns the exported context object(s):

/**
 * Implementation of hook_ctools_plugin_api().
 */
function mymodule_ctools_plugin_api($module, $api) {
  if ($module == "context" && $api == "context") {
    return array("version" => 3); 
  }
}

and then in mymodule.context.inc something along the lines of this:

/**
 * Implementation of hook_context_default_contexts().
 */
function mymodule_context_default_contexts() {
  $export = array();
  $context = new stdClass;
  $context->disabled = FALSE; /* Edit this to true to make a default context disabled initially */
  $context->api_version = 3;
  $context->name = 'testy';
  $context->description = 'testing context';
  $context->tag = '';
  $context->conditions = array(
    'node' => array(
      'values' => array(
        'page' => 'page',
      ), 
      'options' => array(
        'node_form' => '1',
      ),
    ),
  );
  $context->reactions = array(
    'menu' => 'admin/help',
  );  
  $context->condition_mode = 0;
 
  // Translatables
  // Included for use with string extractors like potx.
  t('testing context');
 
  $export[$context->name] = $context;
  return $export;
}

Perhaps the reason I couldn't find easy documentation for this is that it's really the ctools api doing the work - I think I'll submit an issue for the context module though to suggest at least a hint is added to the README to point developers in the right direction.
[edit]I posted a documentation issue on drupal.org[/edit]

One of the posts I did find which helped was Stella Power's interesting write up on how to use ctools to create exportables in your own module.

Feb 08 2011
Feb 08

In general I'm a happy vim user, but now and again I am asked why I'm using such an antiquated environment. Editor preference is of course a topic over which many long and pointless arguments have been waged - often from intractable dug-in positions of dogma. I think it's good to poke your head above the trench occasionally and see what else is available.

I suppose the other end of the spectrum to something as lightweight (albeit extensible) as vim is a full IDE such as Eclipse. There are undoubtedly many great things about the Eclipse environment, but I just can't get over the annoyance of my rhythm being interrupted as I type; it feels like it's just trying to do too much for me.

It was in this spirit that I recently gave Geany a try. It's a lightweight cross-platform editor based on GTK, with some basic IDE features.

As a Drupal developer one of the first pleasant surprises was that I didn't need to tell it to treat .module and .install files as PHP scripts - it's smart enough to see the opening tag and give me syntax highlighting etc... straight away.

It does a few simple things which make life easier - code folding if you want, a list of functions in the current file, and the facility to jump to the definition of a given function or constant, or search for the usage of whatever your cursor is over within the current document, or all open documents. It also does simple code completion, auto indentation, and has a tag library of PHP's built in functions. For PHP files, the compile button checks the syntax of the current file using php -l which is particularly useful when working on install or update hooks in Drupal.

Working with Drupal we're often not using built-in PHP functions, but rather those from the Drupal API. It's possible to get some help from Geany here too. Whilst I've seen this done with Eclipse and other IDEs before, this is one of the areas where they can become very cumbersome - for example as they try to re-index a few mb of source files for the project in the background while you're typing. The approach I'm trying out with Geany is more along the lines of using ctags with vim (or emacs).

The idea is to scan through the source code of the project (Drupal core in this case) just once, and make a list of all the function names (and constants if you like) with a few helpful details such as the argument lists. Rather annoyingly Geany uses the slightly obscure tagmanager format for its tags. The docs suggest that you can use Geany itself to generate your list of tags:

So, inside a freshly drushed copy of Drupal6 core, I tried

geany -g /tmp/drupal6.php.tags $(find . -type f -name '*.php' -o -name '*.module' -o -name '*.inc' -o -name '*.install' -o -name '*.engine')

...(using the unix find command to provide geany with a list of all the files with the common file extensions Drupal uses for PHP files).

This produced a file in tagmanager format, which geany was happy to use. This gave me auto-completion for Drupal function names, but it was missing a very useful feature - the argument list (which are provided for PHP's built-in functions).

Luckily Geany supports an alternative tag format - pipe-delimited - which is easier to manipulate than the tagmanager format. There is a contributed tags file available for Drupal which includes basic argument lists, but from the timestamp on the file I think this is probably for Drupal 5.

So how about rolling our own? Looking at the tags file which exuberent ctags produces for Drupal core, and at what we need to provide in the basic pipe-delimited tags file, ctags does most of the work for us.

ctags output

drupal_set_message  includes/bootstrap.inc  /^function drupal_set_message($message = NULL, $type = 'status', $repeat = TRUE) {$/
drupal_set_title  includes/path.inc /^function drupal_set_title($title = NULL) {$/

geany pipe-delimited format

function_name|return_type|argument, list|

I'm sure someone could come up with an awk one-liner to do this, but I resorted to a quick php cli script.

Unfortunately ctags is not clever enough to give us the return type, which might be there in doxygen-style comments (as will better info about the arguments), but this gives us:

drupal_set_message||$message = NULL, $type = 'status', $repeat = TRUE|
drupal_set_title||$title = NULL|

...which is pretty useful as a quick reminder.

With the addition of the Drupal tags file, I'm finding Geany a really good compromise between the light-weight speed of something like vim whilst offering some really useful IDE-type features for Drupal development. Why not help yourself to the tags file and give it a go.

[edit]
added after some comments and feedback on IRC:
You can enable the tag files (it only really makes sense to use one of them at a time) by placing them in:

 ~/.config/geany/tags

...you will also have to rename the files to remove the underscore, as otherwise geany will refuse to use them.

If you have any problems, try launching geany from the commandline with the -v ( --verbose) option.
[/edit]

Nov 28 2010
Nov 28

I recently found myself trying to use drush to set up a Drupal 6 install on a server where I did not have root access. I kept getting errors along the lines of this:

Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 1234 bytes)

I didn't have permission to edit the php.ini file for php_cli, so I had to find another workaround. The solution's pretty simple and comes courtesy of the drushrc.php file.

drush looks in a list of locations for a config file, and the example.drush.php which you'll find inside includes in your drush directory contains the following instructions:

 * Rename this file to drushrc.php and optionally copy it to one of
 * five convenient places, listed below in order of precedence:
 *
 * 1. Drupal site folder (e.g sites/{default|example.com}/drushrc.php).
 * 2. Drupal installation root.
 * 3. In any location, as specified by the --config (-c) option.
 * 4. User's .drush folder (i.e. ~/.drush/drushrc.php).
 * 5. System wide configuration folder (e.g. /etc/drush/drushrc.php).
 * 6. Drush installation folder.

I simply followed these instructions (going for ~/.drush/drushrc.php in this case), and added the following line to the end of the file:

ini_set('memory_limit',          '128M');

You can check that this is getting picked up like so:

drush php-eval 'print ini_get("memory_limit")' ;
128M

...and the real proof of the pudding is that I'm no longer getting the memory exhausted error messages.

Oct 08 2010
Oct 08

I was recently asked to add a simple twitter feed to a Drupal 5 site I built a few years ago. You know the sort of thing - a block in the sidebar with the 5 most recent tweets from the organisation's twitter account. Unfortunately, this turned out to be one of those requests that sounds like it's going to be really easy, but is not (especially on Drupal 5). I came up with a solution which does the job quite nicely, but it's a bit of a hack; I wouldn't do it like this on a new site - but then of course I wouldn't be using Drupal 5.

There are several twitter modules for Drupal, as you'd expect. The twitter module does have a Drupal 5 version, but it's a minimal dev release and doesn't seem to offer much of the functionality of the other releases. The other module which sounded promising was twitter pull but this has no D5 release at all. I had a think about doing a backport, but this was supposed to be a quick favour and with the imminent release of D7 (ahem!), working on Drupal 5 code hardly seems like an investment in the future.

Then I checked, and found that twitter provides RSS feeds for users' timelines, and obviously D5 has a perfectly functional RSS aggregator module built-in. The RSS feed looks something like this:

http://twitter.com/statuses/user_timeline/30355784.rss
[/geshifilter-text]

...so I set this up as a feed (admin/content/aggregator), and this automatically provided a block. This works okay, but the problem is that each item ends up like this:

[[html]]
mcdruid_co_uk: Drupal Quiz module: "action to be preformed after a user has completed Quiz: Ban IP address of current user" - seems a bit extreme, shirley?
[[/html]]

...so we end up with the entire tweet being a link, pointing at a page displaying just that one tweet. This is not exactly what I was hoping for.

So here's the slightly nasty hack; like all well-written Drupal modules, aggregator runs its output through a theme function, which can be overridden:

uid && module_exists('blog') && user_access('edit own blog')) {
    if ($image = theme('image', 'misc/blog.png', t('blog it'), t('blog it'))) {
      $output .= '

'. l($image, 'node/add/blog', array('title' => t('Comment on this news item in your personal blog.'), 'class' => 'blog-it'), "iid=$item->iid", NULL, FALSE, TRUE) .'

'; } }   // Display the external link to the item. $output .= ''. check_plain($item->title) ."\n";   return $output; } ?>

We wouldn't necessarily want to mess with the output of feeds other than this twitter one, so we need a way of identifying items from this feed. I noticed that twitter's RSS feed prepends the twitter username to each post, so we can use this to identify them.

I borrowed a function from the twitter module which deals with twitter @usernames and #hashtags. After that, running the tweet through the default input filter strips out potential nasties, and also converts URLs to links (assuming you have this switched on in your default input filter). The item includes a timestamp, so I used that to append a date/time. I didn't want the blog it stuff, so my theme code (in template.php) ended up looking like this:

define('ORGNAME_TWITTER_USERNAME',		'orgname');
 
/**
 * theme aggregator items
 */
function mytheme_aggregator_block_item($item, $feed = 0) {
  // horrible hack, but catch items which are part of the twitter feed
  $twitter_username = variable_get('orgname_twitter_username', ORGNAME_TWITTER_USERNAME);
  $tweet_prefix = "$twitter_username: ";
  if (strpos($item->title, $tweet_prefix) === 0) {
    // chop the username part from the start
    $tweet = substr($item->title, strlen($tweet_prefix));
    // @usernames
    $tweet = _mytheme_twitter_link_filter($tweet);
    // #hashtags
    $tweet = _mytheme_twitter_link_filter($tweet, '#', 'http://search.twitter.com/search?q=%23');
    // add date
    $tweet .= ' (' . format_date($item->timestamp) . ')'; 
    // filter
    $tweet = check_markup($tweet);
    $output = trim($tweet);
  }
  else {
    // default implementation
    $output = ''. check_plain($item->title) ."\n";
  }
  return $output;
}
 
/**
 * stolen from twitter module (for D6)
 *
 * This helper function converts Twitter-style @usernames and #hashtags into 
 * actual links.
 */
function _mytheme_twitter_link_filter($text, $prefix = '@', $destination = 'http://twitter.com/') {
  $matches = array(
    '/\>' . $prefix . '([a-z0-9_]{0,15})/i',
    '/^' . $prefix . '([a-z0-9_]{0,15})/i',
    '/(\s+)' . $prefix . '([a-z0-9_]{0,15})/i',
  );
  $replacements = array(
    '>' . $prefix . '${1}',
    '' . $prefix . '${1}',
    '${1}' . $prefix . '${2}',
  );
  return preg_replace($matches, $replacements, $text);
}

The result is nicely filtered and formatted tweets, with @usernames, #hashtags and URLs all converted to links.

If you don't like the way the aggregator module provides the block for the feed, you can always take what aggregator outputs and manipulate it with your own module code, e.g.:

  // get the block content from aggregator module
  $block = module_invoke('aggregator', 'block', 'view', 'feed-1');
  $block_content .= $block['content'];

It'll do the job until the site's upgraded to a newer version of Drupal.

Sep 04 2010
Sep 04

I came across a problem when working on a Drupal user registration form which had to include an accept terms and conditions checkbox. In fact, I came across a couple of problems, but I only had to fix one myself.

At present there are some bugs in Drupal 6 core pertaining to mandatory checkboxes. Luckily, there's a module for that. I was using the handy terms of use module, which works around the mandatory checkbox problems with or without the help of the checkbox_validate module.

One problem remained though - if I submitted the registration form without ticking the mandatory terms of use checkbox, the form validation worked in that it stopped me proceeding, and I saw the You must agree with the Terms of Use to get an account error message at the top of the form, but there was no visual highlighting of the checkbox in question.

A poke around with firebug confirmed that the form API has been cajoled into doing its form_set_error thing, and the checkbox had the error class. However, styling form elements - particularly the different types of inputs - with CSS has always been a bit hit-and-miss. The 2px solid red border that the stylesheets specified should be around the checkbox was nowhere to be seen.

I checked in several other browsers (lots of screenshots included with this post) - it looked like the IEs and Opera were the only browsers which rendered the red border around the checkbox - Firefox, Chrome and Safari did not.

My workaround is to override the theme function for a checkbox, and to wrap checkboxes with the 'error' class in a span which I can then style. In my CSS I also turn off the red border for the checkbox itself, as otherwise you get double borders in the browsers which don't have this problem in the first place:

';
 
  if (strpos($element['#attributes']['class'], 'error') !== false) {
    $checkbox = "$checkbox";   
  }
 
  if (!is_null($element['#title'])) {
    $checkbox = ''. $checkbox .' '. $element['#title'] .'';
  }
 
  unset($element['#title']);
  return theme('form_element', $element, $checkbox);
}

...and...

[[css]]
/* CSS */
.checkbox-error {
border: 2px solid red;
}
.form-item .checkbox-error input.error {
border: none;
}
[[/css]]

The results are not visually perfect - the red border on my span doesn't always hug the edges of the checkbox very neatly - particularly in the browsers which don't render the red border around the checkbox, as they tend to be the ones which are rendering shadows or other fancy effects. Anyway the point is there's now something you can hang your hat on CSS-wise, and that actually shows up in all the major browsers.

Jul 27 2010
Jul 27

I recently undertook a migration from oscommerce to ubercart. The scope of this migration was limited to the transfer of products (and categories) - I didn't try and migrate customers and previous orders.

Here's an overview of the procedure I followed:

  • generate a CSV file of categories in oscommerce
  • import into drupal / ubercart using taxonomy_csv module
  • generate a CSV file of product data from oscommerce
  • import into drupal / ubercart using (patched) node_import module

My life was made relatively easy by the fact that although the categories in oscommerce had a hierarchical structure, it was very simple. There were only a handful of top-level categories, and the tree was only one deep. Here's what the schema for categories looks like in oscommerce:

[[sql]]
mysql> SHOW TABLES LIKE 'categor%';
+-----------------------------------+
| Tables_in_mysite_osc1 (categor%) |
+-----------------------------------+
| categories |
| categories_description |
+-----------------------------------+
mysql> DESCRIBE categories;
+------------------+-------------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+------------------+-------------+------+-----+---------+----------------+
| categories_id | int(11) | NO | PRI | NULL | auto_increment |
| categories_image | varchar(64) | YES | | NULL | |
| parent_id | int(11) | NO | MUL | 0 | |
| sort_order | int(3) | YES | | NULL | |
| date_added | datetime | YES | | NULL | |
| last_modified | datetime | YES | | NULL | |
+------------------+-------------+------+-----+---------+----------------+
mysql> DESCRIBE categories_description;
+-----------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-----------------+-------------+------+-----+---------+-------+
| categories_id | int(11) | NO | PRI | 0 | |
| language_id | int(11) | NO | PRI | 1 | |
| categories_name | varchar(32) | NO | MUL | | |
+-----------------+-------------+------+-----+---------+-------+
[[/sql]]

I had 115 categories, 5 of which were top-level, and the remaining 110 were all children of one of those 5 parents. This meant it was simple to generate the CSV for the import of the category hierarchy into drupal manually; I simply went through my top-level categories and got a list of all their children. All I wanted for the import was the category names. (n.b. I could also ignore language as this site's monolingual.) I did some simple queries to get the names of categories where the parent was one of my top-level categories (which all had a parent_id of 0), e.g.

[[sql]]
SELECT categories_name FROM categories c
INNER JOIN categories_description cd ON c.categories_id = cd.categories_id
WHERE c.parent_id = 31;
[[/sql]]

...and prepared a CSV file, a snippet of which is below (where Empire and Europe and Colonies are top-level) I then imported my categories using taxonomy_csv, set to mode Hierarchical tree structure or one term by line structure. node_import can also import taxonomy terms, but unless I'm mistaken it doesn't support hierarchical taxonomies.

"Empire",
,"Aden"
,"Antigua"
,"Ascension"
,"Australia"
,"B.O.I.C."
,"Bahamas"
...snip...
"Europe and Colonies",
,"Austria"
,"Baltic"
,"Benelux"
,"Eastern Europe"
,"France"
...etc...

Next was my products. I also wanted to use a CSV file to import these into ubercart, so I had to generate a CSV file from the oscommerce database. I wrote a quick php cli script which queries the database, (optionally) grabs product images using CURL from the webserver oscommerce is running on, and outputs a nice CSV file and a folder full of product images (which need to be put in the right place on the drupal/ubercart server). Here's the script:

#!/usr/bin/php
http://shop.example.com/images/'</a>    );
define('IMAGE_LOCAL_DIR',        '/tmp/osc_images/'                   );
 
$query = << 5656
    [[products_quantity]] => 1
    [[products_model]] => 
    [[products_image]] => AdenStH.jpg
    [[products_price]] => 10.0000
    [[products_date_added]] => 2009-03-16 12:34:00
    [[products_last_modified]] => 
    [[products_date_available]] => 
    [[products_weight]] => 0.00
    [[products_status]] => 1
    [[products_tax_class_id]] => 0
    [[manufacturers_id]] => 0
    [[products_ordered]] => 0
    [[language_id]] => 1
    [[products_name]] => N05656 - Ascension : Multifranked to St. Helena 1939
    [[products_description]] => St. Helena receiver dated 1941! Curiosity!!
    [[products_url]] => 
    [[products_viewed]] => 159
    [[categories_id]] => 409
    [[categories_name]] => Ascension
)
*/
 
// prepare files
$handle = fopen(CSV_OUTPUT, 'w');
 
$columns = array('sku', 'name', 'date', 'description', 'image', 'price', 'category');
fputcsv($handle, $columns);
 
 
if (GRAB_IMAGES) {
  if (!is_dir(IMAGE_LOCAL_DIR)) { 
    mkdir(IMAGE_LOCAL_DIR);
  }
}
 
while (($product = db_object($products)) && ($counter < LIMIT)) {
  //print_r($product);
  $counter ++;
 
  $sku = substr($product->products_name, 0, 6);
  $product_name = substr($product->products_name, 9);
  $image_name = clean_filename($product->products_image);
 
  $data_to_write = array(
                          $sku,
                          $product_name,
                          $product->products_date_added,
                          $product->products_description,
                          $image_name,
                          $product->products_price,
                          $product->categories_name
                        );
  $data_to_write = array_map('trim', $data_to_write);
 
  if (GRAB_IMAGES) {
    if (grab_image($product->products_image, $image_name)) {
      echo 'grabbed ' . $product->products_image . " ($image_name)\n";
    }
    else {
      echo 'failed to grab ' . $product->products_image . "\n";
    }
  }
 
  fputcsv($handle, $data_to_write);
}
 
fclose($handle);
echo "# iterated over $counter products\n### END\n";
exit;
 
/** helper functions **/
 
function db_error($message) {
  echo "db_error: $message\n" . mysql_error() . "\n";
}
 
function db_connect($db_host, $db_name, $db_user, $db_pass) {
  $db = mysql_connect($db_host, $db_user, $db_pass) or db_error('Unable to connect to database');
  mysql_select_db($db_name, $db);
  return $db;
}
 
function db_query($query, $db) {
  $result = mysql_query($query, $db) or db_error($sql);
  return $result;
}
 
function db_object($result) {
  return mysql_fetch_object($result);
}
 
function grab_image($image_name, $new_name) {
  $url = IMAGE_REMOTE_PATH . $image_name;
   // use curl to grab the image from the server
  $ch = curl_init($url);
 
  curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
  curl_setopt($ch, CURLOPT_HEADER, FALSE);
 
  $data = curl_exec($ch);
  curl_close($ch);
  if (strlen($data) > 0) {
    $retval = file_put_contents(IMAGE_LOCAL_DIR . $new_name, $data);
  }
  else {
    $retval = false;
  }
  return $retval;
}
 
function clean_filename($old_name) {
  $bad_stuff =  array('.JPG', ' ', '&');
  $good_stuff = array('.jpg', '',  '-');
  $new_name = str_replace($bad_stuff, $good_stuff, $old_name);
  return $new_name;
}

You'll see my products had an SKU we'd put into the first part of the title in oscommerce - this should be easy to remove if it's not applicable. The image grabbing requires php-curl - you could just grab a whole image directory off the server running oscommerce, but I wanted to be careful to only migrate images actually being used. There are obviously many other ways of doing this.

I had to apply a couple of patches to node_import (rc4) to get it working for me:

...I also set escape char to \ (when using fputcsv in my script). After all that, I was able to follow the node_import wizard, mapping fields in the CSV to fields in my ubercart product content type, after which the imported successfully digested my CSV, and all my products appeared with their images in my new ubercart site (I obviously had to put the images my script had grabbed into the right place on the ubercart webserver as well).

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web