Apr 01 2009
Apr 01

Developers are all familiar with the default behavior of the drupal menu systems "local tasks" (aka tabs). These appear throughout most Drupal sites, primarily in the administration area, but also on other pages like the user profile.

Generally, developers are pretty good about creating logical local tasks, meaning only those menu items which logically live under another menu item (like view, edit, revisions, workflow, etc... live under the node/% menu item).

But sometimes, these tabs either don't really make sense as tabs or you simply want to have the flexibility of working with the items as "normal menu items", or those menu items which appear under admin/build/menu.

I recently wanted to move some of the tabs on the user profile page (user/UID) into the main menu so that I could include them as blocks.

For some reason, developers think the user profile page is a great place to put tabs for user related pages such as friendslist, tracker, bookmarks, notifications and so on. But these types of items are less a part of the user's account information than they are resources for specific users. Personally, I would not think to look at my account information on a site to find stuff like favorites or buddies. I'd expect those items to be presented somewhere much more obvious like a navigation block.

Initially, this may seem like a trivial task. My first thought was to simply use hook_menu_alter() and change the 'type' value of the menu item from MENU_LOCAL_TASK to MENU_NORMAL_ITEM. However, for reasons I don't understand well enough to explain in detail, this does not work.

In order to achieve the desired result, you must change the path of the menu item and incorporate the '%user_uid_optional' argument, replacing the default '%user' argument.

All very confusing, I know. Let's look at an example.

The notifications module (which provides notification on changes to subscribed to content) uses the user profile page rather heavily. I don't want its links there, I want them in the sidebar where users can always see them.

<?php
/**
* Implementation of hook_menu_alter().
*/
function MODULENAME_menu_alter(&amp;$callbacks) {
 
// NOTIFICATIONS MODULE
 
$callbacks['notifications/%user_uid_optional'] = $callbacks['user/%user/notifications'];
 
$callbacks['notifications/%user_uid_optional']['type'] = MENU_NORMAL_ITEM;
  unset(
$callbacks['user/%user/notifications']);
  <
SNIP>
}
?>

So I have moved the notifications menu into my own menu, changed the type, used %user_uid_optional instead of %user, and unset the original menu item.

This works fine except for the fact that you'll lose all of the other menu items under user/%user/notifications! You need to account for all menu items in the hierarchy to properly reproduce the tabs in the main menu system, so we add the following:

<?php
    $callbacks
['notifications/%user_uid_optional/thread'] = $callbacks['user/%user/notifications/thread'];
    unset(
$callbacks['user/%user/notifications/thread']); $callbacks['notifications/%user_uid_optional/nodetype'] = $callbacks['user/%user/notifications/nodetype'];
    unset(
$callbacks['user/%user/notifications/nodetype']); $callbacks['notifications/%user_uid_optional/author'] = $callbacks['user/%user/notifications/author'];
    unset(
$callbacks['user/%user/notifications/author']);
?>

And of course, we don't want this code executing at all if our module is not enabled, so you'd want to wrap the whole thing in:

<?php
 
if (module_exists('notifications')) {
 
  <
SNIP>

  }

?>

Keep in mind that not all modules implement menu items using hook_menu(). It's becoming more and more common for developers to rely on the views module to generate menu items, and this is a wise choice. Menus generated using views (ala bookmark module) can be modified to get the desired result without any custom code.

Nov 25 2006
Nov 25
Nostradamus

I began the Devbee website back in March as a way to help others by way of documenting what I have learned about Drupal and also to drum up a little bit of business for myself. The content of this site is extremely targeted, and I don't ever expect to see more than a few hundred visits a day. This definitely does not reflect the expectations, or at least hopes, of most website owners. It's typically all about bringing in as many visitors as possible to generate money through advertising or purchases. Sites interested in bringing in large numbers of visitors typically do this by spending a lot of time focusing on "search engine optimization" (SEO). Absolutely nothing can drive traffic to a site like a top placement in the search results on one of the major search engines.

Back in the day (way back during the last millennium), all one needed to do was have a simple HTML page containing relevant words or phrases and he was fairly likely to make a decent showing in results pages. In fact, this is exactly how I shifted from studying literature to building websites. I built my first homepage (don't laugh!) for fun. It was found by an employer, and I got a cool job at a major search engine. Today, it is not so simple.

Fortunately for us, as Drupal users, we have a secret weapon, Drupal itself. Drupal SEO does not require any witchcraft or elaborate HTML trickery. It's simple, and in this article, I'm going to explain how I get consistent premium search placement with very little effort.

Stumbling upon Drupal SEO

Today I discovered that an article I wrote recently is the top result for the query "opcode cache" on Google. I almost feel guilty about it. There are countless pages out there with much more information on the topic than my article, yet I'm at the top. I guess I'll just have to deal with it.

This is not unusual. I find myself on "the first page" of many searches for terms relevant to my site. And when I'm not seeing a premium placement (top-ten), it's either because the search term is very broad (e.g. "Drupal") or there are simply much more relevant pages pushing my placement down. Just like the old days.

And more than half of my very modest traffic comes through these search results.

What's the Secret?

Now comes the mysterious part. I make no claims of expertise in the area of SEO. It's mostly voodoo as far as I'm concerned. The search engines are necessarily very secretive about their methods, trying to stay ahead of search engine spammers. And what works today may be detrimental tomorrow. What I'm going to describe below is entirely based on my own, very subjective, experience with various techniques and modules. These are the things that I believe are resulting in my accidental SEO success.

Drupal SEO

Drupal itself is well-known for its search-engine friendliness. Its markup is clean and standards-compliant. It creates all the tags the engines are looking for. And unlike so many other CMSs, Drupal creates search engine friendly URLs. Using Drupal is the first step in this process, but presumably you're already doing this, so let's move on.

The Right Path

Here's an example of the URL to a Joomla forum topic: http://forum.joomla.org/index.php/topic,65.0.html

And here's an example of a URL to a Drupal forum topic: http://drupal.org/drupal-5.0-beta1

Do you notice a difference? Can you tell me anything about the Joomla article without going to the page? In fact you can, sort of: you might conclude that the page covers a topic, a fact of dubious value. The URL really provides no useful information to you. Nor does it provide anything useful to a search engine. This is key. Unless you're searching for "index topic 65.0 html", this URL isn't going to help you find the information on this page.

Looking at the Drupal URL is another story. Based on that URL, one can assume that it has something to do with "drupal 5.0 beta1", and so can a search engine. If that's what you're looking for, this page will come up #1.

Most SEO "experts" agree that the search-engine-friendly URLs are critical to a page's search ranking.

Drupal allows you complete control of the path of any page. Creating short, clean and informative paths will improve your rankings. And the Pathauto module automates the process of generating relevant paths. But be extremely careful when experimenting with Pathauto, particularly on sites with existing content. Using Pathauto without first understanding how to use it properly can result in all of the URLs on your site changing, and thereby breaking existing links to your content. If you are going to introduce Pathauto on an existing site, play it safe and enable the Create a new alias in addition to the old alias option in Pathauto's settings. But keep in mind that having multiple URLs pointing to the same page on your site may result in a search engine penalty for "duplicate content".

Sitemaps

Sitemaps are an easy way for webmasters to inform search engines about pages on their sites that are available for crawling. In its simplest form, a Sitemap is an XML file that lists URLs for a site along with additional meta data about each URL (when it was last updated, how often it usually changes, and how important it is, relative to other URLs in the site) so that search engines can more intelligently crawl the site.

I've seen no solid evidence that implementing a sitemap will directly improve search rankings. However, even if search engines do not use your sitemap to to adjust the ranking of your pages (which I doubt), it does help them more efficiently index your site, thereby increasing the likelihood of your pages being included in search results. This one's a no-brainer.

Sitemaps would be virtually impossible to maintain by hand. And this is where the excellent XML Sitemap (formerly Google sitmeap) module comes in. Installing this module is simple and comes with reasonable default settings that don't require changing unless you want to fine tune your sitemap. After you've installed and enabled this module, you'll need to tell search engines about your sitemap. At this point, I'm only familiar with Google Sitemaps, Though other major companies are beginning to adopt this concept as an new open standard.

Leaving Comments

Another common method used by search engines to determine the importance of your pages is the number of other sites that link to them. A simple way to continually promote your site while helping improve your search rankings is to make regular comments on other sites like Drupal.org. Take the time to create an account on sites similar to yours and complete your public profile. Then leave useful comments where appropriate. Do not post comments simply to include a link back to your site. This is in very poor taste and may get you blocked. Instead, post comments where you have something to contribute to the topic being discussed. If you have nothing useful to add, don't post a comment. I'm a regular participant over at Drupal.org, and I'm confident this helps the "relevance" of my own site.

Page Title

By default, Drupal will use the title of your node as the page HTML title (the bit that appears in the <title></title> tags of the HTML and shows up in the title bar of your browser). This is very reasonable behavior. However, if you want to give your page that extra SEO boost, you may want to allow for two different page titles, one that appears at the top of the page in <h1> tags and the other that appears in the head of the HTML document in the <title> tag. the <h1> and <title> tags are both pieces that search engines will consider when reviewing your page. If they are identical, you're missing out on an opportunity to further promote the page!

So how do you manage to control the <title> tag contents if Drupal automatically sets it based on the node title? The Page Title module does this. Install and enable this module, and you will see an additional field on the node edit form called "page title". Use this field to configure the phrase that you think will most likely attract users to the page. Use something eye catching and alluring, something the user will feel he has to read. If you're writing about an article you found on another site, don't title the page "cool link!", instead, something more enticing: "Fascinating study of the Indonesian spotted tadpole". Follow that up with a relevant <h1> title: "National Geographic looks at one of nature's most mis-understood wonders".

The Prophecy

Search result placement was not a top concern of mine when I built this site. But it has become a bit of an obsession now. I have no need to drive thousands of visitors seeking information on opcode caching to my site, but hitting that number one position for a query is a bit of a rush! Thanks Drupal!

Lastly, I asked myself a question as I wrote this article: Is there anything at all to what I'm saying? Well, I think there is, and I'm willing to make a bold prediction based on this belief. Within three days of posting this article, I believe it will appear in the top-ten search results for "Drupal SEO" on Google. If I'm right, that should serve as some pretty solid evidence that there's something to all this. There are currently 1,090,000 pages competing for placement in this results page. The odds of making it into the top-ten by shear luck are 1 in 109000.

And if I'm wrong, well, I can always come back and edit out this prediction to save face %^)

The Revelation

Update: Mon Nov 27 23:19:42 2006

A search for "Drupal SEO" now shows this article as the second result out of 1,080,000 pages. I come in just below an article on Drupal.org.

So as you now see, there is not a lot of work involved in getting premium search placement if you are using Drupal. Of course, the broader your topic, the more difficult it will be to hit the top-ten. While you can almost certainly hit number one for surfers searching for a certain rare antiquity, your less likely to see much success attracting surfers hunting for the term "sex".

drupal_seo
Nov 16 2006
Nov 16

Until the mid 90s, spam was a non-issue. It was exciting to get email. The web was also virtually spam-free. Netizens respected one another and everything was very pleasant. Spam Those days are long gone. Fortunately, there are some pretty amazing tools out there for fighting email spam. I use a combination of SpamAssassin on the server side and Thunderbird (with its wonderful built in junkmail filters) on the desktop. I am sent thousands of spam messages a day that I never see thanks to these tools.

But approximately five years ago, a new type of spam emerged which exploited not email but the web. Among this new wave of abuse, my personal favorite, comment spam.

I love getting comments on my blog. I also like reading comments on other blogs. However, it's not practical to simply allow anyone who wants to leave a comment, as within a very short period of time, blog comments will be overrun with spam generated by scripts that exploit sites with permissive comment privileges. To prevent this, most sites require that you log in to post a comment. But this may be too much to ask of someone who just wants to post a quick comment as they pass through. I often come across blog postings which I would like to contribute to, but I simply don't bother because the site requires me to create an account (which I'd likely only use once) before posting a comment. Not worth it. Another common practice is the use of "captchas" which require a user enter some bit of information to prove they are human and not a script. This works fairly well, however, it is still a hurdle that must be jumped before a user can post a comment. And as I've personally learned, captchas, particularly those that are image based, are prone to problems which may leave users unable to post a comment at all.

As email spam grew, there were various efforts to implement similar types of protection, requiring by the sender to somehow verify he was not a spammer (typically by resending the email with some special text in the subject line). None of these solutions are around anymore because they were just plain annoying. SpamAssassin and other similar tools are now used on most mail servers. Savvy email users will typically have some sort of junkmail filter built into their email client or perhaps as part of an anti-virus package. And spam is much less a nuisance as a result.

What we need for comment spam is a similar solution. One that works without getting in the way of the commenter or causing a lot of work for the blog owner. Turn it on, and it works. I've recently come across just such a solution for blogs which also happens to have a very nice Drupal module so you can quickly and easily put this solution to work on your own Drupal site.

Enter Akismet

It's called Akismet, and it works similarly to junkmail filters. After a comment (or virtually any piece of content) has been submitted, the Akismet module passes it to a server where it is analyzed. Content labeled as potential spam is then saved for review by the site admin and not posted to the blog.

Pricing

Akismet follows my absolute favorite pricing model. It's free for workaday Joes like me and costs money only if you're a large company that will be pumping lots of bits through the service. They realize that most small bloggers are not making any money on their sites, and they price their service accordingly. Very cool.

Installation

In order to use Akismet, you need to obtain a Wordpress API key. I'm not entirely sure why, but it is free and having a collection of API keys is fun. So get one if you have not already.

The Akismet Drupal module is appropriately named Akismet. It's not currently hosted on Drupal.org, but hopefully the author will eventually host it there as that is where most people find their Drupal modules. Instead, you will need to download the Akismet module from the author's own site. The installation process is standard. Unzip the contents into your site's modules directory, go to your admin/modules page and enable it. There is no need for additional Akismet code as all the spam checking is done on Akismet's servers.

Configuration

After installing Akismet, I was immediately impressed at how professional the module is. There were absolutely no problems after installation. Configuration options are powerful and very well explained. The spam queue is very nice and lets you quickly mark content as "ham" (ie not spam) and delete actual spam. As you build up a level of trust with the spam detection, you can configure the module to automatically delete spam after a period of time.

Spam filtering can be enabled on a per node type basis, allowing you to turn off filtering for node types submitted by trusted users (such as bloggers) and on for others (eg forums users). Comment filtering is configured separately.

Another sweet feature is the ability to customize responses to detected spammers. In addition to being able to delay response time by a configureable number of seconds, you can also configure an alternate HTTP response to the client, such as 503 (service unavailable) or 403 (access denied). Nice touch.

One small problem

I've only been working with Akismet for several days now. And I'd previously been using captcha, which I imagine got me out of the spammers sights for a while (spammers seem to spend most of their efforts on sites where their scripts can post content successfully). So far, Akismet has detected 12 spams, 2 of which were not actually spam. These were very short comments, and I imagine Akismet takes the length of the content into consideration. I assume that as the Akismet server processes more and more pieces of content, it will become more accurate in picking out spam versus legitimate content. Each time a piece of flagged content is marked as "ham", it is sent to Akismet where it can help refine their rule sets and make the service more accurate.

Perhaps Akismet could provide an additional option that allows users to increase or decrease tolerance for spam. I would prefer to err on the side of caution and let comments through.

Nov 13 2006
Nov 13

PHP is an interpreted language. This means that each time a PHP generated page is requested, the server must read in the various files needed and "compile" them into something the machine can understand (opcode). A typical Drupal page requires more than a dozen of these bits of code be compiled.

Opcode cache mechanisms preserve this generated code in cache so that it need only be generated a single time to server hundreds or millions of subsequent requests.

Enabling opcode cache will reduce the time it takes to generate a page by up to 90%.

Vroom! PHP is known for its blazing speed. Why would you want to speed up your PHP applications even more? Well, first and foremost is the coolness factor. Next, you'll increase the capacity of your current server(s) many times over, thereby postponing the inevitable need to add new hardware as your site's popularity explodes. Lastly, high bandwidth, low latency visitors to your site who are currently seeing page load times in the 1-2 second range will be shocked to find your vamped up site serving up pages almost instantaneously. After enabling opcode cache on my own server, I saw page loads drop from about 1.5 seconds to as low as 300ms. Now that's good fun the whole family can enjoy.

Opcode Cache Solutions

There are a number of opcode caching solutions. For a rundown on some of them, read this article. After a bit of research and a lot of asking around, I concluded that Eaccelerator was the best choice for me. It's compatible with PHP5, is arguably the most popular of its kind, and is successfully used on sites getting far more traffic than you or I are ever likely to see.

Implementing Eaccelerator

This is the fun and exciting part. Implementing opcode cache is far easier than you might imagine. The only thing you'll need is admin (root) access to your server. If you're in a shared hosting environment, ask your service provider about implementing this feature if it is not in place already. These instructions apply to *nix environments only.

Poor Man's Benchmarking

If you would like to have some before and after numbers to show off to your friends, now is the time to get the 'before' numbers. Ideally, you will have access to a second host on the same local network as your server so that the running of the test does not affect the results. For those of us without such access, we'll just have to run the test on the actual webserver, so don't submit these results in your next whitepaper:

Apache comes with a handy benchmarking tool called "ab". This is what I use for quick and dirty testing. From the command line, simply type in the following:

ab -n 1000 -c 10 http://[YOURSITE.COM]/

Here is a portion of the results I got on my own test:

    Concurrency Level:      10
Time taken for tests:   78.976666 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      13269256 bytes
HTML transferred:       12911899 bytes
Requests per second:    12.66 [#/sec] (mean)
Time per request:       789.767 [ms] (mean)
Time per request:       78.977 [ms] (mean, across all concurrent requests)
Transfer rate:          164.07 [Kbytes/sec] received
Connection Times (ms)
min  mean[+/-sd] median   max
Connect:        0    7  51.3      0     617
Processing:    77  725 1704.4    300   21390
Waiting:        0  673 1697.5    266   21383
Total:         77  732 1706.2    307   21390
Percentage of the requests served within a certain time (ms)
50%    307
66%    468
75%    625
80%    639
90%    805
95%   3808
98%   6876
99%   8529
100%  21390 (longest request)

The single most useful number is 'Requests per second', which in my case was 12.66.

Download, Build and Install

First, download the source code.

Get it to your server and do the following (I'm assuming you have gcc on your system, if not, get it):

tar jxvf  eaccelerator-0.9.5.tar.bz2
cd eaccelerator-0.9.5
phpize
./configure
make
make install

Configure Apache and Restart

If you have an /etc/php.d directory, create the file /etc/php.d/eaccelerator.ini for your new settings. Alternatively, you can put them in your php.ini file. Your configuration should look something like this:

zend_extension="/usr/lib/php/modules/eaccelerator.so"
eaccelerator.shm_size="32"
eaccelerator.cache_dir="/var/cache/eaccelerator"
eaccelerator.enable="1"
eaccelerator.optimizer="1"
eaccelerator.check_mtime="1"
eaccelerator.debug="0"
eaccelerator.filter=""
eaccelerator.shm_max="0"
eaccelerator.shm_ttl="0"
eaccelerator.shm_prune_period="0"
eaccelerator.shm_only="0"
eaccelerator.compress="1"
eaccelerator.compress_level="9"
eaccelerator.log_file = "/var/log/httpd/eaccelerator_log"
; eaccelerator.allowed_admin_path = "/var/www/html/control.php"

Adjust values according to your particular distribution. For more details on configuring eaccelerator, see the settings documentation.

See Eaccelerator in Action

The value eaccelerator.allowed_admin_path, if enabled, should point to a web accessible directory with a copy of 'control.php' (which comes with the eaccelerator source code). Edit this script, changing the username and password. You can then access this control panel and see exactly what eaccelerator is caching

See the results

After enabling Eaccelerator on devbee.com, I ran my benchmark again, and here are the results:

    Concurrency Level:      10
Time taken for tests:   10.472143 seconds
Complete requests:      1000
Failed requests:        0
Write errors:           0
Total transferred:      13129000 bytes
HTML transferred:       12773000 bytes
Requests per second:    95.49 [#/sec] (mean)
Time per request:       104.721 [ms] (mean)
Time per request:       10.472 [ms] (mean, across all concurrent requests)
Transfer rate:          1224.30 [Kbytes/sec] received
Connection Times (ms)
min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       4
Processing:    20  103  52.1     96     345
Waiting:       17   92  50.1     83     342
Total:         20  103  52.1     96     345
Percentage of the requests served within a certain time (ms)
50%     96
66%    122
75%    137
80%    147
90%    176
95%    201
98%    225
99%    248
100%    345 (longest request)

We are now serving up 95.49 requests per second. That's 754% increase in server capacity. Had I been able to run the tests from another machine on the same network, I believe the numbers would be even more dramatic.

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web