Upgrade Your Drupal Skills

We trained 1,000+ Drupal Developers over the last decade.

See Advanced Courses NAH, I know Enough

Roadmap for the pivots_block module recommendation on d.o.

A brief history to begin with ...

What's pivots_block?

The idea is to generate "related modules" recommendation based on co-citations. Suppose we have TinyMCE and FCKeditor co-mentioned together in many forum discussions, then we consider the 2 modules related to some extend. Here is a detailed explanation.

Where we are now?

With the help of d.o. infrastructure/webmaster team, we deployed pivots_block on d.o. Google Analytics reports showed that pivots_block invited 3 times higher click-throughs than the simple "New forum posts" block. Also, we found that the classical correlation coefficient algorithm received more click-throughs than the other 3 extended algorithms. In general, we think pivots_block works pretty fine for the d.o. community.

The roadmap to the future ...

The next major improvement

One key factor to pivots_block is to correctly detect module citations in forum discussions. Currently we used 1) the popular aliases such as CCK and 2) the module title names together with the keyword 'module' as detectors to match module citations in discussions. This might have missed quite a few module citations.

To fix that, we recruited a graduate student at University of Michigan and manually read through all 12,742 messages posted to d.o. forum in November 2008. By that we hope to collect a list of module aliases used by the community, and then use that to improve accuracy of detecting module mentions. The work is almost done, and we hope to apply it soon to d.o. and test if it improves the recommendation quality.

Other alternatives

One alternative is to generate module recommendations based on , which is current running on d.o. Its limitation, however, is that it tends to recommend complement modules than substitute modules, because people rarely use substitute modules in the same sites. Google Analytics showed that this algorithm had slighted lower click-through rates than the original co-citation algorithm, but not statistically significant.

Another alternative is to use ApacheSolr MoreLikeThis. This is promising because d.o. search is running on ApacheSolr already. However, to my knowledge (maybe limited), the relevancy matching algorithm of MoreLikeThis is text-based. That is, modules are related because their project text descriptions are similar. This might or might not work well for d.o. modules. But it's definitely a direction to explore.

The last alternative is to generate related modules based on module ratings such as http://drupalmodules.com/. This is a promising idea too. One concern is that it might be subjective to deliberative manipulation, indicated by some research literature. Besides, this approach is only possible after implementing a module voting system in d.o. redesign.

Action plan

First, I'd like to apply the next (probably last) major improvement of pivots_block to d.o., as described earlier, and measure its click-through rate. That would be the best we can get from the co-citation pivots algorithm.

Second, I'd like to work with the ApacheSolr team and see if we can use ApacheSolr MoreLikeThis to make "related modules" recommendations on d.o.

If ApacheSolr MoreLikeThis receives higher click-throughs, which would indicate that it's more helpful to the community, then it's better to keep MoreLikeThis. And vice versa. If we decide to keep pivots_block, my future plan then is to make it a more general-purpose module and build it on top of ApacheSolr (details will be announced later).

I'll try to finish this research in April and report it back to the community. Drupal ROCKS and hope we'll make d.o. module recommendations work better!!

Author: 
Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web