May 02 2017
May 02

Custom Video Export/Import Process With Views and Feeds

In the Media Research Center's set of three main Drupal sites, MRCTV serves as our video platform where all videos are created and stored as nodes, and then using Feeds, specific videos are imported into the other two sites (Newsbusters and CNS News) as video nodes. Then, on NB and CNS, we use the Video Embed Field module with a custom VEF provider for MRCTV to play the embedded videos.

There are only specific videos that need to be imported into the destination sites, so a way to map channels between the two sites is needed. All three sites have a Channels vocabulary, a mapping is created between the appropriate channels. This mapping has two parts:

  1. A feed of channels terms on NB and CNS.
  2. A custom admin form that links source channels on MRCTV with target channels on the destination site.

On the receiving site side, in addition to the standard feed content, the following custom elements are needed for the Feeds import:

  1. The nid of the video node on MRCTV. This is used to create the URL that is put into the VEF field.
  2. The taxonomy terms for the Channels vocabulary terms in the destination sites (NB and CNS).

Since these are outside of the standard feed components, they will need to be added custom to the feed items.

I documented my custom Feeds importer on drupal.stackexchange, so you can see the code there.

MRCTV is finally in the process of being updated to D8 from D6 (insert derision here), so both the mapping form and the feed needed to be re-created. The first part of the structure is the channel mapping form. The following file VideoExportForm.php is placed in /modules/custom/video_export/src/Form:

/**
 * @file
 * Contains \Drupal\video_export\Form\VideoExportForm.
 */

namespace Drupal\video_export\Form;

use Drupal\Core\Config\ConfigFactoryInterface;
use Drupal\Core\Form\ConfigFormBase;
use Drupal\Core\Form\FormStateInterface;
use GuzzleHttp\Exception\RequestException;

class VideoExportForm extends ConfigFormBase {
  
  /**
   * {@inheritdoc}.
   */
  public function getFormId() {
    return 'video_export_settings';
  }
    
  /**
   * {@inheritdoc}
   */
  public function buildform(array $form, FormStateInterface $form_state) {
    $form = array();
    $channels = array();
    
    // Get list of channels.
    $terms =\Drupal::entityTypeManager()->getStorage('taxonomy_term')->loadTree('channels');
    foreach ($terms as $term) {
      $channels[$term->tid] = $term->name;
    }
    
    // Get config data from video_export.settings.yml.
    $config = \Drupal::config('video_export.settings');
    $mapping_config = \Drupal::config('video_export.mappings');
    $sites = $config->get('sites');
    
    foreach($sites as $site => $site_data) {
      // Get channels list.
      try {
        $response = \Drupal::httpClient()->get($site_data['channel_url'], array('headers' => array('Accept' => 'text/plain')));
        $data = $response->getBody();
        if (empty($data)) {
          return FALSE;
        }
      }
      catch (RequestException $e) {
        return FALSE;
      }
  
      $channel_data = new \SimpleXMLElement($data);
      foreach ($channel_data->channel as $channel) {
        $channel_name = $channel->name->__toString();
        $channel_tid = $channel->tid->__toString();
        $target_channels[$channel_tid] = $channel_name;
      }
      // Sort array alphabetically by element.
      asort($target_channels, SORT_STRING);
  
      $target_channel_options = array();
      $target_channel_options[0] = "No Channel";
      foreach ($target_channels as $target_tid => $target_name) {
        $target_channel_options[$target_tid] = $target_name;
      }
  
      //Get mappings from mappings conifg.
      $mappings = $mapping_config->get('sites');
      foreach ($mappings[$site]['mappings'] as $mrctv_channel => $target_channel) {
        $mapping_defaults[$mrctv_channel] = $target_channel;
      }
  
      $form[$site] = array(
        '#type' => 'details',
        '#title' => t($site . ' Channel Mappings'),
        '#description' => t('Map MRCTV channels to ' . $site . ' channels'),
        '#collapsible' => TRUE,
        '#collapsed' => TRUE,
        '#tree' => TRUE,
      );
  
      // Loop through all of the categories and create a fieldset for each one.
      foreach ($channels as $id => $title) {
        $form[$site]['channels'][$id] = array(
          '#type' => 'select',
          '#title' => $title,
          '#options' => $target_channel_options,
          '#tree' => TRUE,
        );
        if (in_array($id, array_keys($mapping_defaults))) {
          $form[$site]['channels'][$id]['#default_value'] = intval($mapping_defaults[$id]);
        }
      }
    }
    
    // Get mapping configs.
    $xml = array();
    $mapping_config = \Drupal::config('video_export.mappings');
    $sites = $mapping_config->get('sites');
    $channel_mappings = $sites[$site]['mappings'];
    // Get video nodes that belong to one of the selected channels.
    $query = \Drupal::entityQuery('node')
      ->condition('status', 1)
      ->condition('type', 'video')
      ->condition('changed', REQUEST_TIME - 59200, '>=')
      ->condition('field_channels.entity.tid', array_keys($channel_mappings), 'IN');
    $nids = $query->execute();
    // Load the entities using the nid values. The array keys are the associated vids.
    $video_nodes = \Drupal::entityTypeManager()->getStorage('node')->loadMultiple($nids);

    foreach ($video_nodes as $nid => $node) {
      $host = \Drupal::request()->getSchemeAndHttpHost();
      $url_alias = \Drupal::service('path.alias_manager')->getAliasByPath('/node/' . $nid);
      // Get channels values.
      $channel_tids = array_column($node->field_channels->getValue(), 'target_id');
      $create_date = \Drupal::service('date.formatter')->format($node->getCreatedTime(), 'custom', 'j M Y h:i:s O');
      $item = array(
        'title' => $node->getTitle(),
        'link' => $host . $url_alias,
        'description' => $node->get('body')->value,
        'mrctv-nid' => $nid,
        'guid' => $nid . ' at ' . $host,
        'pubDate' => $create_date
      );
      // Check for short title and add it if it's there.
      if ($node->get('field_short_title')->value) {
        $item['short-title'] = $node->get('field_short_title')->value;
      }
      foreach ($channel_tids as $ctid) {
        $item[$site . '-channel-map'][] = $ctid;
      }
      $xml[] = $item;
    }

    return parent::buildForm($form, $form_state);
  }
  
  /**
   * {@inheritdoc}.
   */
  public function validateForm(array &$form, FormStateInterface $form_state) {
  
  }
  
  protected function getEditableConfigNames() {
    return ['video_export.mappings'];
  }
  
  /**
   * {@inheritdoc}
   */
  public function submitForm(array &$form, FormStateInterface $form_state) {
    $values = $form_state->getValues();
    $config = $this->config('video_export.mappings');
    $sites = array();

    foreach($values as $site => $mappings) {
      if (is_array($mappings)) {
        foreach ($mappings['channels'] as $mrctv_channel => $target_channel) {
          if ($target_channel != 0) {
            $sites[$site]['mappings'][$mrctv_channel] = $target_channel;
          }

        }
        $config->set('sites', $sites);
      }
    }
    $config->save();
  
    parent::submitForm($form, $form_state);
  }
}

The setting for the channels feeds on NB and CNS are stored in /modules/custom/video_export/config/install/video_export.settings.yml:

sites:
  newsbusters:
    channel_url: 'http://www.newsbusters.org/path/to/channels'
  cnsnews:
    channel_url: 'http://www.cnsnews.com/path/to/channels'
list_time: 24
        

Since this is an admin settings form, I extend the ConfigFormBase class. This adds some additional functionality over the standard FormBase class, similar to the way the system_settings_form() function does in D7 and older (see the change record for details).

As mentioned above the form does the following things:

  1. Reads the channels feed from the destination sites
  2. Creates a fieldset for each site with a select list for each MRCTV channel where the user can select the destination channel.
  3. Saves the mappings in config.

The next thing that is needed is the feed of video nodes that are available to be imported. After trying unsuccessfully to create a custom REST API endpoint, I ended up going with a Feeds display in Views. Out of the box I can create my feed, but I still need to add my custom elements. In D6, I used hook_nodeapi($op = 'rss item') to add my custom elements. In other feeds on D7 sites I've been able to use the Views RSS module with its provided hooks to add custom RSS elements, but as of now it is currently unusable for D8 due to one major issue.

Finally, since everything in D8 is based on OOP, I knew there had to be a way to override a Views class at some level, so after some searching, I decided to override the display plugin. I poked around in the Views code and found the RssFields class that is used for the field level display for a Feeds display, so I overrode that.

namespace Drupal\video_export\Plugin\views\row;

use Drupal\views\Plugin\views\row\RssFields;
use Drupal\Core\Form\FormStateInterface;
use Drupal\Core\Url;

/**
 * Renders an RSS item based on fields.
 *
 * @ViewsRow(
 *   id = "mrctv_rss_fields",
 *   title = @Translation("MRCTV Fields"),
 *   help = @Translation("Display fields as RSS items."),
 *   theme = "views_view_row_rss",
 *   display_types = {"feed"}
 * )
 */
class MRCTVRssFields extends RssFields {
  
  /**
   * Override of RssFields::render() with additional fields.
   *
   * @param object $row
   *
   * @return array
   */
  public function render($row) {
    $build = parent:: render();
    $item = $build['#row'];
    
    // Add MRCTV nid
    $item->elements[] = array(
      'key' => 'mrctv-nid',
      'value' => $row->nid,
    );
  
    // Add channels and their target nids. We can get them from $row->_entity.
    $site = $this->view->args[0];
    // Get MRCTV nids from view.
    $channel_tids = array_column($row->_entity->field_channels->getValue(), 'target_id');
    // Now, get destination tids from config.
    $mapping_config = \Drupal::config('video_export.mappings');
    $all_mappings = $mapping_config->get('sites');
  
    foreach($channel_tids as $mrctv_channel) {
      if(in_array($mrctv_channel, array_keys($all_mappings[$site]['mappings']))) {
        $item->elements[] = array(
          'key' => $site . '-channel-map',
          'value' => $all_mappings[$site]['mappings'][$mrctv_channel],
        );
      }
    }
    
    // Re-populate the $build array with the updated row.
    $build['#row'] = $item;
    
    return $build;
  }
}

As you can see, the override is fairly simple; all I needed to do was override the render() method. This method returns a render array, so all I do is get the built array from the parent class, add my custom elements to the #row element in the array, and return it.

One thing that I couldn't do simply in the views UI was select the nodes that should be in the feed based on the associate Channels vocabulary terms. These are dynamic, based on the mappings selected in the admin form, so I can't pre-select them in the view settings. This is where hook_views_query_alter() comes to the rescue.

/**
 * Implements hook_views_query_alter().
 */
function video_export_views_query_alter(Drupal\views\ViewExecutable $view, Drupal\views\Plugin\views\query\Sql $query) {
  if ($view->id() == 'video_export' && $view->getDisplay()->display['id'] == 'feed_1') {
    // First, we need to get the site parameter from the view.
    $site = $view->args[0];
    
    // Next, we need to get the saved config for the channel mapping.
    $mapping_config = \Drupal::config('video_export.mappings');
    $all_mappings = $mapping_config->get('sites');
    $tids = array_keys($all_mappings[$site]['mappings']);
   
    // Modify query to get nodes that have the selected nids, which are the array keys.
    $query->addWhere(NULL, 'node__field_channels.field_channels_target_id', $tids, 'IN');
  }
}

All I do here is get the saved mappings from config and add them to the views query as a WHERE condition to limit the feed items to the appropriate nodes.

One issue I ran into with the results was duplicate records. Since field_channels (the entity reference field for the Channels vocabulary) is multiselect, the query returns multiple records for each node if there are multiple Channels terms selected. There are display settings to show multiple items in one row, but they don't take effect here. I didn't dig far enough into the code to know for sure, but my guess is that the grouping happens at a higher layer in the views rendering process, so they don't take effect in this situation.

To get around this, I implemented hook_views_pre_render(). At this point in the process, the results have been built, so I just loop through them and remove duplicates.

/**
 * Implements hook_views_query_pre_render().
 */
function video_export_views_pre_render(Drupal\views\ViewExecutable $view) {
  $unique_nids = $new_results = array();
  
  // Loop through results and filter out duplicate results.
  foreach($view->result as $index => $result) {
    if(!in_array($result->nid, $unique_nids)) {
      $unique_nids[] = $result->nid;
    }
    else {
      $new_results[] = $result;
    }
  }
  // Replace $view->result with new array. Apparently views requires sequentially keyed
  // array of results instead of skipping keys (e.g. 0, 2, 4, etc), so we can't just
  // unset the duplicates.
  $view->result = $new_results;
}

As noted in the code comment, views seems to require a sequentially numbered array, so you can't just unset the duplicate keys and leave it as is, so I chose to just add each item to a new array. In retrospect, I could have just used PHP functions like array_splice() and array_filter(), but this method works just as well.

It should also be noted that the views hooks need to go in a *.views_execution.inc file, so this one is in /modules/custom/video_export/video_export.views_execution.inc.

All I do at this point is use the Job Scheduler module with Feeds in the destination sites to schedule the import at the desired interval, and the process runs by itself. 

Sep 14 2016
Sep 14

Handling clients with more than one site involves lots of decisions. And yet, it can sometimes seem like ultimately all that doesn’t matter a hill of beans to the end-user, the site visitor. They won’t care whether you use Domain module, multi-site, separate sites with common codebase, and so on. Because most people don’t notice what’s in their URL bar. They want ease of login, and ease of navigation. That translates into things such as the single sign-on that drupal.org uses, and common menus and headers, and also site search: they don’t care that it’s actually sites search, plural, they just want to find stuff.

For the University of North Carolina, who have a network of sites running on a range of different platforms, a unified search system was a key way of giving visitors the experience of a cohesive whole. The hub site, an existing Drupal 7 installation, needed to provide search results from across the whole family of sites.

This presented a few challenges. Naturally, we turned to Apache Solr. Hitherto, I’ve always considered Solr to be some sort of black magic, from the way in which it requires its own separate server (http not good enough for you?) to the mysteries of its configuration (both Drupal modules that integrate with it require you to dump a bunch of configuration files into your Solr installation). But Solr excels at what it sets out to do, and the Drupal modules around it are now mature enough that things just work out of the box. Even better, Search API module allows you to plug in a different search back-end, so you can develop locally using Drupal’s own database as your search provider, with the intention of plugging it all into Solr when you deploy to servers.

One possible setup would have been to have the various sites each send their data into Solr directly. However, with the Pantheon platform this didn’t look to be possible: in order to achieve close integration between Drupal and Solr, Pantheon locks down your Solr instance.  That left talking to Solr via Drupal.

SearchAPI lets you define different datasources for your search data, and comes with one for each entity type on your site. In a datasource handler class, you can define how the datasource gets a list of IDs of things to index, and how it gets the content. So writing a custom datasource was one possibility.

Enter the next problem: the external sites we needed to index only exposed their content to us in one format: RSS. In theory, you could have a Search API datasource which pulls in data from an RSS feed. But then you need to write a SearchAPI datasource class which knows how to parse RSS and extract the fields from it.  That sounded like we’d be reinventing Feeds, so we turned to that to see what we could do with it. Feeds normally saves data into Drupal entities, but maybe (we thought) there was a way to have the data be passed into SearchAPI for indexing, by writing a custom Feeds plugin?  However, we found we had a funny problem of the sort that you don’t consider the existence of until you stumble on it: Feeds works on cron runs, pulling in data from a remote source and saving it into Drupal somehow. But SearchAPI also works on cron runs, pulling data in, usually entities. How do you get two processes to communicate when they both want to be the active participant?

With time pressing, we took the simple option: we defined a custom entity type for Feeds to put its data into, and SearchAPI to read its data from. (We could have just used a node type, but then there would have been an ongoing burden of needing to ensure that type was excluded from any kind of interaction with nodes.)

Essentially, this custom entity type acted like a bucket: Feeds dumps data in, SearchAPI picks data out. As solutions go, not the most massively elegant, at first glance. But if you think about it, if we had gone down the route of SearchAPI fetching from RSS directly, then re-indexing would have been a really lengthy process, and could have had consequences for the performance of the sites whose content was being slurped up. A sensible approach would then have been to implement some sort of caching on our server, either of the RSS feeds as files, or the processed RSS data. And suddenly our custom entity bucket system doesn’t look so inelegant after all: it’s basically a cache that both Feeds and SearchAPI can talk to easily.

There were a few pitfalls. With Search API, our search index needed to work on two entity types (nodes and the custom bucket entities), and while Search API on Drupal 7 allows this, its multiple entity type datasource handler had a few issues we needed to iron out or learn to live with.

The good news though is that the Drupal 8 version of Search API has the concept of multi-entity type search indexes at its core, rather than as a side feature: every index can handle multiple entity types, and there’s no such thing as a datasource for a single entity type.  With Feeds, we found that not all the configuration is exportable to Features for easy deployment. Everything about parsing the RSS feed into entities can be exported, except the actual URL, which is a separate piece of setup and not exportable. So we had to add a hook_updateN() to take care of setting that up.

The end result though was a site search that seamlessly returns results from multiple sites, allowing users to work with a network of disparate sites built on different technologies as if they were all the same thing. Which is what they were probably thinking they were all along anyway.

Author: Joachim Noreiko

Apr 27 2014
Apr 27

Feeds (http://drupal.org/project/feeds) is a very popular module. From the project page, we get a nice description of the module:

Import or aggregate data as nodes, users, taxonomy terms or simple database records.

The basic idea is that you throw a csv file to it and it creates drupal content. As simple as that. The input format can be more than just a csv, check the project page for more details.

We can use the feeds api http://drupalcode.org/project/feeds.git/blob/HEAD:/feeds.api.php if we want more functionality than the standard.
I am going to describe 3 different uses of the the feeds api:

  1. Perform an operation after a feed source has been parsed, before it will be processed
  2. Perform pre-save operations
  3. Add additional target options to the mapping form

1. Perform an operation after a feed source has been parsed, before it will be processed

Use this hook:

<?php
hook_feeds_after_parse
()
?>

A common use case is to alter the data from the csv. For example, lets say that the terms we want to import are described different in the csv than in drupal:
In the csv we may have words like "n. america", "s. america", "w. europe", "e. europe" while the terms in drupal are: "north america", "south america", "western europe", "eastern europe". We need to map the csv values to their drupal equivalent data before import:

<?php
/**
* Implements hook_feeds_after_parse().
*/
function mymodule_feeds_after_parse(FeedsSource $source, FeedsParserResult $result) {
$map = array(
'e. europe' => 'eastern europe',
'w. europe' => 'western europe',
);
foreach(
$result->items as $key=>$item){
$result->items[$key]['region'] = $map[$result->items[$key]['region']];
}
}
?>

2. Perform pre-save operations

This allows us to act on the entity that is going to be created. This is similar to the hook_entity_presave().
Here we import only users with their surname equal to 'Smith':

<?php
/**
* Implements hook_feeds_presave().
*/
function mymodule_feeds_presave(FeedsSource $source, $entity, $item) {// check that this is fired only for the indented importer
if($source->importer->id=='my_user_importer'){
// check that we like this name
if($item['surname'] != 'Smith'){
$entity->feeds_item->skip = TRUE;
drupal_set_message(t('Only Smith\'s allowed. Skipping...', 'warning');
}
}
}
?>

3. Add additional target options to the mapping form

This is the most advanced case described here. The hook

<?php
hook_feeds_processor_targets_alter
()
?>

allows us to do more complex stuff. For example, lets assume that in our site we are using different newsletter lists, and we want to register the user to the proper list. The newsletter lists are not a field in the user form, so we don't get the option in the feeds ui to control this. This hook allows us to add a field in the feeds ui mapping form and define a callback for its function.

<?php
/**
* Implements hook_feeds_processor_targets_alter().
*
*/
function mymodule_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
$targets['newsletter_list'] = array(
'name' => t('Newsletter list'),
'description' => t('This field sets the user to the proper newsletter list.'),
'callback' => 'mymodule_newsletter_list',
);
}
?>

The code above will show a new option "Newsletter list" in the fields mapping page of the feeds ui module for our importer. Now, we need to define a callback, for this option.

<?php
/**
* Callback for hook_feeds_processor_targets_alter.
* Subscribes the user to the proper newsletter list.
*
* @param $source
* @param $entity
* @param $target
* @param $value
* @param $mapping
*/
function mymodule_newsletter_list ($source, $entity, $target, $value, $mapping) {
//$value contains the subscription list from the csv
subscribe_user(($entity->uid, $value);
}
?>

Did you like this post? Drop me a line in the comments below

Sep 26 2013
Sep 26

I'm going to go through a basic tutorial here today on importing iCal addresses into your Drupal 7 site as nodes. While there are already several modules that allow you to show your google (or other) calendar data on your site, importing your calendar with those modules usually has a few restrictions, such as limiting the specific data you can show, and not having very good methods for customizing your style. Using Feeds to import the data as nodes allows you to bring an events calendar to your site that is 100% customizable by you, the same way any other nodes on your site are customized.

Note: Drupal 7 uses mostly different modules from Drupal 6, however the modules work very similarly. So while I will not be going over D6 options in this tutorial, a lot of what you see here can be used in Drupal 6.

Here are the specific modules related to this tutorial you will need to install to make this work:

Several of these modules have required modules to make them work - make sure you also have those installed.

You will also need to make sure the iCalCreator library is installed in your /sites/all/libraries directory.

Note: As of this writing (Oct, 2013), the most recent version of Feeds (2.0-alpha8) has a minor issue with some servers. The issue is described here, and if you run into the same issue, you may be able to use the solution in post #6 to bypass the issue until the module is updated.

Enable your modules, make sure there are no errors & bugs, and let's begin.

1 - Create an Event content type

The first thing we want to do is make sure we have a content type to import our iCal into. To make it into a proper event feed, there is only 1 necessary additional field other than title - and that is date. We'll use a few more, so that you can see some other cool options with the importer.

Create a new content type called "Event". I usually rename the title under the content type settings to "Event Name," just to be a bit more descriptive in case it's ever someone other than myself creating content, but that's entirely up to you. I also usually immediately delete the default body field, but you can definitely use that for the Description field I will detail below. Your choice.

We'll create three new fields.

Event Date and Time

Name it whatever you want, again, I typically try to be pretty descriptive. Make sure to use a proper machine name - good practices usually include the nodetype name (for instance, I gave this field a machine name of " field_event_date_time " ). Select Date as the field type, and for the widget, use the pop up. For this tutorial, just use all the default settings.

Event Location

Again, name it how you will. Set this to be a text field, and use the default settings.

Event Description

If you used the default body field, skip this. Otherwise, create a field called "Event Description," and set it to the text area field type. Default settings.

Here is what mine looks like:
Event Node Type

Now, let's grab your calendar address and create a feed.

2 - Create a Feed Importer

We will now create the feed importer. Before we begin, let's go ahead and grab your calendar address.

Get your iCal Address

Go to your public google calendar (or any other calendar that uses ical addresses), head into that calendar's settings, and get the ical address. (Note - make sure you have some dates filled out, or you won't be importing anything!) This is what your calendar settings page should look like:

Google Calendar Settings

In the image above, you see I've hilighted the ical link you're looking for with a bright red box. Click that link and then copy the full address.

Now, let's create the feed itself.

Main Feed Settings

Go to /admin/structure/feeds/create

Give the feed a title. I titled mine: iCal Feed Importer. Very clever, eh?

Once created, it takes you to a feed settings page. This page can be a bit confusing if you don't know what you're doing, so we'll go through each detail relating to our iCal setup.

Feed - Basic Settings

Under Basic Settings, you will see that you have quite a few options. You have the opportunity to rename your importer, and you also have a few others. The first, "Attach to Content Type" can be a bit misleading if you're not aware of all the other options. This selection will add a feed importer to every new node edit created with the selected content type. So for instance, if you were to select "Event" from that list, every time you created a new "Event" node, it would ask you for a feed address. This is not the preferred method for what we're doing, since it requires you to create a node just to import other nodes. We may go through this option in a later tutorial, but for now, leave on "Use Standalone Form."

The next few options should be pretty self-explanatory.

The "Periodic Import" just names the amount of time that the importer waits before checking the iCal address for new content. On most smaller sites, you could probably set this to "12 hours" or "1 day" with no problems.

The "Import on Submission" checkbox is just what it says. If you have it checked (which it is, by default), then as soon as you create a new import, it will immediately import all the data, instead of waiting for the amount of time in the "Periodic Import" above. If unchecked, the feed will not import the data until the next scheduled time.

Finally, the "Process in background" checkbox has a pretty detailed description of what it does. We'll leave it blank for this.

Feed - Fetcher

Leave all these settings to default for this tutorial.

Feed - Parser

Select "Change" next to Parser, and select "iCal parser". Save.

Now you will see that below Parser, you have settings for "iCal Parser." By default, there are no extra settings to configure here.

Feed - Processor

Make sure the main Processor settings are set to "Node processor." Save.

Next to "Node Processor" you should see a settings link. Click it.

For "Bundle", select your event node type. The next few settings can be a bit confusing, so we'll look at them one at a time.

Under "Update Existing Nodes", you see three options. The first option, Do not update existing nodes, is selected by default. This means that whenever your importer imports data, it will simply create all new nodes for everything it imports. This is not the preferred method. The second option, Replace existing nodes, is slightly better. This option will delete existing nodes with the same title, and create new nodes of that title. This is an acceptable method, but we will use the third option, Update existing nodes. This option will take any new, updated, or changed data from the import, and update the nodes that already exist. Note - the helper text mentions "Unique target." This is very important, and we will get to it in the mapping section.

The "Skip hash check" checkbox is empty by default. We will leave it this way. If checked, then every time the importer ran, it would update (node_save) the content that exists, even if no changes were made.

"Text Format" should be pretty self explanatory. I leave mine to Plain Text, as I like full control of my styling through css.

"Author" will simply automatically assign the user to the nodes created. I typically assign to my user 1 account.

"Authorize" does exactly what it says. It checks that the person creating the import has permissions to do so. For security's sake, I always leave this checked.

"Expire Nodes" is also pretty self explanatory. Note - The date it uses to decide whether to expire the node is based on the node published date, not the event date field. Keep that in mind if you use this option.

Now let's look at the most important part, mapping.

3 - Mapping the Data

Under the Node Processor settings, there is a link for "Mapping." Click it.

This is the settings page where we determine how exactly our importer reads the data, and maps to our node fields. There is actually a certain order that they need to go in for them to work correctly.

The "Source" column is used to select from where the mapping occurs. It relates specifically to the actual calendar you are importing from. The "Target" colum is used to select to where the data is mapped, in your nodes.

In the Source column, select "Date start." In your Target column, select the field name you used for your event date field - make sure you use the "Start" option provided (you will see in the screenshot below). Mine says: "Event Date and Time: Start." Save. You will need to save after each step of the process.

Now in the Source column, select "Date end." Then in your Target column, select the same field name from before, but choose the "End" option. Mine says: "Event Date and Time: End." Save again.

In Source column, select "Summary." This is typically what the calendar uses for the title. In the Target column, map it to "Title." Save.

Once you save it, you will see that this particular section has a new settings option - "Not used as unique." Click the settings icon and check the "Unique" box, then Update. This will make sure that each node created is using a unique title - it will check against previous titles, and if a similar title already exists, it will update that node, instead of creating a new node.

In the Source column, select "Location text," and in the Target column, map this to whatever you named your location field. Mine says: "Event Location." Save.

Finally, in the Source column, select "Description," and in the Target column, map this to whatever you named your description field (or your body field, if you used the default body field for your node). Mine says: "Event Description." Save.

Here is what my mapping page looks like:

This wraps all the mapping we will be doing, but as you can see, there are plenty of other options available, such as using date repeat options, mapping a url for the calendar item, etc. You can play around with these, but most of them are typically not used in events calendars.

Now let's import the data and see what happens.

4 - Importing

Using the "Standalone Importer" option we set up in the feed means that there is one specific place from which we will do all our importing: /import (www.example.com/import).

If you go to that page, you will see a list of all the importers you have available as standalone importers. If this is your first time doing this, you should only see one, but you can have as many separate importers as you want. Select the importer.

You will go to a small form that should say "No imported feeds," and has a field for you to input a URL. Remember the iCal address we had from way up above? Paste that in here, then click "Import." You will see an "initializing" box, and then when the page reloads, you should see a notice of a number of items imported. Let's look at them. Go to /admin/content, and select one of them.

If you followed the steps correctly, you should see a node that has all the data from the calendar. Here's what mine looks like:

Now that you have multiple nodes, you can do with them anything you would typically do with nodes - create views (or Calendar Views!), blocks, panels, whatever you want. You now have full control over your events, and if you have your cron set up properly, you should only ever have to update your calendar, and this importer will automatically bring in any new events whenever cron is run.

A few notes:

  • As of the time of this writing (Oct, 2013), there was a potential bug with date_ical where the date fields were not being filled in. This has been solved in the dev version of the module, and the new version should have the fix implemented
  • When mapping the fields, the Date start time must always come before the Date end time, due to the way the parser functions. This may seem obvious, but if you ever have dates not showing up, check first that Date start is above Date end.
  • I mentioned at the beginning of the tutorial that much of this process works for Drupal 6, but uses different specific modules. Here are the modules you will need if using D6: Feed API, iCal Feed Parser, Date
Jun 17 2013
Jun 17

Now that Twitter 1.1 and Feeds are buddies, time to move to other data sources. Next up: Facebook. Using trusty Feeds and friends, I was able to ingest my own Facebook home feed. Here's how to replicate this:

For the impatient, attached is a feature that should get you set up quickly. You'll need the following modules:

  • Feeds latest HEAD from 7.x-2.x branch.
  • Feeds JSONPath Parser version 7.x-1.0-beta2 - make sure to install the needed JSONPath library as per the instructions on the module page.
  • Feeds OAuth latest HEAD from 7.x-1.x branch.
  • php-proauth library that you install in sites/all/libraries module as such:

git clone https://github.com/infojunkie/php-proauth.git

The idea behind the setup is to create a Feeds pipeline that:

  • Fetches the given resource URL (a Facebook Graph API URL) using Feeds OAuth 2.0 Fetcher. This fetcher checks for OAuth 2.0 access token for the current user and performs authorization if it's not found. It alerts the user to missing access tokens during feed creation.
  • Parses the result using Feeds JSONPath Parser, since Facebook Graph API uses JSON.
  • Maps the result to nodes using the standard Node Processor.

Create a new Facebook application. You need to add two specific settings to it:

  • Basic > Website with Facebook Login > Site URL: enter the callback URL that is reported in the Feed importer's Fetcher > HTTPS OAuth 2.0 Fetcher Settings > Site identifier description.
  • Permissions > Extended Permissions: add the read_stream permission.
  • You will need to copy the App ID and App Secret strings of the Facebook app to the Fetcher > HTTPS OAuth 2.0 Fetcher Settings > Consumer key and Consumer secret settings, respectively.
  • Set the fetcher's Method to GET.
  • Then create a new node of type Facebook feed with the Graph API URL (e.g. your home feed). Make a note of this node's nid.
  • Finally, edit the facebook view included in this feature, such that the filter Feeds item: Owner feed nid refers to the nid noted above.

That's it. This should cure your feed indigestions!

Attachment Size facebook_feed-7.x-0.2.tar 40 KB
Jun 13 2013
Jun 13

Update: This post now contains a feature that you can import in D7 to see the Twitter feed in action.

The new Twitter 1.1 API kicked in recently, which meant a new cycle of maintenance for anyone consuming their data programmatically. My own Feeds + Views demo site streams #drupal, using Feeds and complementary modules. I had to make a few changes to the importer to adapt to the new API:

  • Authorization using OAuth
  • Parsing JSON instead of XML

The new Twitter 1.1 API requires OAuth authentication/authorization for every request. Fortunately, I had already written Feeds OAuth to solve for feeds that require OAuth, so I just had to plug this in. Well, not "just", because it took choosing among the several authorization options that Twitter provides, and fixing a couple of bugs in the module itself.

Twitter provides several options for authorization, depending on the needs of the consumer (not listed on this page is the Application-only authentication). I ended up choosing the method that required the least work on the Feeds OAuth module, namely obtaining OAuth tokens from dev.twitter.com as a pre-processing step. To do this, I manually added an entry to the feeds_oauth_access_tokens table, with the tokens that were handed to me by Twitter on my application page. This way, Feeds OAuth would not have to ask me to login to Twitter in order to make the API call. Obviously, this is a temporary hack and I will work on enhancing the module support for different authentication options.

Twitter 1.1 API only returns JSON results. To parse JSON instead of RSS/Atom, I used Feeds JSONPath Parser. It does the job as advertised, but the only challenge here was to retrieve the tweet URL for each result. The Twitter search API itself does not return tweet URLs, for some unfathomable reason. My setup needs the tweet URL to pass it to oEmbed, which renders the tweet on the view. Tweet URLS are of the form https://twitter.com/<user_screen_name>/status/<tweet_id>.

To get the URL, I had to resort to coding. Here's how I did it:

  • First convince Feeds JSONPath Parser to retrieve the user's screen name. To do this, I had to map it to some field - I chose the node body, although a better solution would be to use a NULL target field, just to fool the parser into returning the value.

  • In a custom module, I created a new programmatic source field called "Tweet URL" that synthesizes the URL:

/**
 * Implements hook_feeds_parser_sources_alter().
 */ 
function demo_feeds_parser_sources_alter(&$sources, $content_type) {
  $sources['tweet_url'] = array(
    'name' => t('Tweet URL'),
    'description' => t('The URL of a tweet.'),
    'callback' => 'demo_feeds_tweet_url',
  );
}

/**
 * Populates the "tweet_url" field for each result.
 */
function demo_feeds_tweet_url(FeedsSource $source, FeedsParserResult $result, $key) {
  $item = $result->currentItem();
  // jsonpath_parser:2 corresponds to user screen name in my importer.
  // jsonpath_parser:0 corresponds to tweet ID in my importer.
  return 'https://twitter.com/' . $item['jsonpath_parser:2'] . '/status/' . $item['jsonpath_parser:0'];
}
  • Finally, I mapped this new source field to my target field, the URL of a Link that renders using oEmbed.

This little exercise took a good couple of hours - and that's just for a demo. API changes are always painful, but at least the Feeds OAuth module got some love and fixes in the process.

  • Enable the module twitter_feed_custom.
  • Copy the ping.php_.txt file to your Drupal root folder and rename it to ping.php. Also edit the file to point the DRUPAL_ROOT definition to your actual Drupal root folder.
  • Copy the Consumer key and Consumer secret strings of the Twitter app to the Fetcher > HTTPS OAuth Fetcher Settings > Consumer key and Consumer secret settings, respectively.
  • Create a new node of type Twitter feed with your query URL (e.g. #drupal). Make a note of this node's nid.
  • Edit the (badly-named) hashdrupal view included in this feature, such that the filter Feeds item: Owner feed nid refers to the nid noted above.
May 18 2013
May 18

Average: 5 (2 votes)

DrupalCamp AustinWe're super-excited to announce that we've been invited to present a half-day workshop during DrupalCamp Austin. The Camp takes place the weekend of June 21-23, 2013 and we'll be presenting "Getting Stuff into Drupal - Basics of Content Migration" from 1:30pm until 5:30pm on Saturday the 22nd. The workshop will cost $75 and we'll be covering the basics of three of the most common ways of importing content into Drupal: the Feeds, Migrate, and the Drupal-to-Drupal data migration (based on Migrate) modules. Interested? Check out all the details and then register today.

Over the past few years, we've performed various types of migrations into Drupal from all sorts of sources: static web sites, spreadsheets, other content management systems, and older versions of Drupal sites. Using this experience, we've developed an example-based workshop that demonstrates some of our go-to tools for bringing content into Drupal.

The workshop will be short on lecturing, and long on real-world examples. We'll import spreadsheet data using Feeds, a Drupal 6 site into Drupal 7 using Drupal-to-Drupal migration, and a custom migration using the Migrate module.

We're always looking for new and exciting workshops to offer - please take a few minutes and take this short survey to help us determine potential topics for future workshops.

Trackback URL for this post:

http://drupaleasy.com/trackback/591

Apr 29 2013
Apr 29

Average: 4.8 (8 votes)

FarmersMarkets.com

At Florida DrupalCamp 2013, I presented a session that demonstrated how to utilize the Feeds, Feeds Tamper, Address field, Geofield, and other modules to create a fully-functional website for searching for Farmers Markets anywhere in the United States. While the session's intent was to inspire people as to what Drupal can do in a very short amount of time, this blog post will focus on the details of the process.

A few years ago, I built a similar presentation using world-wide earthquake data, importing into a Drupal 6 site using Table Wizard and displaying the data using the Mapstraction module. I must have given that presentation about half-a-dozen times over the course of a year or so at various meetups and camps, so I thought now was a good time to bring it up-to-date with modern (relatively-speaking) Drupal tools.

StopwatchBefore we get started, let me point out that the title is a lie. It's actually going to be more than 7,000 records, but I like the way the "5,000" and "45" play off each other. The first time I did this demonstration in front of an audience, it actually took me only 25 minutes, 26 seconds - the rest of the presentation time was taken up with some initial slides and furious betting on how long it would actually take me (the winner got a copy of Mapping with Drupal).

The Farmers Markets Source Data

Data.gov logoThe first step in a project like this is to find some good clean source data. I'm a big fan of the seemingly infinite supply of publically available data found on Data.gov, the United States' public repository of federal government data. After poking around for a bit of time (I'm embarassed to say exactly how much!), I stumbled upon the Farmers Markets Geographic Data - a Microsoft Excel-formatted dataset containing data on over 7,000 Farmers Markets all over the United States. The dataset contains names, descriptions, addresses, websites, and other details - most importantly, it contains the latitude and longitude for each location. While not mandatory, having the latitude and longitude data sure does make the process easier.

Farmers Markets spreadsheetInspecting the data in a spreadsheet, things looked pretty clean. Since I knew I needed to save the file in comma-separted-values (.csv) format, I did some very minor cleanup on it by doing the following:

  • Removed the top 3 descriptive header rows. For the import, all we actually need is a column/field name header row and the data. Any anciliary header rows need to be removed.
  • Removed the bottom 2 descriptive footer rows. For this dataset, there was a row at the bottom of the dataset that contained information about when the dataset was last updated. This wasn't needed for the import, so I manually deleted it.

Additionally, I took note of the following things:

  • The FMID field (I'm assuming this is an acronym for "farmers market identifier") appears to be a unique integer for each record in the dataset. This will come in handy during import.
  • There was no "country" field. This isn't unexpected since this was data for United States' farmers markets, but I did take note of it because the Address Field module will be looking for country data. I could have simply added a new "country" field (with all values set to "United States") to the dataset prior to exporting it as a .csv file, but I prefer to keep the dataset as "pure" as possible, so I decided to leave it alone for now and deal with the country stuff as part of the import process (see below).
  • The data in the "State" column included full state names, not the standard 2-letter abbreviations. I knew that this would need some tampering (via the Feeds Tamper) module to convert it so that it would import cleanly.
  • The "Schedule" field for some records is longer than 255 characters. This means that we'll have to use a "Long text" field type in the content type to handle the data in this field.
  • The data also included 20 "category"-type fields indicating the types of goods available at each farmer's market (eggs, cheese, soap, trees, etc...) Ideally, each of these 20 fields should be mapped to a single Drupal vocabulary. This would require a custom Feeds Tamper plugin.
  • The "lastUpdated" field mostly contains well-formatted dates, but because there are some records where the data is not well-formatted ("2009" instead of "mm/dd/yyyy") and it is just informational, its probably best just to use a text field in the content type for this data.

Once I was satisfied that the data was clean and I had a good understanding of it, I saved it as a .csv file and moved onto getting Drupal ready to import it.

Setting Up the Basic Site

As with most of the sites DrupalEasy builds, we started out with our own custom Drush make file that automatically downloads a bunch of standardish modules we use on every site as well as our own custom installation profile that does some initial site configuration (turning off the Overlay, enabling the Administration Menu module, etc...) This enables us to get a basic site up-and-running in just a few mintues.

Next, we need to download and enable the modules that we're going to need:

  • Geofield - be sure to use a version of the 7.x-2.x branch dated later than 2013-Apr-07

Depending on whether or not you start with our custom make file, there may be other modules that are dependencies of the ones listed above that will also need to be downloaded and enabled.

If you use Drush, the following command will enable all the necessary modules:

drush en addressfield feeds_ui feeds_tamper_ui geofield 
geofield_map job_scheduler feeds geophp rules_admin openlayers_ui

Creating the Farmers Market Content Type

Once the site is up-and-running, the first step is to set up something for all of the data to be imported into. In this case, hopefully it is obvious that we need to create a new content type with fields that roughly match the fields in our source file. By creating a node for each Farmers Market, once imported we can leverage all of the tools in the Drupal universe to interact with them as we build out the site.

Create a new content type (admin/structure/types/add) with the following properties (throughout this post, any properties/attributes/settings not specifically mentioned can be left at their default values):

  • Name: Farmers Market
  • Disable "Promoted to front page"
  • Disable "Main menu" from "Available menus"

Moving on to the fields (admin/structure/types/manage/farmers-market/fields):

  • Delete the "Body" field
  • Add "Address" field: type=Postal Address, Available countries=United States, enable "Hide the country when only one is available"
  • Add "Lat/Long" field: type=Geofield, Widget=Latitude/Longitude
  • Add "URL" field: type=Link, Link Title=No Title
  • Add "Location details" field: type=Text
  • Add "Schedule" field: type=Long text
  • Add "Last updated" field: type=Text

Farmers Markets content type

One thing to note is that once the import is complete, we're going to go ahead and enable the Geocoder module so that any Drupal-side address updates to any Farmers Market nodes will be automatically updated with the proper latitude/longitude coordinates. We don't want to enable this functionality prior to import otherwise the module will attempt to geocode each address from the source file during import. This is completely unecessary since the source file already includes latitude/longitude data. Plus, Google Geocoder limits non-paid users to 2,500 requests per day - unless you pay for more.

Creating the Importer

At this point, we have the source data (the .csv file) and the destination (the "Farmers Market" content type). The next step is to create the mechanism that will actually transfer the data from the source to the destination. We'll use the Feeds module to do this. The Feeds module is designed to take data from a variety of sources (most commonly RSS feeds and .csv files) and map it to Drupal entities (usually nodes, but not always).

Add a new importer (admin/structure/feeds) named "Farmers Markets Importer". The "Edit" page for importers has 4 major sections. Let's look at each one in detail.

Basic settings

This section consists of the general configuration of the importer. For this project, use the following settings:

  • Attach to content type = Use standalone form. Note that this isn't referring to were the content is going to go, it is referring to the importer itself. In our case, since we only have a single data source, a standalone form is fine. If were were planning on importing data from multiple sources, a custom importer content type might be necessary.
  • Periodic import = Off. This setting is primarily used for automatically checking a feed for new data. For a one-time .csv import, it is not necessary.
  • Import on submission = Enabled. This triggers the import to start whenever a new .csv file is uploaded on the main import (/import) page.

Fetcher

This section sets the mechanism that actually interacts with the source data. We need to change the Fetcher from "HTTP Fetcher" (commonly used for RSS feeds) to "File upload". Looking at the settings for the "File upload" fetcher, all the default values are fine, so no changes are necessary.

Parser

This section sets the process that will be used to parse the source data into a format that the "Processor" (next step) can understand. In our case, we need to change the Parser from "Common Syndication parser" to "CSV parser". Again, the default settings for the "CSV parser" are fine as-is. It is interesting to note that the Feeds module is easily extensible. Custom Fetchers, Parsers, and Processors can be written to handle virtually any type of incoming data.

Processor

This final section is were the parsed source data is mapped to the proper place in Drupal. In our case, the default "Node processor" is what we want (since we're mapping the data into our new "Farmers Market" content type). The settings for the Node processor are as follows:

  • Update existing nodes = Update existing nodes (slower than replacing them). This is a big win for us, and is only possible because of the FMID field in the source data. This means that when an updated source dataset is available, (assuming the field structure hasn't changed) we can simply re-run our importer on the new file and upload only the records that have changed. In other words, we won't have to delete (including any user comments and Drupal-side updates) and re-import Farmers Market nodes. This will allow us to keep the site up-to-date with a minimum of work.
  • Content type = Farmers Market. This is where we tell the importer that we're going to be generating nodes of the Farmers Market content type with the imported records. This is the first link between the source and destination.
  • Author = admin (or any user on your site). It's fine to leave it as "anonymous", but I'd rather have my nodes "owned" by an actual user.

The final (and most tedious) step is to set up the mapping of fields between the source and destination. In other words: data from each source fields needs to know which destination field it will go into. It is important to note here that the source field names must be entered exactly as they appear in the source data file. The mappings for this importer are (Source = Target):

  • MarketName = Title
  • FMID = GUID - set this field to "Unique" then be sure to click to "Save" the mapping.
  • Street = Address: Thoroughfare
  • City = Address: Locality
  • State = Address: Administrative area
  • Zip = Address: Postal code
  • x = Lat/Long Longitude
  • y = Lat/Long Latitude
  • Website = URL: URL
  • Location = Location details
  • Schedule = Schedule
  • updateTime = Last updated

Be sure to double-check that the "FMID" field is set to unique!

Farmers Markets content type

Massaging the Data

As I indicated in the "Farmers Markets Source Data" section above, there are a couple of things we need to do in order to get the dataset to import cleanly: set the default country ("United States") and translate the full state name to the 2-letter abbreviation ("New York" to "NY", for example).

Setting the Default Country

Setting the default country field for every record on import is actually a fairly simple operation to set up - assuming you're aware that the Feeds module exposes a "Before saving an item imported via [importer name]" event for each Feeds importer. This allows us to step in the middle of the import process and set a data value as we wish.

From the main Rules configuration page (admin/configure/workflow/rules), add a new rule named "Add default country for imported markets" that reacts on the "Before saving an item imported via Farmers Markets Importer" event.

Next, add an "Entity has field" condition with the Data selector=node and Field=field_address. This ensures that the country field exists (it is part of field_address) and (more importantly) is available for us to set its value in the next step.

Finally, add an "Set a data value" action with a Data selector=node:field-address:country and a Value=United States. Click to save everything it's done!

Add default country for imported markets rule

Translating the State Names

The second data issue that we need address during data import is that of the "State" field. We need a mechanism were we can automatically translate full state names into their 2-letter abbreviation. I turned to the Feeds Tamper module for this, as it is relatively straight-forward for a developer to create a custom plugin that can be assigned to any field via the Feeds Tamper interface. The source data is then run through the plugin code to make any necessary changes. Unfortunately, a plugin had to be written for this application - I have contributed it back to the community, but the module author has not acted on it as of April, 2013.

If you're not familiar with applying patches, feel free to download the state_to_abbrev_inc.zip file, uncompress it, and place it in your feeds_tamper/plugins directory.

Once the plugin is installed, it needs to be assiged to the "State" field. This is done by clicking "Tamper" link for our importer from the main "Feed importers" page (admin/structure/feeds). Then, add the "Full U.S. state name to abbrev" plugin to the "
State -> Address: Administrative area" field.

Tamper with the State field

Import!

At this point, everything is ready to proceed with the import. Navigate to the main "Import" page (/import) via the Navigation menu and click the "Farmers Markets Importer". Select the file to upload and click to "Import".

I like to test things with a small version of the main source file - one with only a handful of records. This is helpful in making sure everything is being imported correctly without having to wait for all 7,000+ records to be processed. I check things by inspecting a few of the newly created Farmers Market nodes, ensuring fields are populated as expected. If I'm satisfied, then I go ahead and run the import with the full data set.

On my particular local machine, importing all 7,000+ records took about 6 minutes.

Setting Up the Proximity Search

One of the features that really makes location-based content useful is proximity searches: being able to allow the user to "show me all the things near a particular location". For this example, we're going to use the built-in proximity functionality of the 7.x-2.x version of the Geofield module. We'll create a view that exposes a proximity filter that incorporates geocoding by allowing the user to enter any valid location data into the "origin" textfield. That is, the user can query for farmers markets within 10 miles of "Sacramento, CA", "06103", or "1600 Pennsylvania Ave, Washington, DC" - any text that the active Geocoder (usually Google Geocoder) can parse.

Exposed Views proximity filter

Once the import is complete, enable the Geocoder module. Then, create a new view named "Proximity search". On the initial views wizard page, "Show Content of type Farmers Market" (sorting doesn't matter yet). Create a page with a Display format" of "Geofield Map". Set the "Items to display" to 100 (just to make sure we never overwhelm the map with points), and disable the pager. On the main views interface:

  • Click to add the "Content: Lat/Long" field - exclude from display, Formatter=Latitude/Longitude
  • Click to add the "Content: Lat/Long - proximity" field - exclude from display, Source of Origin Point=Exposed Geofield Proximity Filter, Unit of Measure=Miles
  • Click to add the "Content: Lat/Long - proximity" filter - Expose this filter to visitors, enable "Remember the last selection", Operator=Is less than or equal to, Source of Origin Point=Geocoded Location. Be sure to add a default value for the exposed filter ("10 miles from New York, NY") or you may see a nasty little bug (http://drupal.org/node/1871510)
  • Remove the "Content: Post date" sort criteria
  • Edit the settings of the Geofield Map format: Data source=Lat/Long, Popup Text=title

Once complete, save the view, then navigate to the /proximity-search (or whatever URL you set for the page display of the view) and give it a whirl!

Proximity search

Pimping the Display (and Functionality) of Farmers Market Nodes

At this point, if you click on a Farmers Market pin, then click through to a particular Farmers Market node, the display of the node is less-than-impressive.

Initial Farmers Market node display

With just a little bit of effort, this can be greatly improved. We'll rearrange the order of fields, tweak the display a little bit, add a map, and incorporate Geocoder functionality for address updates.

OpenLayers Map

To keep things interesting, we're going to use the OpenLayers module for the map display on the individual Farmers Market nodes. First, we'll need to edit the OpenLayers map that we're going to utilize. Go to the main OpenLayers "Maps" page (admin/structure/openlayers/maps), and click to edit the geofield_formatter_map (the description of the map should explain why we're using this one - it is designed to handle the display of Geofield output). There's lots of available settings for each map, we'll only make a few small configuration changes:

  • Basics section: Width=auto, Height=250px
  • Layers and Styles section: only the "OMS Mapnik" layer should be enabled and set to default, set the styles for "Placeholder for Geofield Formatter" to "Marker Black Small"
  • Behaviors section: Point Zoom Level=14

Once the map is configured, we can utilize it on the Lat/Long field of our Farmers Market content type. Go to the "Manage Display" page (admin/structure/types/manage/farmers_market/display) and change the format of the "Lat/Long" field to OpenLayers. Click to save and test.

Reordering the Display Fields

While we're on the "Manage Display" page of the Farmers Market content type, rearrange the fields as follows:

  1. Lat/Long: Label=Hidden
  2. Location details: Label=Hidden
  3. Address: Label=Hidden
  4. URL: Label=Hidden
  5. Schedule: Label=Inline
  6. Last updated: Label=Inline

With these changes, things improve quite a bit.

Pimped Farmers Market node display

Enabling the Geocoder for Address Updates

Finally, now that all the data is imported, we can go back and modify the Lat/Long field to automatically be updated by the Geocoder module whenever the node is updated (in case the address changes). From the "Manage Fields" page for our content type (admin/structure/types/manage/farmers_market/fields), click the "Latitude/Longitude" widget for the "Lat/Long" field, change the widget to be "Geocode from another field", then continue to click to edit the field configuration and ensure the "Geocode from field" option is set to "Address". Click to save.

Are We Done?

At this point, we have a fully functional site where users can search for farmers markets near them, then click to view the details on ones that interest them. Since the farmers markets are nodes, we can leverage all the great modules available from Drupal.org to futher extend and enhance the site.

With just a few additional modules, a contribued (responsive) theme with just a few extra lines of CSS, and some publically available imagery, it's quite simple to produce a usable site - just like FarmersMarketsNow.com!

Extra Credit - Utilizing the Source Dataset's Category Fields

Still reading? Congrats - you're in it for the long haul. Wondering how we can leverage the category data from the source file? Here are the steps:

  1. Before creating the Farmers Market content type, create a new "Categories" vocabulary.
  2. Add a "Categories" term reference field to the Farmers Market content type and set the "Number of values" to "Unlimited".
  3. Install the taxonomy_inc.zip Feeds Tamper plugin in the feeds_tamper/plugins directory. This is a custom pluging that is specific to this particular source file. It takes data from all the category-type fields in the source file and imports them to the new "Categories" vocabulary.
  4. Add a new mapping field to the importer. The way the Feeds Tamper plugin was created, only one needs to be added. Use "Bakedgoods = Categories".
  5. Utilize the custom "Taxonomy Y/N" Feeds Tamper plugin on the "Bakedgoods -> Categories" field.

Rerun the import and see the magic! Note that the extra processing for the categories really slows down the import quite a bit. I'm sure that there are other ways of importing the category-type fields to a single vocabulary, let me know in the comments if you know of an easier method.

Trackback URL for this post:

http://drupaleasy.com/trackback/575

AttachmentSize 1.21 KB 1.34 KB 763 bytes
Mar 15 2013
Mar 15

Rather remarkably we’ve managed to avoid the top xxx module list for Drupal 7… however to recap the presentation yesterday at ACCSVa.org here it goes….

A Drupal Roadmap with Rich Gregory – Look in to some critical dev tools like Drush and other things to get you going.

1.  Display Suite (putting Views at the top almost redundant….) – thanks to Display Suite and it’s buddy Field Group Drupal 7?s core content creation kit is a flexible dashboard delivery tool.  With a few clicks you can now turn a lengthy and unintuitive form into a dashboard – i’m seeing hope for a wordpress like content adding area.

and after

and after DS + FG

Forms before Display Suite and Field Group

Forms before Display Suite and Field Group

2. Views – it should go without saying, and now that it’s going to be a part of Drupal 8 core I’m going to leave it at that… you need views to do anything worth doing.  We’ve got a half dozen or more tutorials on views here, so dig in.

3. Context – this is your logic layout tool – pick conditions and reaction.  There are numerous modules to extend context as well – in the presentation I mentioned http://drupal.org/project/context_reaction_theme however this only has a D6 option.   You’ll probably need to use http://drupal.org/project/context_addassets to do essentially the same thing.  Also note that Mobile Tools allows you to use contexts to do dramatic theming changes based on the mobile device.

First up choose the conditions for your layout

First up choose the conditions for your layout

The choose your reactions

The choose your reactions

4.Rules: Rules allows your site to become a dynamic workflow management intranet style workhorse. The amount of flexibility here, much like Views, extends beyond the scope of a simple “short stack” review, however in essence you’re taking events that happen within the site, or custom crontab events, setting conditions and triggering actions. Coupled with modules like Views Rules the possibilities are amazing.

5. Entity reference - extending CCK (part of drupal 7 core) the up-and-coming successor to References. Allow content to reference other content, and as mentioned this allows View Relationships to create a SQL JOIN on your content – get more information about your Content Author, and many more options…this post here is particularly fun with references referencing references…

6. Honorable mention: Feeds – this is the bulk content migration tool of choice for folks like myself.  It’s intuitive and lets you harvest content from various sources and ingest it in to your content types, user tables, etc.. we have a few tutorials on feeds that may help you with some specifics – it’s a powerful tool, and coupled with tools like feeds tamper there are a lot of options.

7. Honorable mention: Flag.  Give your users buttons and they’ll push them.  Flags allow your users to have simple on/off buttons – categorize content, flag spam, etc…  they of course work with views, rules, and the rest of the gang :)

So there’s my short stack for Drupal 7 – I’m sure entities and entity forms probably belong on there, however for most basic sites I think this is a good start… heck probably need to talk wysiwyg editors too…. so many modules!  Thanks again to ACCSVA.org for the conference, Rich Gregory for the great tunes and the lift, and  Blue Cat Networks – the hat is bangin.

Dec 05 2012
Dec 05

Feeds SQL does its job so well it’s rather mindblowing.   As the name suggests it allows Feeds to use a SQL query as its data source.  The mapping and all the rest of it works as one might expect, and that’s what is blowing my mind right now.  After installing the module

SQL Fetcher

SQL Fetcher

Firstly set your fetcher to the sql fetcher…  I only have one database in my settings file for now, but that’s sure to change.

Secondly choose the SQL Parser… I guess this means that you might be able to upload a file in to the fetcher and then parse it… for this example we’re going SQL Fetch and SQL Parse though.

Choose the SQL Parser

Choose the SQL Parser

And now the magic begins…  in the settings you can test your sql fetch by putting in a select statement.  For this example I made a dummy table with some random stuff in it

SQL Parsing

SQL Parsing

and now comes the wow!

It's data!

It's data!

And more wow!

Your columns are ready to be mapped!

Your columns are ready to be mapped!

So pretty – and just like that all those whacked databases you were trying to decide how to merge in to Drupal are available with feeds…. yum yum yummy. Mad props to vkareh for the dev on this piece… absolutely awesome.

Nov 05 2012
Nov 05

How to Tidy URLs and Relative Links When Moving From Dev to Go-Live (for Drupal 6 and 7)

Few things are as annoying as building something that works perfectly when you create it, but fails when you take it out of the lab. That's how site owners can often feel when content editors create piles and piles of Drupal nodes full of relative URLs in images and links. They look fine on the site, but if the content is syndicated via RSS or Atom, sent out in an email, or otherwise repurposed in another location, the links break. Even worse, hand-made links and images entered while the site is under development can easily point to the outdated "beta" URL. Who can save the day? Pathologic module, that's who.

Pathologic module's configuration options

Pathologic is an input filter -- to install it, you drop the module into your Drupal site and add it to one of your text formats -- Full HTML and Filtered HTML, for example. Whenever content is posted in a format configured to use Pathologic, it will scan the content for URLs and tidy them up. Relative URLs like /node/1 get turned into absolute ones like http://example.com/node/1, URLs pointing to alternative versions of your site like dev.example.com are replaced with your public URL, and so on.

Pathologic can also standardize the protocol of links inside your site's content. If users edit content over a secure connection, for example, it's easy to mix links using the http:// and https:// protocols -- something that can lead to annoying warnings on some users' machines. For developers with exacting URL-correcting needs, it also supports custom URL modification hooks. Using those hooks, your site's custom fixes (replacing MP3 links with a URL on a different server, for example) can piggyback on Pathologic's configuration and logic.

Pathologic is an efficient workhorse of a module that solves an annoying problem efficiently. If you've run into problems with relative links and staging-server URLs breaking links and images on your RSS feeds, you owe it to yourself to check it out!

*/
Aug 16 2012
Aug 16

Migrate

Posted on: Thursday, August 16th 2012 by Brandon Tate

Time and time again, I’ve had to import content into a Drupal site. In the past I would rely on Feeds to do the job for me. However, over time, I found Feeds to come up short when it came to complex imports. So, I went looking elsewhere for a better solution and came across Migrate and haven’t looked back since. I’ve developed four import solutions for different sites and have only come across one problem with Migrate which I’ll get to later. First, let's get into the great features of Migrate.

I’ll start by creating a Film migration example. This will be extending an abstract class provided by Migrate named Migration - in most cases, you'll do nearly all your work in extending this class. The Migration class handles the details of performing the migration for you - iterating over source, creating destination objects, and keeping track of the relationships between them.

Sources

First, lets start with defining the source data and where it's coming from. In all cases, I’ve had to import CSV or Tab delimited files, so my import source has been MigrateSourceCSV. This tells migrate that your import source is coming from a CSV or Tab file. When importing tab files though, you need to specify in the options array the delimeter to be “\t”. As well, all files I’ve imported include a header, within the options array you can specify this option to ensure that the header row is not imported. Next, we will want to specify an array describing the file's columns where keys are integers (or may be omitted), values are an array of field name then description. However, if your file contains a header column this may be left empty (I've provided this array below). Migrate can also import from SQL, Oracle, MSSQL, JSON and XML. More Info on sources.

class FilmExampleMigration extends Migration {
  public function __construct() {
    parent::__construct();
    $options = array();
    $options['header_rows'] = 1;
    $options['delimiter'] = "\t";
    $columns[0] = array('Film_ID', 'Film ID');
    $columns[1] = array('Film_Body', 'Film Body');
    $columns[2] = array('Origin', 'Film Origin');
    $columns[3] = array('Producer_ID', 'Film Producer');
    $columns[4] = array('Actor_ID', 'Film Actors');
    $columns[5] = array('Image', 'Film Image');
    $this->source = new MigrateSourceCSV(variable_get('file_private_path', conf_path() . '/files-private') . '/import_files/example.tab', $columns, $options);

Destinations

Next, we will need to define the destination. Migrate can import into Nodes, Taxonomy Terms, Users, Files, Roles, Comments and Tables. As well, there is another module called Migrate Extras which extends the stock ability of Migrate. I personally have implemented Nodes, Terms and Tables so I will discuss those here. Migrate to a node is done by including MigrateDestinationNode with the node type as the parameter. Terms are defined by including MigrateDestinationTerm with the vocabulary name as the parameter. Lastly, Tables is defined by MigrateDestinationTable where the table name is the parameter. Migrating into a table also requires that the table is already created within your database. More Info on destinations.

//term
$this->destination = new MigrateDestinationTerm('category');

//node
$this->destination = new MigrateDestinationNode('film');

//table
$this->destination = new MigrateDestinationTable('film_category_reference_table');

Migrate Map

With Migrate, the MigrateMap class tracks the relationship of source records to their destination records as well as messages generated during the migration. Your source can have a single map key or multi key. Below is an example of both.

//single key
$this->map = new MigrateSQLMap(
     $this->machineName,
       array(
          'Film_ID' => array(
            'type' => 'varchar',
            'length' => 255,
            'not null' => TRUE,
          ),
    MigrateDestinationNode::getKeySchema()
);

//multi key - importing to a table 
$this->map = new MigrateSQLMap(
    $this->machineName,
      array(
        'Film_ID' => array(
           'type' => 'varchar',
           'length' => 255,
           'not null' => TRUE,
         ),
        'Origin_ID' => array(
          'type' => 'varchar',
          'length' => 255,
          'not null' => TRUE,
        )            
    ),
    MigrateDestinationTable::getKeySchema('viff_guest_program_xref')
);

Dependencies

Some imports depend on other imports so Migrate handles this by including the ability to define hard or soft dependencies. Hard dependencies give you the ability to prevent a child import to run before the parent import has run successfully. That means, no errors or anything can occur during that parent import before the child import can run. A soft dependency ensures that the parent import has run before the child but does allow errors to occur within the parent.

//hard dependencies
$this->dependencies = array('Producers', 'Actors');
//soft dependencies
$this->softDependencies = array('Producers', 'Actors');

Field Mappings

Next up, we need to associate source fields to destination fields using what Migrate calls field mappings. You can read up on Migrate’s mappings here. In a simple case, usually text or an integer is provided and it is easily mapped into Drupal using the following line:

$this->addFieldMapping('field_body', 'Film_Body');
$this->addFieldMapping('field_id', 'Film_ID');

Field mappings also allows you to provide default values, such as:

//default image
$this->addFieldMapping('field_image', 'Film_Image')->defaultValue(‘public://default_images/default_film.png’);
//default body text to full html
$this->addFieldMapping('field_body', 'Film_Body')->defaultValue(‘full_html’);

In some cases you need to reference one node to another. This can easily be done using the sourceMigration function. In this example, the ID provided in Program_ID is associated to the Drupal entity that will be created, using the entity ID that was created in the MigrateProgram migration.

$this->addFieldMapping('field_producer', 'Producer_ID')->sourceMigration('MigrateProducer');

Another useful ability is to explode multiple values into a single field. Imagine a Film has multiple actors and the data is defined as “value1::value2::value3”. We would handle this use case using the following:

$this->addFieldMapping('field_actors', Actors')->separator('::')->sourceMigration('MigrateActor');

Taxonomy values are commonly used in migrates but in most cases the client’s CSV file does not contain the term ID needed, instead the term name is provided. In this case, we need to tell the field mapping that the name is being provided instead of the ID:

$this->addFieldMapping('field_genre', 'Genre')->arguments(array('source_type' => 'term'));

Migrate Hooks

In most cases Migrate’s field mappings can handle all situations of getting the data in the system. However, sometimes you need access to the row being imported or the entity type being created. Migrate gives you three functions for this, prepareRow($row), prepare(&$node, $row) and complete(&$node, $row). PrepareRow() allows you to manipulate the row before the rows starts to be imported into the system. You can modify row attributes or even append more row columns as needed. Prepare() allows you to modify the node before it gets saved. Complete() is essentially the same as prepare() but is fired at the end of the row import process.

//prepare example
function prepare(&$node, $row) {
  // concatinate date + time
  $start_time_str  = $row->Event_Date .' '. $row->Event_Time;
  $start_timestamp = strtotime($start_time_str);
  $start_time      = date('Y-m-d\TG:i:s', $start_timestamp);

  $end_time_str  = $row->Event_Date_End .' '. $row->Event_End_Time;
  $end_timestamp = strtotime($end_time_str);
  $end_time      = date('Y-m-d\TG:i:s', $end_timestamp);

  $node->field_event_starttime[LANGUAGE_NONE][0]['value'] = $start_time;
  $node->field_event_endtime[LANGUAGE_NONE][0]['value']   = $end_time;
}

//prepareRow example
public function prepareRow($row){
  //prepend image location onto filenames
  $prefix = 'public://import/images/';
  $row['Image'] = $prefix . $row['Image'];
}

Migrate Drush Commands

Last but not least, I wanted to touch on Drush. Luckily for us command line lovers Migrate has implemented a bunch of useful drush commands that allow us to import, rollback and do just about anything the Migrate UI can do. Here is the list below:

migrate-audit (ma)	View information on problems in a migration.         	
migrate-deregister      Remove all tracking of a migration                   	
migrate-fields-desti     List the fields available for mapping in a destination. 
migrate-fields-sourc   List the fields available for mapping from a source. 	
migrate-import (mi)    Perform one or more migration processes              	
migrate-mappings      View information on all field mappings in a migration.  
migrate-reset-status   Reset a active migration's status to idle            	
migrate-rollback (mr) 	Roll back the destination objects from a given migration
migrate-status (ms)    List all migrations with current status.             	
migrate-stop (mst)	Stop an active migration operation                   	
migrate-wipe (mw) 	Delete all nodes from specified content types.  

I mentioned earlier that there is only one problem I’ve seen with Migrate. This issue is related to CSV and Tab files. Migrate is expecting the files to be continuous. This allows migrate to rollback and re-import at a whim. However, if the import file contains only the new set of data you wish to import, and not the old data that is already imported, you lose the roll back ability because the mappings are lost. As well, the Migrate UI becomes pretty confusing as none of the total rows, imported and un-imported columns make sense since the ID’s don’t relate to data stored within migrates mapping tables. This is the only issue I’ve come across Migrate but still prefer it over other options.

Overall, I’m very impressed with Migrate and its ability to offer such a verbose option of sources, destinations, field mappings and hooks that allow you to ensure the data will make it into your Drupal site no matter what the situation. Feel free to comment and offer suggestions on your experiences with Migrate. For more information on Migrate, check the documentation. Also, I've attached the Film example migration file here.

Aug 08 2012
Aug 08

Feeds is a lovely utility that has blossomed greatly in Drupal 7. As with Drupal 6 both Feeds and Feeds Tamper are Features compatible and now even more versatile with add-ons like Feeds SQL.  There are a several caveats to getting Feeds happy when working with CSV formats.  I’m coming from a Mac environment, often mapping excel spreadsheets over in to Drupal.  These are my ongoing notes, am doing a lot in Feeds and will keep this post as a list of my remarkably consistent failures.  (NB: if you’re in D6 Node Import just seems way more forgiving on all these points, just not nearly as “Drupal Hep”)

Remember to get your csv UTF-8 compatible – I like Text Wrangler for this – especially when coming from Micro$oft Excel your content may have some odd-ball encoding. So if saving as Windows, Unix, or MS-Dos CSV fails on you just open up your CSV file in there and resave

CSV fix in textwrangler

CSV fix in textwrangler

saving csv from excel for feeds

saving csv from excel for feeds

 

 

 

 

 

 

 

Caveats:  Another common quandary with Feeds is using it with fields outside of core CCK or Entities. While it’ll map to CCK Select Other it has a tough time, especially with multiple values.  Even with the values put in the drop-down selection area it always mapped to “Other” and then stuck in the value.

Unlike in Drupal 6 this field will throw errors if indexed, doesn’t seem to work with the Facet API, or be good for much of nothin’. Perhaps Select Or will work better, however it seems to have view limitations too.  So while none of this reflects on Feeds it’s worth noting because you may end up needing some UI tidbits… it may map to them, but it’s

Mar 19 2012
Mar 19

Note to all:  Feeds doesn’t seem to connect with user profiles http://drupal.org/node/1060230 = stay tuned for the fix in part 2 (hint use user import also)

Organic groups is a great way to organize groups and cascade permissions – in this example we tested against 45,000 users in 3 different roles going in to 500 groups.

While Node import did the heavy lifting for the group node creation feeds came ahead for user imports.  Node Import just has a few more settings for group node creation than does feeds.

This project came about as a conversion from an unmaintainable drupal 6 site that we converted into off-the-shelf drupal.
Modules used:

Feeds Tamper to assign multiple groups from a signle field   group node id's and role assignments in a spreadsheet

group node id's and role assignments in a spreadsheet

Feeds Tamper to assign multiple groups from a signle field

feeds tamper for role id

feeds tamper for role id

Something to remember with feeds – it can get a bit wonky over file line endings and encoding – more so than node import from what I’ve seen. These are the settings that have worked for me coming from mac excel -> open in text wrangler and resave with windows cr/lf and utf-8 encoding

Feeds Settings

Feeds Settings

The settings are pretty straightforward.

A few things to remember before you upload:

  • if you plan on making node aliases do that before you upload – views bulk operations is slower than molasses updating url aliases.  Same goes for auto node titles if you’re using them.
  • In general every node reference or user reference you have is going to slow something up
text wrangler settings for feed import

text wrangler settings for feed import

Feed mappings

Feed mappings - remember to have a unique id in there

  • organic groups access permissions settings Permissions settings

Modules tested: although a lot of modules were tested not all really made the cut. Some because they didn’t really seem useful in our case. Others (organic subgroups) because I really couldn’t get a response on IRC or the forums about how to ingest them en masse – if anyone has done this please feel free to chime in) organic groups access permissions settings

  • User Import – it only allows you to map your users to a single group.  Thankfully feeds tamper allowed us to map multiple groups per user and group post
  • User Import for Organic Groups – these two seemed promising, however UI4OG only allows the users to be mapped to one group… no good for our use case
  • OG Subgroups – ok – wanted to see this in action, never really got it working at least not through the import interface- might make a good use case for using the migration modules
  • Node Access by User Reference - I really like a lot of Node one’s ideas, and Johan Falk in particular comes up with some very clever ways of re-arranging drupal mods – so I thought I might give this intuitive technique a go – the caveats mentioned in his post indeed became insurmountable – with the need to have 16 user references in each group post my virtual machine came grinding to a halt.
Dec 17 2011
Dec 17

Consuming JSON feed using Feeds and JSONPath Parser

Posted on: Saturday, December 17th 2011 by Rexx Llabore

One of the best aspects about Drupal is its interoperability. Drupal can be integrated with almost all systems(CMS/ECM/DM)out there. There are so many ways to import content from other systems to Drupal. How you implement it will depend on some factors(budget, time constraint, 3rd party system constraints). If you need to consume a JSON feed and convert them into nodes, try using feeds module in conjunction with feeds_jsonpath_parser module. Use feeds to consume data from a given URL and feeds_jsonpath_parser to traverse through the JSON data and map elements to CCK fields. JSONPath allows you to traverse JSON string like XPATH is used to traverse XML contents.

Here is an sample use case that we will solve using Feeds/JSONPath Parser:

We need to extract the list of employee information from a given system. We have created an employee content type that will encapsulate employee information that we will consume from a 3rd party system.

The employee content type will contain the following fields:
Last name (field_employee_lname)
First name (field_employee_fname)
Employee ID (field_employee_emp_id)
Position (field_employee_position)

Here is the sample JSON string that we will consume:

{"companies":{"company_a":{"employees":[{"position":"Developer","first_name":"Bob","last_name":"Williams","employee_id":1},{"position":"Developer","first_name":"Dave","last_name":"Ali","employee_id":2},{"position":"Developer","first_name":"Jon","last_name":"Davis","employee_id":3}]}}}

Here is the list of modules used:
feeds (and its dependencies)
feeds_import
feeds_ui
feeds_jsonpath_parser

Here are the steps in order to consume that JSON feed:

1) Navigate to admin/build/feeds/create
2) Enter importer name and description.
3) Click create. NOTE: You will be redirected to the importer configuration page.
4) Modify the following basic settings and click save:
- Attach to content type setting: Feed Item
- Minimum refresh period: 1 day
5) Make sure the Fetcher is set to HTTP Fetcher.
6) Change Parser to JSONPath parser.
7) Change Processor to Node processor.
8) Modify the following node processor settings and click save.
- Content Type: employee
- Author: admin
- Authorize: checked
- Expire: never
- Update existing nodes: replace existing nodes
9) Map the following JSONPath expressions with the appropriate employee content type fields.
10) Modify the JSONPath parser settings and click save:
- Context: $.companies.company_a.employees[*]
- title(node title): last_name
- field_employee_last_name: last_name
- field_employee_first_name: first_name
- field_employee_emp_id: employee_id
- guid: employee_id

NOTE: Modifying JSONPath settings tells Feeds how to traverse the JSON string and retrieve specific data that Drupal needs to build an employee content.

11) Navigate to admin/build/feeds
12) Create a feed item node that will be responsible for fetching the JSON content by clicking Feed Item (under “Attached to”).
13) Enter node title and feed URL.

NOTE: When cron runs, feeds will look for the feed item node that you created and will consume the JSON content from the specified URL. You can also manually consume the JSON content by navigating to the node page of the feed item, clicking the import tab and then clicking import.

Apr 25 2011
Apr 25

When a client asks for a way to pull content onto a site through RSS, the obvious choice is to use Drupal's Feeds module. I've never been really in love with this module but it does the job well. We recently had an interesting case that required extending the normal functionality of feeds to interact with custom content types.

The scenario was as follows:

  • The site lists many organizations and the events of those organizations.
  • Site users connect with and follow events and news from the various organizations.
  • Each organization may run several websites each having its own set of RSS feeds.
  • The organizations desired the ability to have their content added to their pages dynamically and without effort.
  • The user would visit the organization's page and see a list of news and articles from one or more of that organization's feeds.

The idea is fairly straight forward. But feeds does not support this kind of association by default.

If you're not familiar with feeds here's a brief rundown of how it works:

  1. Feeds allows you to designate a content type to be the source of a feed, or it will create a feed content type for you.
  2. You then create new nodes of this content type, adding the URL of the feed to be imported to each node.
  3. A designated times, the feed importer will be used to fetch information from the designated RSS URL. The data will be added to Drupal as nodes. You can choose what kind of node you would like the imported data to be.

This was a Drupal 7 site and our idea was to use references to reference each feed importer to the organization in question. For example the feed importers for developer.apple.com would reference the organization node for Apple as would the feed importer for news.apple.com while the feed importers for developer.microsoft.com and news.microsoft.com would reference the Microsoft organization node.

Make sense so far?

We then created a new content type for partner's news called Partner News. To this we added normal title, body, date, and ID information and also another reference field for organization. What we really wanted was for the partner news nodes to automatically inherit the reference field from the importer that created them. So playing off the example above, we wanted each news item that was added to Drupal from the feed at news.apple.com to inherit the reference to the Apple organization node so that we could later create a view on the Apple organization node that displayed all the imported feeds associated with Apple.

The trick is that this function didn't exist. Lucky for us Feeds module provides some useful hooks to extend its base functionality.

The custom module we created is fewer than 50 lines if you ignore all the comments.

The following screen shots show a typical Feeds importer setup.

In this case I am attaching my feed importer to content type called importer. This means that when I create a new importer node, I will see that feeds has added a new field to the content type giving me a place to add the URL of the feed to be imported.

Under settings I designate that imported nodes should be Partner News nodes. That means that each item in an RSS feed that is imported will become its own Partner News node. I also set Feeds to update nodes rather than replace them or creating new ones if it finds duplicate data.

Finally we designate the mapping. The mapping defines to feeds what elements from an imported RSS element should be added to what part of the new node. Some of these are pretty obvious. We map title to title, date to date, description to body, and GUID to GUID. This last one (GUID) provides a unique identifier for updating feed data and is required if you want the nodes to update rather than duplicate.

But what we want doesn't exist. We want to see a source element that says something like “Feed Importer's Organization Reference” and a target that says something like “Organization Reference”, so that we can map from one to the other.

To do this start a custom module in the standard fashion (http://drupal.org/node/1074360). I'll call my module feedmapper. In feedmapper.module add the following function:

<?php
/**
* Implements hook_feeds_parser_sources_alter().
*/
function feedmapper_feeds_parser_sources_alter(&$sources, $content_type) {
  $sources['field_importer_reference'] = array(
    'name' => t('Organizations\'s NID'),
    'description' => t('The node ID of the partner.'),
    'callback' => 'feedmapper_get_organization_nid',
  );
}

This adds a new source to the dropdown on the feed importer configuration.

The callback describes a function that will actually handle the data processing. You should prefix it with the name of you module but it can be named anything that makes sense. I haven't written this function yet, but we'll get to that shortly.

Next I'll add a function that specifies a new target.

/**
* Implements hook_feeds_processor_targets_alter().
*/
function feedmapper_feeds_processor_targets_alter(&$targets, $entity_type, $bundle_name) {
  $targets['field_importer_reference'] = array(
    'name' => 'Organization Reference',
    'description' => 'the node reference for the partner',
    'real_target' => 'field_importer_reference', // Specify real target field on node. This is on the content type.
    'callback' => 'feedmapper_feeds_set_target',
  );
}

Note that the source and the target both reference the same field field_importer_reference. This is due to the fact that I am reusing the field across content types. If you had different field names for each content type, you would need to make the target and source point to the specific field names you created.

Now you can assign this mapping. Of course it doesn't do anything yet because we haven't written the appropriate callbacks.

The set method is actually pretty easy because feeds handles that for us providing we pass the correct data at first. We need to focus on retrieving the correct node Id from the feed importer. To do this we access a property of the feed object. This property is feed_nid which as you can guess returns the value of the feed's node id. But now that we have the nid, retrieving another field's data is fairly trivial, we just need to make sure we're using the correct type of node so we run a check on the node type and then get the field in question:

/**
* Find the node id of the feed source and use it to find the associated organization.
*/
function feedmapper_get_organization_nid(FeedsSource $source) {
  $nid = $source->feed_nid;
  $feed = node_load($nid);
  if ($feed->type == 'importer') { //this needs to be the name of the importer content type.
    $partner_nid = $feed->field_importer_reference;
  }
  else {
    $partner_nid = NULL;
  }
  return $partner_nid;
}/**
* Implements hook_feeds_set_target().
*/
function feedmapper_feeds_set_target($source, &$entity, $target, $value) {
  $entity->$target = $value;
}

And that's it. Now the creating of a feed importer is tied to an organization and every time news is imported via feeds the incoming news item is automatically linked to its parent organization.

I've attached a working version of the module outlined here along with a Feature that should get you started.

Good luck.

AttachmentSize 3.52 KB 953 bytes
Dec 15 2010
Dec 15

on 15 December, 2010

I gave a talk on "Feeding Drupal in Real-Time" at the Guardian on Tuesday, for the Drupal Drop-In event. It was a great evening, I met lots of interesting people and enjoyed some fantastic presentations. Thanks to everyone who came and made the event a success, especially Robert Castelo and Mark Baker for organising, and Microsoft and the Guardian for sponsoring.

My slides from the talk are below, but I also did two live demonstrations which are kinda hard to reproduce here! However, here's what happened for anyone who missed it:

  1. We used Feeds module to import Flickr photos with location data and display them on a Google map.

  2. We imported Gowalla check-ins and used pubsubhubbub to show my location update in real-time on a Drupal Gmap, as I checked in to the Guardian HQ. This was very exciting and almost definitely doomed to fail on the day, but by some fluke of fortune, it didn't!

I'll write blog posts with more detailed instructions for replicating these demonstrations soon, you can be informed of these by subscribing to our pubsubhubbub-enabled RSS feed. :)

Mar 25 2008
Mar 25

A first part of many about feeds.

The Transmission network, a group for sharing videos for social change, have been working on developing a free metadata standard for video interchange. Before this what there has been is tied to Yahoo! MRSS which one wouldn't call proactive in working with the community for growth and development of their standard.

So the Transmission folks have got some backing and come up with an XML standard for video metadata that in the docs (and in political free background) is linked to Atom. There's a working sort of beta version 0.9 on it's way to becoming a real 1.0 release.

I've implemented it for the video module for Drupal. An example atom feed with plenty of metadata. This requires the patch I committed to the Atom module to allow enclosures as well as additional namespaces.

The fact that no one has previously pushed a metadata standard for Atom, or moved the Yahoo! RSS namespace beyond it's stale position, is interesting. People who actually make and screen videos are crying out for this metadata. They will quickly list off what they want to be made available.

Yet, despite the user demand for metadata, I challenge you to find an implementation of MRSS that includes all of the metadata that it could make available. I really want to find them too, so let me know! (I now exclude Drupal sites with a video module after 16 March because my patch to make proper feeds with the metadata that is available was committed).

It's a shame too. Because the demand is already there from the video community end users. The MRSS standard doesn't seem to be too bad either, it fits a lot of the present videocasting requirements - there is much I feel the Transmission standard could learn from it there. However unlike the Transmission standard the Yahoo! one doesn't meet all the requirements of the present day video creator or user. Yet even with the rise of demand making the Yahoo! MRSS outline an XML standard or developing it with the community seems not to be happening, at the moment.

All that is happening with feeds is CMS developers are starting to develop feeds that actually include enclosures. This is essential as the interest in videocasting becomes greater - miro must have a lot to be praised for here. But the information and formats the video community really want and need is not there. If you actually look at the feeds produced quite often they don't even display how big the file is!

Part of the problem has to be that video related data is pretty complicated and there is quite a bit of it. How long has it taken me to get used to all the different codecs, framerates, and so forth - I'm still getting used to it?!

But that's no excuse. Why aren't Content Management Systems storing this information with videos if it is made available? Why aren't they discovering it if it's not available? Even your really non-demanding user wants to know the if 'the quality is any good' and if 'they can play this video'. It's not rocket science to do either... but I suspect that leads me to a future post...

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web