Drupal 8 Migrate - importing clean data via .csv file

Parent Feed: 

One of the pillars of the consulting side of the work we do here at DrupalEasy is data migration into Drupal sites. Over the past few years, we've been focused on migrating data into Drupal 8 using the most excellent core migrate modules along with contrib modules like Migrate Tools, Migrate Plus, and Migrate Source CSV

If there's one thing I've learned over the years is that data to be migrated is never, ever, ever, ever, never as clean as it is claimed to be. There's always something that has to be massaged during the migration process in order to get things working.

One of the most common things I see when migrating data from a .csv is strings with trailing spaces. If you take a cursory look at the data in a spreadsheet, you might see something like "Bread", but if you look at the same data in a text .csv file, you'll see that the string is actually "Bread " (trailing space). If you're migrating this field to a vocabulary using the entity_lookup process plugin, that trailing space will cause the term to not be found, and therefore not migrated.

The solution? Well, you could clean up the data, but there's actually a much easier solution that I use by default on almost all string data being migrated - I use the "callback" plugin to in-turn call the PHP trim() function on incoming strings in the "process" section of the migration configuration. Here's an example:

field_topics:
  -
    plugin: callback
    callable: trim
    source: Topic
  -
    plugin: entity_lookup
    entity_type: taxonomy_term
    bundle: topics
    bundle_key: vid
    value_key: name
    ignore_case: true

Using this method allows for the incoming data to be a little dirty without affecting the migration.

Original Post: 

About Drupal Sun

Drupal Sun is an Evolving Web project. It allows you to:

  • Do full-text search on all the articles in Drupal Planet (thanks to Apache Solr)
  • Facet based on tags, author, or feed
  • Flip through articles quickly (with j/k or arrow keys) to find what you're interested in
  • View the entire article text inline, or in the context of the site where it was created

See the blog post at Evolving Web

Evolving Web