Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Move into Drupal using the Migrate Module

btmash
July 19, 2012

Move into Drupal using the Migrate Module

The migrate module provides a flexible framework for migrating content into Drupal from other sources (e.g., when converting a web site from another CMS to Drupal). Out-of-the-box, support for creating core Drupal objects such as nodes, users, files, terms, and comments are included - it can easily be extended for migrating other kinds of content. The power comes from an object oriented API that's tricky to get started with - We'll walk through the various classes in the module and how they work together to manage migrations.

btmash

July 19, 2012
Tweet

More Decks by btmash

Other Decks in Programming

Transcript

  1. Agenda • Introduction to Migrate • Theory • Implementation ◦

    Hooks ◦ Classes ▪ Migration ▪ Handlers ◦ Commands • Q & A (Conclusion)
  2. Thanks • Mike Ryan (mikeryan) ◦ http://drupal.org/user/4420 • Moshe Weitzman

    (moshe weitzman ◦ http://drupal.org/user/23 • Frank Carey ◦ http://drupal.org/user/112063 • Andrew Morton (drewish) ◦ http://drupal.org/user/34869
  3. Introduction to Migrate - Options • What are your options

    to bring content over into Drupal? ◦ By Hand ▪ Very time consuming. ▪ Not feasible if you have a 'lot' of content. • If you really don't like who you work with. ◦ Custom Scripts ▪ Might be ok as a 'one-off' solution. ▪ As flexible as you want. ▪ Write out drush plugins/web ui. ▪ Tracking? ▪ Integration into your source?
  4. Introduction to Migrate - Options Feeds • Absolutely great option.

    • Easy to setup. • Maps fields from source -> destination • Can Import RSS / Atom / Various Feeds ◦ Plugins for CSV, Database, even LDAP • Well documented. • Performance issues. • Content update handling. • Doesn't work well wrt references from other content (types).
  5. Introduction to Migrate • Powerful object oriented framework for moving

    content into Drupal. • Already defined many import sources. ◦ XML, JSON, CSV, DB, etc. • Support to migrate into various types of content. ◦ users, nodes, comments, taxonomy, all core entities. ◦ can define your own import handler. ◦ can import into a particular table! • Fast. • Minimal UI, mainly steered around drush. • Drush integration. • Steep Learning Curve. ◦ You will write code.
  6. Introduction to Migrate • Drupal 6 requires autoload and dbtng

    modules. So the code is very similar in 6 and 7. • Migrate Extras provides support for many contrib modules. ◦ Provides base class for importing to entities from EntityAPI. ◦ More field modules implementing field handlers. • The most comprehensive and up-to-date documentation is the beer.inc and wine.inc examples. ◦ Part of the Migrate module.
  7. Theory - Goal Source sid title user field1 field2 ...

    fieldN Destination content_id(auto) title uid field1 field2 ... fieldN
  8. Theory - Source • Interface to your current set of

    data (csv, json, xml, db, etc). • Provides a list of fields. • Responsible for iterating over the rows of data.
  9. Theory - Destination • Responsible for saving specific type of

    content to Drupal (user, node, row in a particular table) • Each Source record correlates to one Destination record.
  10. Theory - Field Mappings • Links a source field to

    destination field. • Basic functions such as splitting into an array based on separators, etc. • Can pass additional arguments (as the field handler implements).
  11. Theory - Mapping (goal) Source sid title user field1 field2

    field3 field4 ... fieldN Destination content_id(auto) title uid field1 field2 ... fieldN Mapping (alter and reference) Mapping
  12. Theory - Map • Connects the source and destination IDs

    allowing for translation between them. • Tracks keys schema format. • Allows for migration to re-run and update existing records. • Allows imported records to be deleted. • Allows you to reference the ID from another migration for to get converted for your own migration.
  13. Theory - Migration Map Source sid title user field1 field2

    field3 field4 ... fieldN Destination content_id(auto) title uid field1 field2 ... fieldN Mapping (alter and reference) Mapping Map Table
  14. Theory - Migration • Sets up all the necessary pieces:

    Source, Destination, Map, Field Mappings. • May provide logic for skipping over rows during migration. • May alter the Source Data during the migration. • May alter the Destination Entities during the Migration.
  15. Theory - Field Handler • Converts your source data into

    a format that Drupal understands. • $row->bar = array('foo', 'bar') into $entity_field_bar = array( 'und' => array( 0 => array('value' => 'foo'), 1 => array('value' => 'bar'), ), );
  16. Theory - Destination Handler • Extends existing destinations and adds

    additional functionality. ◦ MigrateCommentNodeHandler provides the option to allow for comments to a given node. • Contrib projects might want to create these. ◦ Flag?
  17. Theory - Field Handler Source sid title user field1 field2

    field3 field4 ... fieldN Destination content_id(auto) title (text) uid field1 (text) field2 (image) ... fieldN (tags) Mapping (alter and reference) Mapping Map Table
  18. Implementation • Let Migrate know about your module (hook). •

    Build a migration class. ◦ Provide a description. ◦ Give it information about where the content is coming from (Source). ◦ Give it information about where the content is going to get saved (Destination). ◦ Map the fields from the source into the destination (Map). ◦ (optional) Massage the data / add any fields you were not able to get in the initial mapping. ◦ (optional) Add / massage any data that does not have field handlers before the content gets saved. • Register class file in .info file.
  19. Implementation - Hooks • Just one :) ◦ Provide the

    API version number (currently at 2) function my_migrate_module_migrate_api() { return array( 'api' => 2, ); } • Might change to 3/4/5...N in the future ;)
  20. Implementation - Class • Consists of atleast 1 function and

    3 optional functions. class NodeBundleMigration extends Migration { public function __construct() { ... } # REQ'D. public function prepareRow($row) public function prepare($entity, $row) public function complete($entity, $row) }
  21. Implementation - __construct() • Set up the source, destination, map,

    field mappings in constructor. class NodeBundleMigration extends Migration { public function __construct() { parent::__construct(); $this->source = <my_source>; $this->destination = <my_destination>; $this->map = <my_map>; $this->addFieldMapping($my_dest_fld, $my_src_fld); } }
  22. Implementation - __construct() Source Fields • Lets Migration class know

    a little about the fields that are coming in (like compound fields). • Can set it to an array if nothing complex. $source_fields = array( 'mtid' => 'The source row ID', 'compound_field_1' => 'Field not from inital query but will be necessary later on.' );
  23. Implementation - __construct() Source (Current Database) // Required $query =

    db_select('my_table', 'mt'); $query->fields('mt', array('mtid', 'style', 'details', 'updated', 'style_parent', 'style_image')); $query->join('mt_extras', 'mte', 'mt.mtid = mte.mtid'); $query->orderBy('mt.updated', 'ASC'); // Implement a count_query if it is different or set to NULL. $this->source = new MigrateSourceSQL($query, $source_fields, $count_query);
  24. Implementation - __construct() Source (External Database) $connection = Database::getConnection('for_migration'); $query

    = $connection->select('my_table', 'mt'); $query->fields('mt', array('mtid', 'style', 'details', 'updated', 'style_parent', 'style_image')); $query->orderBy('mt.updated', 'ASC'); // Implement a count_query if it is different or set to NULL. $this->source = new MigrateSourceSQL($query, $source_fields, $count_query, array('map_joinable' => FALSE')); • Lets migrate know there is no easy way to map the IDs ◦ More computing work necessary.
  25. Implementation - __construct() Source (CSV File) // The definition of

    the columns. Keys are integers // values are an array of field name then description. $columns = array( 0 => array('cvs_uid', 'Id'), 1 => array('email', 'Email'), 2 => array('name', 'Name'), 3 => array('date', 'Date'), ); $this->source = new MigrateSourceCSV("path/to/file.csv", $columns, array('header_rows' => TRUE), $this->fields());
  26. Implementation - __construct() Source (Other Sources) • Comes with base

    source migration classes to migrate from JSON, XML, File Directories. • Expect to make some changes depending on the migration format.
  27. Source Base Classes • If you have source IDs referenced

    separately from your values. ◦ Use MigrateSourceList as a source. ◦ Implement MigrateList for fetching counts and IDs, and MigrateItem for fetching values. • If everything is in a single file with IDs mixed in: ◦ Use MigrateSourceMultiItems as a source. ◦ Implement MigrateItems for extracting IDs and values. • Look at http://drupal.org/node/1152152, http://drupal. org/node/1152154, and http://drupal.org/node/1152156 for a clearer example.
  28. Implementation - __construct() Migration Map $this->map = new MigrateSQLMap($this->machineName, //

    Describe your primary ID schema array( 'mtid' => array( 'type' => 'integer', 'unsigned' => TRUE, 'not null' => TRUE, 'alias' => 'mt' ), ), MigrateDestinationNode::getKeySchema() );
  29. Implementation - __construct() Highwater • May have noticed orderby on

    sql queries. • Migrate feature to figure out if a piece of content can be updated rather than inserted just once. • Need to let migrate know which column contains the highwater data. $this->highwaterField = array( 'name' => 'updated', 'alias' => 'mt', );
  30. Implementation - __construct() Destination // Terms $this->destination = new MigrateDestinationTerm('site_vocabulary');

    // Nodes $this->destination = new MigrateDestinationNode('articles'); // Users $this->destination = new MigrateDestinationUser(); // Contrib - Commerce Products $this->destination = new MigrateDestinationEntity('commerce_product');
  31. Implementation - __construct() Field Mapping // Can be simple. $this->addFieldMapping('dest_name',

    'source_name'); // Can be a set value. $this->addFieldMapping('uid')->defaultValue(1); // Can have no value (or whatever the system default is) $this->addFieldMapping('path')->issueGroup('DNM'); // Can be multiple values with a separator. $this->addFieldMapping('field_tags', 'source_tags')->separator(','); // Can have arguments $this->addFieldMapping('field_body', 'description')->arguments($arguments);
  32. Implementation - __construct() Field Mapping (cont'd) Core migrate fields arguments

    have moved to token-based field mapping. ◦ $this->addFieldMapping('field_body:teaser', 'teaser_source_field'); ◦ Implemented for field handlers via the fields() function in 2.4. ◦ Just provide scalar or array! • 'More' Mapping. ◦ Simpler to understand. • Still some magic. ◦ Files mappings still strange. ◦ http://drupal.org/node/1540106 ◦ Create a destination dir for files as tokens are iffy. Or migrate via file IDs (easier)
  33. Implementation - __construct() Field Mapping Arguments • More for the

    contributed space since Migrate 2.4 core fields have moved away from this approach. • Used to pass multiple source fields into a single destination field (more like 'meta' information. • As an example, a body field (with summary) $this->addFieldMapping('body', 'source_body') ->arguments(array( 'summary' => array('source_field' => 'teaser'), 'format' => 1, ));
  34. Implementation - __construct() Field Mapping Arguments (cont'd) • Can implement

    with static argument function if you reuse arguments with other fields. • Old Image Mapping Format 1: $this->addFieldMapping('image', 'source_image') ->arguments(array( 'source_path' => $path, 'alt' => array('source_field' => 'image_alt', )); • Old Image Mapping Format 2: $arguments = MigrateFileFieldHandler::arguments( $path, 'file_copy', FILE_EXISTS_RENAME, NULL, array('source_field' => 'image_alt')); $this->addFieldMapping('image', 'source_image')->arguments($arguments);
  35. Implementation - __construct() Field Mapping Source Migrations • When you

    have a value from the old migration and need to look up the new ID from the migration map. ◦ Content Author ◦ References $this->addFieldMapping('uid', 'author_id') - >sourceMigration('MTUserMigration'); • Remember to add a dependency :) ◦ $this->dependencies = array('MTUserMigration');
  36. Implementation - Additional Processing • Three key functions to insert/modify

    the imported data. ◦ prepareRow($row) ◦ prepare($entity, $row) ◦ complete($entity, $row) • Each one is useful in different circumstances.
  37. Implementation - prepareRow($row) • Passes in the row from the

    current source as an object so you can make modifications. • Can indicate that a row should be skipped during import by returning FALSE; • Add or change field values: $row->field3 = $row->field4 .' '. $row->field5; $row->created = strtotime($row->access); $row->images = array('image1', 'image2');
  38. Implementation prepare($entity, $row) • Work directly with the entity object

    that has been populated with field mappings. ◦ Passed the entity prior to being saved and the source row. • Last chance before entity gets saved. • Have you save fields in entity field format. • Use prepare() to populate fields that do not have a field handler (link, relation, location as examples at time of writing) $entity->field_link['und'][0]['value'] = 'http: //drupal.org/project/migrate';
  39. Implementation complete($entity, $row) • Entity is now saved - chance

    to update any *other* records that reference the current entity. • Don't use it to save the same record again...
  40. Implementation - Dealing with Circular Dependencies • Implement stubs -

    (http://drupal. org/node/1013506) • Specify a sourceMigration ('NodeBundleMigration') on the ID's field mapping. • Add createStub($migration, $source_key) to NodeBundleMigration which creates an empty record and returns the record ID. • Next time NodeBundleMigration runs, it will update the stub and fill it with proper content.
  41. Implementation - Dealing with Dynamic Migrations • Some projects (like

    wordpress migrate / commerce migrate) will migrate most but not all content. • Extend by creating destination migration. ◦ Same as regular migration but in __construct you have to provide type of record and value of record. ▪ $this->systemOfRecord = Migration::DESTINATION ▪ $this->addFieldMapping('nid','nid') ->sourceMigration('NodeBundleMigration');
  42. Implementation - Import Flow • Source iterates until it finds

    an appropriate record. • Calls on prepareRow($row) letting you modify or reject the data in $row. • Migration applies the Mappings and Field Handlers to convert $row into $entity. • Migrate calls on prepare($entity, $row) to modify the entity before it gets saved. • Entity is saved. • Migrate records the IDs into the map and calls complete() so you can see and work with the final Entity ID.
  43. Implementation - Suggestions • Separate your file migrations. ◦ Migrate

    2.4 now has a class to migrate your files in separately. ◦ Can retain structure of source file directory. ◦ Or not - its just more flexible. ◦ Or make multiple file migrations based off your separate content migrations and have your content migrations have a dependency on the file migration.
  44. Migrate in Contrib • Creating new types of objects? ◦

    Write a destination handler. ◦ Hopefully, you can implement your object using the entityapi and extend on the MigrateDestinationEntityAPI class. • Create new types of fields? ◦ Write a field handler.
  45. References Projects • http://drupal.org/project/migrate • http://drupal.org/project/migrate_extras Drupal -> Drupal Migration

    Sandboxes • http://drupal.org/sandbox/mikeryan/1234554 • http://drupal.org/sandbox/btmash/1092900 • http://drupal.org/sandbox/btmash/1492598 Documentation • http://drupal.org/node/415260 • http://denver2012.drupal.org/program/sessions/getting- it-drupal-migrate • http://btmash.com/tags/migrate