Using Hash Value (track_changes) To Detect Source Data Changes in Migrate for Drupal 7

When you import data with Migrate ( it is nice to have ability to update already imported data if source has changed. To track updated data Migrate uses highwater marks (more on this: Highwater mark is a column of a source data which has timestamp of the last data change. However, the problem with highwater mark is that you don't always have such "time tracking" enabled on source's page. For example, if you migrate users from D6 to D7, there is no field in {users} table, which stores "changed" timestamp, like in {node} table for example.

So what to do? Luckily, Migrate 2.6 beta 1 came out ten days ago which brings us hash value based tracking: Now you don't need timestamps/highwater marks. Now, Migrate will automagically create hash value for every row of your source data and will compare it against already stored hashes, if hashes aren't equal, it will treat row data as updated an update it. Pretty neat!

To enable this option, all you need to do is to add 'track_changes' => 1 to $options before you map source in your migrate class constructor, it may look like this:

  1. class ExampleNodeMigration extends Migration {
  2. public function __construct($arguments) {
  3. parent::__construct($arguments);
  4. $options = array('track_changes' => 1);
  5. $this->source = new MigrateSourceSQL($this->query(), $this->fields(), NULL, $options);
  6. }

Update: put this snippet on After this, during the first import run Migrate will auto-update all content you have (as if you've forced update option) and create hash values for every row. During next runs, it will import new content and update content that has changed at source.

Hope this tutorial will be helpful, since there is no documentation yet for this really usefull feature.