DEVTRENCH.COM

WordPress to MODx Migration Part 3: Templates, Categories, and Postmeta

Alright, I'm back for part three of this action packed WordPress to MODx Migration mania. In part one I showed you how to use xPDO to connect to the WordPress database and import post content into MODx. In part two I demonstrated how xPDO handles table relationships by importing WordPress comments into MODx. In this part of the WordPress to MODx migration I'm going to show you how to create and assign templates, migrate categories and postmeta data.

Before we start, I'm glad to say that I've created an account on GitHub and have personally switched from SVN to Git. This came about because of MODx's switch from SVN to Git, and I'm glad that they did, because I've found that I like Git a lot more than SVN. That said, the code for this script is now hosted on GitHub at http://github.com/devtrench/WordPress-to-MODx. Since this code is still for developers only, there's no official release in the downloads section (or on MODx), and I'll most likely keep it that way until it's packaged for the masses. However, placing it on GitHub does mark the beta release of this code, so please test it out if you're doing a WP to MODx conversion. Any feedback would be greatly appreciated! If you just want the script you can get it from GitHub. Below and on the following pages, is the continued explanation of how it works for those who are interested.

Templates

Templates are the backbone of the Template Variable system in MODx. So in order to migrate Categories and Postmetas from WordPress into MODx as Template Variables, we first need to create the Templates in MODx that they will be assigned to. In order to do this we must pick id numbers for the templates and assign those in the configuration section of the script.

/**
 * template ids to use during import
 * if you would like to have different templates for posts and pages you can
 * specify that below, if not they will default to the default template id
 * if the templates don't exist they will be created.  The default template id
 * must be set.
 */
$default_template_id = 1;
$post_template_id = 2;
$page_template_id = 3;

The template id can be the id of an existing template, or if the template doesn't exist, a new template with that id will be created. I place the template ids from the configuration into an array and check for their existence in the code below:

// set up our default templates in an array
if (!empty($default_template_id))
  $templates['default'] = $default_template_id;
if (!empty($post_template_id))
  $templates['post'] = $post_template_id;
if (!empty($page_template_id))
  $templates['page'] = $page_template_id;

// now check that the templates exist
foreach($templates as $template_name => $template_id)
{
  $template = $template_name . '_template';
  if(!empty($template_id) && !$$template = $modx->getObject('modTemplate',$template_id))
  {
    $$template = $modx->newObject('modTemplate');
    $data = array(
            'id' => $post_template_id,
            'templatename' => 'Auto Generated ' . ucwords($template_name) . ' Template',
            'content' => '',
    );
    $$template->fromArray($data,'',true);
    $$template->save();
  }
}

You can see that I check to see if the Template already exists by using xPDO to get the Template object by its id. If it doesn't exist I create a new modTemplate object, and save it with the id given in the configuration. You can set primary keys for xPDO objects by passing 'true' as the third parameter in the fromArray method.

Continue on to Migrating Related Categories...

Migrating Related Categories

Once the Templates are created we can start to create Template Variables to hold data related to posts. One important piece of post data is what Category it belongs to. In pre 2.3 versions of WordPress, this information was so important as to give it it's own 'categories' table. In WordPress 2.3 and above, Categories are now mapped with WordPress's Taxonomy system which introduced the ability to use Tags. However, the Categories vs Tags debate has been around in WordPress for a while because there is really no difference (especially from a database standpoint). WordPress posts can have multiple categories as well as multiple tags, so it makes sense to me to just combine these two 'features' and give you the option to call them whatever you want when you migrate to MODx.

By default, I call them 'Tags' in MODx since I think that fits more with the available tutorials for MODx blogging resources, and I believe that is what they truly are. I once read in a WordPress forum that Categories are supposed to be like folders (so one category per post), and Tags are, well, tags (ie. go crazy and label your content however you want). With MODx, we can use containers as Categories and the new Tags template variable for tag based categorization. This script doesn't set up any category folders since I don't know what to do about cross categorization, so that step will have to be done post migration if it's something that is needed. For the rest of this post you can consider Categories and Tags to mean the same thing, and they are used interchangeably.

To migrate Categories and Tags from WordPress you first have to set the $categories_tv variable to the name you want to use for them ("Tags" is the default).

/**
 * the name of the template variable that you want to use for your categories.
 * If it doesn't exist, it will be created. Leave blank if you don't want to
 * import categories
 */
$categories_tv = 'Tags';

If the $categories_tv variable is set to the name of your choice then a template variable will be created with that name if it doesn't already exist:

// now add the category TV
if(!empty($categories_tv) && is_null($category_tv = $modx->getObject('modTemplateVar',array('name'=>$categories_tv))))
{
  $category_tv = $modx->newObject('modTemplateVar');
  $data = array(
          'name' => $categories_tv,
          'caption' => $categories_tv,
          'type' => 'text',
  );
  $category_tv->fromArray($data);
  $category_tv->save();
  $category_tv_id = $category_tv->get('id');

  foreach($templates as $template_name => $template_id)
  {
    if($template_id)
    {
      $tv_link = $modx->newObject('modTemplateVarTemplate');
      $tv_link->set('templateid',$template_id);
      $tv_link->set('tmplvarid',$category_tv_id);
      $tv_link->set('rank',1);
      $tv_link->save();
    }
  }
}

First I find the categories template variable by it's name. If it doesn't exist, then it's created and assigned to each of the Templates that were created previously. I can see now that I need to move the template assignment outside of the conditional statement that checks for the template since the Template Variable may already exist. We'll call that bug #1, and I'll fix it in the next push to GitHub this is already fixed in GitHub.

Migrating categories while keeping them related to their respective posts is done inside of the post migration loop:

  /**
   * categories migration ------------------------------------------------------
   *
   * categories use a template variable to store categories.  This is a tag
   * based structure combines wordpress categories and tags into one field
   */
  if(!empty($categories_tv) && is_object($resource))
  {
    $term_relationsips = array();
    $term_taxonomies = array();
    $terms = array();

    $term_relationsips = $post->getMany('TermRelationships');
    if (!is_array($term_relationsips)) continue;

    foreach($term_relationsips as $tr)
    {
      $term_taxonomies[] = $tr->getOne('TermTaxonomys');
    }
    if (!is_array($term_taxonomies)) continue;

    foreach($term_taxonomies as $tt)
    {
      $term = $tt->getOne('Terms');
      $terms[] = $term->get('name');
    }

    // if categories exist add it to the templatevar for this resource
    if (count($terms) > 0)
    {
      $terms = join(',',$terms);
      $tv = $modx->newObject('modTemplateVarResource');
      $tv->set('tmplvarid',$category_tv_id);
      $tv->set('contentid',$res_id);
      $tv->set('value',$terms);
      $tv->save();
    }
  }

xPDO comes to the rescue again as I use it to first get the category relationships (TermRelationships) for the current post. Each relationship has a corresponding Taxonomy which in turn has a corresponding Term. If it sounds confusing, it is. xPDO handles it well though, and we end up creating a new Template Variable Resource for each tag with a relationship to the category Template Variable and the migrated post. You'll note that there's no mention of 'categories' or 'tags' in the xPDO queries. This is because categories and tags exist side by side in the WordPress tables and we can get all of them simply by finding the ones related to the current post. The only difference between them on the WordPress side is how they are labeled in the Taxonomy table. This label allows the potential of only migrating categories or tags, but that's left as an exercise for the user for now.

Continue on to Migrating Postmetas...

Postmetas

Postmetas are the dumping ground for any extra data related to posts. They're like MODx template variables with a few differences: 1) many postmetas hold information that is hidden from the user (MODx Template Variables are displayed by default), 2) Postmetas that are displayed to the user either show up as Custom Fields (which, AFAIK, are only textarea inputs), or are shown in a variety of ways by plugins. Regardless of these differences, the fact is that data in the WordPress Postmeta table is small bits of information in relationship with their post. This makes them perfect for Template Variables.

Postmeta creation is done by finding all of the unique keys in the Postmeta table and creating template variables based on those names:

// set up all wp postmeta options up as template variables
$criteria = $wp->newQuery('Postmeta');
$criteria->groupby('meta_key');
$criteria->sortby('meta_key','ASC');
$postmetas = $wp->getCollection('Postmeta',$criteria);

foreach ($postmetas as $meta)
{
  // create each meta tv
  $meta_tv = $modx->newObject('modTemplateVar');
  $data = array(
          'name' => $meta->get('meta_key'),
          'caption' => $meta->get('meta_key'),
          'type' => 'textarea',
  );
  $meta_tv->fromArray($data);
  $meta_tv->save();
  $meta_tv_id = $meta_tv->get('id');
  $meta_tv_ids[$meta->get('meta_key')] = $meta_tv->get('id');

  // link the postmeta tvs to the templates
  foreach($templates as $template_name => $template_id)
  {
    if($template_id)
    {
      $tv_link = $modx->newObject('modTemplateVarTemplate');
      $tv_link->set('templateid',$template_id);
      $tv_link->set('tmplvarid',$meta_tv_id);
      $tv_link->set('rank',1);
      $tv_link->save();
    }
  }
}

The first block of code in this snippet gets a collection of postmetas using a custom criteria. The custom criteria is needed because we need the distinct meta_keys in the table, and this is done with groupby(). Next, I iterate over each distinct postmeta key and create a new Template Variable. When the new Template Variable is created, it's added to a special array that uses the meta_key for the array key and the new id for the value. This array is used later to retrieve the id of the Template Variable. Then, like for Tags, I assign each new Template Variable to the Templates we created.

Like Tags, Postmetas are migrated in the post migration loop:

  /**
   * post meta migration -------------------------------------------------------
   */
  $postmetas = '';
  $criteria = $wp->newQuery('Postmeta');
  $criteria->where(array(
          'post_id' => $post->get('ID'),
  ));
  $postmetas = $wp->getCollection('Postmeta',$criteria);
  if (is_array($postmetas))
  {
    foreach($postmetas as $meta)
    {
      $tv = $modx->newObject('modTemplateVarResource');
      $tv->set('tmplvarid',$meta_tv_ids[$meta->get('meta_key')]);
      $tv->set('contentid',$res_id);
      $tv->set('value',$meta->get('meta_value'));
      $tv->save();
    }
  }

First I find if the current post has any Postmetas using an xPDO query. Since we're not retrieving the collection based on a primary key, we need a custom criteria to set the where clause to find postmetas based on the post ID. If there are postmetas I create a new Template Variable Resource for each one and set up the relationships to both the Template Variable it belongs to, and the resource. As you can see this done simply by setting the appropriate id values the tmplvarid and contentid fields. The value is also stored of course.

Continue on to find out what's next for DEVTRENCH...

Where Do We Go From Here?

For all intents and purposes this script should be considered workable but not finished. There are other things we can migrate of course: authors, options, plugins even, but the script is now to the point where I have all of the data I need to migrate this blog into MODx Revolution. Since that is my main goal, I'm going to spend my time working on that, and come back to this script once the new DEVTRENCH site is up and running. The next time I post, this site will be running on MODx Revolution, have a new look, and a new, more dedicated focus on serving the MODx community. I'm pretty much putting my money where my mouth is as a developer and choosing to support MODx over WordPress, especially since I've chosen it for my blogging platform. I've made my final decision.

As for the future of this script, I'm well aware that in it's current state it's pretty much a developer only migration tool. So I plan to make a version that uses Custom Manager Pages to set the configuration options and run the script in a more user friendly fashion. This has the big plus of being able to be packaged and therefore installed via the new Package Manager in Revolution. Don't get me wrong though, I don't ever foresee a complicated blog such as this one being able to be migrated without some kind of intermediate intervention by a developer. It's simply a fact that there are too many custom plugins to account for on the WordPress side to be able to do a straight migration and have everything working in MODx as it did in WordPress. Case in point, this blog uses plugins for syntax highlighting, a downloader, seo, related posts, etc, none of which exist on the MODx side. The syntax highlighter, downloader and others make use of WordPress's shortcode feature, so not only does the feature not work on the MODx side, but the content is littered with shortcodes that MODx has no idea what to do with. It's very much the same as a migration from MODx Evolution to Revolution but worse, and we haven't even gotten into how to make a working blog with our migrated data. I'm hoping that once I've finished with my migration I have more insight into how to make the process smoother.

Until then, wish me luck on moving this beast, and as they say, I'll see you on the flip side.