Migrating from WordPress to CakePHP

Posted by Felix Geisendörfer, on Sep 24, 2007 - in PHP & CakePHP » Other

Hey folks,

I'm currently working on migrating my blog from WordPress to a light-weight CakePHP replacement that I hope to enhance and customize to my liking easily in future. Its not like WordPress has treated me badly, if it wasn't around I'd probably never have started this blog and never gotten to where I am right now. While I'm giving out kudos, I'd also like to mention that you, the readers of this blog have been an amazingly good audience by sharing your knowledge and opinions on PHP programming with me, while also being very encouraging about the whole thing. I really feel bad when I don't get to blog, so time to do it again : ).

In this post I'm simply going to throw out some snippets to show how I'm currently approaching the whole process in order to give both my insight into what works and what doesn't, while also hoping to get some people to share their insight into migrating legacy apps to CakePHP.

So one of the first things I started was to do a new layout and come up with the visual screen elements that I want to integrate. The next thing I did (and still am doing) is to start a little migration script using the new Cake 1.2 console to get my data moved in the new db schema. I haven't finalized the new schema yet and kind of make it up as I go, run a migration and then implement the functionality to show the data currently available. So far I've been working on migrating the wp_posts and wp_comments table. The first thing that hit me was that WordPress is not good about conventions. For example the primary key on my wp_posts table is 'ID' while it is 'comment_ID' on the comments table. I soon realized that one of the biggest tasks would be to map old Wp table fields to ones that have different names in my new CakePHP blog. But before I go in too much detail, I'll paste the migration script I currently have and then explain it a little further:

php
  1.  
  2. uses('model'.DS.'model');
  3. class WpMigrationShell extends Shell {
  4.   var $uses = array(
  5.     'WpPost'
  6.     , 'WpComment'
  7.   );
  8.  
  9.   var $times = array();
  10.  
  11.   function initialize() {
  12.     $this->times['start'] = microtime(true);
  13.     parent::initialize();
  14.   }
  15.  
  16.   function main() {
  17.     $this->out('Migrating WordPress posts ...');
  18.     $this->WpPost->migrate();
  19.     $this->out('Migrating WordPress comments ...');
  20.     $this->WpComment->migrate();
  21.    
  22.     $this->out('Populating WordPress posts counter cache ...');
  23.     $this->WpPost->populateCounterCache();
  24.  
  25.  
  26.    
  27.     $this->times['end'] = microtime(true);
  28.     $duration = round($this->times['end'] - $this->times['start'], 2);
  29.     $this->hr();
  30.     $this->out('Migration took '.$duration.' seconds.');
  31.   }
  32. }
  33.  
  34. class WpMigrationModel extends Model{
  35.   var $useDbConfig = 'wordpress';
  36.   var $newModel = null;
  37.   var $map = array();
  38.  
  39.   function migrate() {
  40.     if (empty($this->newModel)) {
  41.       $this->newModel = substr($this->name, 2);
  42.     }
  43.    
  44.     loadModel($this->newModel);
  45.     $Model = new $this->newModel();
  46.     $Model->execute('TRUNCATE '.$Model->table);
  47.    
  48.     $methods = get_class_methods($this);
  49.     $keys = array_keys($this->map);
  50.     $idKey = $keys[0];
  51.    
  52.     $oldEntries = $this->findAll(null, $keys);
  53.     foreach ($oldEntries as $oldEntry) {
  54.       if (!$this->filter($oldEntry[$this->name])) {
  55.         continue;
  56.       }
  57.       $id = $oldEntry[$this->name][$idKey];
  58.       $Model->create();
  59.       foreach ($this->map as $oldField => $newField) {
  60.         $value = $oldEntry[$this->name][$oldField];
  61.         $migrateFct = 'migrate'.Inflector::camelize($newField);
  62.         if (in_array($migrateFct, $methods)) {
  63.           $value = $this->{$migrateFct}($value);
  64.         }
  65.         $Model->set($newField, $value);
  66.       }
  67.       if (!$Model->save()) {
  68.         die('Could not save '.$this->newModel.' #'.$id);
  69.       }
  70.     }
  71.   }
  72.  
  73.   function filter() {
  74.     return true;
  75.   }
  76.  
  77.   function migrateText($text) {
  78.     return utf8_decode($text);
  79.   }
  80. }
  81.  
  82. class WpPost extends WpMigrationModel{
  83.   var $useTable = 'posts';
  84.   var $map = array(
  85.     'ID' => 'id'
  86.     , 'post_title' => 'title'
  87.     , 'post_content' => 'text'
  88.     , 'post_status' => 'published'
  89.   );
  90.  
  91.   function populateCounterCache() {
  92.     $Post = new Post();
  93.     $Post->recursive = -1;
  94.     $Post->Comment->recursive = -1;
  95.    
  96.     $posts = $Post->findAll(array('id'));
  97.     foreach ($posts as $post) {
  98.       $Post->set($post);
  99.       $comment_count = $Post->Comment->findCount(array('post_id' => $Post->id));
  100.       if ($comment_count) {
  101.         $Post->set(compact('comment_count'));
  102.         $Post->save();
  103.       }
  104.     }
  105.   }
  106.  
  107.   function filter($item) {
  108.     if (!in_array($item['post_status'], array('publish', 'draft', 'private'))) {
  109.       return false;
  110.     }
  111.     return true;
  112.   }
  113.  
  114.   function migratePublished($value) {
  115.     if ($value == 'publish') {
  116.       return true;
  117.     }
  118.    
  119.     return false;
  120.   }
  121. }
  122.  
  123. class WpComment extends WpMigrationModel{
  124.   var $useTable = 'comments';
  125.   var $map = array(
  126.     'comment_ID' => 'id'
  127.     , 'comment_post_ID' => 'post_id'
  128.     , 'comment_author' => 'author_name'
  129.     , 'comment_author_email' => 'author_email'
  130.     , 'comment_author_url' => 'author_url'
  131.     , 'comment_content' => 'text'
  132.   );
  133.  
  134.   function filter($item) {
  135.     static $Post;
  136.     if (empty($Post)) {
  137.       $Post = new Post();
  138.     }
  139.    
  140.     $Post->set('id', $item['comment_post_ID']);
  141.     return $Post->exists();
  142.   }
  143.  
  144.   function migrateAuthorName($name){
  145.     return utf8_decode($name);
  146.   }
  147. }
  148.  

Alright there is quite some stuff going on here so let me begin with the basics. I've decided to give all WordPress tables their own model which I prefix with 'Wp' to not overlap with the models I'm using in the new CakePHP application. Those 'Wp' models all extend a common base model which I call the WpMigrationModel. This is because I found that a lot of functionality is shared across them, especially things like mapping old 'Wp' fields to new fields for my CakePHP models as well as filtering out items I'm not interested anymore. All of this is happening in the WpMigrationModel::migrate. Another neat thing I built in is that while the algorithm loops through the fields of my new cake models, it also looks for a function ::migrate. If its found then the function is applied as a "filter" to the old value in WordPress. As you can see, I'm using this to convert my latin1 encoded (wtf) fields to the new utf8 encoding. I also migrate a wp varchar field called 'post_status' to a more sane / simplified tinyint field called 'published'. Once the posts / comments are migrated I finally loop through all posts again in order to populate a counter cache field called 'comment_count'. Oh and before I forget, during the migration of the post comments, I check if those post_id's are still around. This is neccessary b/c no FK restrictions were used by WordPress which lead to some lonely data islands cluttering the db.

Anyway, this of course is just the beginning, but its already most of the data I'm really interested in migrating - I don't mind loosing a couple meta / whatever fields. In a next post I'll show how to replicate some legacy WordPress logic in my new cake blog. For now I've got to stop as my plane to San Francisco from the Atlanta airport is boarding ; ).

-- Felix Geisendörfer aka the_undefined

PS: Typos may be corrected when I get wifi again ; ).