Cake 1.2's Set class eats nested arrays for breakfast!

Posted by Felix Geisendörfer, on Feb 24, 2007 - in PHP & CakePHP » Core & Hacking

Hey folks,

I was just taking a little trip through the CakePHP core code trying to wrap my head around Acl, Model behaviors and all sorts of stuff. While doing so I saw that the core code starts to be using the Set class more and more that was added a while ago. So far this has been a little dark spot for me in the core and from my previous quick looks at the class I've never been quite able to figure out what it's exact purpose was. Until now all I knew was "well it's probably some fancy array manipulation code that is somewhat obfuscated and undocumented". Oh boy, I wish I had spent more time on this earlier. It's probably one of coolest new features in 1.2 and nobody realizes it ; ).

So before starting to drool over it too much ahead of time, let's take a look at a simple example. You have an array of $users as it could have been returned from a findAll call to your User model:

php
  1. $users = array
  2. (
  3.     0 => array
  4.     (
  5.         'User' => array
  6.         (
  7.             'id' => 1
  8.             , 'name' => 'Felix'
  9.         )      
  10.     )
  11.     , 1 => array
  12.     (
  13.         'User' => array
  14.         (
  15.             'id' => 2
  16.             , 'name' => 'Bob'
  17.         )
  18.     )
  19.     , 2 => array
  20.     (
  21.         'User' => array
  22.         (
  23.             'id' => 3
  24.             , 'name' => 'Jim'
  25.         )
  26.     )
  27. );

What you really want however, is just a simple array containing all user 'name's: array('Felix', 'Bob', 'Jim'). Hmm. Up until today I'd probably have written some code like this to do it:

php
  1. $userNames = array();
  2. foreach ($users as $user)
  3. {
  4.     $userNames[] = $user['User']['name'];
  5. }

Simple enough, right? Not any more! Using the new Set class we can achieve the exact same outcome like this:

php
  1. $userNames = Set::extract($users, '{n}.User.name');

Doesn't blow you away yet? Well, let's look at another example. Let's say our User model as a hasMany associations to an Item model. Then we would get an array like this:

php
  1. $users = array
  2. (
  3.     0 => array
  4.     (
  5.         'User' => array
  6.         (
  7.             'id' => 1
  8.             , 'name' => 'Felix'
  9.             , 'Item' => array
  10.             (
  11.                 0 => array
  12.                 (
  13.                     'id' => 1
  14.                     , 'name' => 'Mouse'            
  15.                 )
  16.                 , 1 => array
  17.                 (
  18.                     'id' => 2
  19.                     , 'name' => 'KeyBoard'
  20.                 )
  21.             )
  22.         )      
  23.     )
  24.     , 1 => array
  25.     (
  26.         'User' => array
  27.         (
  28.             'id' => 2
  29.             , 'name' => 'Bob'
  30.             , 'Item' => array
  31.             (
  32.                 0 => array
  33.                 (
  34.                     'id' => 3
  35.                     , 'name' => 'CD'
  36.                 )
  37.             )
  38.         )
  39.     )
  40.     , 2 => array
  41.     (
  42.         'User' => array
  43.         (
  44.             'id' => 3
  45.             , 'name' => 'Jim'
  46.             , 'Item' => array
  47.             (
  48.                 0 => array
  49.                 (
  50.                     'id' => 4
  51.                     , 'name' => 'USB Stick'
  52.                 )
  53.                 , 1 => array
  54.                 (
  55.                     'id' => 5
  56.                     , 'name' => 'MP3 Player'
  57.                 )
  58.                 , 2 => array
  59.                 (
  60.                     'id' => 6
  61.                     , 'name' => 'Cellphone'
  62.                 )
  63.             )
  64.         )
  65.     )
  66. );

Now here is how I would have traditionally turned this into a 'User.name' => 'User.items' array:

php
  1. $userItems = array();
  2. foreach ($users as $user)
  3. {
  4.     foreach ($user['User']['Item'] as $item)
  5.     {
  6.         $userItems[$user['User']['name']][] = $item['name'];
  7.     }
  8. }

But using the new Set class this is still pretty much a simple one-liner (split up in multiple lines so you don't have to scroll):

php
  1. $userItems = array_combine
  2. (
  3.     Set::extract($users, '{n}.User.name')
  4.     , Set::extract($users, '{n}.User.Item.{n}.name')
  5. );

Both methods will output:

php
  1. (
  2.     [Felix] => Array
  3.         (
  4.             [0] => Mouse
  5.             [1] => KeyBoard
  6.         )
  7.  
  8.     [Bob] => Array
  9.         (
  10.             [0] => CD
  11.         )
  12.  
  13.     [Jim] => Array
  14.         (
  15.             [0] => USB Stick
  16.             [1] => MP3 Player
  17.             [2] => Cellphone
  18.         )
  19. )

"But doesn't it cost more performance to loop through the array twice in the Set example?" I hear some of you cry. Yes it does. And? Have you built your application yet? Does it implement all features you are dreaming of? And most importantly: Do your web stats indicate you are going to have 1 million hits / day soon? If so go back into your code and remove the Set example with the less succinct foreach alternative. If not, listen to Chris Hartjes who's motto for 2007 is Just Build It, Damnit!.

Anyway, here comes my last fun thing to do with Set::extract - parsing an RSS feed for all post titles. For my example I'll use the new XML class in Cake 1.2. Right now Set::extract only supports arrays but hopefully it will either natively support Xml objects at some point, or the Xml class get it's own extract function. For now I've written a little function that can turn an Xml instance into an array that looks like this:

php
  1. function xmltoArray($node)
  2. {
  3.     $array = array();
  4.    
  5.     foreach ($node->children as $child)
  6.     {
  7.         if (empty($child->children))
  8.         {
  9.             $value = $child->value;
  10.         }
  11.         else
  12.         {
  13.             $value = xmltoArray($child);
  14.         }
  15.        
  16.         $key = $child->name;
  17.        
  18.         if (!isset($array[$key]))
  19.         {
  20.             $array[$key] = $value;
  21.         }
  22.         else
  23.         {
  24.             if (!is_array($array[$key]) || !isset($array[$key][0]))
  25.             {
  26.                 $array[$key] = array($array[$key]);
  27.             }
  28.            
  29.             $array[$key][] = $value;
  30.         }
  31.     }
  32.    
  33.     return $array;
  34. }

So now let's assume we would want to extract all post titles from my feed: http://feeds.feedburner.com/thinkingphp we could leverage the Set class to make our code as succinct as:

php
  1. uses('Xml');
  2.  
  3. $feed = xmltoArray(new XML('http://feeds.feedburner.com/thinkingphp'));
  4. $postTitles = Set::extract($feed, 'rss.channel.item.{n}.title');

Which will give you a $postTitles array like this:

php
  1. (
  2.     [0] => How-to: Use Html 4.01 in CakePHP 1.2
  3.     [1] => Looking up foreign key values using Model::displayField
  4.     [2] => Bug-fix update for SVN/FTP Deployment Task
  5.     [3] => Access your config files rapidly (Win32 only)
  6.     [4] => Making error handling for Model::save more beautiful in CakePHP
  7.     [5] => Full content RSS feed
  8.     [6] => Visual Sorting - Some Javascript fun I had last night
  9. )

Now that's beauty right there and a good way to end this post ; ). Take a look at the Set classes source to find out about some other cool methods it has, but to me this is by far the coolest.

-- Felix Geisendörfer aka the_undefined