debuggable

 
Contact Us
 
44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52

Parsing XML using SimpleXML

Posted on 3/5/07 by Tim Koschützki

Parsing XML Data with PHP's SimpleXML

Introduction

Extensible Markup Language (XML) has become the number one format for disparate systems to communicate. Its most common applications are probably the Really Simple Syndication (RSS) Feeds embraced by the blogging community - including http://php-coding-practices.com. :)

One of the most significant changes made to PHP5 is the way it handles XML data. A few seamless set of XML parsing tools have been integrated directly into the language itself. The old days where us poor programmers had to use external tools and libraries are finally over! The purpose of this article is to give a closer look on one the cool new xml libraries - SimpleXML.

Short XML-Roundup

If you have ever worked with XHTML (Extensible Hypertext Markup Language), then you'are familiar with an application of XML, since XHTML is a reformulation of HTML4 as XML. I assume you are familiar with XML already. If not, head over to the W3 Schools Site and learn about it.

Important things in an XML Document

The most important things in an XML document are the following:

  • Entity: An entity is a named unit of storage. Entities can work as "variables" in an XML document. They can also be used to embed angular brackets or other characters that can normally not be part of an XML document. Entities can be included directly into the script or from en external source.
  • Element: A data object that can contain other elements or raw textual data. Elements can also feature one or more attributes.
  • Document Type Declaration A set of instructions that describes the accepted structure of the XML file. They can be embedded or externally defined.

XML documents should be valid. That means they are well-formed (all tags are nested recursively and correctly) and they contain a Document Type Declaration (DTD). The DTD is not a requirement and in fact, you will see many documents without a DTD. You should stick to it, though. This is not a php coding best practice, but an XML one. Think about it. ;)

An Example of valid XML Documents

< ?xml version="1.0"?>

The above document is only well-formed, but it is not valid. This is because it contains no DTD. Let's fix that:

< ?xml version="1.0"?>
< !DOCTYPE message SYSTEM "message.dtd">

Now that is a valid XML document! It is well-formed, all tags are nested correctly and it contains a DTD.

Introduction to SimpleXML

Over are the difficult days of PHP4 when external libraries had to be used to parse and change XML files. With PHP5 came a number of integrated XML libraries - one of which is SimpleXML.
True to its namestake, it provides an easy way to work with xml documents. SimpleXML, however, is geared through parsing and reading xml files and is rather inferior when it comes to alternating documents. Yes, you can alter xml documents with SimpleXML, but the dom library, among others, is far superior in this field. The good news is that you can juggle parsed xml file objects back and forth between the new built-in libraries, which makes the overall task pretty easy.

Creating an XML Document

In order to learn how to parse XML files with PHP SimpleXML, we will need a document first. For that, we simply use the current sitemap.xml file for http://php-coding-practices.com. You can view or download it from http://php-coding-practices.com/sitemap.xml.

Here is an excerpt:

< ?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.1.1" -->
<!-- sitemap-generator-url="http://www.arnebrachhold.de" sitemap-generator-version="2.7.1"  -->
<!-- Debug: Total comment count: 8 -->
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
  <url>
    <loc>http://php-coding-practices.com/</loc>
    <lastmod>2007-05-02T21:51:04+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>

  </url>
<!-- Debug: Start Postings -->
<!-- Debug: Priority report of postID 55: Comments: 0 of 8 = 0 points -->
  <url>
    <loc>http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/</loc>
    <lastmod>2007-05-02T22:51:04+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.1</priority>

  </url>
<!-- Debug: Priority report of postID 54: Comments: 2 of 8 = 0.3 points -->
  <url>
    <loc>http://php-coding-practices.com/refactoring/refactoring-a-first-example/</loc>
    <lastmod>2007-05-02T16:16:22+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.3</priority>

  </url>

The document should be pretty straightforward if you are familiar with XML. It provides a number of urls that each have a location, a last modification date, a change frequency and a priority. It is used with the Google Webmaster Tools to make it easier for google to index all pages on http://php-coding-practices.com.

Loading an XML File

Let's get a head start with SimpleXML on our sitemap.xml. Create a new simplexml.php file within the same directory where you placed the sitemap.xml file. Make sure both files are in your htdocs directory somewhere so you can access the php file on your local php-enabled system. Put the following source code into the simplexml.php file:

$source = 'sitemap.xml';

// load as string
$xmlstr = file_get_contents($source);
$sitemap1 = simplexml_load_string($xmlstr);

// load as file
$sitemap2 = simplexml_load_file($source);

The code is pretty straightforward. First of we use SimpleXML's simplexml_load_string function to load a previously read xml file (which got stored in a string) as a string. Secondly, we parse the xml directly from the file using simplexml_load_file(), which is faster and makes more sense.
The file could also be a path to a remote xml file, depending on your allow_url_fopen php.ini setting. Note, that both $sitemap1 and $sitemap2 are instances of the SimpleXMLElement class.

SimpleXML also has an OOP-centric approach, where you can create those SimpleXMLElement objects on the fly:

$source = 'sitemap.xml';

// load as string
$xmlstr = file_get_contents($source);
$sitemap = new SimpleXMLElement($xmlstr);

// load as file
$sitemap = new SimpleXMLElement($source,null,true);

Not much need of explanation here, except that, as you see, the constructor of the SimpleXMLElement class can receive two optional parameters. The first parameter can hold additional information on how the file should be parsed, whereas the second one informs the class that the first parameter is a path to a file instead of a string.
We left the second parameter to null at this point, because we do not need it for journey. If you are eager to learn what you can do with it, check out the optional constants you can provide as an array for the second parameter.

Accessing Children

SimpleXML is so cool and easy, because when you parse a document as we have done now, all children are stored as nodes of the SimpleXMLElement object - allowing us to access them easily. Let's look at this now:

// load as file
$sitemap = new SimpleXMLElement($source,null,true);

foreach($sitemap as $url) {
  echo "{$url->loc} - {$url->lastmod} - {$url->changefreq} - {$url->priority}\r\n";
}

The result is a great list of all urls and their sub-nodes. The drawback is here that we need to know about all the names of the nodes. If the xml document changes, we would need to change our client code, too. Let's take care of that:

foreach($sitemap->children() as $child) {
  echo $child->getName().":
"
;
 
  foreach($child->children() as $subchild) {
    echo "--->".$subchild->getName().": ".$subchild."
"
;
  }
}

Coolness! What we have done here is simply using the children() method of the SimpleXMLElement class that provides an iteration interface to iterate over all children of a node. Your output should be something like this:

url:
--->loc: http://php-coding-practices.com/
--->lastmod: 2007-05-02T21:51:04+00:00
--->changefreq: daily
--->priority: 1
url:
--->loc: http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/
--->lastmod: 2007-05-02T22:51:04+00:00
--->changefreq: weekly
--->priority: 0.1

Now what if you simply want to dump all xml data recursively with all children? You would not want to create 20 foreach-loops right? SimpleXML itself does not provide an easy recursive function that does that. However, we can easily do it on our own:

displayChildrenRecursive($sitemap);

function displayChildrenRecursive($xmlObj,$depth=0) {
  foreach($xmlObj->children() as $child) {
    echo str_repeat('-',$depth).">".$child->getName().": ".$subchild."
"
;
    displayChildrenRecursive($child,$depth+1);
  }
}

The recursive function is provided with a SimpleXMLElement object and a recursion depth. Then it dumps all of the object's children one by one and calls itself on the fly to process all subchilds of the current child.

Accessing Attributes

If our xml document contained attributes - for example if the urls had an id or number - we could access them as well. XML Example:

< ?xml version="1.0" encoding="UTF-8"?>
<!-- generator="wordpress/2.1.1" -->
<!-- sitemap-generator-url="http://www.arnebrachhold.de" sitemap-generator-version="2.7.1"  -->
<!-- Debug: Total comment count: 8 -->
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
  <url num="1">
    <loc>http://php-coding-practices.com/</loc>
    <lastmod>2007-05-02T21:51:04+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>

  </url>
<!-- Debug: Start Postings -->
<!-- Debug: Priority report of postID 55: Comments: 0 of 8 = 0 points -->
  <url num="2">
    <loc>http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/</loc>
    <lastmod>2007-05-02T22:51:04+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.1</priority>

  </url>
<!-- Debug: Priority report of postID 54: Comments: 2 of 8 = 0.3 points -->
  <url num="3">
    <loc>http://php-coding-practices.com/refactoring/refactoring-a-first-example/</loc>
    <lastmod>2007-05-02T16:16:22+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.3</priority>

  </url>

Here is how we would parse them with the first method:

// load as file
$sitemap = new SimpleXMLElement($source,null,true);

foreach($sitemap as $url) {
  echo "Number: {$url['num']}: {$url->loc} - {$url->lastmod} - {$url->changefreq} - {$url->priority}\r\n";
}

Look at that array-like approach for attributes. Isn't that cool? Here is the implementation using the attributes() method of the SimpleXMLElement object:

foreach($sitemap->children() as $child) {
  echo $child->getName().":
"
;
 
  foreach($child->attributes() as $attr) {
    echo "->".$attr->getName().": ".$attr."
"
;
  }
 
  foreach($child->children() as $subchild) {
    echo "--->".$subchild->getName().": ".$subchild."
"
;
  }
}

Simple, isn't it?

XPath Queries

The XML Path Language (XPath) is a W3C standardized language that is used to access and search XML documents. It is used extensively in Extensible Stylesheet Language Transformation (XSLT) and forms the basis for XML Query (XQuery) and XML Pointer (XPointer). It is a query language to access specific nodes deep in the XML tree in a comfortable way.

SimpleXMLElement comes with its xpath() method, that does all the bulk work for us. Keep in mind that xpath() searches only within the node from which it is accessed.
If you use xpath() on the root SimpleXMLElement it searches the entire document - if you use it with a child, it searches only within the child and so on. It returns an array of SimpleXMLElement objects - even if only a single element is returned.

$xml = < <<XML
<?xml version="1.0" encoding="UTF-8"?>
<urlset>
  <url>
    <loc>http://php-coding-practices.com/</loc>
    <lastmod>2007-05-02T21:51:04+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>

  </url>
  <url>
    <loc>http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/</loc>
    <lastmod>2007-05-02T22:51:04+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.1</priority>

  </url>
</urlset>
XML;

$sitemap = new SimpleXMLElement($xml);
$results = $sitemap->xpath('url/loc');
print_r($results);
foreach($results as $location) {
  echo $location.'
';
}

Important Note: The sitemap.xml file that we use doesn't seem to be liked by xpath, because it contains comments and contained a namespace on the urlset-node:

<urlset xmlns="http://www.google.com/schemas/sitemap/0.84">
  <url>
    <loc>http://php-coding-practices.com/</loc>
    <lastmod>2007-05-02T21:51:04+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>

  </url>
...
[/urlset]

If we do not register the namespace with xpath, it will not work. For now, let's remove the namespace (xmlns="http://www.google.com/schemas/sitemap/0.84"). A way we can make XPath work alongside namespaces will be discussed later.

Modifying XML Documents with SimpleXML

Adding elements and attributes

Prior to PHP 5.1.3, SimpleXML had no means to change an xml document, meaning it could not add or remove elements or attributes. Yes it could change their values, but the only way to add or remove elements or attributes was to export the SimpleXMLElement object to the DOM library. However, with PHP 5.1.3 the method addChild() and addAttribute() were introduced to the SimpleXMLElement object.

Let's look at the addChild() method first:

$url = $sitemap->addChild('url');
$url->addChild('loc','http://php-design-patterns.com');
$url->addChild('lastmod','2007-05-02T21:51:04+00:00');
$url->addChild('changefreq','daily');
$url->addChild('priority','0.5');

header('Content-type: text/xml');
echo $sitemap->asXML();

The addChild() method returns a SimpleXMLElement itself, to which you can add childs again. It accepts three parameters - the node's name, an optional value and an optional namespace. We will come to namespaces in a minute.

Via the asXML() method of the SimpleXMLElement you can also output the entire document again, which comes in handy with the header() function to tell the browser that your script's output has to be treated as XML content. The asXML() method also accepts a file path parameter to which it can save the document. In this case it returns a boolean value indicating whether the safe operation was successful or not.

The addAttribute() method is quite similar:

$url = $sitemap->addChild('url');
$url->addAttribute('featured','true');
$url->addChild('loc','http://php-design-patterns.com');
$url->addChild('lastmod','2007-05-02T21:51:04+00:00');
$url->addChild('changefreq','daily');
$url->addChild('priority','0.5');

header('Content-type: text/xml');
echo $sitemap->asXML();

.

We have now added an attribute "featured" with the value "true" to our url node, as we can see in the script's output:

....
<url featured="true">
  <loc>http://php-design-patterns.com</loc>
  <lastmod>2007-05-02T21:51:04+00:00</lastmod>
  <changefreq>daily</changefreq>
  <priority>0.5</priority></url>

The addAttribute() method can also receive an optional namespace.

Removing elements and attributes

While SimpleXML provides the functionality for adding childs and attributes, it does not provide a means to remove them - at least not directly via its API. However, you can remove an element with:

unset($sitemap->url[0]);

This will not remove attributes from the element at the url level. You could set the attribute value to null as well, but that would not actually remove it. The attribute will only become empty. To really remove attributes and elements, you have to export your SimpleXMLElement objects to the DOM library (explained in a later article).

Working with Namespaces

The use of namespaces allows you to associate certain element and attribute names with namespaces identified by URIs. This has the benefit of avoiding naming conflicts when two elements of the same name exist, but contain different data.

Our sitemap contains a namespace already - check for the string xmlns="http://www.google.com/schemas/sitemap/0.84" in the urlset node. Let's add a few more:

< ?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.google.com/schemas/sitemap/0.84"
  xmlns:meta="http://example.com/meta/"
  xmlns:foo="http://example.com/foo/">

  <url>
    <loc>http://php-coding-practices.com/</loc>
    <lastmod>2007-05-02T21:51:04+00:00</lastmod>
    <changefreq>daily</changefreq>
    <priority>1</priority>

  </url>
  <url>
    <loc>http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/</loc>
    <lastmod>2007-05-02T22:51:04+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <priority>0.1</priority>

  </url>
...

Since PHP 5.1.3, SimpleXML has had the ability to return all namespaces declared in a document (getDocNamespaces()), return all namespaces used in a document (getNamespaces()) and register a namespace prefix used in making an XPath query (registerXPathNamespace()). Here is an example for getDocNamespaces() :

$namespaces = $sitemap->getDocNamespaces();
foreach($namespaces as $key => $value) {
  echo "{$key} => {$value}
"
;
}

This will output

=> http://www.google.com/schemas/sitemap/0.84
meta => http://example.com/meta/
foo => http://example.com/foo/

Fair enough, our initial namespace didn't have a name, so that first line looks a bit weird.

A call to getNamespaces() will return nothing, since we do not use any yet. if we used namespaces within our document, by typing something like

<url>
    <loc>http://php-coding-practices.com/beautifying-your-code/php-code-beautifier-tool/</loc>
    <lastmod>2007-05-02T22:51:04+00:00</lastmod>
    <changefreq>weekly</changefreq>
    <meta :priority>0.1

getNamespaces() would return an array of used namespaces.

The tricky thing is to use namespaces and XPath with registerXPathNamespace(). The function creates a prefix/ns context for the next XPath query. In particular, this is helpful if the provider of the given XML document alters the namespace prefixes. registerXPathNamespace() will create a prefix for the associated namespace, allowing one to access nodes in that namespace without the need to change code to allow for the new prefixes dictated by the provider.

Example:

$sitemap->registerXPathNamespace('c', 'http://www.google.com/schemas/sitemap/0.84');
$result = $sitemap->xpath('//c:loc');
print_r($result);
foreach($result as $value) {
  echo $value.'
'
;
}

Voila, our XPath query works now and lists all url locations. :]

Conclusion

We have come to the end of our little SimpleXML journey. As you see, SimpleXML is a very lightweight and easy-to-use xml parser that provides simple yet effective solutions to the most common xml needs.

If you need to change an xml document, then SimpleXML is not the way to go. We will have a look at an according library for this, namely the DOM library, in a later article.

Thanks for reading! Have a good one. :)

 

PHP Code-Beautifier Tool

Posted on 1/5/07 by Tim Koschützki

Today I discovered a good tool for beautifying existing PHP Code. It works via a web interface. You can either upload a script or directly input it. The code is beautified according to the PHP PEAR Standard Requirements. It does not change or debug your code in any way. What it does is the following:

* Tries to set missing curly braces around single conditional statements. This may not work in all situations.
* Indents with four spaces.
* Uses "one true brace" style in function definitions.
* Sets one space between control keyword and condition.
* Removes space between function calls, parenthesis and beginning of argument list.

Nice tool!

You can download it here.

 

Refactoring - A first example

Posted on 30/4/07 by Tim Koschützki

Today we will look at a first php example of refactoring, to start creating a sense for it.

Think about the following scenario: You want to dynamically add workers to a department and then display that department's workers as an unordered list. Pretty easy eh? However, we will discover that there are some pitfalls and ways to improve from our initial code. We'll start off with the following:

class Department {
  protected $name;
  protected $workers = array();
 
  public function addWorker($name,$isChief,$salary) {
    $this->workers[] = array($name,$isChief,$salary);
  }
 
  public function listWorkers() {
    echo '<ul>';
    foreach($this->workers as $worker) {
      echo "<li>{$worker[0]} - {$worker[1]} - {$worker[2]}";
    }
    echo '</li></ul>';
  }
}

$dept = new Department;
$dept->addWorker('Felix Broda',1,2000);
$dept->addWorker('Stefan Muster',0,1400);
$dept->addWorker('Klaus Schmidt',0,1350);
$dept->listWorkers();

There are a couple of really bad Code Smellsin it, unfortunately.

Our first refactoring

The first thing that comes to mind is the very bad way of storing the workers in the Department class. I mean an array is by all means fine. However, look at the 13th line: We will always depend on our implementation and have to go to the addWorker function to remember the order in which we stored the data. How can we fix this? Introduce a class for a Worker:

class Worker {
  protected $name;
  protected $isChief = false;
  protected $salary = 0;
}

Now we can change the addWorker method of the Department class:

public function addWorker($worker) {
    $this->workers[] = $worker;
  }

Okay, so we can change the client code now in the listWorkers() method:

public function listWorkers() {
    echo '<ul>';
    foreach($this->workers as $worker) {
      echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}";
    }
    echo '</li></ul>';
  }

Okay looking good now. :)

Our second refactoring

When executing the code we see that the attributes of the Worker class must be public in order to work. A quick refactoring, let's fix it up:

class Worker {
  public $name;
  public $isChief = false;
  public $salary = 0;
}

Making our list standards-compliant

There are actually two smaller problems now with our list generation: We are missing the closing list-tags:

echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}</li>";

Okay this took us 5 seconds, but it was worth it. Now we notice that when we are outputting a department-list before adding workers to it, we get an unordered list that has starting ul-tags, but no li-tags. This is not standards-compliant, so we fix it:

public function listWorkers() {
    if(count($this->workers) > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}";
      }
      echo '</li></ul>';
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }

Switching hats again - adding a feature

Switching hats now, because our customer told us, that he wants the number of department workers displayed at the end of each list. We are wearing our feature-hat now. This feature is a quick thing, so we implement it right away:

public function listWorkers() {
    if(count($this->workers) > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}";
      }
      echo '</li></ul>';
      echo "<p>There are {count($this->workers} workers in this department.</p>";
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }

More refactoring

Looking at our code now we use the count() function twice to receive the same bit of information. Now we could call count() once and store its result in a variable. However, that would tie our interface to the implementation. Let's add a new method instead that counts the number of workers for us:

public function numWorkers() {
    return count($this->workers);
  }

Our altered listWorkers() method looks as follows:

public function listWorkers() {
    $numWorkers = $this->numWorkers();
 
    if($numWorkers > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}";
      }
      echo '</li></ul>';
      echo "<p>There are {$numWorkers} workers in this department.</p>";
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }

Cool stuff. :) Let's see what we currently have:

class Department {
  protected $name;
  protected $workers = array();
 
  public function addWorker($worker) {
    $this->workers[] = $worker;
  }
 
  public function listWorkers() {
    $numWorkers = $this->numWorkers();
 
    if($numWorkers > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        echo "<li>{$worker->name} - {$worker->isChief} - {$worker->salary}";
      }
      echo '</li></ul>';
      echo "<p>There are {$numWorkers} workers in this department.</p>";
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }
 
  public function numWorkers() {
    return count($this->workers);
  }
}

class Worker {
  public $name;
  public $isChief = false;
  public $salary = 0;
}

Adding some client code into the mix

$felix = new Worker;
$felix->name = 'Felix Broda';
$felix->isChief = true;
$felix->salary = 2000;

$stefan = new Worker;
$stefan->name = 'Stefan Muster';
$stefan->salary = 1400;

$klaus = new Worker;
$klaus->name = 'Klaus Schmidt';
$klaus->salary = 1350;


$dept = new Department;
$dept->addWorker($felix);
$dept->addWorker($stefan);
$dept->addWorker($klaus);
$dept->listWorkers();

Now it's all working well, but darn that client code is so much for what it does. Let's refactor a bit more and add a handy constructor to the Worker class:

class Worker {
  public $name;
  public $isChief = false;
  public $salary = 0;
 
  public function __construct($name,$salary,$isChief=false) {
    $this->name = $name;
    $this->salary = $salary;
    $this->isChief = $isChief;   
  }
}

Note that we put the isChief variable as an optional parameter with the default value of false. This is handy, because most workers will not be chiefs, leaving us without the need to explicitely tell every new worker object that it is not a chief.

Now our new client code:

$felix = new Worker('Felix Broda',2000,true);
$stefan = new Worker('Stefan Muster',1400);
$klaus = new Worker('Klaus Schmidt',1350);

$dept = new Department;
$dept->addWorker($felix);
$dept->addWorker($stefan);
$dept->addWorker($klaus);
$dept->listWorkers();

Ah many less lines - much better. :)

Shortening the code more

Let's add the ability to add an array of workers all at once. Our goal is to make the following line valid:

$dept->addWorker(array($felix,$stefan,$klaus));

Our altered addWorker() method:

public function addWorker($worker) {
    if(is_array($worker)) {
      foreach($worker as $w)
        $this->workers[] = $w;
    } else {
      $this->workers[] = $worker;
    }
  }

Now was this a feature or a refactoring?

Adding tax rates

Our client wants us to display the actual salaries of all workers of a department. We add the feature in our listWorkers() method:

public function listWorkers() {
    $numWorkers = $this->numWorkers();
 
    if($numWorkers > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        $salary *= 1.07;
        echo "<li>{$worker->name} - {$worker->isChief} - {$salary}";
      }
      echo '</li></ul>';
      echo "<p>There are {$numWorkers} workers in this department.</p>";
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }

However, we just introduced another Code Smell. We should not hard code the calculation of the sales tax into our client code. If the calculation changes we would have to adjust all client code that depends on it. Let's make the overall flow by introducing a method in the Worker class clearer:

public function calcTotalSalary() {
    return round($this->salary * 1.07,2);
  }

Now we can rewrite listWorkers() as follows:

public function listWorkers() {
    $numWorkers = $this->numWorkers();
 
    if($numWorkers > 0) {
      echo '<ul>';
      foreach($this->workers as $worker) {
        echo "<li>{$worker->name} - {$worker->isChief} - {$worker->calcTotalSalary()}";
      }
      echo '</li></ul>';
      echo "<p>There are {$numWorkers} workers in this department.</p>";
    } else {
      echo '<p>Sorry, there are no workers to display.</p>';
    }
  }

Introducing a constant for the tax rate

Now tax rates change and our code should account for them. Let's introduce a constant for the tax rate to save us a lot of trouble changing client code later when the tax rate changes:

define('TAX_RATE','17');
class Worker {
  public $name;
  public $isChief = false;
  public $salary = 0;
 
  public function __construct($name,$salary,$isChief=false) {
    $this->name = $name;
    $this->salary = $salary;
    $this->isChief = $isChief;   
  }
 
  public function calcTotalSalary() {
    return ($this->salary + TAX_RATE/100 * $this->salary);
  }
}

Conclusion

As you see when you refactor you make little changes to code that is working already, but that can be improved to quite some extent. Hopefully you noticed how we switched hats often in order to
reach to some very cool and working code. It often takes only fundamental sense of architecture to get to clean code that works. Please keep in mind that you do not add a feature and refactor at the same time, as that will lead you to the road to hell.

Happy refactoring. :)

 

The various kinds of Design Patterns

Posted on 29/4/07 by Tim Koschützki

A way to classify design patterns

Design Patterns differ in their level of abstraction and their intent to improve the design. Since there are quite many design patterns that have been discovered already, there must be a way to classify them. Again, there are many approaches to organize design patterns and I personally prefer the approach of the number one book about the subject - the Gang of Four Book, which structures design patterns into the following categories:

  • Behavioral Patterns
  • Creational Patterns
  • Structural Patterns

More categories

The GoF book focuses on design patterns from a desktop application point of view though. Since we want to delve more into the topic of design patterns in php webdevelopment, there are two more categories that we need to take into account:

  • Domain-Logic Patterns
  • Presentation-related Patterns

Let's look at them at a greater detail.

Behavioral Patterns

Behavioral Patterns are concerned with algorithmns and the dispatching of responsibilities between objects. They describe the patterns of communication between these objects. They shift your focus away from flow of control to concentrate on the way your objects are interconnected.

Typical patterns for this category are:

  • Chain of Responsibility
  • The Observer Pattern
  • The Command Pattern
  • The Iterator Pattern

Creational Patterns

Creational Design Patterns help you with creating objects. They abstract the instantiation process of objects and thereby help make a system independent of how its objects are created and composed.
There are two reoccurring things in these patterns. Firstly, they hide which concrete classes the system uses to create objects. Secondly, They hide how instances of these classes are created and put together.

Typical creational design patterns are:

  • The Factory Method
  • The Singleton Pattern
  • The Prototype Pattern

Structural Patterns

Structural Design Patterns deal with how objects can be combined into larger structures and composite objects. The design patterns in this category use either object composition or class inheritance to form these larger structures.
The motivation behind this composing of objects could either be grouping related objects together, forming compositions as they are present in real life, adapting a class' interface or extending a class' interface and capabilities at runtime.

Typical Patterns in this category are:

  • Adapter Pattern
  • Composite Pattern
  • Decorator Pattern

Domain-Logic Patterns

In desktop applications domain-logic patterns have a really large scope. They can help with structuring the workflow and assign responsibilities to the proper objects. Since we are mainly concerned with web applications, the patterns in this category are closely related to databases. They help you represent a database-row or even a table as an object and perform all changes you would make to the database via the object.

Typical Patterns in this category are:

  • Active Record Pattern
  • Data Mapper Pattern
  • Table Data Gateway Pattern

Presentation-related Patterns

Presentation-related patterns are a must-use for web applications. We want to show something on our websites, after all! Only in intranets or backend scripts will we have an architecture that needs no real presentation. Think of database backup scripts. However, even those may print "Okay, backup successful" - a form of presentation. Structure your application into layers and stick to the presentation-patterns presented in this category.

Typical Patterns are:

  • Template-View
  • Transform-View

Conclusion

That's it for our brief overview over the various kinds of design patterns. I have presented one way to structure them and there are many others. In the coming few weeks we will study design patterns in detail.

Have a good one all!

 

How Design Patterns solve Problems

Posted on 28/4/07 by Tim Koschützki

Now that is a really important question. Generally you can go by this definition:

A design pattern names, abstracts and identifies the key aspects of a software design by identifying the participating classes, objects and instances, their roles, collaborations, interrelationships and the distribution of responsibilities.

However, the choice of the programming language is important. For example, PHP5 treats instances of objects as references, which might change the implementation of a certain design pattern drastically. Or, what if the language you are using does only support object composition, but no class inheritance?

There are many language-specific problems occuring when you wonder how design patterns solve design problems. That's one reason why I have created this website - to only care about PHP.

In the following, the definition above is explained further.

Finding Appropriate Objects

Object-Oriented design methodologies favor many different approaches. You could single out the nouns of a problem statement and define the corresponding objects. The verbs in the statement would become the objects' operations. Or you could concentrate on the responsibilties in your system. Or you could model the real world and translate all subjects and their actions into your design as objects. There is always disagreement on which approach works best and sometimes one approach is better suited to a problem, than the others.

The last approach, modeling the real world often works. However, when you are faced with a problem where your object-oriented design ends up with no classes as counterparts for the world entities, one might be in trouble if there is no guide.

Such a guide is a design pattern. When modeling objects becomes difficult, a design patterns helps you to identify the needed objects - even if they are low level objects like Sessions, Cookies or even simple Collections. If you strictly model the real world, your design might not be flexible enough. For example, an object that represents an algorithmn, a state or a process is not found during the analysis phase. It's found later when one wants to make the design more flexible and reusable.

 
44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52