Posted by: phillipnb | May 4, 2011

PHP and XML – Part 2


In this edition of our series about php and xml, we will try to understand the SimpleXML library. SimpleXML is a library that focusses on parsing xml files and its functions are part of the php core. It allows to convert xml into an object and this object can be processed with methods and array iterators. All objects created by SimpleXML are instances of the SimpleXMLElement class. So,you need to instantiate this class if you would like use its methods. The SimpleXML library provides an easy way to get an element’s attributes or text provided you know the xml document’s layout. The code for navigating a xml document using this library is quite fast and less complicated as long as you are performing tasks like reading xml files, or extracting data from xml strings but this library is not powerful when it comes to deleting nodes within a xml file.

Enough of this lecture about SimpleXML, let us dive into examples which will demonstrate what we have been talking in the previous paragraph.

Example 1:

$myStr = "<?xml version='1.0' encoding='utf-8'?><car><model>1997</model><price>2500</price></car>";
$xml = simplexml_load_string($myStr);
echo "\n Root : ".$xml->getName();
foreach($xml as $child) {
    echo "\n Node : ".$child->getName()." Value : ".$child;
}

In the example shown above, we are using the function called simplexml_load_string() to load our xml. What simplexml_load_string() does is that it interprets a string of XML and converts it into an object of class SimpleXMLElement. This object points to the root of the xml document. From the root,you can travel through each node as if it was an array using the foreach loop by picking one child node at a time.

Example 2:
car1.xml

<?xml version='1.0' encoding='utf-8'?>
<cars>
    <car id='T88'>
        <make>Toyota</make>
        <year>1988</year>
    </car>
    <car id='H90'>
        <make>Honda</make>
        <year>1990</year>
    </car>
    <car id='C93'>
        <make>Chrysler</make>
        <year month='Jul'>1993</year>
    </car>    
</cars>

In example 2, we are going to load the xml file using the method called simplexml_load_file() which returns an object of class SimpleXMLElement. Once we have a handle to the root, we can browse through the nodes as well as the attributes as shown below.

$xml = simplexml_load_file('car1.xml');
echo "\n Root : ".$xml->getName();
foreach($xml as $child) {
    echo "\n Node : ".$child->getName()." Make : ".$child->make. " Year : ".$child->year." Month : ".$child['id'];
}

In the above example, we are able to get the elements because we know their names as ‘make’ and ‘year’. Note how we were able to access the attribute called ‘id’ from each element.

Example 3:
In this example, we are passing our xml to the constructor of the SimpleXMLElement class. Once this is ready, we are going to add an element called car and two sub elements for it namely, ‘make’ and ‘year’ and after that we convert our object back to xml.

$myStr = "<?xml version='1.0' encoding='utf-8'?><cars><car><make>Ford</make><year>2009</year></car></cars>";
$xml = new SimpleXMLElement($myStr);
$car = $xml->addChild('car');
$car->addChild('make','Kia');
$car->addChild('year','2005');
echo $xml->asXML();

Example 4:
Consider this xml file(ex4.xml):

<?xml version='1.0' encoding='utf-8'?>
<email>
<from>Tim</from>
<to>Molly</to>
<message>Hello Howdy!</message>
</email>

To this xml file, let us try to add an attribute to the root element. For this, our code will look like this:

$xml = new SimpleXMLElement('ex4.xml',null,true);
$xml->addAttribute('type','official');
echo $xml->asXML();

Example 5:
Consider this xml file(ex5.xml)

<?xml version='1.0' encoding='utf-8'?>
<food xmlns:p="http://example.org/ns" xmlns:t="http://example.org/test">
    <fruit>
        <sweet>
            <one>Apple</one>
            <two>Pineapple</two>
        </sweet>
        <sour>
            <one>Grape</one>
            <two>Strawberry</two>
        </sour>
    </fruit>
    <vegetable>
        <tuber>
            <one>Tapioca</one>
            <two>Carrot</two>
        </tuber>
        <legume>
            <one>Peas</one>
            <two>Lentil</two>
        </legume>
    </vegetable>
    <meat>
        <one>Beef</one>
        <two>Mutton</two>
        <three>Chicken</three>
    </meat>
    <meat>
        <one>Duck</one>
        <two>Veel</two>
        <three>Rabbit</three>
    </meat>    
</food>

In Example 5, we are going to introduce something called the SimpleXMLIterator class. The SimpleXMLIterator provides methods for recursive iteration over all nodes of a SimpleXMLElement object. The SimpleXMLIterator has the following methods: current,getChildren,hasChildren,key,next,rewind and valid.

$xmlIterator = new SimpleXMLIterator('ex5.xml',null,true);
$xmlIterator->rewind();

while ($xmlIterator->valid()) {
	$temp = $xmlIterator->current();
	print_r($temp);
	$xmlIterator->next();
}

If you are not satisfied with the Iterator class, you can still access the xml elements considering it as an associative array. For example, You can access the text ‘Beef’ and ‘Apple’ as shown below:

$xml = new SimpleXMLElement('ex5.xml',null,true);
echo $xml->meat[0]->one[0];
echo "\n".$xml->fruit->sweet->one."\n";

Example 6:

$xml = new SimpleXMLElement('ex5.xml',null,true);
echo "\n".$xml->fruit->sweet->one->attributes()."\n";
echo "\n".$xml->fruit->sweet->one['value']."\n";

In Example 6, we will use the same xml input file called ex5.xml. This time our focus is on accessing and displaying the attributes of an element. The above example shows two different ways to display an attribute. Though we have been able to successfully traverse through a xml document, we haven’t been able to delete a node easily. It is here that we are going to take the help of another API called DOM.

Document Object Model

Document Object Model, popularly known as ‘DOM’ extension allows us to operate on xml document using the DOM API found in PHP. There are 2 ways that you can import a xml document to a DOM tree. They are:

  • from a file using DomDocument::load(‘myfile.xml’)
  • from a string using DomDocument::loadXML($myXmlString);

You can also import Html files using loadHtmlFile() and loadHTML(). To save xml document you can use DomDocument::save() to save to a file or DomDocument::saveXML() to save to a string. In case you want to save as html, you can use the following options:DomDocument::saveHTML() to save string as html, DomDocument::saveHTMLFile() to save file in html format. Let us demonstrate the use of some of the above functions in an example. In the example below we are going to load and xml file and browse through its contents displaying node by node:

$xmlDoc = new DOMDocument();
$xmlDoc->load("ex5.xml");
$x = $xmlDoc->documentElement;
foreach ($x->childNodes AS $item) {
	print $item->nodeName . " = " . $item->nodeValue . "<br />";
}

We had discussed in one of the previous paragraphs that one of the important use of DOM will be in deleting an element from the xml file. So, if you are working with SimpleXML and want to pass your xml to DOM, you can use the function, dom_import_simplexml() (It creates a DOMElement object using a SimpleXMLElement object). Similarly, the opposite is also true – you can export your DOM object back to SimpleXML using the function simplexml_import_dom()(This function gets a SimpleXMLElement object from a DOM node). Let us demonstrate deletion of an element using DOM. The xml that we will be using will be:

<?xml version="1.0" encoding="utf-8"?>
<book id="booklisting">
 <title>The Book</title>
 <chapter>1</chapter>
 <chapter>2</chapter>
 <chapter>3</chapter>
</book>

The php code to delete the second element named ‘chapter’ will be:

$dom = new DOMDocument();      // Creates a new DOMDocument object
$dom->load('ex6.xml');         // Load XML from a file
$book = $dom->documentElement; // returns root element
$node = $book->getElementsByTagName('chapter')->item(1); // Gets elements by tagname
$book->removeChild($node);     // Removes child from list of children
echo $dom->saveXML();          // Dumps the internal XML tree back into a string

Here, we first instantiate the DOMDocument class and use that object to load the xml file. Then we use the property called dcoumentElement to get access to the root of the xml file. Using this root handle, we parse the xml tree by tag name looking for tag called ‘chapter’. Within that we focus on the second element named ‘chapter’ and return the handle to a variable which will then be removed from the DOM tree and the xml will then saved as a string and displayed to the user.

So much about traversing through the nodes of an xml document. In the next edition of this series, we will talk about XPath. Until then, its Happy PHPing

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: