Using PHP to parse specific XML tag content?


M

Mechphisto

I'm trying to write a PHP script that will grab the info from an XML
file, and parse and sort the data into their own variables.
Can someone point me a direction for information?
Here's what I have so far. The following script will grab ALL data in
the file and simply make each one bold and separated by a break.
I want to be able to dump the contents of message1 into say $data1 and
message2 into $data2, etc.
Does that make sense?
Thanks for any assistance!
-Liam

<?php
$file = "xml_test.xml";
unset($xml_data);
function contents($parser, $data){
global $xml_data;
$xml_data .= $data;
return $xml_data;
}
function startTag($parser, $data){
global $xml_data;
$xml_data .= "<b>";
return $xml_data;
}
function endTag($parser, $data){
global $xml_data;
$xml_data .= "</b><br />";
return $xml_data;
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "contents");
$fp = fopen($file, "r");
$data = fread($fp, 80000);
if(!(xml_parse($xml_parser, $data, feof($fp)))){
die("Error on line " . xml_get_current_line_number($xml_parser));
}
xml_parser_free($xml_parser);
fclose($fp);
echo $xml_data;
?>

xml_test.xml contains:

<?xml version="1.0" encoding="utf-8"?>
<text>
<message1>Heres XML text.</message1>
<message2>Heres some more XML text.</message2>
</text>
 
Ad

Advertisements

M

Martin Honnen

Mechphisto said:
I'm trying to write a PHP script that will grab the info from an XML
file, and parse and sort the data into their own variables.
Can someone point me a direction for information?
Here's what I have so far. The following script will grab ALL data in
the file and simply make each one bold and separated by a break.
I want to be able to dump the contents of message1 into say $data1 and
message2 into $data2, etc.
Does that make sense?

Have you considered to use SimpleXml or DOM instead? Those APIs allow
you to work with XML at a higher level than the xml_parser API.
 
Á

Álvaro G. Vicario

Mechphisto escribió:
I'm trying to write a PHP script that will grab the info from an XML
file, and parse and sort the data into their own variables.
Can someone point me a direction for information? [...]
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "contents");

I wrote an XML parser a few years ago and it's not something I'd like to
repeat. While parsing you must keep track of where you are (inside a
tag? which tag?) and you have to deal with stuff like "What if
contents() doesn't return the whole contents because fread() stopped in
the middle?". There're many tools nowadays that avoid you the boring
internals.

You can use any of the built-in libraries:

http://es.php.net/manual/en/refs.xml.php

Some people love SimpleXML. I've found XMLReader quite usable. There's
even a user note with the code you want to write:

http://es.php.net/manual/en/class.xmlreader.php#87288

Or you can use third-party code. I specially enjoy phpQuery:

http://code.google.com/p/phpquery/
 
M

Mechphisto

[..snip all..]
Based on the fact the above works, it's just not doing exactly what I
want, I tried the following adjustments to the PHP.
Logically, I don't see any reason why it's not working, but it's
simply not displaying anything on the echo's.
Any ideas?

<?php
$file = "xml_test.xml";
function startTag($parser, $data){
global $msg1;
global $msg2;
global $current;
$current = $data;
if ($current == "message1") {
$msg1 .= "<b>";
return $msg1;
} elseif ($current == "message2") {
$msg2 .= "<em>";
return $msg2;
}
}
function endTag($parser, $data){
global $msg1;
global $msg2;
if ($data == "message1") {
$msg1 .= "</b>";
return $msg1;
} elseif ($data == "message2") {
$msg2 .= "</em>";
return $msg2;
}
}
function contents($parser, $data){
global $msg1;
global $msg2;
global $current;
if ($current == "message1") {
$msg1 .= $data;
return $msg1;
} elseif ($current == "message2") {
$msg2 .= $data;
return $msg2;
}
}
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "contents");
$fp = fopen($file, "r");
$data = fread($fp, 80000);
if(!(xml_parse($xml_parser, $data, feof($fp)))){
die("Error on line " . xml_get_current_line_number($xml_parser));
}
xml_parser_free($xml_parser);
fclose($fp);

echo $msg1;
echo $msg2;
?>
 
M

Mechphisto

Mechphisto escribió:
I'm trying to write a PHP script that will grab the info from an XML
file, and parse and sort the data into their own variables.
Can someone point me a direction for information? [...]
$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "contents");

I wrote an XML parser a few years ago and it's not something I'd like to
repeat. While parsing you must keep track of where you are (inside a
tag? which tag?) and you have to deal with stuff like "What if
contents() doesn't return the whole contents because fread() stopped in
the middle?". There're many tools nowadays that avoid you the boring
internals.

You can use any of the built-in libraries:

http://es.php.net/manual/en/refs.xml.php

Some people love SimpleXML. I've found XMLReader quite usable. There's
even a user note with the code you want to write:

http://es.php.net/manual/en/class.xmlreader.php#87288

Or you can use third-party code. I specially enjoy phpQuery:

http://code.google.com/p/phpquery/

Thanks, I appreciate the the reply and suggestions.
That XMLReader does look like what I could use.

Unfortunately, I'm on a shared server host and don't have control over
what libraries can be installed into PHP. So, I'm kind of stuck with
whatever solution is default built-in PHP5.

Thanks, though. :)
 
M

Mechphisto

[..snip all..]
Based on the fact the above works, it's just not doing exactly what I
want, I tried the following adjustments to the PHP.
Logically, I don't see any reason why it's not working, but it's
simply not displaying anything on the echo's.
Any ideas?

<?php
$file = "xml_test.xml";
function startTag($parser, $data){
        global $msg1;
        global $msg2;
        global $current;
        $current = $data;
        if ($current == "message1") {
                $msg1 .= "<b>";
                return $msg1;
        } elseif ($current == "message2") {
                $msg2 .= "<em>";
                return $msg2;
        }}

function endTag($parser, $data){
        global $msg1;
        global $msg2;
        if ($data == "message1") {
                $msg1 .= "</b>";
                return $msg1;
        } elseif ($data == "message2") {
                $msg2 .= "</em>";
                return $msg2;
        }}

function contents($parser, $data){
        global $msg1;
        global $msg2;
        global $current;
        if ($current == "message1") {
                $msg1 .= $data;
                return $msg1;
        } elseif ($current == "message2") {
                $msg2 .= $data;
                return $msg2;
        }}

$xml_parser = xml_parser_create();
xml_set_element_handler($xml_parser, "startTag", "endTag");
xml_set_character_data_handler($xml_parser, "contents");
$fp = fopen($file, "r");
$data = fread($fp, 80000);
if(!(xml_parse($xml_parser, $data, feof($fp)))){
        die("Error on line " . xml_get_current_line_number($xml_parser));}

xml_parser_free($xml_parser);
fclose($fp);

echo $msg1;
echo $msg2;
?>

Oh holy sheesh!
I saw somewhere something about the parser converting tags to
uppercase, so I changed all the tags to uppercase in the PHP and in
the XML file, and bammo! It all works fine now.
Crazy, annoying... grrr.
 
Ad

Advertisements

Á

Álvaro G. Vicario

Mechphisto escribió:
[..snip all..]
Based on the fact the above works, it's just not doing exactly what I
want, I tried the following adjustments to the PHP.
Logically, I don't see any reason why it's not working, but it's
simply not displaying anything on the echo's.
Any ideas?

<?php
$file = "xml_test.xml";
function startTag($parser, $data){
global $msg1;
global $msg2;
global $current;
$current = $data;
if ($current == "message1") {

Try:

if ($current == "MESSAGE1") {
 
M

Mechphisto

Mechphisto escribió:


[..snip all..]
Based on the fact the above works, it's just not doing exactly what I
want, I tried the following adjustments to the PHP.
Logically, I don't see any reason why it's not working, but it's
simply not displaying anything on the echo's.
Any ideas?
<?php
$file = "xml_test.xml";
function startTag($parser, $data){
   global $msg1;
   global $msg2;
   global $current;
   $current = $data;
   if ($current == "message1") {

Try:

        if ($current == "MESSAGE1") {

Yep, that was the deal. It all has to be uppercase. :)
Thanks for your time and assistance!
 
S

Shahid

Mechphisto escribió:
[..snip all..]
Based on the fact the above works, it's just not doing exactly what I
want, I tried the following adjustments to the PHP.
Logically, I don't see any reason why it's not working, but it's
simply not displaying anything on the echo's.
Any ideas?
<?php
$file = "xml_test.xml";
function startTag($parser, $data){
   global $msg1;
   global $msg2;
   global $current;
   $current = $data;
   if ($current == "message1") {

        if ($current == "MESSAGE1") {

Yep, that was the deal. It all has to be uppercase. :)
Thanks for your time and assistance!

Dear you can subscribe to www.phpclasses.org. I wanted to a similar
kind of task but from HTML(4). You should visit this link and get a SAX
(Simple API for XML) parser. I got a SAX parser named as saxy_parser
classes from this website. So, visit it and there you can get lots of
things of your interest.
 
R

Ron Fox

Mechphisto said:
Oh holy sheesh!
I saw somewhere something about the parser converting tags to
uppercase, so I changed all the tags to uppercase in the PHP and in
the XML file, and bammo! It all works fine now.
Crazy, annoying... grrr.
uh... that sounds so broken.. xml is case sensitive... is it possible
the parser has a flag to run in html mode (where tags are not case
sensitive), and that flag needs to be turned off to truly parse xml?

Ah yes:

see the xml_parser_set_option man page:
http://us.php.net/manual/en/function.xml-parser-set-option.php

Specifically the XML_OPTION_CASE_FOLDING

and remember that http://us.php.net is your friend.
 
M

Mechphisto

Have you considered to use SimpleXml or DOM instead? Those APIs allow
you to work with XML at a higher level than the xml_parser API.

Thanks, but unfortunately, I'm on a shared server host and don't have
control over
what libraries can be installed into PHP. So, I'm kind of stuck with
whatever solution is default built-in PHP5.
 
Ad

Advertisements

C

Captain Paralytic

Thanks, but unfortunately, I'm on a shared server host and don't have
control over
what libraries can be installed into PHP. So, I'm kind of stuck with
whatever solution is default built-in PHP5.

From the manual:
Installation

The SimpleXML extension is enabled by default.
 
M

Mechphisto

Correction:  php.net is your bible!  :)  (....at least that is how I
referred to it when I recently taught my daughter how to program inPHP).

Uh oh, you've given me a dangerous idea.... how old's your daughter?
Mine's 9 and is already playing with making Power Point presentations
and Excel spreadsheets.
(Although, as I think of it, if I'm going to teach any language to the
uninitiated, I'm thinking Python....)
 
M

Mechphisto

From the manual:
Installation

The SimpleXML extension is enabled by default.

Actually, on my present Web site, phpinfo() doesn't have any entry for
SimpleXML.
However, on the new server we're moving to, the phpinfo() DOES have
SimpleXML listed! So, I'll switch to that probably when we move
servers.
Thanks. :)
 
M

Mechphisto

From the manual:
Installation

The SimpleXML extension is enabled by default.

Tried the SimpleXML method on the new server:
Holy moly! It cut 58, yes 58, lines of code down to 6!!
SimpleXML rocks!
 
J

Jerry Stuckle

_Z_ said:
Hi,

I have quite different approach. Since PHP is only the engine to
produce HTML from XML data, then all processing of XML I do with XSLT.
PHP is only a kind of manager (gets the name of XML file, sends
content-type, selects proper xsl file). Here is the code:

Actually, PHP doesn't "produce HTML from XML data". You can parse XML
data and generate output from it in PHP - but so can you in many languages.
<?php
header() ;
$xsl = "MyForm.xsl" ;
$command = "xsltproc " . $xsl . " " . $_GET[XML_File] ;
$last_line = system( $command ) ;
?>

Of course you have to have xsltproc installed, but I assume you have
it. This way enables you to test your XML --> HTML transformation in
testing environment. For example:
bash# xsltproc MyForm.xsl TestData.xml

Even more basic that that - you have to have permission to execute
commands, which many shared hosts do not allow. And you must, of
course, be running on a Linux system.
That is the end of the story if you want only to display the data from
XML file.

Only if you have those tools installed, and can stand the default format
produced. That often isn't true. SimpleXML or the DOM functions give
you much more control.
If not, and you want HTML to be a FORM to post the data back to the
server then you need to parse it in the PHP, but this part is easy
isn't it?

Cheers

- Marcin

Yes, it can be quite easy.

--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
(e-mail address removed)
==================
 
Ad

Advertisements

Z

_Z_

Hi,

I have quite different approach. Since PHP is only the engine to
produce HTML from XML data, then all processing of XML I do with XSLT.
PHP is only a kind of manager (gets the name of XML file, sends
content-type, selects proper xsl file). Here is the code:

<?php
header() ;
$xsl = "MyForm.xsl" ;
$command = "xsltproc " . $xsl . " " . $_GET[XML_File] ;
$last_line = system( $command ) ;
?>

Of course you have to have xsltproc installed, but I assume you have
it. This way enables you to test your XML --> HTML transformation in
testing environment. For example:
bash# xsltproc MyForm.xsl TestData.xml

That is the end of the story if you want only to display the data from
XML file.

If not, and you want HTML to be a FORM to post the data back to the
server then you need to parse it in the PHP, but this part is easy
isn't it?

Cheers

- Marcin
 
Ad

Advertisements

P

Pavel Lepin

Jerry Stuckle said:
_Z_ said:
<?php
header() ;
$xsl = "MyForm.xsl" ;
$command = "xsltproc " . $xsl . " " . $_GET[XML_File] ;
$last_line = system( $command ) ;
?>

Of course you have to have xsltproc installed, but I
assume you have it. This way enables you to test your XML
--> HTML transformation in testing environment. For
example: bash# xsltproc MyForm.xsl TestData.xml

Even more basic that that - you have to have permission to
execute commands, which many shared hosts do not allow.

The XSL extension (using the very same libxslt that xsltproc
is a command-line front-end for) solves that problem
neatly, of course, but that leads us back to some people
being locked into the set of extensions their hosting
provider deemed 'appropriate' for their needs. My advice
would be changing the provider.
And you must, of course, be running on a Linux system.

That's not true. At the very least, libxslt and xsltproc run
just fine under Windows using Cygwin, and, unless I'm much
mistaken, the open source *BSD family. I suspect there
wouldn't be any problems running it on any reasonably sane
proprietary UN*X as well. libxml2 and its evil twin libxslt
seem to be among the most widely spread libs in the world.
Only if you have those tools installed, and can stand the
default format produced.

Sorry, either that doesn't make sense or I'm missing your
point completely. There's no 'default format' if we're
talking about XSLT. It's a programming language designed
specifically for document transformation, so the 'default
format' is whatever you choose it to be.
That often isn't true. SimpleXML or the DOM
functions give you much more control.

True, SimpleXML and DOM give a lot of control and are often
the most reasonable choice for XML processing. But there's
absolutely no point in not having XSLT in your toolbox. If
you need transformations, it's often the most cheap,
efficient and maintainable approach.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top