Xml, Accents and php domxml_open_file

G

Ghislain Benrais

Hi everybody,
I have xml documents with external entities for my accents that I
want to output properly with php function domxml_open_file. I can't get
my accents on a linux-apache server (I get "é" instead of "é"). My
browser is IE6. Do you know why ? A strange thing is that the very same
script on the same document works fine on a windows-apache server.

My xml document :
<?xml version="1.0" ?>
<!DOCTYPE survey [
<!ENTITY eacute "é"> ]>
<survey>
<dict l="fr">
<q id="1" mnemo="cible" type="1" nbmod="7">
<lib>here is an accent &eacute; </lib>
</q>
</dict>
</survey>
My php script :
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;
charset=iso-8859-1">
<meta name="GENERATOR" content="Quanta Plus">
</head>
<body>
<?
if (!$dom = domxml_open_file($fileEtude)) {
echo "Impossible de charger l'&eacute;tude dans le DOM\n";
exit;
}
$root = $dom->document_element();
$domDict = $root->get_elements_by_tagname("dict");
$ArrQ = $root->get_elements_by_tagname("q");
$CurQ = $ArrQ[0];
$NodeLib = array_shift($CurQ->get_elements_by_tagname("lib"));
echo $NodeLib->get_content();
?>

Thanks in advance,
Ghislain
 
M

Martin Honnen

Ghislain said:
Hi everybody,
I have xml documents with external entities for my accents that I
want to output properly with php function domxml_open_file. I can't get
my accents on a linux-apache server (I get "é" instead of "é"). My
browser is IE6. Do you know why ? A strange thing is that the very same
script on the same document works fine on a windows-apache server.

I think you are not encountering any XML problems but rather PHP
shortcomings. The encodings meant to be used with XML, for instance
UTF-8, are much more powerful than PHP's string handling capabilities
which only allow for 8-bit strings. So while PHP has some XML
capabilities you run easily into problems with any characters not in ASCII.
However PHP has some functions to convert UTF-8 to ISO-8859-1 and back
so french accents shouldn't cause a problem as they are in ISO-8859-1.
Thus if you encode your XML as UTF-8 as in

<?xml version="1.0" encoding="UTF-8"?>
<sentence xml:lang="fr">Je suis fatigué.</sentence>

then you should use PHP's utf8_decode function to convert to ISO-8859-1:

<html>
<head>
<title>Testing PHP's W3C DOM support</title>
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
</head>
<body>
<p>
<?php
if (!$xmlDocument = domxml_open_file("test20031011.xml")) {
echo "Error parsing XML document.<br>";
exit;
}
else {
$rootElement = $xmlDocument->document_element();
$firstChild = $rootElement->first_child();
echo "firstChild.node_type(): " . $firstChild->node_type() . "<br>";
echo "firstChild.node_value(): " .
utf8_decode($firstChild->node_value()) . "<br>";
}
?>
</p>
</body>
</html>

That way the accents should display properly in the browser.
 
G

Ghislain Benrais

Thank you Martin, it works just perfectly fine thanks to your
explanations.By the way, I don't use utf8_decode but charset="UTF-8" in
the header so that it works fine on both windows and linux server.
<?xml version="1.0" encoding="UTF-8"?>
<conclusion xml:lang="fr">Je ne suis plus fatigué et à la
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,247
Latest member
crypto tax software1

Latest Threads

Top