little problem with xml::dom::parser

M

mathias wündisch

dear group,

i have a little problem with the automatic conversion from unicode
entities in real characters by XML::DOM::parser (or XML::parser). for
example i have the string '&x#A0;' in a xml source file and i want it
after parsing with XML::DOM::parser also in the target xml file.


begin source file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<doc>
<name>Mathias Wuendisch</name>
</doc>
end source file:

begin perl script:
#!c:\perl\bin\perl.exe -w
use XML::DOM;
use strict;

&process_file( shift @ARGV );
sub process_file {
my $infile = shift;
my $dom_parser = new XML::DOM::parser(NoExpand => 1,
ProtocolEncoding => 'iso-8859-1', ParseParamEnt => 0, ExpandParamEnt
=> 0) ;
my $doc = $dom_parser->parsefile( $infile ,NoExpand => 1,
ParseParamEnt => 0, ExpandParamEnt => 0) ;
print $doc->toString;
$doc->dispose;
}
exit;
end perl script:

after: perl xml-dom-test.pl test.xml > test1.xml
i have this

begin target file:
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<doc>
<name>Mathias Wuendisch</name>
</doc>
end target file:

i've read the sourceforge faq and i've found a solution for "named
entities" like this:

---
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE doc [
<!ENTITY nbsp " " >
]>
<doc>
<name>Mathias&nbsp;Wuendisch</name>
</doc>
---

ok, than the "named entity" &nbsp; is also in the target file... but
what is with "unnamed entities" like &x#A0; ? why did the NoExpand
flag or ExpandParamEnt flag not work for me? any suggestions?

kind regards,
mathias w&uuml;ndisch
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top