REXML and entities

  • Thread starter Pawel Szymczykowski
  • Start date
P

Pawel Szymczykowski

Hi,

I've been trying to use REXML to process an XML file with entities,
but I can't seem to get it to leave my entities alone even with the
:raw context set. The simplified example looks something like this:

---
# song.xml:
<?xml version=3D"1.0" encoding=3D"UTF-8"?>
<!DOCTYPE Song [
<!ELEMENT Song (lyric*)>
<!ELEMENT lyric (#PCDATA)>
<!ENTITY convoy "we got a great big convoy">
<!ENTITY rubberduck "ain't she a beautiful sight">
]>
<Song>
<lyric>&convoy;</lyric>
<lyric>&rubberduck;</lyric>
</Song>
---
# script.rb:
doc =3D Document.new File.open('song.xml', 'r'), { :raw =3D> :all }

doc.elements.each('Song/lyric') do |lyric|
puts lyric.raw # This prints 'true'
puts lyric.text # This always has its entities decoded!
end
---
# output:
true
we got a great big convoy
true
ain't she a beautiful sight
---
# desired output:
true
&convoy;
true
&rubberduck;
---

When I take out the { :raw =3D> :all } part, the entry.raw line returns
nil, but the output isn't changed. Am I misunderstanding how this is
supposed to work, or is it broken? How can I get the entities back in
an unencoded form? This is driving me crazy.

Thanks!

-Pawel
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,015
Latest member
AmbrosePal

Latest Threads

Top