rexml: generating tree from source

P

Patrick Gundlach

Hello out there,

I try to parse some xml-text with rexml:

--------------------------------------------------
#!/usr/bin/env ruby

require 'rexml/document'
include REXML


str=<<EOS
<a><b>text with <illegal characters> </b></a>
EOS

# d=Document.new(str) # barks because < and >
s=Source.new(str)
# I thought that this would give me some output
puts Element.new(s) #-> </>
--------------------------------------------------

What I'd like to have is a representation of

element: a
element: b
text: "text with &lt;illegal characters&gt; "


Any possibility?

Thanks,

Patrick
 
D

daz

Patrick said:
Hello out there,

I try to parse some xml-text with rexml:

--------------------------------------------------
#!/usr/bin/env ruby

require 'rexml/document'
include REXML


str=<<EOS
<a><b>text with <illegal characters> </b></a>
EOS

# d=Document.new(str) # barks because < and >
s=Source.new(str)
# I thought that this would give me some output
puts Element.new(s) #-> </>
--------------------------------------------------

What I'd like to have is a representation of

element: a
element: b
text: "text with &lt;illegal characters&gt; "


Any possibility?


This may be applicable:

<Quoting>
http://www.germane-software.com/software/rexml/docs/tutorial.html

[Creating XML documents]
[...]

"Please be aware that all text nodes in REXML are UTF-8 encoded,
and all of your code must reflect this. You may input and output
other encodings (UTF-8, UTF-16, ISO-8859-1, and UNILE are all
supported, input and output), but within your program, you must
pass REXML UTF-8 strings."

"I can't emphasize this enough, because people do have problems
with this. REXML can't possibly alway guess correctly how your
text is encoded, so it always assumes the text is UTF-8."
</>


daz
 
P

Patrick Gundlach

Hello daz,
This may be applicable:

[rexml, utf-8]

i don't think so, since my input _is_ utf-8 compliant (and ascii,
iso-latin-1 etc.).

The only problematic chars in my code are '<' and '>'.

I use the code exactly as shown.

Patrick

--------------------------------------------------
#!/usr/bin/env ruby

require 'rexml/document'
include REXML


str=<<EOS
<a><b>text with <illegal characters> </b></a>
EOS

# d=Document.new(str) # barks because < and >
s=Source.new(str)
# I thought that this would give me some output
puts Element.new(s) #-> </>
--------------------------------------------------
 
P

Patrick Gundlach

[...]
What I'd like to have is a representation of

element: a
element: b
text: "text with &lt;illegal characters&gt; "


I withdraw my request, because handling invalid XML files is not
what one should ask for....

Sorry for the noise,

Patrick
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,175
Latest member
Vinay Kumar_ Nevatia
Top