[BUG] REXML 2.7.1 External Entity Parsing

P

Paul Duncan

--GcuyunM1iFaMYZNm
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Everyone,

There appears to be a bug in REXML 2.7.1 external entity parsing. The
following code throws an error in Ruby 1.8.0/REXML 2.7.1, but not in
Ruby 1.6.8/REXML 2.3.5:

----
#!/usr/bin/env ruby

require 'rexml/document'

XP =3D '//channel/title'

# dump versions
puts 'Ruby %s, REXML %s' % [RUBY_VERSION, REXML::Version]

# check both examples
%w{working.rss broken.rss}.each do |path|
File.open(path) do |file|
doc =3D REXML::Document.new file.readlines.join('')

puts 'File: ' << path

# check to make sure everything is kosher
puts 'doc.root.class =3D ' << doc.root.class.to_s
puts 'doc.root.elements.class =3D ' << doc.root.elements.class.to_s

# get the title of the feed
puts (e =3D doc.root.elements[XP]) ? e.class.to_s : "Couldn't find #{XP=
}."
end
end
----

2.3.5 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.6.8, REXML 2.3.5
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
<title>Paul Duncan</title>
File: broken.rss
doc.root.class =3D REXML::Element=20
doc.root.elements.class =3D REXML::Elements
<title>O'Reilly Network Articles</title>
=20
2.7.1 Output
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
Ruby 1.8.0, REXML 2.7.1
File: working.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
REXML::Element
File: broken.rss
doc.root.class =3D REXML::Element
doc.root.elements.class =3D REXML::Elements
/usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:83:in `internal_parse': =
undefined method `node_type' for #<REXML::Entity:0x4027d9d0> (NoMethodError)
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `delete_if'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:81:in `internal_p=
arse'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:60:in `match'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:315:in `d_o_s'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `d_o_s'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:317:in `d_o_s'
from /usr/local/lib/site_ruby/1.8/rexml/xpath_parser.rb:313:in `each_inde=
x'
... 8 levels...
from ./rexml_test.rb:12:in `open'
from ./rexml_test.rb:12
from ./rexml_test.rb:11:in `each'
from ./rexml_test.rb:11

The files in question and additional information are available at
http://www.raggle.org/files/rexml-external_entity_bug/ . We're
stripping external entity declarations before parsing feeds in Raggle as
an interim solution.


PS. I attempted to use the REXML bug report page on the Germane
Software site, but it gave me the following error:

The system encountered a fatal error
failed to chroot(/home/jitterbug/rexml)
The last error code was: Operation not permitted
uid/gid=3D81/81=20

--=20
Paul Duncan <[email protected]> OpenPGP Key ID: 0x82C29562
http://www.pablotron.org/ http://www.paulduncan.org/

--GcuyunM1iFaMYZNm
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/WHYvzdlT34LClWIRAksWAKDHdGet3Dc3D/KN0dqnpUboGzrTYwCgjgWh
CD9WfZN4tohdbYF2yuirXnE=
=pwCD
-----END PGP SIGNATURE-----

--GcuyunM1iFaMYZNm--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top