Escaping single quotes in XPath query with REXML

Discussion in 'Ruby' started by Francis Hwang, Oct 21, 2004.

  1. Anybody tried to use XPath in REXML with a single quote, only to run
    into the fact that quote escaping in XPath is apparently not accounted
    for? If this were in the context on XSLT I'd be able to assign some
    annoying temp variable like $apos, but it's not, so I can't.

    irb(main):001:0> require 'rexml/document'
    => true
    irb(main):002:0> include REXML
    => Object
    irb(main):003:0> xml = "<rss version='2.0'><channel><item><title>John's
    Doe</title></item></channel></rss>"
    => "<rss version='2.0'><channel><item><title>John's
    Doe</title></item></channel></rss>"
    irb(main):004:0> xmldoc = Document.new xml
    => <UNDEFINED> ... </>
    irb(main):005:0> XPath.first( xmldoc, "/rss/channel/item/title" ).to_s
    => "<title>John's Doe</title>"
    irb(main):006:0> XPath.first( xmldoc,
    "/rss/channel/item/title[text()='John's Doe']" ).to_s
    NoMethodError: undefined method `node_type' for "John":String
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
    `Predicate'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
    `Predicate'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
    `times'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
    from (irb):6
    irb(main):007:0> XPath.first( xmldoc,
    "/rss/channel/item/title[text()='John\'s Doe']" ).to_s
    NoMethodError: undefined method `node_type' for "John":String
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:124:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in `each'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:123:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:402:in
    `Predicate'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:346:in
    `Predicate'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:204:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
    `times'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:199:in
    `internal_parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:49:in `match'
    from /usr/local/lib/ruby/1.8/rexml/xpath_parser.rb:34:in `parse'
    from /usr/local/lib/ruby/1.8/rexml/xpath.rb:28:in `first'
    from (irb):7
    Francis Hwang, Oct 21, 2004
    #1
    1. Advertising

  2. > irb(main):006:0> XPath.first( xmldoc,
    > "/rss/channel/item/title[text()='John's Doe']" ).to_s


    I'm no expert in XPath, but that looks like a broken XPath query because of
    the three single quotes.

    > irb(main):007:0> XPath.first( xmldoc,
    > "/rss/channel/item/title[text()='John\'s Doe']" ).to_s


    That's identical, as you'll see if you try this:

    irb(main):001:0> a="text()='John\'s Doe'"
    => "text()='John's Doe'"

    You've not inserted a backslash into the string, you just escaped the quote,
    and the escaping was removed. You need two backslashes to insert a single
    backslash into the string:

    irb(main):002:0> a="text()='John\\'s Doe'"
    => "text()='John\\'s Doe'"

    (Despite how it looks, there is only a single backslash in there; it's shown
    as two because it's inside a double-quoted string, to make it valid Ruby)

    irb(main):003:0> a.each_byte { |c| print c.chr," " }
    t e x t ( ) = ' J o h n \ ' s D o e ' => "text()='John\\'s Doe'"

    However, I've just had a quick scan through the XPath-1.0 spec, and I don't
    think that's how you do it. You can include single quotes inside a
    double-quoted string, and vice versa. But probably what you want for the
    general case is XML character entities: ' or &apos;

    Try passing your string through this before constructing your XPath query:

    require 'rexml/text'
    a = "John's Doe"
    b = REXML::Text::normalize(a)
    #=> "John&apos;s Doe"

    HTH,

    Brian.
    Brian Candler, Oct 21, 2004
    #2
    1. Advertising

  3. On Thu, Oct 21, 2004 at 09:28:51AM +0100, Brian Candler wrote:
    > Try passing your string through this before constructing your XPath query:
    >
    > require 'rexml/text'
    > a = "John's Doe"
    > b = REXML::Text::normalize(a)
    > #=> "John&apos;s Doe"


    Hmm, that doesn't work.

    irb(main):007:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John&apos;s Doe']" ).to_s
    => ""
    irb(main):008:0> XPath.first( xmldoc, "/rss/channel/item/title[text()='John's Doe']" ).to_s
    => ""
    irb(main):009:0> XPath.first( xmldoc, "/rss/channel/item/title[text()=\"John's Doe\"]" ).to_s
    => "<title>John's Doe</title>"

    You might want to raise that with the REXML author. In the mean time, if you
    know the string only contains single quotes, then you can surround it with
    double quotes in the XPath query, as per the third line above.

    Regards,

    Brian.
    Brian Candler, Oct 21, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Chris
    Replies:
    1
    Views:
    13,601
    Oisin
    Mar 24, 2006
  2. Lawrence Tierney

    Multiline quotes - escaping quotes - et al

    Lawrence Tierney, Dec 24, 2003, in forum: Java
    Replies:
    3
    Views:
    4,480
    Andrew Thompson
    Dec 24, 2003
  3. Paul Rubel
    Replies:
    5
    Views:
    318
    Florian Gross
    Nov 3, 2004
  4. lists
    Replies:
    3
    Views:
    240
    lists
    Oct 21, 2005
  5. Replies:
    7
    Views:
    161
    Thomas 'PointedEars' Lahn
    May 17, 2005
Loading...

Share This Page