A few confusing Hpricot outputs. Anyone had similar experience?

Discussion in 'Ruby' started by Wang Jian, Apr 6, 2009.

  1. Wang Jian

    Wang Jian Guest

    [Note: parts of this message were removed to make it a legal post.]

    ## I wanted to work on something like the following example string

    require 'hpricot'
    string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on April
    2009</a></html>'
    h = Hpricot(string)
    t = "2009-04-06"

    ## Here it goes: confusion No.1

    h.at('a[@title*="2009-04-06"]')
    ##=> returns the 2nd anchor element, as expected.
    h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
    ##=> *1st anchor element. Why is that??*
    h.at("a[@title*=#{t}]")
    ##=> 2nd anchor. works fine
    h.at('a[@title*="#{t}"]')
    ##=> *nil. Because of the single quote?*

    ## And here comes another confusion:

    year = "2009"
    h.at("a[@title*=#{t}][text()*='2009']")
    ##=> 2nd anchor, as expected.
    h.at("a[@title*=#{t}][text()*=#{year}]")
    ##=> *nil. Why is that? Hpricot can't handle #{} more than once?*

    ## Hope you can fill me in on this one. Thanks!!

    ##Jay
    Wang Jian, Apr 6, 2009
    #1
    1. Advertising

  2. On Mon, Apr 6, 2009 at 4:11 AM, Wang Jian <> wrote:
    > ## I wanted to work on something like the following example string
    >
    > require 'hpricot'
    > string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on April
    > 2009</a></html>'
    > h = Hpricot(string)
    > t = "2009-04-06"
    >
    > ## Here it goes: confusion No.1
    >
    > h.at('a[@title*="2009-04-06"]')
    > ##=> returns the 2nd anchor element, as expected.
    > h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
    > ##=> *1st anchor element. Why is that??*


    I'm not sure why it is returning {emptyelem <a>}, but I can tell you
    why its not returning the element you expect: because you didn't use
    string interpolation so that the call to Time.now.strftime(...) would
    be evaluated and inserted into the string. This selects the expected
    element:

    h.at("a[@title*=#{Time.now.strftime('%Y-%m-%d')}]")

    > h.at("a[@title*=#{t}]")
    > ##=> 2nd anchor. works fine
    > h.at('a[@title*="#{t}"]')
    > ##=> *nil. Because of the single quote?*


    Exactly, that's just ruby single- versus double-quote string behavior.
    With the same setup as you used:

    irb(main):037:0> "#{t}"
    => "2009-04-06"
    irb(main):038:0> '#{t}'
    => "\#{t}"

    >
    > ## And here comes another confusion:
    >
    > year = "2009"
    > h.at("a[@title*=#{t}][text()*='2009']")
    > ##=> 2nd anchor, as expected.
    > h.at("a[@title*=#{t}][text()*=#{year}]")
    > ##=> *nil. Why is that? Hpricot can't handle #{} more than once?*


    Do you mean for these to pass different strings to h.at()? Look at the
    strings you are using.

    irb(main):048:0> puts [ "a[@title*=#{t}][text()*='2009']",
    irb(main):049:1* "a[@title*=#{t}][text()*=#{year}]" ]
    a[@title*=2009-04-06][text()*='2009']
    a[@title*=2009-04-06][text()*=2009]

    So, you are just getting unreliable results when you aren't using
    quotes around the values you are searching for. This version works,
    where the second one above did not:

    h.at("a[@title*='#{t}'][text()*='#{year}']")

    Note that I've put quotes on both values, though at least in this
    example the title appears to work without them.
    Christopher Dicely, Apr 6, 2009
    #2
    1. Advertising

  3. Wang Jian

    Wang Jian Guest

    [Note: parts of this message were removed to make it a legal post.]

    Great notes. Thanks a lot!

    So the take home message is like always use " on the very outside, and use
    (literally) '#{expression}' to ensure consistency.

    It's kinda counter-intuitive at first look, as normally the #{} won't work
    when placed in between single quotes. But it works in this one. :)

    2009/4/6 Christopher Dicely <>

    > On Mon, Apr 6, 2009 at 4:11 AM, Wang Jian <> wrote:
    > > ## I wanted to work on something like the following example string
    > >
    > > require 'hpricot'
    > > string = '<html><a></a><a href="/123/456" title="2009-04-06">posted on

    > April
    > > 2009</a></html>'
    > > h = Hpricot(string)
    > > t = "2009-04-06"
    > >
    > > ## Here it goes: confusion No.1
    > >
    > > h.at('a[@title*="2009-04-06"]')
    > > ##=> returns the 2nd anchor element, as expected.
    > > h.at('a[@title*=Time.now.strftime("%Y-%m-%d")]')
    > > ##=> *1st anchor element. Why is that??*

    >
    > I'm not sure why it is returning {emptyelem <a>}, but I can tell you
    > why its not returning the element you expect: because you didn't use
    > string interpolation so that the call to Time.now.strftime(...) would
    > be evaluated and inserted into the string. This selects the expected
    > element:
    >
    > h.at("a[@title*=#{Time.now.strftime('%Y-%m-%d')}]")
    >
    > > h.at("a[@title*=#{t}]")
    > > ##=> 2nd anchor. works fine
    > > h.at('a[@title*="#{t}"]')
    > > ##=> *nil. Because of the single quote?*

    >
    > Exactly, that's just ruby single- versus double-quote string behavior.
    > With the same setup as you used:
    >
    > irb(main):037:0> "#{t}"
    > => "2009-04-06"
    > irb(main):038:0> '#{t}'
    > => "\#{t}"
    >
    > >
    > > ## And here comes another confusion:
    > >
    > > year = "2009"
    > > h.at("a[@title*=#{t}][text()*='2009']")
    > > ##=> 2nd anchor, as expected.
    > > h.at("a[@title*=#{t}][text()*=#{year}]")
    > > ##=> *nil. Why is that? Hpricot can't handle #{} more than once?*

    >
    > Do you mean for these to pass different strings to h.at()? Look at the
    > strings you are using.
    >
    > irb(main):048:0> puts [ "a[@title*=#{t}][text()*='2009']",
    > irb(main):049:1* "a[@title*=#{t}][text()*=#{year}]" ]
    > a[@title*=2009-04-06][text()*='2009']
    > a[@title*=2009-04-06][text()*=2009]
    >
    > So, you are just getting unreliable results when you aren't using
    > quotes around the values you are searching for. This version works,
    > where the second one above did not:
    >
    > h.at("a[@title*='#{t}'][text()*='#{year}']")
    >
    > Note that I've put quotes on both values, though at least in this
    > example the title appears to work without them.
    >
    >
    Wang Jian, Apr 7, 2009
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. LP
    Replies:
    3
    Views:
    353
    Brian Bischof
    Feb 17, 2005
  2. Ryan
    Replies:
    0
    Views:
    313
  3. Powercat

    dotnetnuke or similar experience

    Powercat, Jul 1, 2007, in forum: ASP .Net
    Replies:
    18
    Views:
    549
    Powercat
    Jul 5, 2007
  4. HH
    Replies:
    2
    Views:
    110
  5. Oltmans
    Replies:
    1
    Views:
    82
    Andreas Bergmaier
    Mar 21, 2011
Loading...

Share This Page