how to extract something in between a pattern

Discussion in 'Ruby' started by Cheyne Li, Sep 22, 2008.

  1. Cheyne Li

    Cheyne Li Guest

    Hi experts,

    I'm not very familiar with ruby's library. I wonder if there a method
    can extract something in a pattern? For example,

    I have a string: a=aabbcc<p>ccddee</p>

    I wanna get the anything between <p> and </p>, which is ccddee


    Thanks in advance.
    --
    Posted via http://www.ruby-forum.com/.
     
    Cheyne Li, Sep 22, 2008
    #1
    1. Advertising

  2. Cheyne Li

    Li Chen Guest

    Cheyne Li wrote:
    > Hi experts,
    >
    > I'm not very familiar with ruby's library. I wonder if there a method
    > can extract something in a pattern? For example,
    >
    > I have a string: a=aabbcc<p>ccddee</p>
    >
    > I wanna get the anything between <p> and </p>, which is ccddee
    >
    >
    > Thanks in advance.



    C:\Users\Alex>irb
    irb(main):001:0> require 'hpricot'
    => true
    irb(main):002:0> a="aabbcc<p>ccddee</p>"
    => "aabbcc<p>ccddee</p>"
    irb(main):003:0>
    irb(main):004:0* doc=Hpricot(a)
    => #<Hpricot::Doc "aabbcc" {elem <p> "ccddee" </p>}>
    irb(main):005:0> p doc.at('p').inner_text
    "ccddee"
    => nil
    irb(main):006:0>



    Li
    --
    Posted via http://www.ruby-forum.com/.
     
    Li Chen, Sep 22, 2008
    #2
    1. Advertising

  3. Ya, in this case (HTML/XML), Hpricot is your best bet.

    Otherwise, standard regex stuff would apply, imo.

    On Mon, Sep 22, 2008 at 12:44 PM, Li Chen <> wrote:
    > Cheyne Li wrote:
    >> Hi experts,
    >>
    >> I'm not very familiar with ruby's library. I wonder if there a method
    >> can extract something in a pattern? For example,
    >>
    >> I have a string: a=aabbcc<p>ccddee</p>
    >>
    >> I wanna get the anything between <p> and </p>, which is ccddee
    >>
    >>
    >> Thanks in advance.

    >
    >
    > C:\Users\Alex>irb
    > irb(main):001:0> require 'hpricot'
    > => true
    > irb(main):002:0> a="aabbcc<p>ccddee</p>"
    > => "aabbcc<p>ccddee</p>"
    > irb(main):003:0>
    > irb(main):004:0* doc=Hpricot(a)
    > => #<Hpricot::Doc "aabbcc" {elem <p> "ccddee" </p>}>
    > irb(main):005:0> p doc.at('p').inner_text
    > "ccddee"
    > => nil
    > irb(main):006:0>
    >
    >
    >
    > Li
    > --
    > Posted via http://www.ruby-forum.com/.
    >
    >




    --
    | ICQ: 335082155 | Note: Due to Google's
    privacy policy <http://tinyurl.com/5xbtl> and the United States'
    policy on electronic surveillance <http://tinyurl.com/muuyl>,
    please do not IM/e-mail me anything you wish to remain secret.
     
    Tod Beardsley, Sep 22, 2008
    #3
  4. Cheyne Li

    Guest

    On Sep 22, 12:58 pm, Cheyne Li <> wrote:
    > Hi experts,
    >
    > I'm not very familiar with ruby's library. I wonder if there a method
    > can extract something in a pattern? For example,
    >
    > I have a string: a=aabbcc<p>ccddee</p>
    >
    > I wanna get the anything between <p> and </p>, which is ccddee
    >
    > Thanks in advance.
    > --
    > Posted viahttp://www.ruby-forum.com/.


    If you want to use regexp, a quick and dirty way would be :

    (a.split %r{</?p>})[1]
     
    , Sep 25, 2008
    #4
  5. Cheyne Li

    Ittay Dror Guest

    [Note: parts of this message were removed to make it a legal post.]



    wrote:
    > On Sep 22, 12:58 pm, Cheyne Li <> wrote:
    >
    >> Hi experts,
    >>
    >> I'm not very familiar with ruby's library. I wonder if there a method
    >> can extract something in a pattern? For example,
    >>
    >> I have a string: a=aabbcc<p>ccddee</p>
    >>
    >> I wanna get the anything between <p> and </p>, which is ccddee
    >>
    >> Thanks in advance.
    >> --
    >> Posted viahttp://www.ruby-forum.com/.
    >>

    >
    > If you want to use regexp, a quick and dirty way would be :
    >
    > (a.split %r{</?p>})[1]
    >

    or:
    irb(main):001:0> a = 'aabbcc<p>ccddee</p>'
    => "aabbcc<p>ccddee</p>"
    irb(main):002:0> a[%r{<p>(.*)</p>}, 1]
    => "ccddee"



    >


    --
    Ittay Dror <>
    Tikal <http://www.tikalk.com>
    Tikal Project <http://tikal.sourceforge.net>


    --
    --
    Ittay Dror <>
     
    Ittay Dror, Sep 25, 2008
    #5
  6. Cheyne Li

    Patrick He Guest

    Or:

    irb(main):004:0> a = 'aabbcc<p>ccddee</p>ccc<p>eee</p>'
    => "aabbcc<p>ccddee</p>ccc<p>eee</p>"
    irb(main):005:0> a.scan(%r{<p>([^<]*)</p>})
    => [["ccddee"], ["eee"]]

    I perfer to specify what character(s) not to match explicitly.


    Ittay Dror wrote:
    >
    > or:
    > irb(main):001:0> a = 'aabbcc<p>ccddee</p>'
    > => "aabbcc<p>ccddee</p>"
    > irb(main):002:0> a[%r{<p>(.*)</p>}, 1]
    > => "ccddee"
    >
     
    Patrick He, Sep 28, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Guest
    Replies:
    4
    Views:
    512
    Guest
    Oct 13, 2004
  2. Martijn
    Replies:
    0
    Views:
    562
    Martijn
    Jul 28, 2003
  3. Replies:
    0
    Views:
    635
  4. Pekka Järvinen
    Replies:
    2
    Views:
    700
    Richard Tobin
    Apr 29, 2008
  5. Replies:
    4
    Views:
    241
    Tad McClellan
    Jun 1, 2007
Loading...

Share This Page