using regular expressions...

Discussion in 'Ruby' started by soldier.coder, Nov 11, 2008.

  1. I have the following code:

    require 'open-uri'
    def scrape_table(html)
    %r{</thead.*?>(.*?)</table>}m =~html
    $1
    end

    def scrape_case(a_line)
    %r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
    $1
    end

    if $0 == __FILE__

    url = 'http://localhost:8080/tests/raw.html';
    page = open(url) #open the url like a file
    text = page.read; #read it into one string
    my_table = scrape_table(text) #grab or "scrape" the table
    my_link = scrape_case(my_table) #grab a html that includes a 6-2
    digit number (ex: 080910-15)
    puts(my_table) #prints out my_table -- which contains the table
    information
    puts("\n")
    puts(my_link)

    end

    The code grabs the one table contained in my URL then looks for an
    HTML link that includes a number that is 6 digits, followed by a dash,
    followed by 6 digits. I'm fairly certain the regex in scrape_case( )
    grabs more than one html link, if more than one is in the table. Is
    there any way I can grab all those links into an array?
    soldier.coder, Nov 11, 2008
    #1
    1. Advertising

  2. soldier.coder

    Peter Szinek Guest

    On 2008.11.11., at 14:22, soldier.coder wrote:

    > I have the following code:
    >
    > require 'open-uri'
    > def scrape_table(html)
    > %r{</thead.*?>(.*?)</table>}m =~html
    > $1
    > end
    >
    > def scrape_case(a_line)
    > %r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
    > $1
    > end


    > Is there any way I can grab all those links into an array?


    Sure - String#scan is your friend:

    def scrape_case(a_line)
    a_line.scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>/)
    end

    ex:

    >> "<a href='123456-78'>123456-78</a> here is another: <a

    href='111111-99'>111111-99</a>".scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<
    \/a>/)
    => ["<a href='123456-78'>123456-78</a>", "<a
    href='111111-99'>111111-99</a>"]


    HTH,
    Peter
    Peter Szinek, Nov 11, 2008
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jay Douglas
    Replies:
    0
    Views:
    592
    Jay Douglas
    Aug 15, 2003
  2. Guest
    Replies:
    1
    Views:
    374
    Jerry III
    Oct 19, 2003
  3. Chris Leffer

    Validating paths using regular expressions

    Chris Leffer, Feb 21, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    298
    Chris Leffer
    Feb 21, 2005
  4. Jarkko Viinamäki
    Replies:
    1
    Views:
    4,153
    =?ISO-8859-1?Q?Daniel_Sj=F6blom?=
    Feb 22, 2004
  5. Noman Shapiro
    Replies:
    0
    Views:
    219
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page