using regular expressions...

soldier.coder · Nov 11, 2008

I have the following code:

require 'open-uri'
def scrape_table(html)
%r{</thead.*?>(.*?)</table>}m =~html
$1
end

def scrape_case(a_line)
%r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
$1
end

if $0 == __FILE__

url = 'http://localhost:8080/tests/raw.html';
page = open(url) #open the url like a file
text = page.read; #read it into one string
my_table = scrape_table(text) #grab or "scrape" the table
my_link = scrape_case(my_table) #grab a html that includes a 6-2
digit number (ex: 080910-15)
puts(my_table) #prints out my_table -- which contains the table
information
puts("\n")
puts(my_link)

end

The code grabs the one table contained in my URL then looks for an
HTML link that includes a number that is 6 digits, followed by a dash,
followed by 6 digits. I'm fairly certain the regex in scrape_case( )
grabs more than one html link, if more than one is in the table. Is
there any way I can grab all those links into an array?

Peter Szinek · Nov 11, 2008

I have the following code:

require 'open-uri'
def scrape_table(html)
%r{</thead.*?>(.*?)</table>}m =~html
$1
end

def scrape_case(a_line)
%r{(<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>)}m =~ a_line
$1
end

Is there any way I can grab all those links into an array?

Sure - String#scan is your friend:

def scrape_case(a_line)
a_line.scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<\/a>/)
end

ex:
href='111111-99'>111111-99</a>".scan(/<a\s.*?\d{6}-\d{2}'>\d{6}-\d{2}<
\/a>/)
=> ["<a href='123456-78'>123456-78</a>", "<a
href='111111-99'>111111-99</a>"]

HTH,
Peter

The power of regular expressions without regular expressions.	0	Jul 17, 2013
Survey details won't go through using php, ajax, Mysql	3	Oct 25, 2023
Database Manager: A C++ Console Application	14	May 12, 2025
Can't control regular expressions	2	Jul 29, 2008
Centering a button using flexbox	2	Feb 5, 2023
Help with code	0	Jun 11, 2022
html parsing using regular expressions	1	Oct 24, 2006
Mini Web Server in C++ (Part One)	4	Oct 2, 2025

using regular expressions...

soldier.coder

Peter Szinek

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads