html parser with regex, how to solve?

Luiz Vitor Martinez Cardoso · Jan 5, 2008

[Note: parts of this message were removed to make it a legal post.]

Yeah,

I`m trying to develop a simple application using ruby (when this works i
will pass to rails). I need get the source code from a URL, and find for
this string:

<h3 class="zmp">$299.99</h3>

wow, but i need search for not only 149.00, but for all possible numbers, my
friend suggest this:

<h3 class="zmp">*$\d+\.\d{2}.*</h3>

i think this works! but i need other thing... look my code:

#!/usr/bin/ruby

require 'hpricot'
require 'open-uri'

@content = Hpricot(open("
http://www.newegg.com/Product/Product.aspx?Item=N82E16855101066"))

now how i can find for <h3 class="zmp">*$\d+\.\d{2}.*</h3> ?

@content.search("<h3 class="zmp">*$\d+\.\d{2}.*</h3>") is broken ;(

how i can solved this?

thanks for you attention,
Luiz Vitor Martinez Cardoso.

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

s.ross · Jan 5, 2008

Don't use the regex. Let hpricot do what it's good at:

$ irb=> 149.00

In your code, your @content will be searchable the same way. Hpricot
will give you a collection of all h3's with class 'zmp'.

http://code.whytheluckystiff.net/doc/hpricot/

Hope this helps.

Yeah,

I`m trying to develop a simple application using ruby (when this
works i
will pass to rails). I need get the source code from a URL, and find
for
this string:

<h3 class="zmp">$299.99</h3>

wow, but i need search for not only 149.00, but for all possible
numbers, my
friend suggest this:

<h3 class="zmp">*$\d+\.\d{2}.*</h3>

i think this works! but i need other thing... look my code:

#!/usr/bin/ruby

require 'hpricot'
require 'open-uri'

@content = Hpricot(open("
http://www.newegg.com/Product/Product.aspx?Item=N82E16855101066"))

now how i can find for <h3 class="zmp">*$\d+\.\d{2}.*</h3> ?

@content.search("<h3 class="zmp">*$\d+\.\d{2}.*</h3>") is broken ;(

how i can solved this?

thanks for you attention,
Luiz Vitor Martinez Cardoso.

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Luiz Vitor Martinez Cardoso · Jan 5, 2008

[Note: parts of this message were removed to make it a legal post.]

Thanks much! This really works

Now i`m having a new problem (very simple), the output is $1999,00, how i
can remove a $? I will need convert this to a float number

Regards,
Luiz Vitor Martinez Cardoso.

Don't use the regex. Let hpricot do what it's good at:

$ irb=> 149.00

In your code, your @content will be searchable the same way. Hpricot
will give you a collection of all h3's with class 'zmp'.

http://code.whytheluckystiff.net/doc/hpricot/

Hope this helps.

Yeah,

I`m trying to develop a simple application using ruby (when this
works i
will pass to rails). I need get the source code from a URL, and find
for
this string:

<h3 class="zmp">$299.99</h3>

wow, but i need search for not only 149.00, but for all possible
numbers, my
friend suggest this:

<h3 class="zmp">*$\d+\.\d{2}.*</h3>

i think this works! but i need other thing... look my code:

#!/usr/bin/ruby

require 'hpricot'
require 'open-uri'

@content = Hpricot(open("
http://www.newegg.com/Product/Product.aspx?Item=N82E16855101066"))

now how i can find for <h3 class="zmp">*$\d+\.\d{2}.*</h3> ?

@content.search("<h3 class="zmp">*$\d+\.\d{2}.*</h3>") is broken ;(

how i can solved this?

thanks for you attention,
Luiz Vitor Martinez Cardoso.

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Click to expand...

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Joe · Jan 5, 2008

try this:

ele.text.sub('$', '')

Joe

Thanks much! This really works

Now i`m having a new problem (very simple), the output is $1999,00, how i
can remove a $? I will need convert this to a float number

Regards,
Luiz Vitor Martinez Cardoso.

Don't use the regex. Let hpricot do what it's good at:

$ irb

require 'rubygems'
require 'hpricot'
html = '<h3 class="zmp">149.00</h3>'
doc = Hpricot.parse(html)
ele = doc.search('h3.zmp')
puts ele.text

Click to expand...

=> 149.00

In your code, your @content will be searchable the same way. Hpricot
will give you a collection of all h3's with class 'zmp'.

http://code.whytheluckystiff.net/doc/hpricot/

Hope this helps.

Yeah,

I`m trying to develop a simple application using ruby (when this
works i
will pass to rails). I need get the source code from a URL, and find
for
this string:

<h3 class="zmp">$299.99</h3>

wow, but i need search for not only 149.00, but for all possible
numbers, my
friend suggest this:

<h3 class="zmp">*$\d+\.\d{2}.*</h3>

i think this works! but i need other thing... look my code:

#!/usr/bin/ruby

require 'hpricot'
require 'open-uri'

@content = Hpricot(open("
http://www.newegg.com/Product/Product.aspx?Item=N82E16855101066"))

now how i can find for <h3 class="zmp">*$\d+\.\d{2}.*</h3> ?

@content.search("<h3 class="zmp">*$\d+\.\d{2}.*</h3>") is broken ;(

how i can solved this?

thanks for you attention,
Luiz Vitor Martinez Cardoso.

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Click to expand...

Click to expand...

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Luiz Vitor Martinez Cardoso · Jan 5, 2008

[Note: parts of this message were removed to make it a legal post.]

Thanks

I do it!

Regards,
Luiz Vitor Martinez Cardoso.

try this:

ele.text.sub('$', '')

Joe

Thanks much! This really works

Now i`m having a new problem (very simple), the output is $1999,00, how i
can remove a $? I will need convert this to a float number

Regards,
Luiz Vitor Martinez Cardoso.

Don't use the regex. Let hpricot do what it's good at:

$ irb
require 'rubygems'
require 'hpricot'
html = '<h3 class="zmp">149.00</h3>'
doc = Hpricot.parse(html)
ele = doc.search('h3.zmp')
puts ele.text
=> 149.00

In your code, your @content will be searchable the same way. Hpricot
will give you a collection of all h3's with class 'zmp'.

http://code.whytheluckystiff.net/doc/hpricot/

Hope this helps.

On Jan 5, 2008, at 4:07 PM, Luiz Vitor Martinez Cardoso wrote:

Yeah,

I`m trying to develop a simple application using ruby (when this
works i
will pass to rails). I need get the source code from a URL, and find
for
this string:

<h3 class="zmp">$299.99</h3>

wow, but i need search for not only 149.00, but for all possible
numbers, my
friend suggest this:

<h3 class="zmp">*$\d+\.\d{2}.*</h3>

i think this works! but i need other thing... look my code:

#!/usr/bin/ruby

require 'hpricot'
require 'open-uri'

@content = Hpricot(open("
http://www.newegg.com/Product/Product.aspx?Item=N82E16855101066"))

now how i can find for <h3 class="zmp">*$\d+\.\d{2}.*</h3> ?

@content.search("<h3 class="zmp">*$\d+\.\d{2}.*</h3>") is broken ;(

how i can solved this?

thanks for you attention,
Luiz Vitor Martinez Cardoso.

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Click to expand...

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Click to expand...

--
Regards,
Luiz Vitor Martinez Cardoso [Grabber].
(11) 8187-8662

rubz.org - engineer student at maua.br

Passing arguments to a class, how do?	7	Jul 13, 2008
What kind of inheritance is that?	1	Feb 3, 2009
Invalid char errors	3	Jan 4, 2008
writing if statement in one line with elsif condition	9	Aug 16, 2008
Need assistance finetuning HTML, CSS, Javascript - sticky header issue	3	Feb 24, 2022
Search Results with Pagination	1	Oct 25, 2024
How do I limit the for loop count?	3	Nov 2, 2021
Hello I am learning how to code and I tried making a calculator with HTML and js with some CSS I am stuck at thing, Like the screen value is	0	Mar 13, 2025

html parser with regex, how to solve?

Luiz Vitor Martinez Cardoso

s.ross

Luiz Vitor Martinez Cardoso

Joe

Luiz Vitor Martinez Cardoso

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads