how to analyse web with nokogiri?

P

Pen Ttt

how to analyse web with nokogiri?
there is my html web:
<table>
<tr>
<td><strong>date</strong></td>
<td>2009-12-31</td>
<td>2009-09-30</td>
</tr>

<tr>
<td><strong>asset</strong></td>
<td>580</td>
<td> 680</td>
</tr>

<tr>
<td><strong>cash</strong></td>
<td>1,693</td>
<td>1,777</td>
</tr>

<tr>
<td><strong>finance</strong></td>
<td>201</td>
<td>2497</td>
</tr>


<tr>
<td><strong>note</strong></td>
<td>500</td>
<td>700</td>
</tr>
</table>
can i get array,such as a with nokogiri?
a[0]=date,asset,cash,finance,note
a[1]=2009-12-31,580 ,1,693,201,500
a[2]=2009-09-30,680,1777,2497,700
 
A

AMILIN Aurélien

Le 10/04/2010 16:07, Pen Ttt a écrit :
<table>
<tr>
<td><strong>date</strong></td>
<td>2009-12-31</td>
<td>2009-09-30</td>
</tr>

<tr>
<td><strong>asset</strong></td>
<td>580</td>
<td> 680</td>
</tr>

<tr>
<td><strong>cash</strong></td>
<td>1,693</td>
<td>1,777</td>
</tr>

<tr>
<td><strong>finance</strong></td>
<td>201</td>
<td>2497</td>
</tr>


<tr>
<td><strong>note</strong></td>
<td>500</td>
<td>700</td>
</tr>
</table>
Yes it's possible you could try this :

table = "<table>
<tr>
<td><strong>date</strong></td>
<td>2009-12-31</td>
<td>2009-09-30</td>
</tr>
<tr>
<td><strong>asset</strong></td>
<td>580</td>
<td> 680</td>
</tr>
<tr>
<td><strong>cash</strong></td>
<td>1,693</td>
<td>1,777</td>
</tr>
<tr>
<td><strong>finance</strong></td>
<td>201</td>
<td>2497</td>
</tr>
<tr>
<td><strong>note</strong></td>
<td>500</td>
<td>700</td>
</tr>
</table>"

parser = Nokogiri::HTML(table)
result = [ ]
parser.css("table tr td").each_with_index do |td, i|
result[i%3] ||= ""
result[i%3] << "#{td.content};"
end

puts result"2009-12-31;580;1,693;201;500;"
"2009-09-30; 680;1,777;2497;700;"
 
P

Pen Ttt

think AMILIN Aurélien ,i get it
i am a beginner,there are some problems :
1\
what is the meaning of css("table tr td")?

2\
what is the meaning of
result[i%3] ||= "" #let nil be result array??
what is i%3 and || ?
result[i%3] << "#{td.content};" #td.content is the td value,i know
what is #{ } ?
 
A

Aurélien AMILIN

1\
css("table tr td") will create an array of all td elements of the <table>
I've passed to Nokogiri::HTML() method, that's why I can call the each meth=
o
on the result

2\
You'v got a table with 3 columns (1 for title, and 2 for values) so i use
the modulo (%, it's a math function : 0%3=3D0; 1%3=3D1; 2%3=3D2; 3%3=3D0; 4=
%3=3D1 ..)
function to place all the data at the right index
result[i%3] ||=3D "" means if result[i%3] is nil then put an empty string i=
nto
it
#{} helps to include a string into another, here i use #{} to add a ; at th=
e
end of the string
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top