how to revise the programm to analyze web?

P

Pen Ttt

the following programme can run successfully,i can get a table
require 'hpricot'
require 'open-uri'
doc = Hpricot(open('http://webbbs.gamer.com.tw/board.php?brd=Chat&p=1'))
tab = (doc/'table[@class="ssize"]')
(tab/'tr').each_with_index do |line,index|
next if (index + 1) %2 == 0
(line/'td').each do |data|
print data.inner_text.gsub("\n", '') + ' '
end
puts ''
end

when i change an other web (there is a table in it too)
i revise the programme
require 'hpricot'
require 'open-uri'
doc =
Hpricot(open('http://quotes.money.163.com/corp/1034/code=601398.html'))
tab = (doc/'table[@class="ssize"]')
(tab/'tr').each_with_index do |line,index|
next if (index + 1) %2 == 0
(line/'td').each do |data|
print data.inner_text.gsub("\n", '') + ' '
end
puts ''
end

now , i can get nothing,how to revise it to get the table ?
please open it with your brower,then you can see the table ,what i want
to do is to get the data of the table , to analyse.
http://quotes.money.163.com/corp/1034/code=601398.html
 
J

Jesús Gabriel y Galán

the following programme =A0can =A0run successfully,i can get a table
require 'hpricot'
require 'open-uri'
doc =3D Hpricot(open('http://webbbs.gamer.com.tw/board.php?brd=3DChat&p= =3D1'))
tab =3D (doc/'table[@class=3D"ssize"]')
(tab/'tr').each_with_index do |line,index|
next if (index + 1) %2 =3D=3D 0
(line/'td').each do |data|
print data.inner_text.gsub("\n", '') + ' '
end
puts ''
end

when i change an other web (there is a table in it too)
i revise the programme
require 'hpricot'
require 'open-uri'
doc =3D
Hpricot(open('http://quotes.money.163.com/corp/1034/code=3D601398.html'))
tab =3D (doc/'table[@class=3D"ssize"]')
(tab/'tr').each_with_index do |line,index|
next if (index + 1) %2 =3D=3D 0
(line/'td').each do |data|
print data.inner_text.gsub("\n", '') + ' '
end
puts ''
end

now , i can get nothing,how to revise it to get the table ?
please open it with your brower,then you can see the table ,what i want
to do is to get the data of the table , to analyse.
http://quotes.money.163.com/corp/1034/code=3D601398.html

You have to look at the structure of the page, and come up with an
xpath that will get you to the table you want, then write it instead
of 'table[@class=3D"ssize"]', because there's no table with that class
in that page.
I think that there might be some firefox addons that can produce a
XPath for an element, but I can't remember. Firebug gives you one
XPath but it's the "canonical" one, maybe you could simplify it, if
you know it's the only table with a class, for example.

Also be careful that maybe in this page, the internal structure of the
table is different. In fact, if you check with Firebug you will see
that the first column is a table and the second column is a div with
more internal tables for each column. So you have to work a little
bit more to extract the data from each cell.

Jesus.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top