beutifulsoup

L

luca72

Hello
I try to use beautifulsoup
i have this:
sito = urllib.urlopen('http://www.prova.com/')
esamino = BeautifulSoup(sito)
luca = esamino.findAll('tr', align='center')

print luca[0]

I need to get the following information:
1)Only|G|BoT|05
2)#1
3)44.4MB
4)Pc-prova.rar
with: print luca[0].a.string i get #1
with print luca[0].td.string i get 44.4MB
can you explain me how to get the others two value
Thanks
Luca
 
P

Peter Pearson

Hello
I try to use beautifulsoup
i have this:
sito = urllib.urlopen('http://www.prova.com/')
esamino = BeautifulSoup(sito)
luca = esamino.findAll('tr', align='center')

print luca[0]
[The following long string has been wrapped.] href="#">#1</a></th><td width="10%">44.4MB</td>
said:
I need to get the following information:
1)Only|G|BoT|05
2)#1
3)44.4MB
4)Pc-prova.rar
with: print luca[0].a.string i get #1
with print luca[0].td.string i get 44.4MB
can you explain me how to get the others two value

Like you, I struggle with BeautifulSoup; but perhaps this will help
while waiting for somebody smarter to join the thread:
.... """<tr align="center"><th width="5%">"""
.... """<a onclick="t('Only|G|BoT|05','#1');" href="#">#1</a>"""
.... "" said:
tr = soup.findAll( 'tr' )
tr[0].findAll( text = True ) [u'#1', u'44.4MB', u' Pc-prova.rar ']
c = tr[0].findChild( attrs={"onclick": True} )
print c[ "onclick" ]
t('Only|G|BoT|05','#1');
 
K

Kay Schluehr

Hello
I try to use beautifulsoup
i have this:
sito = urllib.urlopen('http://www.prova.com/')
esamino = BeautifulSoup(sito)
luca = esamino.findAll('tr', align='center')

print luca[0]

I need to get the following information:
1)Only|G|BoT|05
2)#1
3)44.4MB
4)Pc-prova.rar
with: print luca[0].a.string i get #1
with print luca[0].td.string i get 44.4MB
can you explain me how to get the others two value
Thanks
Luca

The same way you got `luca`

1,2) luca.find("a")["onclick"].split("'") and search through the
result list
3) luca.find("td").string
4) luca.find("font").string
 
L

luca72

hello
Another stupit question instead of use
sito = urllib.urlopen('http://www.prova.com/')
esamino = BeautifulSoup(sito)

i do
sito = urllib.urlopen('http://onlygame.helloweb.eu/')
file_sito = open('sito.html', 'wb')
for line in sito :
file_sito.write(line)
file_sito.close()

how can i pass the file sito.html to beautifulsoup?

Regards

Luca
 
K

Kay Schluehr

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top