view page source or save after load

Z

zephron2000

Hey,

I need to either:

1. View the page source of a webpage after it loads

or

2. Save the webpage to my computer after it loads (same as File > Save
Page As)

urllib is not sufficient (using urlopen or something else in urllib
isn't going to do the trick)

Any ideas?

Thanks,
Lara
 
J

James Stroud

zephron2000 said:
Hey,

I need to either:

1. View the page source of a webpage after it loads

or

2. Save the webpage to my computer after it loads (same as File > Save
Page As)

urllib is not sufficient (using urlopen or something else in urllib
isn't going to do the trick)

Any ideas?

Thanks,
Lara

I happen to be tweaking a module that does this as your question came
in. The relevant lines are:

fetchparams = urllib.urlencode(fetchparams)
wwwf = urllib.urlopen("?".join([baseurl, fetchparams]))
afile = open(filename, "w")
afile.write(wwwf.read())
afile.close()

James

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
A

alex23

zephron2000 said:
I need to either:
1. View the page source of a webpage after it loads
or
2. Save the webpage to my computer after it loads (same as File > Save
Page As)
urllib is not sufficient (using urlopen or something else in urllib
isn't going to do the trick)

You don't really say _why_ urllib.urlopen "isn't going to do the
trick". The following does what you've described:

import urllib
page = urllib.urlopen('http://some.address')
open('saved_page.txt','w').write(page).close()

If you're needing to use a browser directly and you're running under
Windows, try the Internet Explorer Controller library, IEC:

import IEC
ie = IEC.IEController()
ie.Navigate('http://some.address')
page = ie.GetDocumentHTML()
open('saved_page.txt','w').write(page.encode('iso-8859-1')).close()

(You can grab IEC from http://www.mayukhbose.com/python/IEC/index.php)

Hope this helps.

-alex23
 
G

Gabriel Genellina

page = urllib.urlopen('http://some.address')

add .read() at the end
open('saved_page.txt','w').write(page).close()

write() does not return the file object, so this won't work; you have
to bind the file to a temporary variable to be able to close it.



Gabriel Genellina
Softlab SRL





__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas
 
A

alex23

Gabriel Genellina wrote:
<fixes for my stupidity>

Thanks for the corrections, Gabriel. I really need to learn to
cut&paste working code :)

Cheers.

-alex23
 
J

James Stroud

Gabriel said:
add .read() at the end


write() does not return the file object, so this won't work; you have to
bind the file to a temporary variable to be able to close it.

Strictly speaking, "have to" is not perfectly correct. The ".close()"
part can simply be eliminated as the file should close via garbage
collection once leaving the local namespace.

James
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top