web crawling.

S

S Borg

Hello,

I have been writing very simple Python programs that parse HTML and
such, mainly just to get
a better feel for the language. Here is my question: If I parsed an
HTML page into all of the image
files listed on that page, how could I request all of those images and
download them into some specified folder? I am sure this is quite easy,
but I am stuck.

Thank you very much.
Burgeoning Pythonista
 
A

Alex Martelli

S Borg said:
Hello,

I have been writing very simple Python programs that parse HTML and
such, mainly just to get
a better feel for the language. Here is my question: If I parsed an
HTML page into all of the image
files listed on that page, how could I request all of those images and
download them into some specified folder? I am sure this is quite easy,
but I am stuck.

There's a good crawler in the Demo directory of the Python source
distribution, so download and unpack said sources and look there.


Alex
 
G

gene tani

S said:
Hello,

I have been writing very simple Python programs that parse HTML and
such, mainly just to get
a better feel for the language. Here is my question: If I parsed an
HTML page into all of the image
files listed on that page, how could I request all of those images and
download them into some specified folder? I am sure this is quite easy,
but I am stuck.

Thank you very much.
Burgeoning Pythonista

http://sig.levillage.org/?p=588
 
F

Fuzzyman

Use BeautifulSoup to get all the image tags out of the html.

You'll need to join the urls of the images to the url of the page
(urlparse.urljoin off the top of my head). If you look at BeautifulSoup
you will see how to get the 'src' reference of each image tag.

All the best,

Fuzzyman
http://www.voidspace.org.uk/python/index.shtml
 
J

John M. Gabriele

Alex said:
There's a good crawler in the Demo directory of the Python source
distribution, so download and unpack said sources and look there.


Alex

Hm. Looks like that's:

Python-2.4.2/Tools/webchecker

See 'pydoc ./webchecker.py' for more info.

---J
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,051
Latest member
CarleyMcCr

Latest Threads

Top