any suggestions for URL cataloging project?

  • Thread starter Matthew K Jensen
  • Start date
M

Matthew K Jensen

I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?
 
A

Alex Martelli

Matthew K Jensen said:
I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?

Check out Tools/webchecker/ -- the Tools directory is part of Python's
source distribution and should also come with most prepackaged Python
distributions, I believe.


Alex
 
P

Paul McGuire

Matthew K Jensen said:
I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?

One of the examples that comes with pyparsing is urlextractor.py. Point it
at a web page and it lists out the urls and linked text.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top