Re: search an entire website given the homepage URL

Discussion in 'Python' started by Fredrik Lundh, Apr 25, 2006.

  1. "Bell, Kevin" wrote:

    > I know I can use urllib2 to get at a website given urllib2.urlopen(url)
    > but I'm unsure how to then go through all pages that are linked to it,
    > but still in the domain. If I want to search through the entire python
    > website give the homepage, how would I go about it?


    use a search engine (try the search box in the upper right corner).

    using a spider to download the entire site just so you can "search through
    it" is bloody impolite.

    if you have a valid reason to download portions of the site, use wget's mirror
    function, or some similar tool, and be nice. there's a tool called "websucker"
    in the Tools directory of the standard Python distribution that can also be used
    to mirror portions of a site:

    http://svn.python.org/view/python/trunk/Tools/webchecker/

    </F>
    Fredrik Lundh, Apr 25, 2006
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Lord0
    Replies:
    1
    Views:
    548
    Thomas Weidenfeller
    Apr 19, 2006
  2. Bell, Kevin
    Replies:
    0
    Views:
    261
    Bell, Kevin
    Apr 25, 2006
  3. HYRY
    Replies:
    1
    Views:
    259
    James Stroud
    Oct 31, 2006
  4. daniel
    Replies:
    6
    Views:
    5,701
    David Woakes
    Mar 23, 2010
  5. Diarmaid
    Replies:
    3
    Views:
    166
    Evertjan.
    May 12, 2004
Loading...

Share This Page