Web Scraping/Site Scraping

Discussion in 'Python' started by David Jones, Jul 11, 2004.

  1. David Jones

    David Jones Guest

    Hi, I'm interested in learning about web scraping/site scraping using
    Python. Does anybody know of some online resources or have any modules that
    are available to help out. O'Reilly published an interesting book
    "Spidering Hacks" which covered some great scraping hacks but it is all
    written in Perl. I don't know Perl and don't want to. I'm new to
    programing and have been advised to start with Python. So far so good ...
    but need some help with web programming. Thanks for any help you may
    provide. Dave.
     
    David Jones, Jul 11, 2004
    #1
    1. Advertising

  2. Anakim Border, Jul 11, 2004
    #2
    1. Advertising

  3. David Jones

    John J. Lee Guest

    "David Jones" <> writes:

    > Hi, I'm interested in learning about web scraping/site scraping using
    > Python. Does anybody know of some online resources or have any modules that
    > are available to help out. O'Reilly published an interesting book
    > "Spidering Hacks" which covered some great scraping hacks but it is all
    > written in Perl. I don't know Perl and don't want to. I'm new to
    > programing and have been advised to start with Python. So far so good ...
    > but need some help with web programming. Thanks for any help you may
    > provide. Dave.


    http://wwwsearch.sourceforge.net/
    http://wwwsearch.sourceforge.net/bits/GeneralFAQ.html

    http://lists.sourceforge.net/lists/listinfo/wwwsearch-general (rather quiet ATM)


    I ported one of the examples from "Spidering Hacks" to my Python port
    of mechanize. It's in the tarball here:

    http://wwwsearch.sourceforge.net/mechanize/


    John
     
    John J. Lee, Jul 12, 2004
    #3
  4. David Jones

    Paul Bissex Guest

    > "David Jones" <> writes:
    >
    > > Hi, I'm interested in learning about web scraping/site scraping using
    > > Python. Does anybody know of some online resources or have any modules that
    > > are available to help out. O'Reilly published an interesting book
    > > "Spidering Hacks" which covered some great scraping hacks but it is all
    > > written in Perl. I don't know Perl and don't want to. I'm new to
    > > programing and have been advised to start with Python. So far so good ....
    > > but need some help with web programming. Thanks for any help you may
    > > provide. Dave.


    Dave, there's a chapter of "Dive Into Python" that deals specificlaly
    with processing HTML:

    http://diveintopython.org/html_processing/index.html

    If you're new to Python and programming, IMO you should start by going
    through one or more of the available introductory tutorials:

    http://python.org/doc/Intros.html

    Good luck!

    pb

    --
    paul bissex, e-scribe.com -- database-driven web development
    413.585.8095
    69.55.225.29
    01061-0847
    72°39'71"W 42°19'42"N
     
    Paul Bissex, Jul 12, 2004
    #4
  5. On Sun, Jul 11, 2004 at 01:42:47PM +0000, David Jones wrote:
    > Hi, I'm interested in learning about web scraping/site scraping using
    > Python. Does anybody know of some online resources or have any modules that
    > are available to help out. O'Reilly published an interesting book
    > "Spidering Hacks" which covered some great scraping hacks but it is all
    > written in Perl. I don't know Perl and don't want to. I'm new to
    > programing and have been advised to start with Python. So far so good ...
    > but need some help with web programming. Thanks for any help you may
    > provide. Dave.


    For the HTML parsing part of the task, I've heard that Beautiful Soup works
    well:
    http://www.crummy.com/software/BeautifulSoup/

    -Andrew.
     
    Andrew Bennetts, Jul 13, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. mike kim
    Replies:
    2
    Views:
    547
    clintonG
    Aug 20, 2003
  2. Tiddley-Pom
    Replies:
    5
    Views:
    434
    Mark Parnell
    Oct 15, 2003
  3. Alan Silver
    Replies:
    3
    Views:
    433
    Alan Silver
    Sep 5, 2006
  4. apondu
    Replies:
    0
    Views:
    463
    apondu
    Apr 12, 2007
  5. Sofie Willander

    Screen scraping an aspx site with Mechanize

    Sofie Willander, Dec 2, 2010, in forum: Ruby
    Replies:
    9
    Views:
    308
    Piyush Ranjan
    Dec 3, 2010
Loading...

Share This Page