download robot

Discussion in 'Python' started by larryzhang, Apr 13, 2009.

  1. larryzhang

    larryzhang Guest

    Hi,
    Being a newbie for Python, I am trying to write a code that can act as
    a downloading robot.

    The website provides information for companies. Manually, I can search
    by company name and then click the “download” button to get the data
    in excel or word format, before saving the file in a local directory.
    The program is to do this automatically.

    I have met several problems when writing the codes:
    1. The website needs user ID and password, is there a way that I can
    pass my ID and password to the server in my python code?
    2. Can Python hit the “download” button automatically and choose the
    type of file format as I can do manually?
    3. The url of each downloading webpage is not unique (webpages point
    to different data files may share the same url), which prevent me from
    working directly with the url as the address to find a certain file.
    Is there any solution for this? Does this mean I have to work directly
    with the database stored in the server rather than with the webpage
    displayed?

    Thank you very much for any comments and suggestions.

    Larry
     
    larryzhang, Apr 13, 2009
    #1
    1. Advertising

  2. On Mon, Apr 13, 2009 at 11:13 AM, larryzhang <> wrote:
    > Hi,
    > Being a newbie for Python, I am trying to write a code that can act as
    > a downloading robot.
    >


    This might be useful: http://wwwsearch.sourceforge.net/mechanize/.
    I've only casually gone through the page, not actually used it. If
    you feel like it, you can also use the urllib2 in the library to do
    all the work yourself. Notes if you go this way are below.

    > The website provides information for companies. Manually, I can search
    > by company name and then click the “download†button to get the data
    > in excel or word format, before saving the file in a local directory.
    > The program is to do this automatically.
    >
    > I have met several problems when writing the codes:
    > 1. The website needs user ID and password, is there a way that I can
    > pass my ID and password to the server in my python code?


    See the examples in the urllib2 documentation for how to send a
    username and password for Basic authentication. If the authentication
    is done using forms, you'll need to put that data with your request.
    The website might then use cookies to track you, so your code will
    need to be prepared to handle that.

    > 2. Can Python hit the “download†button automatically and choose the
    > type of file format as I can do manually?


    The download button will probably be just an appropriate GET or POST
    request. You'll need to be familiar with HTML forms to be able to do
    this.

    > 3. The url of each downloading webpage is not unique (webpages point
    > to different data files may share the same url), which prevent me from
    > working directly with the url as the address to find a certain file.
    > Is there any solution for this? Does this mean I have to work directly
    > with the database stored in the server rather than with the webpage
    > displayed?


    This simply means that the identifiers for the file to download are
    being passed in using means other than the URL, most likely as POST
    data. Look at the HTML for the page to see how.

    >
    > Thank you very much for any comments and suggestions.
    >


    You'll find tools that let you observe the communication between your
    browser and the web server useful. If you use Mozilla Firefox, the
    httpfox extension might help.

    > Larry
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
     
    Kushal Kumaran, Apr 13, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. gonzal kamikadze
    Replies:
    2
    Views:
    404
    Joerg Jooss
    Apr 6, 2005
  2. Darrel Riekhof

    How to stop a killer java.awt.robot?

    Darrel Riekhof, Sep 30, 2003, in forum: Java
    Replies:
    1
    Views:
    1,127
    Joe Smith
    Sep 30, 2003
  3. yaktipper
    Replies:
    0
    Views:
    641
    yaktipper
    Oct 27, 2003
  4. patrick

    robot keys not working

    patrick, Nov 24, 2003, in forum: Java
    Replies:
    2
    Views:
    1,705
    chikach
    Nov 5, 2008
  5. Rafal Majda
    Replies:
    5
    Views:
    2,376
    Rafal Majda
    Apr 18, 2005
Loading...

Share This Page