view page source or save after load

Discussion in 'Python' started by zephron2000, Sep 21, 2006.

  1. zephron2000

    zephron2000 Guest

    Hey,

    I need to either:

    1. View the page source of a webpage after it loads

    or

    2. Save the webpage to my computer after it loads (same as File > Save
    Page As)

    urllib is not sufficient (using urlopen or something else in urllib
    isn't going to do the trick)

    Any ideas?

    Thanks,
    Lara
     
    zephron2000, Sep 21, 2006
    #1
    1. Advertising

  2. zephron2000

    James Stroud Guest

    zephron2000 wrote:
    > Hey,
    >
    > I need to either:
    >
    > 1. View the page source of a webpage after it loads
    >
    > or
    >
    > 2. Save the webpage to my computer after it loads (same as File > Save
    > Page As)
    >
    > urllib is not sufficient (using urlopen or something else in urllib
    > isn't going to do the trick)
    >
    > Any ideas?
    >
    > Thanks,
    > Lara
    >
    >
    >


    I happen to be tweaking a module that does this as your question came
    in. The relevant lines are:

    fetchparams = urllib.urlencode(fetchparams)
    wwwf = urllib.urlopen("?".join([baseurl, fetchparams]))
    afile = open(filename, "w")
    afile.write(wwwf.read())
    afile.close()

    James

    --
    James Stroud
    UCLA-DOE Institute for Genomics and Proteomics
    Box 951570
    Los Angeles, CA 90095

    http://www.jamesstroud.com/
     
    James Stroud, Sep 21, 2006
    #2
    1. Advertising

  3. zephron2000

    alex23 Guest

    zephron2000 wrote:
    > I need to either:
    > 1. View the page source of a webpage after it loads
    > or
    > 2. Save the webpage to my computer after it loads (same as File > Save
    > Page As)
    > urllib is not sufficient (using urlopen or something else in urllib
    > isn't going to do the trick)


    You don't really say _why_ urllib.urlopen "isn't going to do the
    trick". The following does what you've described:

    import urllib
    page = urllib.urlopen('http://some.address')
    open('saved_page.txt','w').write(page).close()

    If you're needing to use a browser directly and you're running under
    Windows, try the Internet Explorer Controller library, IEC:

    import IEC
    ie = IEC.IEController()
    ie.Navigate('http://some.address')
    page = ie.GetDocumentHTML()
    open('saved_page.txt','w').write(page.encode('iso-8859-1')).close()

    (You can grab IEC from http://www.mayukhbose.com/python/IEC/index.php)

    Hope this helps.

    -alex23
     
    alex23, Sep 21, 2006
    #3
  4. At Thursday 21/9/2006 02:26, alex23 wrote:

    >page = urllib.urlopen('http://some.address')


    add .read() at the end

    >open('saved_page.txt','w').write(page).close()


    write() does not return the file object, so this won't work; you have
    to bind the file to a temporary variable to be able to close it.



    Gabriel Genellina
    Softlab SRL





    __________________________________________________
    Preguntá. Respondé. Descubrí.
    Todo lo que querías saber, y lo que ni imaginabas,
    está en Yahoo! Respuestas (Beta).
    ¡Probalo ya!
    http://www.yahoo.com.ar/respuestas
     
    Gabriel Genellina, Sep 21, 2006
    #4
  5. zephron2000

    alex23 Guest

    Gabriel Genellina wrote:
    <fixes for my stupidity>

    Thanks for the corrections, Gabriel. I really need to learn to
    cut&paste working code :)

    Cheers.

    -alex23
     
    alex23, Sep 21, 2006
    #5
  6. zephron2000

    James Stroud Guest

    Gabriel Genellina wrote:
    > At Thursday 21/9/2006 02:26, alex23 wrote:
    >
    >> page = urllib.urlopen('http://some.address')

    >
    > add .read() at the end
    >
    >> open('saved_page.txt','w').write(page).close()

    >
    > write() does not return the file object, so this won't work; you have to
    > bind the file to a temporary variable to be able to close it.


    Strictly speaking, "have to" is not perfectly correct. The ".close()"
    part can simply be eliminated as the file should close via garbage
    collection once leaving the local namespace.

    James
     
    James Stroud, Sep 21, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. subhadip
    Replies:
    0
    Views:
    639
    subhadip
    Mar 28, 2007
  2. hell2heaven
    Replies:
    0
    Views:
    418
    hell2heaven
    Oct 26, 2008
  3. Parthiv Joshi
    Replies:
    1
    Views:
    695
    Samuel L Matzen
    Jul 6, 2004
  4. andspal

    Updates in Design view not moved to Source view

    andspal, Nov 2, 2006, in forum: ASP .Net Web Controls
    Replies:
    0
    Views:
    183
    andspal
    Nov 2, 2006
  5. Mike Dee
    Replies:
    3
    Views:
    320
    Mamidipalli
    Mar 1, 2006
Loading...

Share This Page