python 3.3 urllib.request

Discussion in 'Python' started by Steeve C, Dec 7, 2012.

  1. Steeve C

    Steeve C Guest

    hello,

    I have a python3 script with urllib.request which have a strange behavior,
    here is the script :

    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    #!/usr/bin/env python3
    # -*- coding: utf-8 -*-

    import urllib.request
    import sys, time


    url = 'http://google.com'

    def make_some_stuff(page, url):
    sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"") +
    url + "\"\n")
    sys.stderr.write(str(page) + "\"\n")
    return True

    def get_page(url):
    while 1:
    try:
    page = urllib.request.urlopen(url)
    yield page

    except urllib.error.URLError as e:
    sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> impossible
    to access to \"") + url + "\"\n")
    time.sleep(5)
    continue

    def main():
    print('in main')
    for page in get_page(url):
    make_some_stuff(page, url)
    time.sleep(5)

    if __name__ == '__main__':
    main()
    +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

    if the computer is connected on internet (with an ethernet connection for
    example) and I run this script, it works like a charme :
    - urllib.request.urlopen return the page
    - make_some_stuff write in stderr
    - when the ethernet cable is unplug the except block handle the error while
    the cable is unplug, and when the cable is pluged
    back urllib.request.urlopen return the page and make_some_stuff write in
    stderr

    this is the normal behavior (for me, imho).

    but if the computer is not connected on internet (ethernet cable unpluged)
    and I run this script, the except block handle the error (normal), but when
    I plug the cable, the script continue looping and urllib.request.urlopen
    never return the page (so, it alway go to the except block)

    What can I do to handle that ?

    Thanks

    Steeve
    Steeve C, Dec 7, 2012
    #1
    1. Advertising

  2. Steeve C

    Hans Mulder Guest

    On 7/12/12 13:52:52, Steeve C wrote:
    > hello,
    >
    > I have a python3 script with urllib.request which have a strange
    > behavior, here is the script :
    >
    > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    > #!/usr/bin/env python3
    > # -*- coding: utf-8 -*-
    >
    > import urllib.request
    > import sys, time
    >
    >
    > url = 'http://google.com'
    >
    > def make_some_stuff(page, url):
    > sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"")
    > + url + "\"\n")
    > sys.stderr.write(str(page) + "\"\n")
    > return True
    >
    > def get_page(url):
    > while 1:
    > try:
    > page = urllib.request.urlopen(url)
    > yield page
    >
    > except urllib.error.URLError as e:
    > sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S ->
    > impossible to access to \"") + url + "\"\n")
    > time.sleep(5)
    > continue
    >
    > def main():
    > print('in main')
    > for page in get_page(url):
    > make_some_stuff(page, url)
    > time.sleep(5)
    >
    > if __name__ == '__main__':
    > main()
    > +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >
    > if the computer is connected on internet (with an ethernet connection
    > for example) and I run this script, it works like a charme :
    > - urllib.request.urlopen return the page
    > - make_some_stuff write in stderr
    > - when the ethernet cable is unplug the except block handle the error
    > while the cable is unplug, and when the cable is pluged
    > back urllib.request.urlopen return the page and make_some_stuff write in
    > stderr
    >
    > this is the normal behavior (for me, imho).
    >
    > but if the computer is not connected on internet (ethernet cable
    > unpluged) and I run this script, the except block handle the error
    > (normal), but when I plug the cable, the script continue looping
    > and urllib.request.urlopen never return the page (so, it always
    > go to the except block)
    >
    > What can I do to handle that ?


    On my laptop, your script works as you'd hope: if I plug in the
    network cable, then the next urllib request sometimes fails, but
    the request after that succeeds.
    This is using Python 3.3 on MacOS X 10.5.
    What version are you running?

    What happens if you start the script with the network cable
    plugged in, then unplug it when the first request has succeeded,
    and then plug it in again when the next request has failed?

    -- HansM
    Hans Mulder, Dec 7, 2012
    #2
    1. Advertising

  3. Steeve C

    Terry Reedy Guest

    On 12/7/2012 12:27 PM, Hans Mulder wrote:
    > On 7/12/12 13:52:52, Steeve C wrote:
    >> hello,
    >>
    >> I have a python3 script with urllib.request which have a strange
    >> behavior, here is the script :
    >>
    >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >> #!/usr/bin/env python3
    >> # -*- coding: utf-8 -*-
    >>
    >> import urllib.request
    >> import sys, time
    >>
    >>
    >> url = 'http://google.com'
    >>
    >> def make_some_stuff(page, url):
    >> sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"")
    >> + url + "\"\n")
    >> sys.stderr.write(str(page) + "\"\n")
    >> return True
    >>
    >> def get_page(url):
    >> while 1:
    >> try:
    >> page = urllib.request.urlopen(url)
    >> yield page
    >>
    >> except urllib.error.URLError as e:
    >> sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S ->
    >> impossible to access to \"") + url + "\"\n")
    >> time.sleep(5)
    >> continue
    >>
    >> def main():
    >> print('in main')
    >> for page in get_page(url):
    >> make_some_stuff(page, url)
    >> time.sleep(5)
    >>
    >> if __name__ == '__main__':
    >> main()
    >> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >>
    >> if the computer is connected on internet (with an ethernet connection
    >> for example) and I run this script, it works like a charme :
    >> - urllib.request.urlopen return the page
    >> - make_some_stuff write in stderr
    >> - when the ethernet cable is unplug the except block handle the error
    >> while the cable is unplug, and when the cable is pluged
    >> back urllib.request.urlopen return the page and make_some_stuff write in
    >> stderr
    >>
    >> this is the normal behavior (for me, imho).
    >>
    >> but if the computer is not connected on internet (ethernet cable
    >> unpluged) and I run this script, the except block handle the error
    >> (normal), but when I plug the cable, the script continue looping
    >> and urllib.request.urlopen never return the page (so, it always
    >> go to the except block)
    >>
    >> What can I do to handle that ?


    Don't do that '-).

    > On my laptop, your script works as you'd hope: if I plug in the
    > network cable, then the next urllib request sometimes fails, but
    > the request after that succeeds.
    > This is using Python 3.3 on MacOS X 10.5.
    > What version are you running?
    >
    > What happens if you start the script with the network cable
    > plugged in, then unplug it when the first request has succeeded,
    > and then plug it in again when the next request has failed?


    I believe he said that that worked. But unplugging cables is not a good
    idea ;-)

    I remember when it was recommended that all cables be plugged in and the
    the connected devices turned on when the computer was turned on and when
    devices might not be recognized unless plugged in and on when the
    computer was booted or rebooted. In other words, ports were scanned once
    as part of the boot process and adding a device required a reboot. It
    certainly was not that long ago when I had to reboot after the Internet
    Service went down and the cable modem had to reset.

    Ethernet and usb ports and modern OSes are more forgiving. But it does
    not surprise me if on some systems something has to be presence at
    process startup to evet be visible to the process.

    I believe this is all beyond Python's control. So the only thing to do
    might be to change hardware and/or OS or have the program restart itself
    if it gets repeated errors.

    --
    Terry Jan Reedy
    Terry Reedy, Dec 8, 2012
    #3
  4. Steeve C

    Hans Mulder Guest

    On 8/12/12 07:20:55, Terry Reedy wrote:
    > On 12/7/2012 12:27 PM, Hans Mulder wrote:
    >> On 7/12/12 13:52:52, Steeve C wrote:
    >>> hello,
    >>>
    >>> I have a python3 script with urllib.request which have a strange
    >>> behavior, here is the script :
    >>>
    >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >>>
    >>> #!/usr/bin/env python3
    >>> # -*- coding: utf-8 -*-
    >>>
    >>> import urllib.request
    >>> import sys, time
    >>>
    >>>
    >>> url = 'http://google.com'
    >>>
    >>> def make_some_stuff(page, url):
    >>> sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"")
    >>> + url + "\"\n")
    >>> sys.stderr.write(str(page) + "\"\n")
    >>> return True
    >>>
    >>> def get_page(url):
    >>> while 1:
    >>> try:
    >>> page = urllib.request.urlopen(url)
    >>> yield page
    >>>
    >>> except urllib.error.URLError as e:
    >>> sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S ->
    >>> impossible to access to \"") + url + "\"\n")
    >>> time.sleep(5)
    >>> continue
    >>>
    >>> def main():
    >>> print('in main')
    >>> for page in get_page(url):
    >>> make_some_stuff(page, url)
    >>> time.sleep(5)
    >>>
    >>> if __name__ == '__main__':
    >>> main()
    >>> +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
    >>>
    >>>
    >>> if the computer is connected on internet (with an ethernet connection
    >>> for example) and I run this script, it works like a charme :
    >>> - urllib.request.urlopen return the page
    >>> - make_some_stuff write in stderr
    >>> - when the ethernet cable is unplug the except block handle the error
    >>> while the cable is unplug, and when the cable is pluged
    >>> back urllib.request.urlopen return the page and make_some_stuff write in
    >>> stderr
    >>>
    >>> this is the normal behavior (for me, imho).
    >>>
    >>> but if the computer is not connected on internet (ethernet cable
    >>> unpluged) and I run this script, the except block handle the error
    >>> (normal), but when I plug the cable, the script continue looping
    >>> and urllib.request.urlopen never return the page (so, it always
    >>> go to the except block)
    >>>
    >>> What can I do to handle that ?


    > Don't do that '-).


    >> On my laptop, your script works as you'd hope: if I plug in the
    >> network cable, then the next urllib request sometimes fails, but
    >> the request after that succeeds.
    >> This is using Python 3.3 on MacOS X 10.5.
    >> What version are you running?
    >>
    >> What happens if you start the script with the network cable
    >> plugged in, then unplug it when the first request has succeeded,
    >> and then plug it in again when the next request has failed?


    > I believe he said that that worked.


    You're right: he said that.

    > But unplugging cables is not a good idea ;-)
    >
    > I remember when it was recommended that all cables be plugged in and the
    > the connected devices turned on when the computer was turned on and when
    > devices might not be recognized unless plugged in and on when the
    > computer was booted or rebooted. In other words, ports were scanned once
    > as part of the boot process and adding a device required a reboot.


    I also remember the time when that was true. But these day, many
    devices are designed to be plugged in with the computer running,
    and the OS continuously scans for new devices.

    > It certainly was not that long ago when I had to reboot after the
    > Internet Service went down and the cable modem had to reset.


    That's a configuration problem: when the cable modem is reset, your
    computer needs to rerun its "network up" script to renew its DHCP lease.
    If it isn't configured to do that automatically, and you don't know
    how to run it manually, then rebooting may be your only option.

    This is a common problem on desktop computers (where losing the
    connection to the cable modem is rare). Laptops are typically
    configured to deal with connection appearing and disappearing
    on both the wired and the wireless interface.

    > Ethernet and usb ports and modern OSes are more forgiving. But it does
    > not surprise me if on some systems something has to be presence at
    > process startup to even be visible to the process.


    His system may be caching the outcome of the IP address lookup.

    If that's the case, I'd expect different error messages, depending
    on whether the first lookup succeeded or not. But since his script
    carefully avoids printing the exception message, it's hard to tell.

    > I believe this is all beyond Python's control. So the only thing to do
    > might be to change hardware and/or OS or have the program restart itself
    > if it gets repeated errors.


    I think that it would deped on the error message. If the error is
    "Network is unreachable" or "No route to host", then sleeping and
    trying again might work. If the error is "nodename nor servname
    provided, or not known", then the script would have to restart itself.


    Hope this helps,

    -- HansM
    Hans Mulder, Dec 8, 2012
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jonathan Gardner

    Asynchronous urllib (urllib+asyncore)?

    Jonathan Gardner, Feb 26, 2008, in forum: Python
    Replies:
    1
    Views:
    470
    Terry Jones
    Feb 27, 2008
  2. pcchen
    Replies:
    1
    Views:
    704
    Steven D'Aprano
    Jul 10, 2010
  3. Chris McDonald
    Replies:
    0
    Views:
    287
    Chris McDonald
    Nov 1, 2010
  4. Johannes Kleese
    Replies:
    4
    Views:
    261
    Terry Reedy
    Nov 27, 2012
  5. Terry Reedy
    Replies:
    0
    Views:
    355
    Terry Reedy
    Nov 13, 2012
Loading...

Share This Page