python 3.3 urllib.request

S

Steeve C

hello,

I have a python3 script with urllib.request which have a strange behavior,
here is the script :

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import urllib.request
import sys, time


url = 'http://google.com'

def make_some_stuff(page, url):
sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"") +
url + "\"\n")
sys.stderr.write(str(page) + "\"\n")
return True

def get_page(url):
while 1:
try:
page = urllib.request.urlopen(url)
yield page

except urllib.error.URLError as e:
sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> impossible
to access to \"") + url + "\"\n")
time.sleep(5)
continue

def main():
print('in main')
for page in get_page(url):
make_some_stuff(page, url)
time.sleep(5)

if __name__ == '__main__':
main()
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

if the computer is connected on internet (with an ethernet connection for
example) and I run this script, it works like a charme :
- urllib.request.urlopen return the page
- make_some_stuff write in stderr
- when the ethernet cable is unplug the except block handle the error while
the cable is unplug, and when the cable is pluged
back urllib.request.urlopen return the page and make_some_stuff write in
stderr

this is the normal behavior (for me, imho).

but if the computer is not connected on internet (ethernet cable unpluged)
and I run this script, the except block handle the error (normal), but when
I plug the cable, the script continue looping and urllib.request.urlopen
never return the page (so, it alway go to the except block)

What can I do to handle that ?

Thanks

Steeve
 
H

Hans Mulder

hello,

I have a python3 script with urllib.request which have a strange
behavior, here is the script :

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
#!/usr/bin/env python3
# -*- coding: utf-8 -*-

import urllib.request
import sys, time


url = 'http://google.com'

def make_some_stuff(page, url):
sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S -> page from \"")
+ url + "\"\n")
sys.stderr.write(str(page) + "\"\n")
return True

def get_page(url):
while 1:
try:
page = urllib.request.urlopen(url)
yield page

except urllib.error.URLError as e:
sys.stderr.write(time.strftime("%d/%m/%Y %H:%M:%S ->
impossible to access to \"") + url + "\"\n")
time.sleep(5)
continue

def main():
print('in main')
for page in get_page(url):
make_some_stuff(page, url)
time.sleep(5)

if __name__ == '__main__':
main()
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

if the computer is connected on internet (with an ethernet connection
for example) and I run this script, it works like a charme :
- urllib.request.urlopen return the page
- make_some_stuff write in stderr
- when the ethernet cable is unplug the except block handle the error
while the cable is unplug, and when the cable is pluged
back urllib.request.urlopen return the page and make_some_stuff write in
stderr

this is the normal behavior (for me, imho).

but if the computer is not connected on internet (ethernet cable
unpluged) and I run this script, the except block handle the error
(normal), but when I plug the cable, the script continue looping
and urllib.request.urlopen never return the page (so, it always
go to the except block)

What can I do to handle that ?

On my laptop, your script works as you'd hope: if I plug in the
network cable, then the next urllib request sometimes fails, but
the request after that succeeds.
This is using Python 3.3 on MacOS X 10.5.
What version are you running?

What happens if you start the script with the network cable
plugged in, then unplug it when the first request has succeeded,
and then plug it in again when the next request has failed?

-- HansM
 
T

Terry Reedy

Don't do that '-).
On my laptop, your script works as you'd hope: if I plug in the
network cable, then the next urllib request sometimes fails, but
the request after that succeeds.
This is using Python 3.3 on MacOS X 10.5.
What version are you running?

What happens if you start the script with the network cable
plugged in, then unplug it when the first request has succeeded,
and then plug it in again when the next request has failed?

I believe he said that that worked. But unplugging cables is not a good
idea ;-)

I remember when it was recommended that all cables be plugged in and the
the connected devices turned on when the computer was turned on and when
devices might not be recognized unless plugged in and on when the
computer was booted or rebooted. In other words, ports were scanned once
as part of the boot process and adding a device required a reboot. It
certainly was not that long ago when I had to reboot after the Internet
Service went down and the cable modem had to reset.

Ethernet and usb ports and modern OSes are more forgiving. But it does
not surprise me if on some systems something has to be presence at
process startup to evet be visible to the process.

I believe this is all beyond Python's control. So the only thing to do
might be to change hardware and/or OS or have the program restart itself
if it gets repeated errors.
 
H

Hans Mulder

Don't do that '-).
I believe he said that that worked.

You're right: he said that.
But unplugging cables is not a good idea ;-)

I remember when it was recommended that all cables be plugged in and the
the connected devices turned on when the computer was turned on and when
devices might not be recognized unless plugged in and on when the
computer was booted or rebooted. In other words, ports were scanned once
as part of the boot process and adding a device required a reboot.

I also remember the time when that was true. But these day, many
devices are designed to be plugged in with the computer running,
and the OS continuously scans for new devices.
It certainly was not that long ago when I had to reboot after the
Internet Service went down and the cable modem had to reset.

That's a configuration problem: when the cable modem is reset, your
computer needs to rerun its "network up" script to renew its DHCP lease.
If it isn't configured to do that automatically, and you don't know
how to run it manually, then rebooting may be your only option.

This is a common problem on desktop computers (where losing the
connection to the cable modem is rare). Laptops are typically
configured to deal with connection appearing and disappearing
on both the wired and the wireless interface.
Ethernet and usb ports and modern OSes are more forgiving. But it does
not surprise me if on some systems something has to be presence at
process startup to even be visible to the process.

His system may be caching the outcome of the IP address lookup.

If that's the case, I'd expect different error messages, depending
on whether the first lookup succeeded or not. But since his script
carefully avoids printing the exception message, it's hard to tell.
I believe this is all beyond Python's control. So the only thing to do
might be to change hardware and/or OS or have the program restart itself
if it gets repeated errors.

I think that it would deped on the error message. If the error is
"Network is unreachable" or "No route to host", then sleeping and
trying again might work. If the error is "nodename nor servname
provided, or not known", then the script would have to restart itself.


Hope this helps,

-- HansM
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top