UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in position

  • Thread starter Îίκος
  • Start date
M

MRAB

Στις 4/7/2013 3:07 μμ, ο/η MRAB έγÏαψε:
Also, try printing out ascii(os.environ['REMOTE_ADDR']).

'108.162.229.97' is the result of:

print( ascii(os.environ['REMOTE_ADDR']) )

Seems perfectly valid. and also have a PTR record, so that leaved us
clueless about the internal server error.
For me, socket.gethostbyaddr('108.162.229.97') raises socket.herror,
which is also a subclass of OSError from Python 3.3 onwards.
 
Î

Îίκος ΓκÏ33κ

Στις 4/7/2013 4:34 μμ, ο/η MRAB έγÏαψε:
Στις 4/7/2013 3:07 μμ, ο/η MRAB έγÏαψε:
On 04/07/2013 12:36, Îίκος wrote:
Στις 4/7/2013 2:06 μμ, ο/η MRAB έγÏαψε:
On 04/07/2013 11:38, Îίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγÏαψε:
Am 04.07.2013 10:37, schrieb Îίκος:
I just started to have this error without changing nothing

Well, undo the nothing that you didn't change. ;)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97]
Premature
end
of script headers: metrites.py

Why cant it decode the starting byte? what starting byte is that?

It's the 0xb6 but it's expecting the starting byte of a UTF-8
sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.

So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?

What character is 0xb6 anyways?

Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)

If it's ISO-8859-7 (what you've previously referred to as
"greek-iso"),
then:

import unicodedata
unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'

You'll need to find out where that bytestring is coming from.

Right.
But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
clue where this is coming from.

And you are right if it was a byte came from an utf-8 encoding scheme
then it would be automatically decoded.

The only thing i can say for use is that this problem a[[ear only
when i
cloudflare my domain "superhost.gr"

If i un-cloudlflare it it cease to display errors.

Can you tell me hpw to write the following properly:

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or
'UnResolved'

so even if the function fails "unresolved" to be returned back?
Somehow i need to capture the error.

Or it dosnt have to do it the or operand will be returned?

If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
3.3 onwards, is a subclass of OSError), so try catching that, setting
'host' to 'UnResolved' if it's raised.

Also, try printing out ascii(os.environ['REMOTE_ADDR']).

I have followed your suggestion by trying this:

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except socket.gaierror:
host = "UnResolved"

and then re-cloudlflared "superhost.gr" domain

http://superhost.gr/ gives internal server error.
Try catching OSError instead. (As I said, from Python 3.3,
socket.gaierror is a subclass of it.)

At least CloudFlare doesn't give me issues:

if i try this:

try:
host = os.environ['REMOTE_ADDR'][0]
except socket.gaierror:
host = "UnResolved"

then i get no errors and a valid ip back

but the above fails.

I don't know how to catch the exception with OSError.

i know only this two:

except socket.gaierror:
except socket.herror

both fail.
 
Î

Îίκος ΓκÏ33κ

-------- ΑÏχικό μήνυμα --------
Θέμα: Re: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
position 0: invalid start byte
ΗμεÏομηνία: Thu, 04 Jul 2013 14:34:42 +0100
Από: MRAB <[email protected]>
Απάντηση: (e-mail address removed)
ΠÏος: (e-mail address removed)
Ομάδες συζήτησης: comp.lang.python
ΑναφοÏές: <[email protected]>
<[email protected]> <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>

Στις 4/7/2013 3:07 μμ, ο/η MRAB έγÏαψε:
Also, try printing out ascii(os.environ['REMOTE_ADDR']).

'108.162.229.97' is the result of:

print( ascii(os.environ['REMOTE_ADDR']) )

Seems perfectly valid. and also have a PTR record, so that leaved us
clueless about the internal server error.
For me, socket.gethostbyaddr('108.162.229.97') raises socket.herror,
which is also a subclass of OSError from Python 3.3 onwards.

Tell me how i should write the try/except please.
 
M

MRAB

Στις 4/7/2013 4:34 μμ, ο/η MRAB έγÏαψε:
Στις 4/7/2013 3:07 μμ, ο/η MRAB έγÏαψε:
On 04/07/2013 12:36, Îίκος wrote:
Στις 4/7/2013 2:06 μμ, ο/η MRAB έγÏαψε:
On 04/07/2013 11:38, Îίκος wrote:
Στις 4/7/2013 12:50 μμ, ο/η Ulrich Eckhardt έγÏαψε:
Am 04.07.2013 10:37, schrieb Îίκος:
I just started to have this error without changing nothing

Well, undo the nothing that you didn't change. ;)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb6 in
position 0:
invalid start byte
[Thu Jul 04 11:35:14 2013] [error] [client 108.162.229.97]
Premature
end
of script headers: metrites.py

Why cant it decode the starting byte? what starting byte is that?

It's the 0xb6 but it's expecting the starting byte of a UTF-8
sequence.
Please do some research on UTF-8, that should clear it up. You could
also search for common causes of that error.

So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?

What character is 0xb6 anyways?

Well, it's from a bytestring, so you'll have to specify what encoding
you're using! (It clearly isn't UTF-8.)

If it's ISO-8859-7 (what you've previously referred to as
"greek-iso"),
then:

import unicodedata
unicodedata.name(b"\xb6".decode("ISO-8859-7"))
'GREEK CAPITAL LETTER ALPHA WITH TONOS'

You'll need to find out where that bytestring is coming from.

Right.
But nowhere in my script(metrites.py) i use an 'Ά' so i really have no
clue where this is coming from.

And you are right if it was a byte came from an utf-8 encoding scheme
then it would be automatically decoded.

The only thing i can say for use is that this problem a[[ear only
when i
cloudflare my domain "superhost.gr"

If i un-cloudlflare it it cease to display errors.

Can you tell me hpw to write the following properly:

host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0] or
'UnResolved'

so even if the function fails "unresolved" to be returned back?
Somehow i need to capture the error.

Or it dosnt have to do it the or operand will be returned?

If gethostbyaddr fails, it raises socket.gaierror, (which, from Python
3.3 onwards, is a subclass of OSError), so try catching that, setting
'host' to 'UnResolved' if it's raised.

Also, try printing out ascii(os.environ['REMOTE_ADDR']).


I have followed your suggestion by trying this:

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except socket.gaierror:
host = "UnResolved"

and then re-cloudlflared "superhost.gr" domain

http://superhost.gr/ gives internal server error.
Try catching OSError instead. (As I said, from Python 3.3,
socket.gaierror is a subclass of it.)

At least CloudFlare doesn't give me issues:

if i try this:

try:
host = os.environ['REMOTE_ADDR'][0]
except socket.gaierror:
host = "UnResolved"
It's pointless trying to catch a socket exception here because you're
not using a socket, you're just getting a string from an environment
variable.
then i get no errors and a valid ip back

but the above fails.

I don't know how to catch the exception with OSError.

i know only this two:

except socket.gaierror:
except socket.herror

both fail.
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!
 
Î

Îίκος ΓκÏ33κ

Στις 4/7/2013 6:10 μμ, ο/η MRAB έγÏαψε:
What do you mean "I don't know how to catch the exception with
OSError"? You've tried "except socket.gaierror" and "except
socket.herror", well just write "except OSError" instead!


try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except OSError:
host = "UnResolved"

produces also an internal server error.

Are you sure is just except OSError ?

seems very general...
 
D

Dennis Lee Bieber

What character is 0xb6 anyways?
It depends on the encoding... In EBCDIC it's unassigned. It's a
paragraph mark in ISO-Latin-1 (ISO-8859-1). Apparently also a paragraph
mark in ISO-Latin-9 (ISO-8859-15).

If it is valid in UTF-8, I can't find a reference. It's not a prefix
for a multi-byte character, which implies that the previous byte should
have been something in prefix or another extended byte entry...
 
F

Ferrous Cranus

Στις 4/7/2013 11:08 μμ, ο/η Dennis Lee Bieber έγÏαψε:
It depends on the encoding... In EBCDIC it's unassigned. It's a
paragraph mark in ISO-Latin-1 (ISO-8859-1). Apparently also a paragraph
mark in ISO-Latin-9 (ISO-8859-15).

If it is valid in UTF-8, I can't find a reference. It's not a prefix
for a multi-byte character, which implies that the previous byte should
have been something in prefix or another extended byte entry...

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"

Is there a way to write the above so i cna print the error return when
it fails?
 
L

Lele Gaifax

Ferrous Cranus said:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"

Is there a way to write the above so i cna print the error return when
it fails?

Try something like

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except Exception as e:
host = "Reverse DNS Failed"
print(e)

?

ciao, lele.
 
M

Michael Torrie

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"

Is there a way to write the above so i cna print the error return when
it fails?

Do you know what IP address causes the failure? If so write a little
python program that does the socket.gethostbyaddr and run it on the
command line! Debugging through the CGI interface sucks.

Have you been writing python tests that you can run on the command line
during your development these last weeks?
 
N

Nobody

So you are also suggesting that what gesthostbyaddr() returns is not
utf-8 encoded too?

The gethostbyaddr() OS function returns a byte string with no specified
encoding. Python 3 will doubtless try to decode that to a character string
using some (probably unspecified) encoding.

Names obtained from DNS should consist entirely of ASCII characters
(gethostbyname shouldn't attempt to decode internationalised names
which use IDN, it should return the raw data).

Names obtained by other means (e.g. /etc/hosts or Active Directory) could
contain anything, but if you use non-ASCII hostnames you're asking for
trouble.
 
F

Ferrous Cranus

Στις 5/7/2013 3:06 πμ, ο/η Nobody έγÏαψε:
The gethostbyaddr() OS function returns a byte string with no specified
encoding. Python 3 will doubtless try to decode that to a character string
using some (probably unspecified) encoding.

I see, but if the function returns a byte string not inutf-8 format then
how my script is uspposes to decode this byte string?

And why only this error happens when i cloudflare my domain, while when
i un-cloudflare it are reverse DNS are being resolves without problem.
So the queston is: How come it only fails when i cloidflare the domain?

Also please comment on that:

host = gethostbyaddr(....) or "UnResolved"

This will return the first argument that define the evaluation as being
true or untrue.

if function returns false the the 2nd argument.
Nut if the function gives an exception will the condition return the 2nd
argument or will the program fail?

I was udner the impression that i could avoid error handling inside
try/excepts by utilizing "or".
 
Î

Îίκος Gr33k

Στις 5/7/2013 3:06 πμ, ο/η Nobody έγÏαψε:
The gethostbyaddr() OS function returns a byte string with no specified
encoding. Python 3 will doubtless try to decode that to a character string
using some (probably unspecified) encoding.

Names obtained from DNS should consist entirely of ASCII characters
(gethostbyname shouldn't attempt to decode internationalised names
which use IDN, it should return the raw data).

Names obtained by other means (e.g. /etc/hosts or Active Directory) could
contain anything, but if you use non-ASCII hostnames you're asking for
trouble.

Please help because i just happened to noticed that after having this code:

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except Exception as e:
host = "Reverse DNS Failed"


all requests are being resolves, result to:


Reverse DNS Failed as you can see here:
http://superhost.gr/?show=log&page=index.html

How can the above code not be able to reeverse dns any more and it falls
back to the failed string?
 
L

Lele Gaifax

Ferrous Cranus said:
host = gethostbyaddr(....) or "UnResolved"

This will return the first argument that define the evaluation as
being true or untrue.

if function returns false the the 2nd argument.
Nut if the function gives an exception will the condition return the
2nd argument or will the program fail?

I was udner the impression that i could avoid error handling inside
try/excepts by utilizing "or".

No, you had the wrong impression. Why don't you simply invoke the Python
interpreter and try things out with that??
Traceback (most recent call last):
Traceback (most recent call last):

ciao, lele.
 
L

Lele Gaifax

Îίκος Gr33k said:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except Exception as e:
host = "Reverse DNS Failed"

How can the above code not be able to reeverse dns any more and it
falls back to the failed string?

The only way to know is actually printing out the exception, either to
stderr, or better using the logging facility, as I suggested.

FYI, your code above is (almost) exactly equivalent to the simpler

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"

ciao, lele.
 
Î

Îίκος Gr33k

Στις 5/7/2013 9:55 πμ, ο/η Lele Gaifax έγÏαψε:
No, you had the wrong impression. Why don't you simply invoke the Python
interpreter and try things out with that??

Traceback (most recent call last):

Traceback (most recent call last):

Thank you Lele, i wanted to but i had no idea how to test it.
Your devision by zero is very smart thing to test!

So it proves that a condition cannot be evaluation as truthy or falsey
if one of the operators is giving out an exception.
Thank you.
 
Î

Îίκος Gr33k

Στις 5/7/2013 10:06 πμ, ο/η Lele Gaifax έγÏαψε:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"


Yes i uses to had it like that, until i was looking for ways to make it
hold the error

except Exception as e:
print( e )
host = e

but print( e ) in the way i used to had it doesn't print out the error,
it instead gives an internal server error on browser.

I must somehow take a look at the error to understand why every visitor
i have gets UnResolved, but how since prints fails?
 
B

Benjamin Kaplan

Îίκος Gr33k said:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except Exception as e:
host = "Reverse DNS Failed"

How can the above code not be able to reeverse dns any more and it
falls back to the failed string?

The only way to know is actually printing out the exception, either to
stderr, or better using the logging facility, as I suggested.

FYI, your code above is (almost) exactly equivalent to the simpler

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"

ciao, lele.

They aren't equivalent. "except Exception" won't catch KeyboardInterrupt or
SystemExit or a few others that you really don't want to catch in a generic
error handler. You should almost never have a bare except.
 
D

Dave Angel

On 07/05/2013 02:51 AM, Îίκος Gr33k wrote:

Please help because i just happened to noticed that after having this code:

try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except Exception as e:
host = "Reverse DNS Failed"

Don't ever catch a bare Exception class. Make it more specific to the
particular problem you're trying to suppress.

In particular, your previous problem with the utf-8 decoding will also
be caught by the Exception class, so it'll get all lumped together as
"Reverse DNS Failed".
all requests are being resolves, result to:


Reverse DNS Failed as you can see here:
http://superhost.gr/?show=log&page=index.html

How can the above code not be able to reeverse dns any more and it falls
back to the failed string?

Since you've not made any progress with all the other suggestions, how
about if you decompose this line into several, and see just which one is
failing? Maybe that'll tell you what's going on. In general,
suppressing an exception without knowing why it's firing is a huge mistake.

The line started as:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

refactor that to:
remadd = os.environ('REMOVE_ADDR')
tuple3 = socket.gethostbyaddr(remadd)
host = tuple3[0]

and see which one throws the exception. Then once you have that,
examine the exact parameters that might be triggering the problem. In
particular, figure out the exact types and values for remadd and tuple3.

print(type(remadd) + " : " + repr(remadd))

Of course, print itself won't work too well in a CGI environment. But
you must have solved that problem by now, either using log files or
running the program excerpt in a regular console.
 
Î

Îίκος Gr33k

Στις 5/7/2013 10:50 πμ, ο/η Dave Angel έγÏαψε:
The line started as:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]

refactor that to:
remadd = os.environ('REMOVE_ADDR')
tuple3 = socket.gethostbyaddr(remadd)
host = tuple3[0]

and see which one throws the exception. Then once you have that,
examine the exact parameters that might be triggering the problem. In
particular, figure out the exact types and values for remadd and tuple3.

print(type(remadd) + " : " + repr(remadd))

I'am not sure how iam supposed to write this: i just tried this:


try:
remadd = os.environ('REMOVE_ADDR')
tuple3 = socket.gethostbyaddr(remadd)
host = tuple3[0]
except:
host = type(remadd) + " : " + repr(remadd)


but iam getting an internal server error.

I didnt print it as you said but its the same thing host var gets
printed later on.

Now, why would this give an internal server error?
 
D

Dave Angel

Στις 5/7/2013 10:06 πμ, ο/η Lele Gaifax έγÏαψε:
try:
host = socket.gethostbyaddr( os.environ['REMOTE_ADDR'] )[0]
except:
host = "Reverse DNS Failed"


Yes i uses to had it like that, until i was looking for ways to make it
hold the error

except Exception as e:
print( e )
host = e

but print( e ) in the way i used to had it doesn't print out the error,
it instead gives an internal server error on browser.

I must somehow take a look at the error to understand why every visitor
i have gets UnResolved, but how since prints fails?


How have you been doing it all along? Just open a console onto that
server, start the appropriate version of Python interactively, and try
the things we've been talking about. If it fails the same way as within
the cgi environmnet, you get full visibility.

Or if the problems cannot be recreated outside the cgi environment, use
the log files you've been logging other problems into. Or simply open a
text file for writing, and add a file= keyword parameter to the print
function call.

print(repr(e), file=myfile)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top