inconsistency in converting from/to hex

L

Laszlo Nagy

We can convert from hex str to bytes with bytes.fromhex class method:

But we cannot convert from hex binary:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be str, not bytes

We don't have bytes_instance.tohex() instance method.
But we have binascii.hexlify. But binascii.hexlify does not return an
str. It returns a bytes instance instead.
b'ff'

Its reverse function binascii.unhexlify can be used on str and bytes too:
b'\xff'

Questions:

* if we have bytes.fromhex() then why don't we have bytes_instance.tohex() ?
* if the purpose of binascii.unhexlify and bytes.fromhex is the same,
then why allow binary arguments for the former, and not for the later?
* in this case, should there be "one obvious way to do it" or not?
 
N

Ned Batchelder

We can convert from hex str to bytes with bytes.fromhex class method:


But we cannot convert from hex binary:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: must be str, not bytes

We don't have bytes_instance.tohex() instance method.
But we have binascii.hexlify. But binascii.hexlify does not return an
str. It returns a bytes instance instead.

b'ff'

Its reverse function binascii.unhexlify can be used on str and bytes too:

b'\xff'

Questions:

* if we have bytes.fromhex() then why don't we have bytes_instance.tohex() ?
* if the purpose of binascii.unhexlify and bytes.fromhex is the same,
then why allow binary arguments for the former, and not for the later?
* in this case, should there be "one obvious way to do it" or not?

The standard library is not always as consistent as we might like. I don't think there is a better answer than that.

This will work if you want to use fromhex with bytes:

b = bytes.fromhex(b"ff".decode("ascii"))


--Ned.
 
S

Steven D'Aprano

Questions:

* if we have bytes.fromhex() then why don't we have
bytes_instance.tohex() ?

The Python core developers are quite conservative about adding new
methods, particularly when there is already a solution to the given
problem. bytes.fromhex is very useful, because when working with binary
data it is common to give data as strings of hex values, and so it is
good to have a built-in method for it:

image = bytes.fromhex('ffd8ffe000104a464946000101 ...')

On the other hand, converting bytes to hexadecimal values is less common.
There's already at least two ways to do it in Python 2:

py> import binascii
py> binascii.hexlify('Python')
'507974686f6e'

py> import codecs
py> codecs.encode('Python', 'hex')
'507974686f6e'

[Aside: in Python 3, the codecs where (mistakenly) removed, but they'll
be added back in 3.4 or 3.5.]

So I can only imagine that had somebody proposed a bytes.tohex() method,
they would have been told "there's already a way to do that, this isn't
important enough to justify being built-in".

* if the purpose of binascii.unhexlify and bytes.fromhex is the same,
then why allow binary arguments for the former, and not for the later?

I would argue that the purpose is *not* the same. binascii is for working
with binary files, hence it accepts bytes and produces bytes.
bytes.fromhex is for producing bytes from strings.

It's an exceedingly narrow distinction, and I can understand anyone who
is not convinced by my argument. I'm only half-convinced myself.

* in this case, should there be "one obvious way to do it" or not?

Define "it". Do you mean "convert bytes to bytes", "bytes to str", "str
to bytes", or "str to str"?

Besides, one *obvious* way is not the same as *only one* way.

I agree that its a bit of a mess. But only a little bit, and it will be
less messy by 3.5 when the codecs solution is re-introduced. Then the
codecs.encode and decode functions will be the one obvious way.
 
S

Serhiy Storchaka

17.11.13 08:31, Steven D'Aprano напиÑав(ла):
There's already at least two ways to do it in Python 2:

py> import binascii
py> binascii.hexlify('Python')
'507974686f6e'

py> import codecs
py> codecs.encode('Python', 'hex')
'507974686f6e'
Third:
b'507974686F6E'

Fourth:
b'507974686F6E'

Fifth:
b'507974686F6E'

[Aside: in Python 3, the codecs where (mistakenly) removed, but they'll
be added back in 3.4 or 3.5.]

Only renamed.
b'507974686f6e'
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top