L
Lie Ryan
Isn't this creating a regular byte?
Shouldn't creation of bytearray be:
Shouldn't creation of bytearray be:
Isn't this creating a regular byte?
Shouldn't creation of bytearray be:
97b = bytearray(b'abc')
b[0]
>>> xx = b'x'
>>> repr(xx) "'x'"
>>> repr(xx[0]) "'x'"
>>> repr(xx[0][0]) "'x'"
>>>
Beware, also, that in 2.6 the "bytes" type is essentially an ugly hackErik said:John said:With "bytearray", the element type is considered to be "unsigned
byte",
or so says PEP 3137: "The element data type is always 'B' (i.e.
unsigned byte)."
Let's try:
Python 2.6.1 (r261:67517, Dec 4 2008, 16:51:00) [MSC v.1500 32 bit
(Intel)] on
win32xx = b'x'
repr(xx) "'x'""'x'"repr(xx[0]) "'x'"
repr(xx[0][0])
But that's not what "repr" indicates. The bytearray element is
apparently
being promoted to "bytes" as soon as it comes out of the array.
There's no distinction byte type. A single character of a bytes type is
also a bytes.
Steve Holden said:Beware, also, that in 2.6 the "bytes" type is essentially an ugly hack
to enable easier forward compatibility with the 3.X series ...
I take it backBenjamin said:It's not an ugly hack. It just isn't all that you might hope it'd live up to be.
Benjamin said:It's not an ugly hack. It just isn't all that you might hope it'd live up to be.
Traceback (most recent call last):>>> a = b'A'
>>> b = b'B'
>>> a+b 'AB'
>>> a[0]+b[0] 'AB'
>>>>>> a = b'A'
>>> b = b'B'
>>> a+b 'AB'
>>> a[0]+b[0] 'AB'
>>>
>>> a & b
John Nagle said:Benjamin said:It's not an ugly hack. It just isn't all that you might hope it'd live up
to be.
The semantics aren't what the naive user would expect. One would
expect an element of a bytearray to be a small integer. But instead,
it has string-like behavior. "+" means concatenate, not add.
The bit operators don't work at all.
Python 2.6.1 ...Traceback (most recent call last):a = b'A'
b = b'B'
a+b 'AB'
a[0]+b[0] 'AB'
a = b'A'
b = b'B'
a+b 'AB'
a[0]+b[0] 'AB'
a & b
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for &: 'str' and 'str'
Given that the intent of bytearray is that it's a data type for
handling raw binary data of unknown format, one might expect it to behave
like
"array.array('B')", which is an array of unsigned bytes that are
treated as integers. But that's not how "bytearray" works. "bytearray"
is more like the old meaning of "str", before Unicode support, circa
Python 2.1.
'x'type(b'x')b'x'[0]
120type(b'x')b'x'[0]
Because b'x' is NOT a bytearray. It is a bytes object. When you actually use
a bytearray, it behaves like you expect.195type(b'x')ba = bytearray(b'abc')
ba[0] + ba[1]
Because b'x' is NOT a bytearray. It is a bytes object. When you actually use
a bytearray, it behaves like you expect.195type(b'x')ba = bytearray(b'abc')
ba[0] + ba[1]
It's a feature. In fact all that was done to accommodate easierJohn said:Because b'x' is NOT a bytearray. It is a bytes object. When you
actually use
a bytearray, it behaves like you expect.type(b'x')type(bytearray(b'x'))195ba = bytearray(b'abc')
ba[0] + ba[1]
That's indeed how Python 2.6 works. But that's not how
PEP 3137 says it's supposed to work.
Guido:
"I propose the following type names at the Python level:
* bytes is an immutable array of bytes (PyString)
* bytearray is a mutable array of bytes (PyBytes)"
...
"Indexing bytes and bytearray returns small ints (like the bytes type in
3.0a1, and like lists or array.array('B'))."
(Not true in Python 2.6 - indexing a "bytes" object returns a "bytes"
object with length 1.)
"b1 + b2: concatenation. With mixed bytes/bytearray operands, the return
type is
that of the first argument (this seems arbitrary until you consider how
+= works)."
(Not true in Python 2.6 - concatenation returns a bytearray in both cases.)
Is this a bug, a feature, a documentation error, or bad design?
Steve said:John said:Benjamin said:It's a feature. In fact all that was done to accommodate easier
migration to 3.x is easily shown in one statement:
True
So that's why bytes works the way it does in 2.6 ... hence my contested
description of it as an "ugly hack". I am happy to withdraw "ugly", but
I think "hack" could still be held to apply.
Agreed. But is this a 2.6 thing, making 2.6 incompatible with 3.0, or
what? How will 3.x do it? The PEP 3137 way, or the Python 2.6 way?
The way it works in 2.6 makes it necessary to do "ord" conversions
where they shouldn't be required.
John Nagle
John said:Steve said:Yes, the hack was to achieve a modicum of compatibility with 3.0 withoutJohn said:Benjamin Kaplan wrote:
Agreed. But is this a 2.6 thing, making 2.6 incompatible with 3.0, or
what? How will 3.x do it? The PEP 3137 way, or the Python 2.6 way?
The way it works in 2.6 makes it necessary to do "ord" conversions
where they shouldn't be required.
having to turn the world upside down.
I haven't used 3.0 enough the say whether bytearray has been correctly
implemented. But I believe the intention is that 3.0 should fully
implement PEP 3137.
regards
Steve
Steve said:John said:Yes, the hack was to achieve a modicum of compatibility with 3.0 without
having to turn the world upside down.
I haven't used 3.0 enough the say whether bytearray has been correctly
implemented. But I believe the intention is that 3.0 should fully
implement PEP 3137.
If "bytes", a new keyword, works differently in 2.6 and 3.0, that was really
dumb. There's no old code using "bytes". So converting code to 2.6 means
it has to be converted AGAIN for 3.0. That's a good reason to ignore 2.6 as
defective.
John Nagle
Christian Heimes said:John Nagle wrote
Please don't call something dumb that you don't fully understand. It's
offenses the people who have spent lots of time developing Python --
personal, unpaid and voluntary time!
I can assure, the bytes alias and b'' alias have their right to exist.
Hendrik van Rooyen said:"Christian Heimes" <lis....s.de> wrote:
on the surface JN has a point - If you have to go through two
conversions, then 2.6 does not achieve what it appears to set out to
do. So the issue is simple:
- do you have to convert twice?
- If yes - why? - as he says - there exists no prior code,
so there seems to be no reason not to make it identical
to 3.0
Crying out; "Please do not criticise me, I am doing it for free!" does
not justify delivering sub standard work - that is the nature of the
open source process - if you lift your head and say or do something,
there are bound to be some objections - some thoughtful and valid,
and others merely carping. Being sensitive about it serves no purpose.
This is not a helpful response - on the surface JN has a point - If
you have to go through two conversions, then 2.6 does not achieve
what it appears to set out to do. So the issue is simple:
- do you have to convert twice?
- If yes - why? - as he says - there exists no prior code,
so there seems to be no reason not to make it identical
to 3.0
Still, John *clearly* doesn't understand what he observes, so asking him
not to draw conclusions until he does understand is not defending
against criticism.
Depends on how you write your code. If you use the bytearray type
(which John didn't, despite his apparent believe that he did),
then no conversion additional conversion is needed.
Likewise, if you only use byte (not bytearray) literals, without
accessing individual bytes (e.g. if you only ever read and write
them, or pass them to the struct module), 2to3 will do the right
thing.
Sure there is. Making the bytes type and the str type identical
in 2.x gives the easiest way of porting. Adding bytes as a separate
type would have complicated a lot of things.
Regards,
Martin
According to PEP 3137, there should be no distinction between
the two for read purposes. In 2.6, there is. That's a bug.
No, it's broken. PEP 3137 says one thing, and the 2.6 implementation
does something else. So code written for 2.6 won't be ready for 3.0.
This defeats the supposed point of 2.6.
2009/2/24 John Nagle said:Â Some of the people involved are on Google's payroll.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.