hex(-5) => Futurewarning: ugh, can't we have a better hex than '-'[:n<0]+hex(abs(n)) ??

B

Bengt Richter

hex(-5)
__main__:1: FutureWarning: hex()/oct() of negative int will return a signed string in Python 2.4
and up
'0xfffffffb''-0x5L'

That is sooo ugly. I suppose it is for a backwards compatible repr, but couldn't we
at least have hex(n, newformat=False) so that we can do

hex(-5, True) => 1xb # 1x signals an arbitrary number of prefixed f's
hex( 5, True) => 0x5

and have int() and long() recognize these?

Also would need a variant of %x and %X for formatting with the % operator.

Regards,
Bengt Richter
 
M

Michael Peuser

Bengt Richter said:
__main__:1: FutureWarning: hex()/oct() of negative int will return a
signed string in Python 2.4 and up '0xfffffffb'

[...]

There is a thread from this morning ("bitwise not ...") - this should be an
excellent contribution!
I have no mercy with someone writing hex(-5)

Kindly
Michael P
 
F

Freddie

Bengt Richter said:
__main__:1: FutureWarning: hex()/oct() of negative int will return a
signed string in Python 2.4 and up '0xfffffffb'

[...]

There is a thread from this morning ("bitwise not ...") - this should be an
excellent contribution!
I have no mercy with someone writing hex(-5)

Kindly
Michael P

What about crazy people like myself? If you generate a crc32 value with zib,
you occasionally get a negative number returned. If you try to convert that
to hex (to test against a stored CRC32 value), it spits out a FutureWarning
at me. So you end up with silly things like this in your code:


# Disable FutureWarning, since it whinges about us making bad hex values :(
import warnings
try:
warnings.filterwarnings(action='ignore', category=FutureWarning)
except NameError:
del warnings
 
J

Juha Autero

Freddie said:
What about crazy people like myself? If you generate a crc32 value with zib,
you occasionally get a negative number returned. If you try to convert that
to hex (to test against a stored CRC32 value), it spits out a FutureWarning
at me.

Read the thread about bitwise not. Tell Python how many bits you
want. In case of CRC32 that is of course 32 bits:
hex(-5&2**32-1)

Two questions: What is the best way to generate bitmask of n bits all
ones? And would sombody explain why hexadecimal (and octal) literals
behave differently from decimal literals? (see:
http://www.python.org/doc/current/ref/integers.html ) Why hexadecimal
literals from 0x80000000 to 0xffffffff are interpetred as negative
numbers instead of converting to long integers?
 
J

Jeff Epler

Two questions: What is the best way to generate bitmask of n bits all
ones?

def ones(n):
r = (1l << n) - 1
try:
r = int(r)
except OverflowError:
pass
return r

does this do what you want? It gives these results:

# 2.3b1 (old, but should have 2.3's long vs int quirks).... print "%2s %22s %22s" % (i, `ones(i)`, hex(ones(i)))
....
0 0 0x0
1 1 0x1
2 3 0x3
3 7 0x7
31 2147483647 0x7fffffff
32 4294967295L 0xFFFFFFFFL
33 8589934591L 0x1FFFFFFFFL
63 9223372036854775807L 0x7FFFFFFFFFFFFFFFL
64 18446744073709551615L 0xFFFFFFFFFFFFFFFFL
65 36893488147419103231L 0x1FFFFFFFFFFFFFFFFL

# 2.2.2.... print "%2s %22s %22s" % (i, `ones(i)`, hex(ones(i)))
....
0 0 0x0
1 1 0x1
2 3 0x3
3 7 0x7
31 2147483647 0x7fffffff
32 4294967295L 0xFFFFFFFFL
33 8589934591L 0x1FFFFFFFFL
63 9223372036854775807L 0x7FFFFFFFFFFFFFFFL
64 18446744073709551615L 0xFFFFFFFFFFFFFFFFL
65 36893488147419103231L 0x1FFFFFFFFFFFFFFFFL
 
B

Bengt Richter

Why do you object to 2**n-1? This is just fine I think.


Most of all this has practical reasons because of the use most programmers
have for stating hexadecimal literals.

Of couse some hex literals are not interpreted as negative numbers but the
memory contents, because it has become undistinguishable what the origin had
been.

One will not expect
print int(0xffffffff )
do something different from
x=0xffffffff
print int(x)
Unfortunately, the path to unification of integers to hardware-width independence
has backwards compatibility problems. I guess they are worse than for true division,
but has anyone really attempted to get a measure of them?

The options for hex representation seem to be (in terms of regexes)

1) signed standard: [+-]?0x[0-9a-fA-F]+
2) unprefixed standard: [0-9a-fA-F]+
which are produced by hex() and '%x' and '%X'
and interpreted by int(x, 16)

There is a need for a round trip hex representation/interpretation for signed integers
of arbitrary size, but IMO a prefixed '-' does violence to the usual expectation
for hex representation (i.e., a sensible view of the bits involved in a conventional
"two's complement" representation of the number to whatever width required).

I hope it can be avoided as a default, but that at a minimum, that an alternative will be provided.

For hex literals, the [01]x[0-9a-fA-F]+ variation seems clean (I forgot again who came up with that as
the best emerging alternative in an old thread, but credit is due). Tim liked it too, I believe ;-)

Since hex() will change anyway, how much more breakage will hex(-1) => 1xf create vs => -0x1 ?
Not to harp, but obviously the -0x1 gives no sense of the conventional underlying bit pattern
of ...fffff. (I am talking about an abstract bit pattern that extends infinitely to the left,
not a concrete implementation. Of course it is often useful to know how the abstraction gets
implemented on a particular platform, but that is not the only use for hex. It is also handy
as a human-readable representatation of an abstract bit sequence).

The other question is what to do with '%x'. The current 2.3 version doesn't seem to pay much
attention to field widths other than as a minimum, so that may offer an opportunity to control
the output. It does not generate an '0x' prefix ( easy for the positive case to specify as
0x%x) yet negatives presumably will prefix a '-'. (Will 0x-abcd be legal??)
What are some other possibilities?

Looking forward to using widths, what would one hope to get from '%2.2x'%-1 ?
I would hope for 'ff', personally ;-) And ' 1' for '%2.2x'%1 and '01' for %02.2x'%1.

Coming at it from this end, what happens if we drop the max width? What should we get
for '%02x'%1 ? That's easy: '01' as now. But '%02x'%-1 => 'ffffffff' currently, and that has
to change. Apparently to '-1' if things go as they're going? (Again, hexness is lost).

A possibility for unrestricted output width would be to print enough hex characters for an
unambiguous interpretation of sign. I.e., so that there are enough bits to include the sign
and an optional number of copies to pad to a full 4-bit hex character as necessary. This would
mean '%02x'%-1 => ff since that gives the first hex character the right sign.

That has the wrong interpretation (if you want to recover the signed value) for
int(('%02x'%-1),16) so that would need a fix for easy use. Although I usually dislike passing
flag info to functions by negating nonzero parameters, it would have mnemonic value in this case.
E.g., int(('%02x'%-1), -16) or the equivalent int('ff', -16) could mean use the leading bit as sign.
This convention would translate nicely to octal and binary string representations as well.

Of course int(('%02x'%255) could not return 'ff' as an unconstrained-width representation. It
would have to be '0ff' to provide the proper sign. Note that constraining this to a max width
of 2 would give 'ff'. This works for octal and binary too.

IMO this way of avoiding '-' in hex, octal, and binary string formats would make the strings
represent the data more clearly. These formats are mainly to communicate bit patterns IMO,
not just alternative ways to spell integer values.

If we have 1xf as a literal for spelling -0x1, I guess we don't need a literal format
for the leading-bit-as-sign convention. But the latter would be a way of reading and writing
signed arbitrary-width hex without losing the hexness of n the way you would with

'%c%x'%('-'[:n<0],abs(n)) #yech, gak ;-/

Regards,
Bengt Richter
 
J

Juha Autero

Michael Peuser said:
Why do you object to 2**n-1? This is just fine I think.

Maybe I should have said "What other ways there are to generate a
bitmask of n bits all ones?" 2**n-1 seemed like a hack since it relies
on certain properties of binary numbers, but on the other hand all
bitmask processing relies on certain properties of binary numbers.
 

Members online

Forum statistics

Threads
473,734
Messages
2,569,441
Members
44,832
Latest member
GlennSmall

Latest Threads

Top