How to get an integer from a sequence of bytes

M

Mok-Kong Shen

From an int one can use to_bytes to get its individual bytes,
but how can one reconstruct the int from the sequence of bytes?

Thanks in advance.

M. K. Shen
 
S

Steven D'Aprano

From an int one can use to_bytes to get its individual bytes, but how
can one reconstruct the int from the sequence of bytes?

Here's one way:

py> n = 11999102937234
py> m = 0
py> for b in n.to_bytes(6, 'big'):
.... m = 256*m + b
....
py> m == n
True
 
C

Carlos Nepomuceno

----------------------------------------
From: (e-mail address removed)
Subject: Re: How to get an integer from a sequence of bytes
Date: Mon, 27 May 2013 15:00:39 +0000
To: (e-mail address removed)



Here's one way:

py> n = 11999102937234
py> m = 0
py> for b in n.to_bytes(6, 'big'):
... m = 256*m + b
...
py> m == n
True

Python 2 doesn't have to_bytes()! :(

# Python 2, LSB 1st
def to_lil_bytes(x):
    r = []
    while x != 0:
        r.append(int(x & 0b11111111))
        x>>= 8
    return r

# Python 2, LSB 1st
def from_lil_bytes(l):
    x = 0
    for i in range(len(l)-1, -1, -1):
        x <<= 8
        x |= l
    return x

# Python 2, MSB 1st
def to_big_bytes(x):
    r = []
    while x != 0:
        r.insert(0, int(x & 0b11111111))
        x>>= 8
    return r

# Python 2, MSB 1st
def from_big_bytes(l):
    x = 0
    for i in range(len(l)):
        x <<= 8
        x |= l
    return x

Can it be faster?
 
G

Grant Edwards

From an int one can use to_bytes to get its individual bytes,
but how can one reconstruct the int from the sequence of bytes?

One way is using the struct module.
 
I

Ian Kelly

Am 27.05.2013 17:30, schrieb Ned Batchelder:
The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes


I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

The bytes object can be viewed as a sequence of ints. So if b is a
bytes object of non-zero length, then b[0] is an int in range(0, 256).
 
J

jmfauth

Am 27.05.2013 17:30, schrieb Ned Batchelder:
I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

The bytes object can be viewed as a sequence of ints.  So if b is a
bytes object of non-zero length, then b[0] is an int in range(0, 256).

----

Well, Python now "speaks" only "integer", the rest is
commodity and there is a good coherency.
<class 'int'>

jmf
 
N

Ned Batchelder

Am 27.05.2013 17:30, schrieb Ned Batchelder:
The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes

I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

Presumably you want this to work:
3

But you also want this to work:
7

These two interpretations are incompatible. If b'\x03' becomes 3, then
shouldn't b'\x37' become 55? But b'\x37' is b'7', and you want that to
be 7.

--Ned.
 
C

Chris Angelico

b'7' is the byte with the character 7 in a certain code, so that's
ok. In other PLs one assigns an int to a byte, with that int in either
decimal notation or hexadecimal notation, or else one assigns a
character to it, in which case it gets the value of the character
in a certain code. What I don't yet understand is why Python is
apprently different from other PLs in that point in not allowing direct
coersion of a byte to an int.

It does. Just subscript it:
55

ChrisA
 
D

Dennis Lee Bieber

b'7' is the byte with the character 7 in a certain code, so that's
ok. In other PLs one assigns an int to a byte, with that int in either

In other languages "byte" is an 8-bit signed/unsigned numeric.

But what you have is a Python 3.x "bytes" structure -- similar to a
character string in Python 2.x...
decimal notation or hexadecimal notation, or else one assigns a
character to it, in which case it gets the value of the character
in a certain code. What I don't yet understand is why Python is
apprently different from other PLs in that point in not allowing direct
coersion of a byte to an int.

As you've been shown, the first step is that you may have to
subscript it; even with just one byte, the structure is still a
"string/array". NOTE: that example doesn't work in 2.7, since
subscripting what is a "string" still returns a substring (of one
character).

Python doesn't have a "numeric" byte type -- the b"..." is an
"array" of 8-bit values in Python 3.x, and is just a character string in
2.x

A language like C didn't have a "string" type... "char" was a
pseudonym for "numeric byte" (and some even support "unsigned char" vs
"signed char").

Maybe you'd like to program in Ada... Where "7" is a "string of
length 1" and '7' is a character -- and you have to do type conversions
to assign the latter to the former.

Heck:

with Text_IO; use Text_IO;

procedure Bytes is

begin
if "7" = '7' then
Put_Line ("string 7 is equal to character 7");
else
Put_Line ("string 7 is NOT equal to character 7");
end if;
end Bytes;

WON'T compile... string can not be compared to character!

with Text_IO; use Text_IO;

procedure Bytes is

A_String : String (1 .. 1);
A_Char : Character := '7';


begin
A_String := A_Char;

end Bytes;

The above fails to compile, whereas the following is valid Ada

with Text_IO; use Text_IO;

procedure Bytes is

A_String : String (1 .. 1);
A_Char : Character := '7';


begin
A_String(1) := A_Char;

end Bytes;

Don't even ask about /numeric/ bytes and strings (or characters). Or
lets...

with Text_IO; use Text_IO;

procedure Bytes is

type Byte is mod 256;

A_String : String (1 .. 1);
Char : Byte := 7;


begin
A_String (1) := Char;

end Bytes;

Fails... But...

with Text_IO; use Text_IO;

procedure Bytes is

type Byte is mod 256;

A_String : String (1 .. 1);
Char : Byte := 7;

begin
A_String (1) := Character'Val (Char);

end Bytes;

That takes a byte data type (unsigned 8-bit value)... Asks for the
CHARACTER data type having the value equivalent to the "position" of the
byte... And then stuff that into the only element of a STRING data type.
 
G

Grant Edwards

In other languages "byte" is an 8-bit signed/unsigned numeric.

That's a common assumption, but historically, a "byte" was merely the
smallest addressable unit of memory. The size of a "byte" on widely
used used CPUs ranged from 4 bits to 60 bits.

Quoting from http://en.wikipedia.org/wiki/Byte

"The size of the byte has historically been hardware
dependent and no definitive standards existed that mandated the
size."

That's why IEEE standards always use the word "octet" when referring a
value containing 8 bits.

Only recently has it become common to assume that an "byte" contains 8
bits.
 
D

Dave Angel

That's a common assumption, but historically, a "byte" was merely the
smallest addressable unit of memory. The size of a "byte" on widely
used used CPUs ranged from 4 bits to 60 bits.

<Hehe> I recall rewriting the unpacking algorithm to get the 10
characters from each byte, on such a machine.
 
G

Grant Edwards

<Hehe> I recall rewriting the unpacking algorithm to get the 10
characters from each byte, on such a machine.

Yep. IIRC there were CDC machines (Cyber 6600?) with a 60-bit wide
"byte" and a 6-bit wide upper-case-only character set. ISTM that the
Pascal compiler limited you to 6 significant characters in variable
names so that it could use a simple single register compare while
doing symbol lookups...

I think some IBM machines had 60-bit "bytes" as well.
 
C

Carlos Nepomuceno

________________________________
Date: Mon, 3 Jun 2013 15:41:41 -0700
Subject: Re: How to get an integer from a sequence of bytes
From: (e-mail address removed)
To: (e-mail address removed) [...]
Today though, it would be difficult to sell a conventional (Von
Neumann) computer that didn't have 8 bit bytes. Quantum computers
would still sell if they were odd this way - they're going to be really
different anyway.

Nowadays it would be a hard task to find a Von Neumann architecture machine..

Most of current CPUs are variants of the Harvard architecture: they separate instructions from data at the cache level.
 
G

Grant Edwards

When I was a Freshman in college, I used a CDC Cyber a lot; it had 6 bit
bytes and 60 bit words. This was in 1985.

But you couldn't address individual 6-bit "hextets" in memory could
you? My recollection is that incrementing a memory address got you
the next 60-bit chunk -- that means that by the older terminology a
"byte" was 60 bits. A "character" was 6 bits, and a single register
or memory location could hold 6 characters.
Today though, it would be difficult to sell a conventional (Von Neumann)
computer that didn't have 8 bit bytes.

There are tons (as in millions of units per month) of CPUs still being
sold in the DSP market with 16, 20, 24, and 32 bit "bytes". (When
writing C on a TMS320Cxx CPU sizeof (char) == sizeof (int) == sizeof
(long) == sizeof (float) == sizeof (double) == 1. They all contain 32
bits.
 
G

Grant Edwards

________________________________
Date: Mon, 3 Jun 2013 15:41:41 -0700
Subject: Re: How to get an integer from a sequence of bytes
From: (e-mail address removed)
To: (e-mail address removed) [...]
Today though, it would be difficult to sell a conventional (Von
Neumann) computer that didn't have 8 bit bytes. Quantum computers
would still sell if they were odd this way - they're going to be really
different anyway.

Nowadays it would be a hard task to find a Von Neumann architecture
machine.

Most of current CPUs are variants of the Harvard architecture: they
separate instructions from data at the cache level.

VN designs are still very common in smaller CPUs (embedded stuff).

Even modern desktop CPUs are "logically" still Von Neumann designs
from the programmer's point of view (there's only a single address
space for both data and instructions). The fact that there are two
sparate caches is almost entirely hidden from the user. If you start
to do stuff like write self-modifying code, then _may_ start having to
worry about cache coherency.
 
C

Carlos Nepomuceno

From: (e-mail address removed)
Subject: Re: How to get an integer from a sequence of bytes
Date: Tue, 4 Jun 2013 13:42:46 +0000
To: (e-mail address removed) [...]
VN designs are still very common in smaller CPUs (embedded stuff).

DSPs perhaps... not CPUs. Even ARMs are Harvard variants.
Even modern desktop CPUs are "logically" still Von Neumann designs
from the programmer's point of view (there's only a single address
space for both data and instructions). The fact that there are two
sparate caches is almost entirely hidden from the user. If you start
to do stuff like write self-modifying code, then _may_ start having to
worry about cache coherency.

Code/data separation isn't the only aspect. VN architecture is totally serial, even for RAM.

It's been a while since we've got into the multi-core, multipath world. Even in embedded devices.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top