How to get an integer from a sequence of bytes

Mok-Kong Shen · May 27, 2013

From an int one can use to_bytes to get its individual bytes,
but how can one reconstruct the int from the sequence of bytes?

Thanks in advance.

M. K. Shen

Steven D'Aprano · May 27, 2013

From an int one can use to_bytes to get its individual bytes, but how
can one reconstruct the int from the sequence of bytes?

Here's one way:

py> n = 11999102937234
py> m = 0
py> for b in n.to_bytes(6, 'big'):
.... m = 256*m + b
....
py> m == n
True

Ned Batchelder · May 27, 2013

From an int one can use to_bytes to get its individual bytes,
but how can one reconstruct the int from the sequence of bytes?

The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes

--Ned.

Steven D'Aprano · May 27, 2013

The next thing in the docs after int.to_bytes is int.from_bytes:

And I can't believe I missed that too :-(

Carlos Nepomuceno · May 27, 2013

----------------------------------------

From: (e-mail address removed)
Subject: Re: How to get an integer from a sequence of bytes
Date: Mon, 27 May 2013 15:00:39 +0000
To: (e-mail address removed)

Here's one way:

py> n = 11999102937234
py> m = 0
py> for b in n.to_bytes(6, 'big'):
... m = 256*m + b
...
py> m == n
True

Python 2 doesn't have to_bytes()!

# Python 2, LSB 1st
def to_lil_bytes(x):
    r = []
    while x != 0:
        r.append(int(x & 0b11111111))
        x>>= 8
    return r

# Python 2, LSB 1st
def from_lil_bytes(l):
    x = 0
    for i in range(len(l)-1, -1, -1):
        x <<= 8
        x |= l
    return x

# Python 2, MSB 1st
def to_big_bytes(x):
    r = []
    while x != 0:
        r.insert(0, int(x & 0b11111111))
        x>>= 8
    return r

# Python 2, MSB 1st
def from_big_bytes(l):
    x = 0
    for i in range(len(l)):
        x <<= 8
        x |= l
    return x

Can it be faster?

Dave Angel · May 27, 2013

And I can't believe I missed that too :-(

And that approach probably works for negative ints too.

Grant Edwards · May 28, 2013

From an int one can use to_bytes to get its individual bytes,
but how can one reconstruct the int from the sequence of bytes?

One way is using the struct module.

Mok-Kong Shen · May 30, 2013

Am 27.05.2013 17:30, schrieb Ned Batchelder:

The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes

I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

M. K. Shen

Ian Kelly · May 30, 2013

Am 27.05.2013 17:30, schrieb Ned Batchelder:

The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes

Click to expand...

I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

The bytes object can be viewed as a sequence of ints. So if b is a
bytes object of non-zero length, then b[0] is an int in range(0, 256).

jmfauth · May 30, 2013

Am 27.05.2013 17:30, schrieb Ned Batchelder:

Click to expand...

I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

Click to expand...

The bytes object can be viewed as a sequence of ints. So if b is a
bytes object of non-zero length, then b[0] is an int in range(0, 256).

----

Well, Python now "speaks" only "integer", the rest is
commodity and there is a good coherency.
<class 'int'>

jmf

Ned Batchelder · May 30, 2013

Am 27.05.2013 17:30, schrieb Ned Batchelder:

The next thing in the docs after int.to_bytes is int.from_bytes:
http://docs.python.org/3.3/library/stdtypes.html#int.from_bytes

Click to expand...

I am sorry to have overlooked that. But one thing I yet wonder is why
there is no direct possibilty of converting a byte to an int in [0,255],
i.e. with a constrct int(b), where b is a byte.

Presumably you want this to work:
3

But you also want this to work:
7

These two interpretations are incompatible. If b'\x03' becomes 3, then
shouldn't b'\x37' become 55? But b'\x37' is b'7', and you want that to
be 7.

--Ned.

Chris Angelico · Jun 2, 2013

b'7' is the byte with the character 7 in a certain code, so that's
ok. In other PLs one assigns an int to a byte, with that int in either
decimal notation or hexadecimal notation, or else one assigns a
character to it, in which case it gets the value of the character
in a certain code. What I don't yet understand is why Python is
apprently different from other PLs in that point in not allowing direct
coersion of a byte to an int.

It does. Just subscript it:

b'7'[0]

Click to expand...

Click to expand...

55

ChrisA

Dennis Lee Bieber · Jun 2, 2013

b'7' is the byte with the character 7 in a certain code, so that's
ok. In other PLs one assigns an int to a byte, with that int in either

In other languages "byte" is an 8-bit signed/unsigned numeric.

But what you have is a Python 3.x "bytes" structure -- similar to a
character string in Python 2.x...

decimal notation or hexadecimal notation, or else one assigns a
character to it, in which case it gets the value of the character
in a certain code. What I don't yet understand is why Python is
apprently different from other PLs in that point in not allowing direct
coersion of a byte to an int.

As you've been shown, the first step is that you may have to
subscript it; even with just one byte, the structure is still a
"string/array". NOTE: that example doesn't work in 2.7, since
subscripting what is a "string" still returns a substring (of one
character).

Python doesn't have a "numeric" byte type -- the b"..." is an
"array" of 8-bit values in Python 3.x, and is just a character string in
2.x

A language like C didn't have a "string" type... "char" was a
pseudonym for "numeric byte" (and some even support "unsigned char" vs
"signed char").

Maybe you'd like to program in Ada... Where "7" is a "string of
length 1" and '7' is a character -- and you have to do type conversions
to assign the latter to the former.

Heck:

with Text_IO; use Text_IO;

procedure Bytes is

begin
if "7" = '7' then
Put_Line ("string 7 is equal to character 7");
else
Put_Line ("string 7 is NOT equal to character 7");
end if;
end Bytes;

WON'T compile... string can not be compared to character!

with Text_IO; use Text_IO;

procedure Bytes is

A_String : String (1 .. 1);
A_Char : Character := '7';

begin
A_String := A_Char;

end Bytes;

The above fails to compile, whereas the following is valid Ada

with Text_IO; use Text_IO;

procedure Bytes is

A_String : String (1 .. 1);
A_Char : Character := '7';

begin
A_String(1) := A_Char;

end Bytes;

Don't even ask about /numeric/ bytes and strings (or characters). Or
lets...

with Text_IO; use Text_IO;

procedure Bytes is

type Byte is mod 256;

A_String : String (1 .. 1);
Char : Byte := 7;

begin
A_String (1) := Char;

end Bytes;

Fails... But...

with Text_IO; use Text_IO;

procedure Bytes is

type Byte is mod 256;

A_String : String (1 .. 1);
Char : Byte := 7;

begin
A_String (1) := Character'Val (Char);

end Bytes;

That takes a byte data type (unsigned 8-bit value)... Asks for the
CHARACTER data type having the value equivalent to the "position" of the
byte... And then stuff that into the only element of a STRING data type.

Grant Edwards · Jun 3, 2013

In other languages "byte" is an 8-bit signed/unsigned numeric.

That's a common assumption, but historically, a "byte" was merely the
smallest addressable unit of memory. The size of a "byte" on widely
used used CPUs ranged from 4 bits to 60 bits.

Quoting from http://en.wikipedia.org/wiki/Byte

"The size of the byte has historically been hardware
dependent and no definitive standards existed that mandated the
size."

That's why IEEE standards always use the word "octet" when referring a
value containing 8 bits.

Only recently has it become common to assume that an "byte" contains 8
bits.

Dave Angel · Jun 3, 2013

That's a common assumption, but historically, a "byte" was merely the
smallest addressable unit of memory. The size of a "byte" on widely
used used CPUs ranged from 4 bits to 60 bits.

<Hehe> I recall rewriting the unpacking algorithm to get the 10
characters from each byte, on such a machine.

Grant Edwards · Jun 3, 2013

<Hehe> I recall rewriting the unpacking algorithm to get the 10
characters from each byte, on such a machine.

Yep. IIRC there were CDC machines (Cyber 6600?) with a 60-bit wide
"byte" and a 6-bit wide upper-case-only character set. ISTM that the
Pascal compiler limited you to 6 significant characters in variable
names so that it could use a simple single register compare while
doing symbol lookups...

I think some IBM machines had 60-bit "bytes" as well.

Carlos Nepomuceno · Jun 3, 2013

________________________________

Date: Mon, 3 Jun 2013 15:41:41 -0700
Subject: Re: How to get an integer from a sequence of bytes
From: (e-mail address removed)
To: (e-mail address removed) [...]
Today though, it would be difficult to sell a conventional (Von
Neumann) computer that didn't have 8 bit bytes. Quantum computers
would still sell if they were odd this way - they're going to be really
different anyway.

Nowadays it would be a hard task to find a Von Neumann architecture machine..

Most of current CPUs are variants of the Harvard architecture: they separate instructions from data at the cache level.

Grant Edwards · Jun 4, 2013

When I was a Freshman in college, I used a CDC Cyber a lot; it had 6 bit
bytes and 60 bit words. This was in 1985.

But you couldn't address individual 6-bit "hextets" in memory could
you? My recollection is that incrementing a memory address got you
the next 60-bit chunk -- that means that by the older terminology a
"byte" was 60 bits. A "character" was 6 bits, and a single register
or memory location could hold 6 characters.

Today though, it would be difficult to sell a conventional (Von Neumann)
computer that didn't have 8 bit bytes.

There are tons (as in millions of units per month) of CPUs still being
sold in the DSP market with 16, 20, 24, and 32 bit "bytes". (When
writing C on a TMS320Cxx CPU sizeof (char) == sizeof (int) == sizeof
(long) == sizeof (float) == sizeof (double) == 1. They all contain 32
bits.

Grant Edwards · Jun 4, 2013

________________________________

Date: Mon, 3 Jun 2013 15:41:41 -0700
Subject: Re: How to get an integer from a sequence of bytes
From: (e-mail address removed)
To: (e-mail address removed) [...]
Today though, it would be difficult to sell a conventional (Von
Neumann) computer that didn't have 8 bit bytes. Quantum computers
would still sell if they were odd this way - they're going to be really
different anyway.

Click to expand...

Nowadays it would be a hard task to find a Von Neumann architecture
machine.

Most of current CPUs are variants of the Harvard architecture: they
separate instructions from data at the cache level.

VN designs are still very common in smaller CPUs (embedded stuff).

Even modern desktop CPUs are "logically" still Von Neumann designs
from the programmer's point of view (there's only a single address
space for both data and instructions). The fact that there are two
sparate caches is almost entirely hidden from the user. If you start
to do stuff like write self-modifying code, then _may_ start having to
worry about cache coherency.

Carlos Nepomuceno · Jun 4, 2013

From: (e-mail address removed)
Subject: Re: How to get an integer from a sequence of bytes
Date: Tue, 4 Jun 2013 13:42:46 +0000
To: (e-mail address removed) [...]
VN designs are still very common in smaller CPUs (embedded stuff).

DSPs perhaps... not CPUs. Even ARMs are Harvard variants.

Even modern desktop CPUs are "logically" still Von Neumann designs
from the programmer's point of view (there's only a single address
space for both data and instructions). The fact that there are two
sparate caches is almost entirely hidden from the user. If you start
to do stuff like write self-modifying code, then _may_ start having to
worry about cache coherency.

Code/data separation isn't the only aspect. VN architecture is totally serial, even for RAM.

It's been a while since we've got into the multi-core, multipath world. Even in embedded devices.

Comparison of Integer and Pointer (that's supposed to be an Integer). Where did I go wrong?	0	Nov 19, 2022
The cost of the cheapest routes between cities	3	Jan 6, 2023
New to VHDL... Trying to convert a 2-bytes number into an decimal	0	Dec 8, 2022
How to get expertise in "cyber security" or from where to start for this?	0	Apr 20, 2024
A data conversion question	1	Apr 6, 2014
I am having trouble finding a method of using the git enterprise api to scrape data from projects	1	Jun 1, 2023
how to get bytes from bytearray without copying	0	Mar 2, 2014
How can I get my menu inside of a menu to function properly?	1	Jan 19, 2023

How to get an integer from a sequence of bytes

Mok-Kong Shen

Steven D'Aprano

Ned Batchelder

Steven D'Aprano

Carlos Nepomuceno

Dave Angel

Grant Edwards

Mok-Kong Shen

Ian Kelly

jmfauth

Ned Batchelder

Chris Angelico

Dennis Lee Bieber

Grant Edwards

Dave Angel

Grant Edwards

Carlos Nepomuceno

Grant Edwards

Grant Edwards

Carlos Nepomuceno

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads