# Bit shifts and endianness

Discussion in 'C Programming' started by gamehack, Jan 5, 2006.

1. ### gamehackGuest

Hi all,

I was thinking today, suppose we have the number
n = 0xAB 0xFF
which is equivalent to 44031 in decimal. In big endian it will be
stored as
10101011 11111111
but in little endian it will be
11111111 10101011
If we then apply a bit shift n << 2; that would give us completely
different numbers. So is actually a left bit shift done to the left on
a big-endian and left bit shift on little endian is actually a right
shift in the memory?

Thanks a lot

PS. What other problems arise from endianness? Are there any good
resources on that?

gamehack, Jan 5, 2006

2. ### Keith ThompsonGuest

"gamehack" <> writes:
> I was thinking today, suppose we have the number
> n = 0xAB 0xFF

That's two numbers. Do you mean 0xABFF?

> which is equivalent to 44031 in decimal.

Apparently so.

> In big endian it will be
> stored as
> 10101011 11111111
> but in little endian it will be
> 11111111 10101011
> If we then apply a bit shift n << 2; that would give us completely
> different numbers.

No. Shift operators are defined on the *values* of their operands,
not on their representations.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson, Jan 5, 2006

3. ### gamehackGuest

Keith Thompson wrote:
> "gamehack" <> writes:
> > I was thinking today, suppose we have the number
> > n = 0xAB 0xFF

>
> That's two numbers. Do you mean 0xABFF?

Sorry, I meant 0xABFF and just separated them as distinct bytes.
>
> > which is equivalent to 44031 in decimal.

>
> Apparently so.
>
> > In big endian it will be
> > stored as
> > 10101011 11111111
> > but in little endian it will be
> > 11111111 10101011
> > If we then apply a bit shift n << 2; that would give us completely
> > different numbers.

>
> No. Shift operators are defined on the *values* of their operands,
> not on their representations.
>

Thanks
> --
> Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
> We must do something. This is something. Therefore, we must do this.

gamehack, Jan 5, 2006
4. ### Mark McIntyreGuest

On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
<> wrote:

>"gamehack" <> writes:
>> I was thinking today, suppose we have the number

>> 10101011 11111111
>> but in little endian it will be
>> 11111111 10101011
>> If we then apply a bit shift n << 2; that would give us completely
>> different numbers.

>
>No. Shift operators are defined on the *values* of their operands,
>not on their representations.

Clarification: I assume you mean that shift operators operate on a
binary value, irrespective of how that is stored on disk or in memory.
I'm saying this because few humans would think of the value of 0xabff
as the value in binary.

(and please, don't start another one of those tedious "well patently
what I said is 'right', so you're a fool" discussions again, I really
don't want to have to express my opinion of people who can't accept
that there may be more than one way to skin a cat).
Mark McIntyre
--

----== Posted via Newsfeeds.Com - Unlimited-Unrestricted-Secure Usenet News==----
http://www.newsfeeds.com The #1 Newsgroup Service in the World! 120,000+ Newsgroups
----= East and West-Coast Server Farms - Total Privacy via Encryption =----

Mark McIntyre, Jan 5, 2006
5. ### Richard HeathfieldGuest

Mark McIntyre said:

> On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
> <> wrote:
>
>>No. Shift operators are defined on the *values* of their operands,
>>not on their representations.

>
> Clarification: I assume you mean that shift operators operate on a
> binary value,

No, they operate on a value. Values don't have number bases. That's merely a
representation issue.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Richard Heathfield, Jan 5, 2006
6. ### Guest

Mark McIntyre wrote:
> On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
> <> wrote:
>
> >"gamehack" <> writes:
> >> I was thinking today, suppose we have the number

>
> >> 10101011 11111111
> >> but in little endian it will be
> >> 11111111 10101011
> >> If we then apply a bit shift n << 2; that would give us completely
> >> different numbers.

> >
> >No. Shift operators are defined on the *values* of their operands,
> >not on their representations.

>
> Clarification: I assume you mean that shift operators operate on a
> binary value, irrespective of how that is stored on disk or in memory.
> I'm saying this because few humans would think of the value of 0xabff
> as the value in binary.
>

Yeah, I had this misconception before. 0xabff >> 8 always gives me
0x00ab regardless of weather the machine is big endian or little
endian. Endianness just defines weather 'ab' comes before or after 'ff'
in memory. It doesn't affect binary operations. Endianness is only an
issue when you are transferring data between machines via a file or a
network, not when you are doing the computing.

, Jan 5, 2006
7. ### Richard HeathfieldGuest

said:

> 0xabff >> 8 always gives me 0x00ab regardless of weather

Even so, I don't recommend that you try this in the rain.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Richard Heathfield, Jan 5, 2006
8. ### Keith ThompsonGuest

Mark McIntyre <> writes:
> On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
> <> wrote:
>>"gamehack" <> writes:
>>> I was thinking today, suppose we have the number

>
>>> 10101011 11111111
>>> but in little endian it will be
>>> 11111111 10101011
>>> If we then apply a bit shift n << 2; that would give us completely
>>> different numbers.

>>
>>No. Shift operators are defined on the *values* of their operands,
>>not on their representations.

>
> Clarification: I assume you mean that shift operators operate on a
> binary value, irrespective of how that is stored on disk or in memory.
> I'm saying this because few humans would think of the value of 0xabff
> as the value in binary.

That's one way of looking at it.

The standard describes the result of a shift operator both as "E1
(left|right)-shifted E2 bit positions" and in terms of multiplication
or division by powers of 2. I think the latter is intended to define
what the standard means by the terms "(left|right)-shifted". The
standard could as easily have defined the operators purely in terms of
multiplication or division with no reference to shifting, with the
resulting wording describing exactly the same semantics.

One advantage of describing it this way is that it avoids the OP's
question; "<<" and ">>" don't depend on endianness any more than "+"
and "-" do, and there's no potential ambiguity about the meanings of
"left" and "right".

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson, Jan 5, 2006
9. ### Jordan AbelGuest

On 2006-01-05, Richard Heathfield <> wrote:
> Mark McIntyre said:
>
>> On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
>> <> wrote:
>>
>>>No. Shift operators are defined on the *values* of their operands,
>>>not on their representations.

>>
>> Clarification: I assume you mean that shift operators operate on a
>> binary value,

>
> No, they operate on a value. Values don't have number bases. That's merely a
> representation issue.

The standard mentions "bits" in too many places for me to believe this.

Jordan Abel, Jan 6, 2006
10. ### Chris TorekGuest

In article <>
<> wrote:
>... Endianness is only an issue when you are transferring
>data between machines via a file or a network ...

More precisely (and more correctly ), "endianness" becomes
an issue whenever someone or something takes a value apart,
so that there is a "before" and an "after", or a "left" and a
"right", or some other way of sequencing parts of the value.

For instance, imagine you are moving from one place of living to
another (e.g., moving apartments). Your car has room for 1/3 of
your bed, but not the whole thing. You take a saw to the bed and
cut it into thirds. You then transport each third to your new
location and reassemble it.

You need to reassemble it in the same order you took it apart, or
it will likely be quite uncomfortable to sleep in.

Endianness in computers arises when you allow the machine to take
a value apart and ship it somewhere, and then allow some other
machine to put it together again. If the two machines use the same
order, all is well, but if they use different orders, your bed is
no longer very useful.

There are a number of ways to handle this, but the simplest is:
do the disassembly and reassembly yourself. Instead of letting
the machine do it in "machine order", do it yourself and control
the order.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
Reading email is like searching for food in the garbage, thanks to spammers.

Chris Torek, Jan 6, 2006
11. ### Eric SosmanGuest

Richard Heathfield wrote:
> said:
>
>
>>0xabff >> 8 always gives me 0x00ab regardless of weather

>
>
> Even so, I don't recommend that you try this in the rain.

.... lest the negative sign give you a positive charge ...

--
Eric Sosman
lid

Eric Sosman, Jan 6, 2006
12. ### Eric SosmanGuest

Jordan Abel wrote:

> On 2006-01-05, Richard Heathfield <> wrote:
>
>>Mark McIntyre said:
>>
>>
>>>On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
>>><> wrote:
>>>
>>>
>>>>No. Shift operators are defined on the *values* of their operands,
>>>>not on their representations.
>>>
>>>Clarification: I assume you mean that shift operators operate on a
>>>binary value,

>>
>>No, they operate on a value. Values don't have number bases. That's merely a
>>representation issue.

>
>
> The standard mentions "bits" in too many places for me to believe this.

Yeah. And there's also the requirement for a "pure binary"
representation. Nonetheless, I stick with the Heathfield Maxim:
operators operate on values, not on representations. When the
Standard describes ^ or >> in terms of bits, it does so merely
as a convenience, as an appeal to the readers' understanding of
binary arithmetic. Nowhere (that I can see) is there a requirement
that the arithmetic actually be carried out bit-by-bit or even
bits-in-parallel; it's just easier for the Standard to describe
`x & 17' by referring to binary digits than by avoiding such a
reference. (Quick: Describe `x & 17' without reference to bits.
It can surely be done, but I expect the description would be
verbose and opaque even by the standards of Standardese.)

Let's put it this way: Imagine a machine with two "banks"
of memory, whose stoned-out-of-their-gourds designers have
decided should have different endiannesses. If `x' is in the
BigEndian bank and `y' in the LittleEndian, what is the effect
of `x = 1; y = x << 1;'? I claim that the effect must be the
same as `x = 1; y = 2;', or else the implementation is broken.
The fact that the representations differ just doesn't matter.

--
Eric Sosman
lid

Eric Sosman, Jan 6, 2006
13. ### Keith ThompsonGuest

Jordan Abel <> writes:
> On 2006-01-05, Richard Heathfield <> wrote:
>> Mark McIntyre said:
>>
>>> On Thu, 05 Jan 2006 20:26:11 GMT, in comp.lang.c , Keith Thompson
>>> <> wrote:
>>>
>>>>No. Shift operators are defined on the *values* of their operands,
>>>>not on their representations.
>>>
>>> Clarification: I assume you mean that shift operators operate on a
>>> binary value,

>>
>> No, they operate on a value. Values don't have number bases. That's
>> merely a representation issue.

>
> The standard mentions "bits" in too many places for me to believe this.

The standard imposes some fairly stringent requirements on how
integer, both signed and unsigned, are represented. These
requirements are tight enough to guarantee (I think) that the
"multiple or divide by powers of 2" and "shift by N bits" models of
the "<<" and ">>" operators turn out to be equivalent (with the
proviso that any padding bits don't participate).

Once could imagine a hypothetical C standard that doesn't place any
requirements on the representations of integers, but that defines "<<"
and ">>" in much the same way as the actual standard does. An
implementation conforming to this hypothetical standard could use,
say, a trinary hardware representation for integers, but would still
have to implement "<<" and ">>" so they yield the same results as on a
binary machine. (In fact, such an implementation would probably be
legal under the actual C standard as long as it does a sufficiently
good job of hiding the trinary hardware behind a binary-looking
interface.)

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson, Jan 6, 2006
14. ### Keith ThompsonGuest

Keith Thompson <> writes:
[...]
> The standard imposes some fairly stringent requirements on how
> integer, both signed and unsigned, are represented. These
> requirements are tight enough to guarantee (I think) that the
> "multiple or divide by powers of 2" and "shift by N bits" models of
> the "<<" and ">>" operators turn out to be equivalent (with the
> proviso that any padding bits don't participate).

Since not everyone has a copy of the standard, here's what C99 says in
its description of the bitwise shift operators. (I've replaced the
multiplication symbol 'x' with "*", and the superscript exponent with
a "**" operator.)

The integer promotions are performed on each of the operands. The
type of the result is that of the promoted left operand. If the
value of the right operand is negative or is greater than or equal
to the width of the promoted left operand, the behavior is
undefined.

The result of E1 << E2 is E1 left-shifted E2 bit positions;
vacated bits are filled with zeros. If E1 has an unsigned type,
the value of the result is E1 * 2**E2, reduced modulo one more
than the maximum value representable in the result type. If E1 has
a signed type and nonnegative value, and E1 * 2**E2 is
representable in the result type, then that is the resulting
value; otherwise, the behavior is undefined.

The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1
has an unsigned type or if E1 has a signed type and a nonnegative
value, the value of the result is the integral part of the
quotient of E1 / 2**E2. If E1 has a signed type and a negative
value, the resulting value is implementation-defined.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson, Jan 6, 2006
15. ### Keith ThompsonGuest

Eric Sosman <> writes:
[...]
> Yeah. And there's also the requirement for a "pure binary"
> representation. Nonetheless, I stick with the Heathfield Maxim:
> operators operate on values, not on representations. When the
> Standard describes ^ or >> in terms of bits, it does so merely
> as a convenience, as an appeal to the readers' understanding of
> binary arithmetic. Nowhere (that I can see) is there a requirement
> that the arithmetic actually be carried out bit-by-bit or even
> bits-in-parallel; it's just easier for the Standard to describe
> `x & 17' by referring to binary digits than by avoiding such a
> reference. (Quick: Describe `x & 17' without reference to bits.
> It can surely be done, but I expect the description would be
> verbose and opaque even by the standards of Standardese.)

I wouldn't try to describe x & 17 without reference to bits. Rather,
I'd define "bits" in terms of the arithmetic properties of the number,
and *then* define x & 17 in terms of the "bits" I've defined.
Something along the lines of:

An integer value in the range 0..2**(n-1) can be uniquely defined
as a sum of powers of two: ... + b3 * 2**3 + b2 * 2**2 + b1 * 2**1
+ b0 * 2**0. Each bn value is referred to as the nth _bit_ of the
integer value.

With this definition, it becomes irrelevant whether these "bits"
correspond to some on-off state in hardware somewhere; the description
applies equally well to numbers represented in decimal or trinary.

--
Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Keith Thompson, Jan 6, 2006
16. ### Eric SosmanGuest

Keith Thompson wrote:
> Eric Sosman <> writes:
>> [...] (Quick: Describe `x & 17' without reference to bits.
>>It can surely be done, but I expect the description would be
>>verbose and opaque even by the standards of Standardese.)

>
>
> I wouldn't try to describe x & 17 without reference to bits. Rather,
> I'd define "bits" in terms of the arithmetic properties of the number,
> and *then* define x & 17 in terms of the "bits" I've defined.
> Something along the lines of:
>
> An integer value in the range 0..2**(n-1) can be uniquely defined
> as a sum of powers of two: ... + b3 * 2**3 + b2 * 2**2 + b1 * 2**1
> + b0 * 2**0. Each bn value is referred to as the nth _bit_ of the
> integer value.
>
> With this definition, it becomes irrelevant whether these "bits"
> correspond to some on-off state in hardware somewhere; the description
> applies equally well to numbers represented in decimal or trinary.

Right, and that's the point: C's operators[*] are defined
in terms of the values of the operands and the value of the
result. They are not defined in terms of the representations
of those values, but in terms of the values themselves, however
represented.

[*] "Ordinary" operators, anyhow. Special operators like
`.' don't fit well with this way of understanding -- but not
even these oddballs work with the representations of their
operands.

As a concrete example of a situation where values are
represented without any individual "bit-artifacts" being
present, consider the data stream between two modems. If I
understand correctly (I'm not a modem maven), each change in
the signal encodes an entire group of bits: one phase shift
represents 000, another phase shift represents 010, and so
on. "Value bits" are transmitted, but no isolated "signal
bits" are discernible.

--
Eric Sosman
lid

Eric Sosman, Jan 6, 2006
17. ### Jordan AbelGuest

On 2006-01-06, Eric Sosman <> wrote:
> Keith Thompson wrote:
>> Eric Sosman <> writes:
>>> [...] (Quick: Describe `x & 17' without reference to bits.
>>>It can surely be done, but I expect the description would be
>>>verbose and opaque even by the standards of Standardese.)

>>
>>
>> I wouldn't try to describe x & 17 without reference to bits. Rather,
>> I'd define "bits" in terms of the arithmetic properties of the number,
>> and *then* define x & 17 in terms of the "bits" I've defined.
>> Something along the lines of:
>>
>> An integer value in the range 0..2**(n-1) can be uniquely defined
>> as a sum of powers of two: ... + b3 * 2**3 + b2 * 2**2 + b1 * 2**1
>> + b0 * 2**0. Each bn value is referred to as the nth _bit_ of the
>> integer value.
>>
>> With this definition, it becomes irrelevant whether these "bits"
>> correspond to some on-off state in hardware somewhere; the description
>> applies equally well to numbers represented in decimal or trinary.

>
> Right, and that's the point: C's operators[*] are defined
> in terms of the values of the operands and the value of the
> result. They are not defined in terms of the representations of
> those values, but in terms of the values themselves, however
> represented.
>
> [*] "Ordinary" operators, anyhow. Special operators like
> `.' don't fit well with this way of understanding -- but not even
> these oddballs work with the representations of their operands.
>
> As a concrete example of a situation where values are
> represented without any individual "bit-artifacts" being present,
> consider the data stream between two modems. If I understand
> correctly (I'm not a modem maven), each change in the signal
> encodes an entire group of bits: one phase shift represents 000,
> another phase shift represents 010, and so on. "Value bits" are
> transmitted, but no isolated "signal bits" are discernible.

That doesn't mean it's not a "binary value" - which, if you were
paying attention, is the claim you are attempting to refute.

Jordan Abel, Jan 6, 2006
18. ### Eric SosmanGuest

Jordan Abel wrote:
>
> That doesn't mean it's not a "binary value" - which, if you were
> paying attention, is the claim you are attempting to refute.

Thank you, Jordan, for drawing my attention to my
inattention. Tracing back the thread, though, I find
this remark from Mark McIntyre:

> No, they operate on a value. Values don't have number bases.
> That's merely a representation issue.

.... to which you responded:

> The standard mentions "bits" in too many places for me to
> believe this.

McIntyre says the shift operators work on values, not
representations, and you say you don't believe him. That's
what I'm trying to refute: I add my voice to the McIntyre
(and Heathfield) chorus to say that you are mistaken.

But perhaps the "this" in your response referred to
something else altogether?

--
Eric Sosman
lid

Eric Sosman, Jan 6, 2006
19. ### Walter RobersonGuest

In article <>,
Eric Sosman <> wrote:
> As a concrete example of a situation where values are
>represented without any individual "bit-artifacts" being
>present, consider the data stream between two modems. If I
>understand correctly (I'm not a modem maven), each change in
>the signal encodes an entire group of bits: one phase shift
>represents 000, another phase shift represents 010, and so
>on. "Value bits" are transmitted, but no isolated "signal
>bits" are discernible.

There are a number of different encoding mechanisms used for
modems designed for use with telephone lines (the term 'modem'
applies to a number of other situations as well.) The first standards,
for 110 and 300 baud, were straight single-frequency carriers with
no phase encoding. 1200 baud was, if I recall correctly, dual frequency
but not phase encoded. All of the CCITT standard encodings from 2400 baud
upwards rely upon phase encoding. The proprietary Telebit protocols
worked rather differently (128 or 256 frequency channels but
individual signals were held for long periods.)

It is not exactly accurate to say that -each- change in the signal
encodes a group of bits; rather, the signal is sampled at several
intervals along the cycle, and the details of the "phase breaking"
that is imposed on the base sine wave over that interval encodes
a cluster of bits in a completely non-linear non-binary manner
[i.e., there is no single aspect of the phase changes that encodes
any one particular bit.]

Especially for the higher speeds, extra "logical" bits are encoded into
the signal, and the decoded group of bits is run through a
"constellation" corrector to find the most probable original bit group.
In theory the constellation corrector could map to a bit grouping that
had no obvious resemblance to the bits read off of the signal -- with
the non-linear encoding of bits and taking into account the need for
some slosh while the signal slews, the constellation correction could
decide that one should have chosen a different non-linear decoding.

Another example of non-binary encoding is gigabit ethernet, which
starts with the signaling mechanisms used for 100 megabit ethernet,
raised the carrier frequency from 100 megahertz to 125 megahertz,
but then does phase encoding and constellation correction that extracts
10 databits from each 16 signal values decoded out of the wave.
--
"law -- it's a commodity"
-- Andrew Ryan (The Globe and Mail, 2005/11/26)

Walter Roberson, Jan 6, 2006
20. ### Richard HeathfieldGuest

Eric Sosman said:

> Jordan Abel wrote:
>>
>> That doesn't mean it's not a "binary value" - which, if you were
>> paying attention, is the claim you are attempting to refute.

>
> Thank you, Jordan, for drawing my attention to my
> inattention. Tracing back the thread, though, I find
> this remark from Mark McIntyre:
>
> > No, they operate on a value. Values don't have number bases.
> > That's merely a representation issue.

No, those are my words, not Mark's.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)

Richard Heathfield, Jan 6, 2006