Cross-platform way to pack (int + flags) to unsigned int

Alex J · Jun 10, 2013

Hi all,

Given two values: int X and int F and assuming that
(X << 2) >> 2 == X and F is a two-bit value write cross-platform code to "pack" X and F to unsigned int.

I've solved this as follows.
Packing:
unsigned int P = (unsigned int) ((X << 2) | F);

Unpacking:
int F = ((int) P) & 3;
int A = ((int) P) >> 2;

I know that both packing and unpacking code is not valid from ISO C point of view but still the question is - should I care about it?

The only compiler I care about is GCC ver.>4 and its targets, e.g. windows, linux, mac os x.

How this problem can be solved in a truly cross-platform way? Especially assuming the packed data will be read on the multiple platforms and sizeof(int) == sizeof(unsigned int) == 32 on all of these platforms.

P.S.:
#include <stdio.h>
#include <stdlib.h>

void check_int(int a, int f) {
int a1;
int f1;
unsigned int p = (unsigned int) (a << 2) | f;

a1 = ((int) p) >> 2;
f1 = ((int) p) & 3;

if ((a1 != a) || (f1 != f)) {
fprintf(stderr, "a1(%d) != a(%d) || f1(%d) != f(%d)\n", a1, a, f1, f);
exit(-1);
} else {
fprintf(stdout, "OK: %d, %d\n", a, f);
}
}

int main() {
check_int(1, 0);
check_int(144555666, 3);
check_int(-1, 2);
check_int(-222333444, 1);
return 0;
}

Paul N · Jun 10, 2013

Hi all,

Given two values: int X and int F and assuming that
(X << 2) >> 2 == X and F is a two-bit value write cross-platform codeto "pack" X and F to unsigned int.

I've solved this as follows.
Packing:
unsigned int P = (unsigned int) ((X << 2) | F);

Unpacking:
int F = ((int) P) & 3;
int A = ((int) P) >> 2;

I know that both packing and unpacking code is not valid from ISO C pointof view but still the question is - should I care about it?

I'm not an expert, but I thought that messing around with unsigned
values is perfectly safe. So I would be inclined to write:

Packing:
unsigned int P = (( (unsigned int) X << 2) | (unsigned int) F );

Unpacking:
int F = (int) (P & 3);
int A = (int) (P >> 2);

But there are experts in this group who can give a more authoritative
reply.

Barry Schwarz · Jun 10, 2013

Hi all,

Given two values: int X and int F and assuming that
(X << 2) >> 2 == X and F is a two-bit value write cross-platform code to "pack" X and F to unsigned int.

I've solved this as follows.
Packing:
unsigned int P = (unsigned int) ((X << 2) | F);

Unpacking:
int F = ((int) P) & 3;
int A = ((int) P) >> 2;

I know that both packing and unpacking code is not valid from ISO C point of view but still the question is - should I care about it?

In what way do you think it is invalid?

The only compiler I care about is GCC ver.>4 and its targets, e.g. windows, linux, mac os x.

How this problem can be solved in a truly cross-platform way? Especially assuming the packed data will be read on the multiple platforms and sizeof(int) == sizeof(unsigned int) == 32 on all of these platforms.

How is the data being transferred between platforms, as binary or
text? If text, you should not have a problem.

If binary, will all the system always have the same endianness? If
one is big-endian, then f will be stored in the last two bits of the
fourth byte. When read on a little endian system, it will look for f
in the last two bits of the first byte.

Lew Pitcher · Jun 10, 2013

In what way do you think it is invalid?

How is the data being transferred between platforms, as binary or
text? If text, you should not have a problem.

Caveat: Text shouldn't be a problem IF both writer and reader agree on the
characterset of the data, the native line termination format of the data,
the block data format of the data (i.e., for IBM mainframe, is the data
Fixed Blocked, Variable Blocked, Variable Blocked Spanned, Undefined or
something else), etc.

If binary, will all the system always have the same endianness? If
one is big-endian, then f will be stored in the last two bits of the
fourth byte. When read on a little endian system, it will look for f
in the last two bits of the first byte.

In other words, a common data exchange format must be decided upon and
agreed to by all writers and readers /before/ the programs are developed,
coded, and tested.

Eric Sosman · Jun 10, 2013

Hi all,

Given two values: int X and int F and assuming that
(X << 2) >> 2 == X and F is a two-bit value write cross-platform code to "pack" X and F to unsigned int.

Let's pause a moment to study the X condition more closely.
If it's meant as a portable ("cross-platform") statement, it
implies 0 <= X && X <= INT_MAX/4 (otherwise, simply evaluating
X<<2 would yield undefined behavior). On the 32-bit systems you
mention below, this means 0 <= X && X <= 0x1FFFFFFF. Keep the
range restriction in mind for what follows.

I've solved this as follows.
Packing:
unsigned int P = (unsigned int) ((X << 2) | F);

Okay; the cast is unnecessary but harmless.

Unpacking:
int F = ((int) P) & 3;
int A = ((int) P) >> 2;

Okay; again, the casts are unnecessary but harmless. (But
see below for a lame excuse that semi-justifies the second one.)

I know that both packing and unpacking code is not valid from ISO C point of view but still the question is - should I care about it?

As long as the range restrictions hold, there's nothing invalid
about what you've shown. What invalidity are you worried about?

The only compiler I care about is GCC ver.>4 and its targets, e.g. windows, linux, mac os x.

How this problem can be solved in a truly cross-platform way? Especially assuming the packed data will be read on the multiple platforms and sizeof(int) == sizeof(unsigned int) == 32 on all of these platforms.

Depends what you mean by "truly cross-platform." The assumption
of a 32-bit int excludes a few platforms right away, and there may be
a few more where gcc isn't available. Still, a large majority of
"mainstream" platforms meet your requirements.

There's another problem lurking, though: Endianness. If the 32-bit
int is composed of four 8-bit bytes, there are 4! = 24 ways to arrange
those bytes;[*] two arrangements ("Big-Endian" and "Little-Endian") are
popular today, and at least one more ("Middle-Endian") has been seen
in the past. If X and F are one million and one, respectively, you'll
pack them as the value 4000001, forming the four bytes 00 3D 09 01 (in
hex). A Big-Endian machine would transmit or store these with the 00
first and the 01 last, but a Little-Endian machine would do things the
other way around. So if one of them writes the value (in its native
order) and the other reads it (in *its* native order), 4000001 will
be misinterpreted as 17382656, from which you'll extract X = 4345664
and F = 0. The fidelity leaves a little to be desired!

That's not to say these problems can't be dealt with, just that
you'd better give them some thought. See the FAQ.

[*] Actually, there are 32! ~= 2.6E35 ways to arrange the bits.
Consider yourself fortunate that nobody's quite that perverse. Yet.

P.S.:
#include <stdio.h>
#include <stdlib.h>

void check_int(int a, int f) {
int a1;
int f1;
unsigned int p = (unsigned int) (a << 2) | f;

a1 = ((int) p) >> 2;
f1 = ((int) p) & 3;

if ((a1 != a) || (f1 != f)) {
fprintf(stderr, "a1(%d) != a(%d) || f1(%d) != f(%d)\n", a1, a, f1, f);
exit(-1);

Don't Do That. Use EXIT_FAILURE. (Does anybody *know* where
this exit(-1) meme got started? Does anybody know of *any* system
on which a -1 exit status survives unchanged all the way to the point
where an invoker could examine it? I think it *might* have worked
on VMS -- but it would have meant "success" if it did.)

} else {
fprintf(stdout, "OK: %d, %d\n", a, f);
}
}

int main() {
check_int(1, 0);
check_int(144555666, 3);
check_int(-1, 2);
check_int(-222333444, 1);

The final two tests are on shaky ground, as they violate the
range restrictions mentioned earlier. (On the other hand, they
also -- sort of -- justify some of the casts you've written: If
X<<2 with X negative doesn't explode *and* (int)p with p outside
the range of int doesn't explode *and* (int)p>>2 with (int)p
negative doesn't explode, then the cast *might* save the day.
But I wouldn't call that "truly cross-platform.")

return 0;
}

Another approach might be to use a struct with bit-fields:

struct packed {
int X : 30;
unsigned int F : 2;
};

This doesn't solve the representation issues -- if anything, it
makes them trickier -- but it relaxes the range restriction to
permit negative X'es.

Alex J · Jun 11, 2013

On 6/10/2013 3:27 PM, Alex J wrote:

snip...
Let's pause a moment to study the X condition more closely.

If it's meant as a portable ("cross-platform") statement, it

implies 0 <= X && X <= INT_MAX/4 (otherwise, simply evaluating

X<<2 would yield undefined behavior). On the 32-bit systems you

mention below, this means 0 <= X && X <= 0x1FFFFFFF. Keep the

range restriction in mind for what follows.

Yes, you're absolutely right. But I need signed types too.

Okay; the cast is unnecessary but harmless.

Okay; again, the casts are unnecessary but harmless. (But

see below for a lame excuse that semi-justifies the second one.)

As long as the range restrictions hold, there's nothing invalid

about what you've shown. What invalidity are you worried about?

ISO C99 (6.5.7/4) - undefined behavior for left shift for negative value.

Depends what you mean by "truly cross-platform." The assumption

of a 32-bit int excludes a few platforms right away, and there may be

a few more where gcc isn't available. Still, a large majority of

"mainstream" platforms meet your requirements.

There's another problem lurking, though: Endianness.

Yes, you're right. I am aware of it and I planned to "document" low-endian representation of the transmitted binary data (as in x86).

That's not to say these problems can't be dealt with, just that

you'd better give them some thought. See the FAQ.

[*] Actually, there are 32! ~= 2.6E35 ways to arrange the bits.

Consider yourself fortunate that nobody's quite that perverse. Yet.

P.S.:

Click to expand...

#include <stdio.h>

Click to expand...

#include <stdlib.h>

void check_int(int a, int f) {

Click to expand...

int a1;

Click to expand...

int f1;

Click to expand...

unsigned int p = (unsigned int) (a << 2) | f;

a1 = ((int) p) >> 2;

Click to expand...

f1 = ((int) p) & 3;

if ((a1 != a) || (f1 != f)) {

Click to expand...

fprintf(stderr, "a1(%d) != a(%d) || f1(%d) != f(%d)\n", a1, a, f1, f);

Click to expand...

exit(-1);

Click to expand...

Don't Do That. Use EXIT_FAILURE. (Does anybody *know* where

this exit(-1) meme got started? Does anybody know of *any* system

on which a -1 exit status survives unchanged all the way to the point

where an invoker could examine it? I think it *might* have worked

on VMS -- but it would have meant "success" if it did.)

Thanks for pointing on that.

The final two tests are on shaky ground, as they violate the

range restrictions mentioned earlier. (On the other hand, they

also -- sort of -- justify some of the casts you've written: If

X<<2 with X negative doesn't explode *and* (int)p with p outside

the range of int doesn't explode *and* (int)p>>2 with (int)p

negative doesn't explode, then the cast *might* save the day.

But I wouldn't call that "truly cross-platform.")

Yes, you're right. But I need signed integers.
May be there is a reliable way to transform to/from a packed binary number representation - i.e. flags + number (e.g. network byte order)?
After quick googling I did not find any and now I believe I shouldn't do it..

I need a quick loading and saving the big packs of binary data on the same platform (writing a big array of the unsigned ints) so I believe I should provide a special converter for big endian platforms. Convertation will be the rare though theoretically possible case so I do not care about its speedand memory consumption.

Is it sufficient to have two converters: one for little->big endian format converter and big->little endian format converter?

AFAIK all the known 32-bit platform (well, better to say platforms with 32-bit ints) with same endianess share the *same* binary representation of ints and all the bitwise operations on integer numbers has the same effect? Oh, I forgot to mention that at the moment I care of GCC only but support forthe other modern compilers - msvc, icc would be nice.

If it is true at least I can rely on the packing/unpacking operations I specified above for both big and little endian platforms and write converters that aware about endianess. Of course endianess information will be encodedin the header of the transmitted binary representation.

Another approach might be to use a struct with bit-fields:

struct packed {

int X : 30;

unsigned int F : 2;

};

This doesn't solve the representation issues -- if anything, it

makes them trickier -- but it relaxes the range restriction to

permit negative X'es.

Thank you and all who answered.

Alex J · Jun 11, 2013

[snip]
Another approach might be to use a struct with bit-fields:

struct packed {
int X : 30;
unsigned int F : 2;
};

This doesn't solve the representation issues -- if anything, it
makes them trickier -- but it relaxes the range restriction to
permit negative X'es.

I heard that bit fields are non-portable and there is no guarantee that compiler will not apply some alignment to the struct that's why I didn't use it.

I am probably wrong but even with pragma pack(1) struct is not guaranteed to be 32-bit size or simply said sizeof(struct packed) will not always be 4. Yet I'm not sure on that.

Please correct me if I'm wrong.

James Kuyper · Jun 11, 2013

[snip]
Another approach might be to use a struct with bit-fields:

struct packed {
int X : 30;
unsigned int F : 2;
};

This doesn't solve the representation issues -- if anything, it
makes them trickier -- but it relaxes the range restriction to
permit negative X'es.

Click to expand...

I heard that bit fields are non-portable and there is no guarantee that compiler will not apply some alignment to the struct that's why I didn't use it.

That's what he meant when he said "it makes them trickier".

I am probably wrong but even with pragma pack(1) struct is not guaranteed to be 32-bit size or simply said sizeof(struct packed) will not always be 4. Yet I'm not sure on that.

#pragma pack itself is not standard, so the standard guarantees nothing
about how it works on those implementations which support it - and the
ones that do support it do so with several different incompatible
syntaxes for specifying the way the structures are packed.

To avoid undefined behavior during packing, you'll have to transform
valid values for X into positive numbers, convert to unsigned, and then
performing the left shift. For unpacking, you need to perform the
inverse operations in the opposite order. There's several different
ways to make the numbers positive. One of the simplest is:

#define INT30_MIN (-1<<29)

// Packing
p = (unsigned)(x-INT30_MIN) << 2 | f

// Unpacking
f = p & 3;
x = (int)(p>>2)+INT30_MIN;

The code would have to be a bit more complicated if you want it to work
on systems where int and unsigned int are not both 32 bit types. You'll
still have to deal with byte ordering when reading or writing the
packed values.

Eric Sosman · Jun 11, 2013

[...]
As long as the range restrictions hold, there's nothing invalid
about what you've shown. What invalidity are you worried about?

Click to expand...

ISO C99 (6.5.7/4) - undefined behavior for left shift for negative value.

You began with

Given two values: int X and int F and assuming that
(X << 2) >> 2 == X and F is a two-bit value

.... which means either that X is non-negative and not too large,
or that you're *not* worried about 6.5.7p4! If 6.5.7.4 is in
fact a concern, you'll need to revise your assumption about X.

[...]
May be there is a reliable way to transform to/from a packed binary number representation - i.e. flags + number (e.g. network byte order)?
After quick googling I did not find any and now I believe I shouldn't do it.

One fully-portable approach would be to add a suitable offset
to X before encoding, ensuring that what's shifted is non-negative:

unsigned int encoded = (X + OFFSET) << 2 | F;

Then you subtract the same offset when extracting:

int decodedX = (encoded >> 2) - OFFSET;
int decodedF = encoded & 3;

I need a quick loading and saving the big packs of binary data on the same platform (writing a big array of the unsigned ints) so I believe I should provide a special converter for big endian platforms. Convertation will be the rare though theoretically possible case so I do not care about its speed and memory consumption.

You've confused me. When you say "on the same platform," it seems
that you want code that will work everywhere, but that packing and
extracting all happen on the same system; in this case, endianness is
not an issue. But when you talk about a "converter for big endian
platforms," it seems that data exchange between variegated platforms
is in fact needed ...

Either way, it's easy to read and write the data in a consistent
"wire format" regardless of the host platform's endianness. Here's
how you could write a four-byte value in Little-Endian order:

// Error-checking omitted for brevity
unsigned int value = ...;
putc(value & 0xFF, stream);
putc((value >> 8) & 0xFF, stream);
putc((value >> 16) & 0xFF, stream);
putc((value >> 24) & 0xFF, stream);

If you're certain of 32-bitness you could omit the final &0xFF, but
any speedup would surely be negligible compared to the I/O. Then
you can read the bytes back the same way:

unsigned int b0 = getc(stream);
unsigned int b1 = getc(stream);
unsigned int b2 = getc(stream);
unsigned int b3 = getc(stream);
unsigned int value = (b3 << 24) + (b2 << 16) + (b1 << 8) + b0;

.... or even

int X = (b3 << 22) + (b2 << 14) + (b1 << 6) + (b0 >> 2)
- OFFSET;
int F = b0 & 3;

Is it sufficient to have two converters: one for little->big endian format converter and big->little endian format converter?

As illustrated above, I think it suffices to have zero converters.

AFAIK all the known 32-bit platform (well, better to say platforms with 32-bit ints) with same endianess share the *same* binary representation of ints and all the bitwise operations on integer numbers has the same effect? Oh, I forgot to mention that at the moment I care of GCC only but support for the other modern compilers - msvc, icc would be nice.

This sounds like a digression; I'm not sure what you're driving at.
Negative numbers, maybe? You need to avoid them anyhow, because even if
all the platforms use two's complement (it's been years since I saw one
that didn't) you still need to worry about getting the sign right when
extracting. Right-shifting a negative int is formally undefined; in
practice, some platforms duplicate the sign bit while others introduce
zeros (giving a non-negative result).

If it is true at least I can rely on the packing/unpacking operations I specified above for both big and little endian platforms and write converters that aware about endianess. Of course endianess information will be encoded in the header of the transmitted binary representation.

Or just read and write the same "wire format" everywhere.

Eric Sosman · Jun 11, 2013

[snip]
Another approach might be to use a struct with bit-fields:

struct packed {
int X : 30;
unsigned int F : 2;
};

This doesn't solve the representation issues -- if anything, it
makes them trickier -- but it relaxes the range restriction to
permit negative X'es.

Click to expand...

I heard that bit fields are non-portable and there is no guarantee that compiler will not apply some alignment to the struct that's why I didn't use it.

Like much of C, bit-fields are portable within limits. Every C
compiler supports bit-fields, with widths up to at least the width
of an int -- Since you're assuming 32-bit ints, the :30 bit-field is
fine. The compiler has a lot of freedom in how it chooses to store
the bits (which is why I said the representation issues get trickier),
but if you're only worried about intra-machine storage that's not a
problem.

I am probably wrong but even with pragma pack(1) struct is not guaranteed to be 32-bit size or simply said sizeof(struct packed) will not always be 4. Yet I'm not sure on that.

Correct: As I said, the compiler has a lot of freedom. As for
#pragma pack(1) -- Well, once you've uttered a non-Standard #pragma,
*nothing* is guaranteed by the C language.

Tim Rentsch · Jun 11, 2013

Alex J said:
On 6/10/2013 3:27 PM, Alex J wrote:

snip...
Let's pause a moment to study the X condition more closely.
If it's meant as a portable ("cross-platform") statement, it
implies 0 <= X && X <= INT_MAX/4 (otherwise, simply evaluating
X<<2 would yield undefined behavior). On the 32-bit systems you
mention below, this means 0 <= X && X <= 0x1FFFFFFF. Keep the
range restriction in mind for what follows.

Click to expand...

Yes, you're absolutely right. But I need signed types too. [snip]

The code examples suggested in other responses have bugs
in them. Here is an easy and portable way to do what
you want to do (disclaimer: typed in, not compiled):

#include <limits.h>

#if UINT_MAX <= INT_MAX
# error sorry, this platform is screwy.
#endif

#define OFFSET (INT_MAX/4 + 1)

unsigned
pack( int x, int f ){
return (unsigned)(x+OFFSET) << 2 | f & 0x3u;
}

void
unpack( unsigned u, int *x, int *f ){
*x = (int)(u>>2) - OFFSET, *f = u & 0x3;
}

This doesn't address the issue of how to transmit the
unsigned value reliably, but it looks like you know
what you're going to do about that.

Alex J · Jun 11, 2013

On 6/11/2013 4:35 AM, Alex J wrote:
[snip]
You've confused me. When you say "on the same platform," it seems
that you want code that will work everywhere, but that packing and
extracting all happen on the same system; in this case, endianness is
not an issue. But when you talk about a "converter for big endian
platforms," it seems that data exchange between variegated platforms
is in fact needed ...

I'm sorry for not being clear. Priority one for me is a code that behaves in the expected way on all the target platforms but I didn't mean the *same*binary representation of ints on all the target platforms (so that the serialized data may be freely transmitted to the other platform via some remote protocol and then processed as is).

[snip]

Alex J · Jun 11, 2013

[snip]
#include <limits.h>

#if UINT_MAX <= INT_MAX
# error sorry, this platform is screwy.
#endif

#define OFFSET (INT_MAX/4 + 1)

unsigned
pack( int x, int f ){
return (unsigned)(x+OFFSET) << 2 | f & 0x3u;
}

void
unpack( unsigned u, int *x, int *f ){
*x = (int)(u>>2) - OFFSET, *f = u & 0x3;
}

Thank you for the sample.

This doesn't address the issue of how to transmit the
unsigned value reliably, but it looks like you know
what you're going to do about that.

Yep, that's right. Thank you and thanks others who pointed to the offset trick.

glen herrmannsfeldt · Jun 11, 2013

Tim Rentsch said:
(snip)

Yes, you're absolutely right. But I need signed types too. [snip]

Click to expand...

The code examples suggested in other responses have bugs
in them. Here is an easy and portable way to do what
you want to do (disclaimer: typed in, not compiled):
(snip)
#define OFFSET (INT_MAX/4 + 1)

unsigned
pack( int x, int f ){
return (unsigned)(x+OFFSET) << 2 | f & 0x3u;

I suppose that is true, but for values in range, and the
appropriate arithmetic shift operations, shouldn't it also
work with signed int, arithmetic shift of those signed int
values, and the appropriate inverse?

The OP assured us that (x<<2)>>2==x, which should be true for
in range x and arithmetic shift on sign magnitude, ones
complement, and twos complement machines, not that he is likely
to run into the first two.

-- glen

James Kuyper · Jun 11, 2013

Tim Rentsch said:
Tim Rentsch said:

(snip)

Yes, you're absolutely right. But I need signed types too. [snip]

Click to expand...

Click to expand...

The code examples suggested in other responses have bugs
in them. Here is an easy and portable way to do what
you want to do (disclaimer: typed in, not compiled):
(snip)
#define OFFSET (INT_MAX/4 + 1)

Click to expand...

unsigned
pack( int x, int f ){
return (unsigned)(x+OFFSET) << 2 | f & 0x3u;

Click to expand...

I suppose that is true, but for values in range, and the
appropriate arithmetic shift operations, shouldn't it also
work with signed int, arithmetic shift of those signed int
values, and the appropriate inverse?

No, the OP told us that some values of x for which x<0 are in range, and
the behavior of x<<2 is undefined for such values.

The OP assured us that (x<<2)>>2==x, which should be true for

He said that he was assuming this, not that he had verified it. Since
the behavior is undefined, no matter how many tests he might have
performed to verify that relationship, a fully conforming implementation
is free to generate code that produces very different results on the
very next test.

More to the point, because the behavior is undefined, a fully conforming
implementation is free to generate code for

temp = x<<2, temp>>2 == x

which fails for x<0, even though the "equivalent" code with no explicit
temporary variable apparently "succeeded". It could, for instance,
perform optimizations that invalidated the supposed equivalence of the
two pieces of code; so long as it's invalidated only for those values
where the behavior is undefined.

in range x and arithmetic shift on sign magnitude, ones
complement, and twos complement machines, not that he is likely
to run into the first two.

Because the behavior of x<<2 for x<0 is undefined, I've never bothered
finding out what actual behavior occurs when such code is executed.
You're making some assumptions about that behavior, and your assumptions
might be right for the implementations you're familiar with, and maybe
even for the ones that Alex J needs his code to work on. However, I
doubt that the Committee would have made the behavior undefined if that
were universally true.

Eric Sosman · Jun 11, 2013

[...]
Because the behavior of x<<2 for x<0 is undefined, I've never bothered
finding out what actual behavior occurs when such code is executed.
You're making some assumptions about that behavior, and your assumptions
might be right for the implementations you're familiar with, and maybe
even for the ones that Alex J needs his code to work on. However, I
doubt that the Committee would have made the behavior undefined if that
were universally true.

Right. On every platform I've used (well, except perhaps
one from Long Ago and before C was invented), the left-shift
would simply have "lost" the extra copies of the sign bit. A
bigger problem arises on the inverse: When right-shifting to
extract the negative value, some right-shifts would fill with
copies of the sign bit (preserving negativity) while others
would fill with zeroes (exhibiting a positive attitude). Both
operations are formally undefined by C; I think it's the latter
that poses the greater practical problem.

Moral: Don't Do That.

Philip Lantz · Jun 12, 2013

Eric said:
Right. On every platform I've used (well, except perhaps
one from Long Ago and before C was invented), the left-shift
would simply have "lost" the extra copies of the sign bit. A
bigger problem arises on the inverse: When right-shifting to
extract the negative value, some right-shifts would fill with
copies of the sign bit (preserving negativity) while others
would fill with zeroes (exhibiting a positive attitude). Both
operations are formally undefined by C.

Actually, the result of a right shift of a negative value is
implementation defined; a left shift of a negative value is undefined
behavior. (A distinction without a difference, I know.)

Eric Sosman · Jun 12, 2013

Actually, the result of a right shift of a negative value is
implementation defined; a left shift of a negative value is undefined
behavior. (A distinction without a difference, I know.)

Oops! Thanks for the correction.

Phil Carmody · Jun 12, 2013

Barry Schwarz said:
In what way do you think it is invalid?

Well, if validity is measured in terms of portability, then the right
shift of (signed) int, being non-portable, makes it invalid. It's a
valid criterion - not work on some architectures is a pretty good
reason to call the code invalid.

Phil

Tim Rentsch · Jun 15, 2013

Philip Lantz said:
Actually, the result of a right shift of a negative value is
implementation defined; a left shift of a negative value is undefined
behavior. (A distinction without a difference, I know.)

Actually there's a big difference. It may be rare that the
difference has a significant effect, but it can. Doing a left
shift of a negative value can easily produce a trap representation
(obviously only in implementations that have trap representations
for signed integers). This may not occur often, but certainly it
is not unheard of. So the 'undefined behavior' consequences are
not just imaginary. By contrast, a right shift of a negative
value must produce some valid value -- it can't just blow up the
way a left shift of negative values can.

RSA implementation issues in public key pem loader function	0	May 21, 2025
Rock, Paper, Scissor game. Im getting TypeError, unsupported operand type(s) for -=: 'NoneType' and 'int'	2	Aug 28, 2023
Help with raycaster	0	Mar 27, 2025
Need Helping adding Square root code to an existing calculator. (Absolute begginer?)	0	Jan 12, 2025
Universal BMP Steganography Tool (AES-128-CTR + SP800-90A CSPRNG) Full Encoder/Decoder with 3LSB Payload, PasswordDerived Key & External Key File	4	Mar 26, 2026
Secure Keyboard v2.0 Modern C++ Virtual Keyboard for Windows (Glassmorphism UI, Clipboard Auto-Clear)	0	Mar 26, 2026
WIN32 - Update Text in a Window in order to show its size in Pixels and coordinates	0	Oct 4, 2023
Drawing missing in bitmap in a pure C win32 program	4	Jun 3, 2023

Cross-platform way to pack (int + flags) to unsigned int

Alex J

Paul N

Barry Schwarz

Lew Pitcher

Eric Sosman

Alex J

Alex J

James Kuyper

Eric Sosman

Eric Sosman

Tim Rentsch

Alex J

Alex J

glen herrmannsfeldt

James Kuyper

Eric Sosman

Philip Lantz

Eric Sosman

Phil Carmody

Tim Rentsch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads