bitshifting

A

Alef Veld

Hi everyone,
I'm trying to get my head around the concept of bit shifting, and there
are some things that seem to elude me. Many thanks for your answers.

1. When shifting bits, is there a certain bit size you are working on
(byte or more?) or can you define this yourself as with a bit field in
a struct. I mean i don't have to define a bit field when i shift a
number like 4 i can just do it. How many bits am i working with here.
Is that 32 bits for an int and a byte for a char ?
2. How far can you shift or divide ? To the max of the power of 2
within the boundaries of a unsigned int ? (i'm only doing shifts on
unsigned ints for now)
3. is the MSB always the left byte and the LSB always the right byte
(probably differs on endianness) ? Why are these 2 important anyway ?
4. When i shift 4 times, am i performing a decimal_number * 4 or am i
doing a decimal_number * decimal_number * decimal_number *
decimal_number ?
5. What does this code do ? :
if(sscanf(t_addr, "%d.%d.%d.%d",
&octet1, &octet2, &octet3, &octet4) < 1)
exit_error(-1);
return((octet1 << 24) | (octet2 << 16) | (octet3 << 8) | octet4);

I'm focussing on the last line here. The shifts would display to me the
logic of an 32bit ip address but would when one entered 192 for the
first octet the decimal also not get shifted by 24 getting 4608 ?which
makes not a lot of sense to me and i don't quite understand what the
reasoning for that is.

I'm used to working with ip addresses so for me i count from 128 and
start at 1. When working with more then 1 byte, do you just go to 256
512 1024 etc ? I tried a little program and it gave me this list after
the last number it went to 0 again. Is that the maximum amount of
multiplications that can be done ?

2048
4096
8192
16384
32768
65536
131072
262144
524288
1048576
2097152
4194304
8388608
16777216
33554432
67108864
134217728
268435456
536870912
1073741824
2147483648
0

Many thanks everyone
 
I

Ian Collins

Alef said:
Hi everyone,
I'm trying to get my head around the concept of bit shifting, and there
are some things that seem to elude me. Many thanks for your answers.
If you want to understand bit shifting for unsigned integers, write the
binary number on paper and do the shifting by hand.
1. When shifting bits, is there a certain bit size you are working on
(byte or more?) or can you define this yourself as with a bit field in a
struct. I mean i don't have to define a bit field when i shift a number
like 4 i can just do it. How many bits am i working with here. Is that
32 bits for an int and a byte for a char ?

That depends on the type you are shifting.
2. How far can you shift or divide ? To the max of the power of 2 within
the boundaries of a unsigned int ? (i'm only doing shifts on unsigned
ints for now)

Four an unsigned int, you can shift up to the number of bits in the type.
3. is the MSB always the left byte and the LSB always the right byte
(probably differs on endianness) ? Why are these 2 important anyway ?

For most uses, this doesn't matter, what matters is the value more than
the representation.
4. When i shift 4 times, am i performing a decimal_number * 4 or am i
doing a decimal_number * decimal_number * decimal_number * decimal_number ?

When you shift 4 times, you are multiplying or dividing by 2^4, which is 16.
5. What does this code do ? :
return((octet1 << 24) | (octet2 << 16) | (octet3 << 8) | octet4);

I'm focussing on the last line here. The shifts would display to me the
logic of an 32bit ip address but would when one entered 192 for the
first octet the decimal also not get shifted by 24 getting 4608 ?

Look at it this way,

192 = 0x000000C0

Shift left 24 bits

0xc0000000
 
T

thomas.mertes

If you want to understand bit shifting for unsigned integers, write the
binary number on paper and do the shifting by hand.


That depends on the type you are shifting.


Four an unsigned int, you can shift up to the number of bits in the type.

According to C89: The result (of a shift) is undefined if
the right operand is negative, or greater than or equal to
the number of bits in the left expression's type.

That means if 'number' is an unsigned int which is 32 bits
wide, the result of
number << 32
and
number >> 32
is undefined.

Greetings Thomas Mertes

Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
B

Ben Bacarisse

Alef Veld said:
1. When shifting bits, is there a certain bit size you are working on
(byte or more?) or can you define this yourself as with a bit field in
a struct. I mean i don't have to define a bit field when i shift a
number like 4 i can just do it. How many bits am i working with
here. Is that 32 bits for an int and a byte for a char ?

There are two cases. If the expression being shifted is of a type
"smaller" than plain int, the value is promoted and it is the
promoted value that is shifted. When the expression being shifted is
"bigger" than plain int the type is left alone.

I put "smaller" and "bigger" in quotes because the standard uses quite
a lengthy set of rules based a notion called the conversion rank of
the type. The upshot is that the value is preserved including, of
course, the sign but not necessarily the signedness. It is easy to
think that:

unsigned char x = 0xff;
x << 24

will be shifting an unsigned int but it won't. Because the type int
can represent all the possible values of unsigned char it is a plain
(signed) int that gets shifted. This matters, because the result of a
signed shift can be undefined (and may well be in this case).

If x had been declared unsigned int, the shift would have been an
unsigned one because unsigned int has the same conversion rank as int.
Shifts of "bigger" types are done without any conversion so 1ul << 7
is a shift of an unsigned long value.
2. How far can you shift or divide ? To the max of the power of 2
within the boundaries of a unsigned int ? (i'm only doing shifts on
unsigned ints for now)

With an unsigned type you can shift by up to (but not equal to) the
number of value bits in the type. It not worth remembering the rules
for signed types since bit operations in signed types are generally a
bad idea. (The rules are that right shifts of negative values are
implementation defined, and left shifts are undefined if the resulting
value can't be represented. In the example above, if 256 * pow(2, 24)
is > INT_MAX (probable on a 32 bit machine) the result will be
undefined.)
3. is the MSB always the left byte and the LSB always the right byte
(probably differs on endianness) ? Why are these 2 important anyway
?

Shifts are defined in terms of what they do to the value (multiplying
and dividing by powers of 2). The endianness has no effect at all.
4. When i shift 4 times, am i performing a decimal_number * 4 or am i
doing a decimal_number * decimal_number * decimal_number *
decimal_number ?

A left shift by 4 bits has the effect of multiplying the value being
shifted by 16. I does not, as you might be suggesting, multiply it by
4, nor does it raise it to the 4th power.

(The term "decimal" refers to a way of representing a number. It says
nothing about the value. decimal_number * 4 would be better expressed
as number * 4 -- the "decimal" part is just confusing.)
5. What does this code do ? :
if(sscanf(t_addr, "%d.%d.%d.%d",
&octet1, &octet2, &octet3, &octet4) < 1)
exit_error(-1);
return((octet1 << 24) | (octet2 << 16) | (octet3 << 8) | octet4);

Looks like trouble. You can't tell with certainty what the shifts do
because we don't know the types of the octet1-4 variables. They are
set by a sscanf format of %d so they should int. In that case the
shifts are signed you maybe in all sorts of trouble. If they are not
int objects, then the sscanf call is wrong.

IP address calculations should be done with unsigned long variables or
with unsigned long casts where needed.
I'm focussing on the last line here. The shifts would display to me
the logic of an 32bit ip address but would when one entered 192 for
the first octet the decimal also not get shifted by 24 getting 4608
?which makes not a lot of sense to me and i don't quite understand
what the reasoning for that is.

I don't follow. If x << 24 is properly defined it has the effect of
multiplying by pow(2, 24) (2 to the power 24). Typically, that will
put the value of x in the MSB of a 4-byte value.
I'm used to working with ip addresses so for me i count from 128 and
start at 1. When working with more then 1 byte, do you just go to 256
512 1024 etc ? I tried a little program and it gave me this list after
the last number it went to 0 again. Is that the maximum amount of
multiplications that can be done ?

If you post the code, I am sure it can be explained but I can't follow
what you are asking.
 
I

Ian Collins

According to C89: The result (of a shift) is undefined if
the right operand is negative, or greater than or equal to
the number of bits in the left expression's type.
Good catch. You spotted the deliberate mistake :)
 
I

Ian Collins

Ben said:
Looks like trouble. You can't tell with certainty what the shifts do
because we don't know the types of the octet1-4 variables. They are
set by a sscanf format of %d so they should int. In that case the
shifts are signed you maybe in all sorts of trouble. If they are not
int objects, then the sscanf call is wrong.

IP address calculations should be done with unsigned long variables or
with unsigned long casts where needed.
No, they should be done on a 32 bit unsigned type (unit32_t if you have it).

unsigned long is 64 bits on many 64 bit systems.

This (use of unsigned long of IP addresses) is one of the most common
porting issues when moving from a 32 to a 64 bit platform.
If you post the code, I am sure it can be explained but I can't follow
what you are asking.
The OP appears to be confusing shifting by 24 bits with multiplying by 24.
 
T

Thad Smith

Ian said:
Good catch. You spotted the deliberate mistake :)

I don't see the mistake. "Up to" means less than with an implied starting
point, in this case of 0. Thus shifting up to 32 bits means shifting 0 to
31 bits. "Up to and including" includes the limit.
 
I

Ian Collins

Thad said:
I don't see the mistake. "Up to" means less than with an implied
starting point, in this case of 0. Thus shifting up to 32 bits means
shifting 0 to 31 bits. "Up to and including" includes the limit.
In mathematical terms maybe, but not in idiomatic English. If a car
claims to seat up to five adults, it seats five, not four.
 
B

Ben Bacarisse

Ian Collins said:
No, they should be done on a 32 bit unsigned type (unit32_t if you
have it).

I can't see why. I take your point that I could have suggested the C99
stdint types, but then surely uint_least32_t is more portable?
unsigned long is 64 bits on many 64 bit systems.

This (use of unsigned long of IP addresses) is one of the most common
porting issues when moving from a 32 to a 64 bit platform.

Again, I can't really see why. If the calculations are properly
written, the extra bits won't matter -- the author has to take into
account that there may be some more bits in the calculation so there
being 32 extra bits should not matter. Am I missing something?

There is a general point that badly written bit operations cause
problems when moving between systems with different int widths, but
using an exact unsigned width (like uint32_t) only hides the
non-portability until you move the code to a system that does not have
that exact width.

Obviously I am talking only about calculations. For stuffing IP
addresses into externally imposed structures there is no fully
portable solution (though using byte copying and avoiding int types
altogether is often gets close enough).
 
I

Ian Collins

Ben said:
I can't see why. I take your point that I could have suggested the C99
stdint types, but then surely uint_least32_t is more portable?
Well IP (V4) addresses are 32 bit unsigned quantities, so it's highly
unlikely a system with an IP stack lacks 32 bit unsigned integers.
Again, I can't really see why. If the calculations are properly
written, the extra bits won't matter -- the author has to take into
account that there may be some more bits in the calculation so there
being 32 extra bits should not matter. Am I missing something?
More of a case of passing unsigned long to POSIX byte order conversion
functions/macros. These were updated to use the C99 fixed width types.
 
B

Ben Bacarisse

Ian Collins said:
Well IP (V4) addresses are 32 bit unsigned quantities, so it's highly
unlikely a system with an IP stack lacks 32 bit unsigned integers.

Unless I am miss-remembering, Honeywell 6000's running GECOS has
TCP/IP at the end of their lives.

More of a case of passing unsigned long to POSIX byte order conversion
functions/macros. These were updated to use the C99 fixed width
types.

POSIX compliance would be beyond such a system, but not TCP/IP itself.
 
K

Keith Thompson

Ian Collins said:
Well IP (V4) addresses are 32 bit unsigned quantities, so it's highly
unlikely a system with an IP stack lacks 32 bit unsigned integers.
[...]

Unlikely perhaps, but not impossible. The Cray T90 had (has?) a full
Unix operating system, but it's C implementation had 8-bit char and
64-bit short, int, and long; it had no 16-bit or 32-bit integer type.

As I recall, a struct in one of the system headers used a 32-bit bit
field for an IP address.
 
B

Barry Schwarz

In mathematical terms maybe, but not in idiomatic English. If a car
claims to seat up to five adults, it seats five, not four.

Actually only three, except for very small values of five or adults.


Remove del for email
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top