question regarding FAQ's

M

mohangupta13

with ref to question 16.7 in FAQ's :

struct mystruct {
char c;
long int i32;
int i16;
} s;

unsigned char *p = buf;

s.c = *p++;

1. s.i32 = (long)*p++ << 24;
2. s.i32 |= (long)*p++ << 16;
3. s.i32 |= (unsigned)(*p++ << 8);
s.i32 |= *p++;

s.i16 = *p++ << 8;
s.i16 |= *p++;

for unpacking a buffer as "buf " here how are all these casts
justified , i mean in line 2 if *p was equal to something like 128
(assuming 8-bit char) then when casting it to long it would have been
sign extended (as i understand it ) as:

expression result bit
representation

(unsigned char) 128 10000000

(long)128 (11111111)^3
10000000 (it is sign extended)

(long)128 <<16 11111111 10000000
(00)^8

(a)^b here means string of 'a' repeated 'b' times

so now how is the 'bitwise or ' in line 2 justified as the value which
was 128 has changed to something else due to the cast to long and the
subsequent left shift.

same for line 3 , if *p is 8-bit then left shifting it by 8 will
result in what ???

I really think I am just not able to grasp the concept of shifting and
sign extended thing in casts ..

Can anyone kindly explain it with some easy to understand example ???

Thanks a lot
Mohan Gupta
 
J

James Kuyper

mohangupta13 said:
with ref to question 16.7 in FAQ's :

struct mystruct {
char c;
long int i32;
int i16;
} s;

unsigned char *p = buf;

s.c = *p++;

1. s.i32 = (long)*p++ << 24;
2. s.i32 |= (long)*p++ << 16;
3. s.i32 |= (unsigned)(*p++ << 8);
s.i32 |= *p++;

s.i16 = *p++ << 8;
s.i16 |= *p++;

for unpacking a buffer as "buf " here how are all these casts
justified , i mean in line 2 if *p was equal to something like 128
(assuming 8-bit char) then when casting it to long it would have been
sign extended (as i understand it ) as:

expression result bit
representation

(unsigned char) 128 10000000

(long)128 (11111111)^3
10000000 (it is sign extended)

First of all, since (unsigned char)128 isn't signed, it doesn't have a
sign to be extended. Secondly, C's conversion rules are value
preserving, when they can be, so (long)(unsigned char)128 == 128L.
(long)128 <<16 11111111 10000000
(00)^8

(a)^b here means string of 'a' repeated 'b' times

so now how is the 'bitwise or ' in line 2 justified as the value which
was 128 has changed to something else due to the cast to long and the
subsequent left shift.

After the 16 bit shift, the binary representation would be
10000000 00000000 00000000
same for line 3 , if *p is 8-bit then left shifting it by 8 will
result in what ???

The same exact bits, followed by 8 bits set to 0. Another way to put
this is that *p<<8 means exactly the same as *p * 256.
I really think I am just not able to grasp the concept of shifting and
sign extended thing in casts ..

Drop the concept of sign extension: it's not part of the C standard.
When C was standardized, the committee chose value-preserving rather
than sign-preserving conversion rules.

You should never use shifts if sign bits are a potential issue, which is
why that code in the FAQ is inappropriate.

Let N be 2 raised to the n'th power. If E1 has a signed type and a
negative value, then if E1 * N is representable in the promoted type of
E1, then E1 << n == E1*N; otherwise the behavior is undefined (6.5.7p4).
That's going to cause trouble with line 1 if

*p < LONG_MIN/(long)pow(2,24)

and that's what's wrong with this code.

Similarly, if E1 has a negative value, E1 >> n has an undefined value
(6.5.7p5). That's much better than undefined behavior, but not very useful.

For those reasons, you should generally avoid using the shift operators
when the left operand would be signed, especially when the operand is
negative.

The example should have used unsigned long, not long. So what do you do
if the value you're trying to unpack is signed, and possibly negative?
There's no single correct answer to that. You need to know which bit is
the sign bit, and how it's interpreted. C allows for three
possibilities: sign/magnitude, 1's complement, and 2's complement. The
right approach to use depends upon which of those choices was made when
storing the data you're unpacking, and which of those choices was made
by the implementation you used to unpack the data. If the representation
of the packed data and the data in the struct are the same, then the
best approach is simply:

memcpy(&s.i32, p, sizeof s.i32);
p += sizeof s.i32;

In the answer to question 16.7, there's a link to question 12.42, which
explains some of these issues in more detail. In question 12.42, there's
a line to question 3.19, which is also very relevant.
 
N

Nobody

with ref to question 16.7 in FAQ's :

struct mystruct {
char c;
long int i32;
int i16;
} s;

unsigned char *p = buf;

s.c = *p++;

1. s.i32 = (long)*p++ << 24;
2. s.i32 |= (long)*p++ << 16;
3. s.i32 |= (unsigned)(*p++ << 8);
s.i32 |= *p++;

s.i16 = *p++ << 8;
s.i16 |= *p++;

for unpacking a buffer as "buf " here how are all these casts
justified , i mean in line 2 if *p was equal to something like 128
(assuming 8-bit char) then when casting it to long it would have been
sign extended (as i understand it ) as:

*p is an UNSIGNED char; there is no sign to extend.
so now how is the 'bitwise or ' in line 2 justified as the value which
was 128 has changed to something else due to the cast to long and the
subsequent left shift.

This doesn't parse. If *p++ was 0x80 (128), (long)*p++ will be 0x00000080
(assuming a 32-bit long) and (long)*p++ << 16 will be 0x00800000.
same for line 3 , if *p is 8-bit then left shifting it by 8 will
result in what ???

Left-shifting *p++ by 8 bits will result in zero, but that's not what's
happening. Instead (unsigned)*p++ (which is 16 bits in the example) is
being shifted left by 8 bits.
I really think I am just not able to grasp the concept of shifting and
sign extended thing in casts ..

Can anyone kindly explain it with some easy to understand example ???

Sign extension doesn't occur here. as *p is unsigned.

The casts enlarge the value so that it is wide enough to hold the shifted
result. Note that casts bind more tightly than binary operators, so e.g.
(long)*p<<24 is equivalent to ((long)*p)<<24, not to (long)(*p<<24). The
latter would result in the significant bits of *p being lost.

If p points to the four bytes 0x01,0x23,0x45,0x67, the code:

s.i32 = (long)*p++ << 24;
s.i32 |= (long)*p++ << 16;
s.i32 |= (unsigned)(*p++ << 8);
s.i32 |= *p++;

evaluates as:

s.i32 = 0x01000000;
s.i32 |= 0x00230000;
s.i32 |= 0x4500;
s.i32 |= 0x67;

leaving s.i32 containing 0x01234567.
 
B

Barry Schwarz

with ref to question 16.7 in FAQ's :

struct mystruct {
char c;
long int i32;
int i16;
} s;

unsigned char *p = buf;

s.c = *p++;

1. s.i32 = (long)*p++ << 24;
2. s.i32 |= (long)*p++ << 16;
3. s.i32 |= (unsigned)(*p++ << 8);
s.i32 |= *p++;

s.i16 = *p++ << 8;
s.i16 |= *p++;

for unpacking a buffer as "buf " here how are all these casts
justified , i mean in line 2 if *p was equal to something like 128
(assuming 8-bit char) then when casting it to long it would have been
sign extended (as i understand it ) as:

Statement 1 is problematic with or without the cast. If the value
being shifted is greater than 0x7f and long is only 32 bits, the
statement invokes undefined behavior (6.5.7-4).

In statement 3, the parenthesized expression is evaluated as a signed
int and then converted to unsigned. If int is 16 bits, the expression
has the same problem. If you remove the parentheses, the problem goes
away.

If int is 16 bits, then the first assignment to s.i16 has the same
problem.

snip several lines of bad assumptions.
same for line 3 , if *p is 8-bit then left shifting it by 8 will
result in what ???

*p is promoted to int first before it is shifted. Lookup "usual
arithmetic conversions" in 6.3.1.8.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top