What's wrong with this code?

X

xbyte

It's a problem from a book, and I can't figure it out.

4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
you've gotta extract an indicated (by bytenum) word from it, AND the
following code is said to be written by a failed programmer:

int xbyte(unsigned word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}

IMHO, this is a perfect bulk of code, what's your opinion?
 
B

Ben Bacarisse

xbyte said:
It's a problem from a book, and I can't figure it out.

4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
you've gotta extract an indicated (by bytenum) word from it, AND the
following code is said to be written by a failed programmer:

int xbyte(unsigned word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}

IMHO, this is a perfect bulk of code, what's your opinion?

Perfect is a strong word! I'd say it captures the gist of the
problem but it is not portable and will exhibit undefined behaviour
is some cases.

Unsigned types are a natural choice when doing bit operations, so I'd
prefer an unsigned return type (unless the integer return is used to
signal an error using a negative value).

The bytenum can not be negative, so why not make that unsigned? In
any case, undefined behaviour awaits if bytenum is out of range.

Personally, I prefer '* 8' to '<< 3' when multiplying by 8.

C99 provides a good type to document the 32-bit assumption: uint32_t
but slightly greater portability can be got by using uint_least32_t.
Thus:

uint_least8_t xbyte(uint_least32_t word, unsigned bytenum)
{
assert(bytenum < 4);
return (word >> (bytenum * 8)) & 0xff;
}

or maybe:

int xbyte(uint_least32_t word, unsigned bytenum)
{
return bytenum < 4 ? (word >> (bytenum * 8)) & 0xff : -1;
}

In C90, I'd probably write:

unsigned char xbyte(unsigned long word, unsigned bytenum)
{
assert(bytenum < 4);
return (word >> (bytenum * 8)) & 0xff;
}

or

int xbyte(unsigned long word, unsigned bytenum)
{
return bytenum < 4 ? (word >> (bytenum * 8)) & 0xff : -1;
}

but I would never say any of these is perfect (for one thing that
would imply that code can be unambiguously ordered by "quality").
 
J

Joachim Schmitz

xbyte said:
It's a problem from a book, and I can't figure it out.

4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
you've gotta extract an indicated (by bytenum) word from it, AND the
following code is said to be written by a failed programmer:

int xbyte(unsigned word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}

IMHO, this is a perfect bulk of code, what's your opinion?
The function should return a byte (unsigned char), but does return an int.
You'd need to check whether bytenum is in the range of 0-3
There's no quarantee that unsigned (int) is 32 bit long

Bye, Jojo
 
M

Malcolm McLean

xbyte said:
It's a problem from a book, and I can't figure it out.

4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
you've gotta extract an indicated (by bytenum) word from it, AND the
following code is said to be written by a failed programmer:

int xbyte(unsigned word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}

IMHO, this is a perfect bulk of code, what's your opinion?
The programmer might have failed the course, but the code is good enough to
be used even in a production environment. Because it is extremely unlikely
that it can fail usefully.

However bytenum should be checked to be within the range 0-3, word should
arguably be an unsigned long to guarantee at least 32 bits. I say "arguably"
because in fact an extremely low level function like this needs to be
efficient, and a conversion from int to long might be exactly what you don't
need if it is not a nop.
The identifier "word" is a bit unfortunate, since it looks like it might be
a user-defined basic type.

int is the conventional return type for functions returning chars, by the
way. You could argue this is a bad convention, but it allows -1 on error.
 
F

Flash Gordon

Malcolm McLean wrote, On 29/09/07 21:03:
The programmer might have failed the course, but the code is good enough
to be used even in a production environment.

Depends on the environment. I can certainly see it being rejected at
review for using a left shift where what is really meant is multiplication.
Because it is extremely
unlikely that it can fail usefully.

As written it is unlikely to fail in any useful manner, certainly.
However bytenum should be checked to be within the range 0-3, word
should arguably be an unsigned long to guarantee at least 32 bits. I say
"arguably" because in fact an extremely low level function like this
needs to be efficient, and a conversion from int to long might be
exactly what you don't need if it is not a nop.

Now explain how a conversion which does nothing can affect efficiency.
Especially as it would probably be called with an unsigned long as a
parameter in a project where they had made the parameter an unsigned long.
The identifier "word" is a bit unfortunate, since it looks like it might
be a user-defined basic type.

It does not look like that to anyone who reads it.
int is the conventional return type for functions returning chars, by
the way. You could argue this is a bad convention, but it allows -1 on
error.

If there is no error checking then returning an unsigned long documents
what will be returned, if a check is added then returning an int and
using -ve values for error codes makes sense.
 
R

Richard

Flash Gordon said:
Malcolm McLean wrote, On 29/09/07 21:03:

Depends on the environment. I can certainly see it being rejected at
review for using a left shift where what is really meant is
multiplication.

Not if we are talking about shifting as opposed to multiplication. Using
the shift makes it obvious that we are shifting bits around. "*8" on the
other hand might be any arbitrary multiplication that just happens to be
a power of 2.
 
W

Willem

Richard wrote:
)> Malcolm McLean wrote, On 29/09/07 21:03:
)>>
)>> )>>> It's a problem from a book, and I can't figure it out.
)>>>
)>>> 4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
)>>> you've gotta extract an indicated (by bytenum) word from it, AND the
)>>> following code is said to be written by a failed programmer:
)>>>
)>>> int xbyte(unsigned word, int bytenum)
)>>> {
)>>> return (word >> (bytenum << 3)) & 0xFF;
)>>> }
)>>>
)>>> IMHO, this is a perfect bulk of code, what's your opinion?
)>>>
)>> The programmer might have failed the course, but the code is good
)>> enough to be used even in a production environment.
)>
)> Depends on the environment. I can certainly see it being rejected at
)> review for using a left shift where what is really meant is
)> multiplication.
)
) Not if we are talking about shifting as opposed to multiplication. Using
) the shift makes it obvious that we are shifting bits around. "*8" on the
) other hand might be any arbitrary multiplication that just happens to be
) a power of 2.

I direct your attention to what the code actually tries to do.
8 *is* an arbitrary multiplication that happens to be a power of two.
It could just as well have been 9.

To make the point more clear, a slightly more portable way to write
it would be:

return (word >> (bytenum * CHAR_BIT)) & UCHAR_MAX;

Although I'm sure there are several issues with that as well.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
?

=?iso-2022-kr?q?Harald_van_D=0E=29=26=0Fk?=

Not if we are talking about shifting as opposed to multiplication. Using
the shift makes it obvious that we are shifting bits around. "*8" on the
other hand might be any arbitrary multiplication that just happens to be
a power of 2.

If you were writing the code for a 9-bit char 36-bit int machine, you
might end up with something like

int xbyte(unsigned word, int bytenum) {
return (word >> (bytenum * 9)) & 0x1FF;
}

"*8" is just any arbitrary multiplication that happens to be a power of
2, in this case.
 
F

Flash Gordon

Willem wrote, On 30/09/07 00:45:
Richard wrote:
)> Malcolm McLean wrote, On 29/09/07 21:03:
)>>
)>> news:[email protected]...

)>>> return (word >> (bytenum << 3)) & 0xFF;

)>> The programmer might have failed the course, but the code is good
)>> enough to be used even in a production environment.
)>
)> Depends on the environment. I can certainly see it being rejected at
)> review for using a left shift where what is really meant is
)> multiplication.
)
) Not if we are talking about shifting as opposed to multiplication. Using
) the shift makes it obvious that we are shifting bits around. "*8" on the
) other hand might be any arbitrary multiplication that just happens to be
) a power of 2.

I direct your attention to what the code actually tries to do.
8 *is* an arbitrary multiplication that happens to be a power of two.
It could just as well have been 9.

Yes, that is indeed a correct expansion of my point, as was Harold's
response to Richard.
To make the point more clear, a slightly more portable way to write
it would be:

return (word >> (bytenum * CHAR_BIT)) & UCHAR_MAX;

Although I'm sure there are several issues with that as well.

It depends on the real requirement. If you are dealing with some format
where you have four 8 bit numbers packed in to an integer type then
using 8 and 0xFF is correct, if the requirement is 4 C bytes then your
code is correct.
 
A

Army1987

It's a problem from a book, and I can't figure it out.

4 words (4x8 bits) are packed into a 32-bit unsigned integer, and
you've gotta extract an indicated (by bytenum) word from it, AND the
following code is said to be written by a failed programmer:

int xbyte(unsigned word, int bytenum)
{
return (word >> (bytenum << 3)) & 0xFF;
}

IMHO, this is a perfect bulk of code, what's your opinion?
None. But if you replace `<< 3` with `* CHAR_BIT` and 0xFF with
UCHAR_MAX (these are in <limits.h>), you don't have to know how
big the word and the bytes are.
 
P

Peter J. Holzer

None. But if you replace `<< 3` with `* CHAR_BIT` and 0xFF with
UCHAR_MAX (these are in <limits.h>), you don't have to know how
big the word and the bytes are.

Except that the requirement explicitely says "4x8 bits", not 4 x
CHAR_BIT bits. That may be an error in the requirement, but more likely
it is mandated by some external source, like a file format or a
communication protocol (in the real world - I have seen that this
problem is from "a book"). Then * 8 and & 0xFF will work correctly even
on machines with larger bytes, but * CHAR_BIT and UCHAR_MAX will not.

hp
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top