3-byte ints

E

Ed Morton

I have 2 counters - one is required to be a 2-byte variable while the
other is required to be 3 bytes (not my choice, but I'm stuck with it!).
I've declared them as:

unsigned short small;
unsigned long large: 24;

First question - is that the best way to declare the "large" one to
ensure it's 3 bytes? Another suggestion I got was "unsigned char
large[3];" but that would be a little tougher to do arithmetic
operations on.

Now I need a general-purpose macro to increment them and I need to be
able to produce a report if the counter would overflow when incremented.
So I don't need to have separate macros for each type of counter, I've
done this by introducing a temporary unsigned long (no counter will be
larger than that) to store the original and then doing the increment
using the real counter (to ensure that overflow occurs when expected for
that type of counter) then testing if it rolled over by seeing if the
result is less than the original as in the "incCntr()" macro below:

#include <stdio.h>

typedef struct {
unsigned short small;
unsigned long large: 24;
} CNTRS;

#define incCntr(cntr,incr) \
do { \
unsigned long _tmp = (unsigned long)(cntr); \
(cntr) = (cntr) + (incr); \
if ((cntr) < _tmp) { \
printf("Overflow at %lu + %d\n",_tmp,(incr));\
(cntr) = _tmp; \
} \
printf("%lu -> %lu\n",_tmp,(unsigned long)(cntr));\
} while(0)

int main(void)
{
CNTRS cntrs;
cntrs.small = 65529;
cntrs.large = 16777210;
incCntr(cntrs.small,5);
incCntr(cntrs.small,5);
incCntr(cntrs.large,5);
incCntr(cntrs.large,5);
return 1;
}

Second question - anyone see any problems with doing the overflow test
this way or can suggest a better alternative?

When compiling I get this warning:

gcc -Wall -otst tst.c
tst.c: In function `main':
tst.c:26: warning: long unsigned int format, unsigned int arg (arg 3)
tst.c:27: warning: long unsigned int format, unsigned int arg (arg 3)

It's complaining about the "cntr" argument in the line:

printf("%lu -> %lu\n",_tmp,(unsigned long)(cntr));

Third question - why is the compiler apparently ignoring my cast and
complaining that "(unsigned long)(cntr)" is an unsigned int?

Fourth question - would there be any reason not to declare my "small"
counter as an unsigned long bit-field too, i.e.:

unsigned long small: 16;

for consistency?

FWIW, the result of running the program is what I expected:

tst
65529 -> 65534
Overflow at 65534 + 5
65534 -> 65534
16777210 -> 16777215
Overflow at 16777215 + 5
16777215 -> 16777215

Regards,

Ed.
 
M

Mark A. Odell

I have 2 counters - one is required to be a 2-byte variable while the
other is required to be 3 bytes (not my choice, but I'm stuck with it!).
I've declared them as:

unsigned short small;
unsigned long large: 24;

How does the bit field help you? Why not:

unsigned short small; // 16-bits on this platform
unsigned long large; // 32-bits on this platform

#define INCR_SMALL(s, i) do \
{ \
if (s + i <= UNSIGNED_SHORT_MAX) s += i; \
else printf("Overflow off " #s "+" #i "\n"); \
} while (0)

#define INCR_LARGE(l, i) do \
{ \
if (l + i <= 0x00FFFFFFUL) l += i; \
else printf("Overflow off " #l "+" #i "\n"); \
} while (0)

Note: above is untested.
 
K

Kevin Bracey

In message <[email protected]>
Ed Morton said:
I have 2 counters - one is required to be a 2-byte variable while the
other is required to be 3 bytes (not my choice, but I'm stuck with it!).
I've declared them as:

unsigned short small;
unsigned long large: 24;

First question - is that the best way to declare the "large" one to
ensure it's 3 bytes?

Pretty much, assuming it's in a structure. The only things I'd say are:

1) C90 doesn't allow anything other than "int" and "unsigned int" for
bitfield types. C99 does allow implementations to offer other types
like "unsigned long"; presumably your implementation does - it's
a common extension.

2) The size of a bitfield can't exceed that of its type - if you did
change it to "unsigned int", you'd then have a requirement that
int was at least 24 bits (but you'd get a diagnostic if it wasn't).
[ snip code ]
Second question - anyone see any problems with doing the overflow test
this way or can suggest a better alternative?

Looks fine to me. If you're using C99 you could use uintmax_t rather than
unsigned long, just in case you end up with larger bitfields in future.

I wouldn't return 1 from main though - that'll probably be interpreted as
an error condition by the calling environment.
It's complaining about the "cntr" argument in the line:

printf("%lu -> %lu\n",_tmp,(unsigned long)(cntr));

Third question - why is the compiler apparently ignoring my cast and
complaining that "(unsigned long)(cntr)" is an unsigned int?

Because it's buggy? Your code looks fine to me.
Fourth question - would there be any reason not to declare my "small"
counter as an unsigned long bit-field too, i.e.:

unsigned long small: 16;

for consistency?

Not really, modulo the comments about types above. It might be more portable
as it would guarantee exactly 16 bits, which 'short' wouldn't. But in
practice, I've seen compilers generate significantly different code for a
16-bit bitfield versus a short; it's not unlikely that 'short' may be more
optimised, either in code generation terms or just alignment.
 
E

Ed Morton

Mark said:
How does the bit field help you? Why not:

unsigned short small; // 16-bits on this platform
unsigned long large; // 32-bits on this platform

<snip>

I have to pass this structure to some other code that's expecting
several fields each to be exactly 3 bytes.
#define INCR_SMALL(s, i) do \
{ \
if (s + i <= UNSIGNED_SHORT_MAX) s += i; \
else printf("Overflow off " #s "+" #i "\n"); \
} while (0)

#define INCR_LARGE(l, i) do \
{ \
if (l + i <= 0x00FFFFFFUL) l += i; \
else printf("Overflow off " #l "+" #i "\n"); \
} while (0)

Note: above is untested.

To isolate the callers of the macro from the types of the counters, it'd
mean creating a separate macro for each counter (I actually have several
3-byte counters and several 2-byte counters), e.g.:

#define INCR_C1(cntrs,incr) INCR_SMALL(cntrs.small1,incr)
#define INCR_C2(cntrs,incr) INCR_SMALL(cntrs.small2,incr)
#define INCR_C3(cntrs,incr) INCR_LARGE(cntrs.large1,incr)
.....

I do prefer that to incrementing the counter first and then having to
reset it later.

Once I added in the cast to unsigned long for the "s + i", it worked.

Thanks.

Ed.
 
S

Simon Biber

Ed Morton said:
I have to pass this structure to some other code that's expecting
several fields each to be exactly 3 bytes.

Most C implementations do not support exact 3 byte integer types.

If it needs to be laid out exactly so in memory, you can create
an array of unsigned characters.

void pack(unsigned char *three, unsigned long value)
{
assert(value < (1UL << 24));
three[0] = value & 0xFF;
three[1] = value >> 8 & 0xFF;
three[2] = value >> 16 & 0xFF;
}

unsigned long unpack(unsigned char *three)
{
return (unsigned long)three[0]
| (unsigned long)three[1] << 8
| (unsigned long)three[2] << 16;
}

These functions assume you will be packing 8 bits into each byte,
and using a little-endian packing layout.
 
E

Ed Morton

Most C implementations do not support exact 3 byte integer types.

So, if I use:

unsigned long large: 24;

then the code may not work on my original platform and, even if it does, it
isn't portable, right? What kind of problems could I expect to see? Is there any
way to test whether or not I actually have a problem?
If it needs to be laid out exactly so in memory, you can create
an array of unsigned characters.
These functions assume you will be packing 8 bits into each byte,
and using a little-endian packing layout.

Sounds like I'll be needing those.

Thanks,

Ed.
 
P

Peter Nilsson

Simon Biber said:
Most C implementations do not support exact 3 byte integer types.

Do any? :)
If it needs to be laid out exactly so in memory, you can create
an array of unsigned characters.

void pack(unsigned char *three, unsigned long value)
{
assert(value < (1UL << 24));
three[0] = value & 0xFF;
three[1] = value >> 8 & 0xFF;
three[2] = value >> 16 & 0xFF;
}

unsigned long unpack(unsigned char *three)
{
return (unsigned long)three[0]
| (unsigned long)three[1] << 8
| (unsigned long)three[2] << 16;
}

These functions assume you will be packing 8 bits into each byte,

Why? I know we live in an octet world, but you can do this portably
with CHAR_BIT and UCHAR_MAX.
 
S

Simon Biber

Peter Nilsson said:

I don't know of any, but I've learnt not to make generalisations on
comp.lang.c as someone inevitably provides an example to the contrary.
 
S

Simon Biber

Ed Morton said:
So, if I use:

unsigned long large: 24;

It's not portable to use 'unsigned long' as the base type for a bitfield; the
only portable types are 'int' and 'unsigned int'.
then the code may not work on my original platform and, even if it does, it
isn't portable, right? What kind of problems could I expect to see? Is there
any way to test whether or not I actually have a problem?

You need to know the exact binary format expected, then conform to it.

A 24-bit bitfield will probably still take up four bytes, and you have no
control over exactly where and in what order the 24 bits are stored.

You can check it out on your computer by:
#include <stdio.h>

int main(void)
{
struct foo {unsigned long large : 24; } bar;
size_t i;

bar.large = 0xDEADBE;
printf("%lu\n", (long unsigned) sizeof bar);
for(i = 0; i < sizeof bar; i++)
{
printf("%02X ", ((unsigned char *)&bar));
}
putchar('\n');
return 0;
}

I get:
4
BE AD DE 61

Which indicates the bitfield is stored in the first three of four bytes, in
little-endian order, and that the fourth (padding) byte is uninitialised.
Your results may vary.
 
J

Jack Klein

I have 2 counters - one is required to be a 2-byte variable while the

That would 32 bits on one compiler I use, and 64 bits on another.
other is required to be 3 bytes (not my choice, but I'm stuck with it!).

This would be 48 bits on one compiler I use, and 96 bits on another.

Since a byte in C must have at least 8 bits but may have more, and
does on some architectures, if you mean "16 bits" and "24 bits", say
that. 2 bytes means "at least 16 bits, but possibly more" and 3 bytes
means "at least 24 bits, but possible more".

If you mean 16 bits and 24 bits, say so. There are architectures I
work with where both would fit into a byte.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
J

Jack Klein

It's not portable to use 'unsigned long' as the base type for a bitfield; the
only portable types are 'int' and 'unsigned int'.


You need to know the exact binary format expected, then conform to it.

A 24-bit bitfield will probably still take up four bytes, and you have no
control over exactly where and in what order the 24 bits are stored.

On a Motorola 56000 it will fit perfectly in 1 byte, which happens to
have exactly 24 bits.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
J

Jack Klein

Simon Biber said:
Most C implementations do not support exact 3 byte integer types.

Do any? :)
If it needs to be laid out exactly so in memory, you can create
an array of unsigned characters.

void pack(unsigned char *three, unsigned long value)
{
assert(value < (1UL << 24));
three[0] = value & 0xFF;
three[1] = value >> 8 & 0xFF;
three[2] = value >> 16 & 0xFF;
}

unsigned long unpack(unsigned char *three)
{
return (unsigned long)three[0]
| (unsigned long)three[1] << 8
| (unsigned long)three[2] << 16;
}

These functions assume you will be packing 8 bits into each byte,

Why? I know we live in an octet world, but you can do this portably
with CHAR_BIT and UCHAR_MAX.
and using a little-endian packing layout.

Actually even when an implementation has CHAR_BIT > 8, it is quite
easy and useful to write code using just 8 bits in an unsigned char,
and far more portable. Also, living in a octet-oriented world of
communications standards, it is often necessary to handle individual 8
bit quantities as individual items.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
R

R Pradeep Chandran

:> I have 2 counters - one is required to be a 2-byte variable while the
:
:That would 32 bits on one compiler I use, and 64 bits on another.
:
:> other is required to be 3 bytes (not my choice, but I'm stuck with it!).
:
:This would be 48 bits on one compiler I use, and 96 bits on another.

<snip>

Which are those compilers and their corresponding target platforms?
Could you please post some details of them? It is not that I don't
believe you. But, A lot of my colleagues and friends don't believe in
CHAR_BIT != 8 and I would really like to point out these cases to them.

Have a nice day,
Pradeep
 
E

Ed Morton

Based on the feedback, I'll declare my variables without bitfields and
test them for overflow as Mark suggested, and then pack then into an
array of unsigned chars as Simon suggested when I need to send them to
the code I interface with.

Unsigned short and unsigned long are guaranteed to be 16 bits and 32
bits respectively on this and any future platform this code will run on
as it's adding to a large existing base that depends on those sizes.

Thanks to all who replied.

Ed.
 
M

Mark Gordon

I don't know of any, but I've learnt not to make generalisations on
comp.lang.c as someone inevitably provides an example to the contrary.

I know of (but have not used) a 24 bit processor which has a C compiler
available for it. Specifically the Motorola DSP56000. I would assume
that int is 24 bits and it is even possible that char could be 24 bits!
 
J

Jack Klein

:> I have 2 counters - one is required to be a 2-byte variable while the
:
:That would 32 bits on one compiler I use, and 64 bits on another.
:
:> other is required to be 3 bytes (not my choice, but I'm stuck with it!).
:
:This would be 48 bits on one compiler I use, and 96 bits on another.

<snip>

Which are those compilers and their corresponding target platforms?
Could you please post some details of them? It is not that I don't
believe you. But, A lot of my colleagues and friends don't believe in
CHAR_BIT != 8 and I would really like to point out these cases to them.

Have a nice day,
Pradeep

I happen to have my laptop home with me, which has one of the
compilers installed, here is a copy and paste of a part of the
limits.h file...

========
/********************************************************************/
/* limits.h v3.09 */
/* Copyright (c) 1996-2003 Texas Instruments Incorporated */
/********************************************************************/

#ifndef _LIMITS
#define _LIMITS

#define CHAR_BIT 16 /* NUMBER OF BITS IN TYPE CHAR */
#define SCHAR_MAX 32767 /* MAX VALUE FOR SIGNED CHAR */
#define SCHAR_MIN (-SCHAR_MAX-1) /* MIN VALUE FOR SIGNED CHAR */
#define UCHAR_MAX 65535u /* MAX VALUE FOR UNSIGNED CHAR */
========

This is from Texas Instruments Code Composer Studio for the
TMS320C2810 and TMS320C2812 Digital Signal Processors.

I don't have a copy of the other compiler handy here at home to copy
and paste, so you will have to take my word for it. It is for an
Analog Devices SHARC 32-bit DSP. All the integer types are 32 bits,
and CHAR_BIT is 16.

Mind you, you won't find these sort of architectures anywhere else but
on DSPs anymore, but a lot of DSP programming is being done in C and
even C++ these days.

These are pretty much all free-standing environments, it is not really
possible to provide all the features of a hosted environment on a
platform where char and int have the same representation. It is
impossible to provide a getchar() function which complies with the
standard, namely that it returns all possible values of char and also
EOF, which is an int different from any possible char value.

There are also the early members of the Motorola 56000 DSP family,
which had a 24-bit word size. char, short, and int were all 24 bits,
and long was 64 bits. Many of the new 56000 family members are either
16 bit or 32 bit, but I believe some of the 24 bit versions are still
produced today.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
K

Kevin Easton

Jack Klein said:
Mind you, you won't find these sort of architectures anywhere else but
on DSPs anymore, but a lot of DSP programming is being done in C and
even C++ these days.

These are pretty much all free-standing environments, it is not really
possible to provide all the features of a hosted environment on a
platform where char and int have the same representation. It is
impossible to provide a getchar() function which complies with the
standard, namely that it returns all possible values of char and also
EOF, which is an int different from any possible char value.

(It's actually an unsigned char converted to int, not plain char).

However, are you sure it has to be able to return all possible unsigned
chars? Isn't it possible for unsigned char to have 65536 possible
values, but there be only, say, 140 distinct _characters_ which the
string, input and output functions deal with? Does every possible
unsigned char value have to represent a character?

- Kevin.
 
M

Micah Cowan

Kevin Easton said:
(It's actually an unsigned char converted to int, not plain char).

I assume you mean the "all possible values" bit, not EOF.
However, are you sure it has to be able to return all possible unsigned
chars? Isn't it possible for unsigned char to have 65536 possible
values, but there be only, say, 140 distinct _characters_ which the
string, input and output functions deal with? Does every possible
unsigned char value have to represent a character?

Doesn't matter. Consider the case when you are reading binary files.

-Micah
 
K

Keith Thompson

Simon Biber said:
It's not portable to use 'unsigned long' as the base type for a bitfield; the
only portable types are 'int' and 'unsigned int'.

And 'signed int'. For a bit field, it's implementation-defined
whether plain 'int' is signed or unsigned.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top