Colon (:) syntax in defining fields in a struct

R

Raj Kotaru

Hello all,

I recently came across the following segment of code that defines a C
struct:

typedef struct
{
unsigned char unused_bits:4;
unsigned char wchair_state:2;
} xyz;

What do the numbers 4 and 2 refer to?

If I define a second struct as below:

typedef struct
{
unsigned char unused_bits;
unsigned char wchair_state;
} abc;


and then declare

void main(void)
{
xyz _xyz;
abc _abc;
}

In terms of memory allocation, is there any difference between that
allocated for _xyz and _abc?

Any feedback would be much appreciated.

Thanks
Raj
 
S

Shark Venue

":" means bit-allocation.
System will allocate 1 byte to struct _xyz and 2 byte to struct _abc.
Here is mem structure chart:
_________________________________________________
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |
-------------------------------------------------
|<- _xyz.u_bits ->|<- _xyz.w->|

_________________
|1|2|3|4|5|6|7|8| <---- _abc.u_bits;
+-+-+-+-+-+-+-+-+
|1|2|3|4|5|6|7|8| <---- _abc.w
 
S

SM Ryan

(e-mail address removed) (Raj Kotaru) wrote:
# Hello all,
#
# I recently came across the following segment of code that defines a C
# struct:
#
# typedef struct
# {
# unsigned char unused_bits:4;
# unsigned char wchair_state:2;
# } xyz;
#
# What do the numbers 4 and 2 refer to?

unused_bits is four bits wide and wchair_state is two bits. The fields may
be packed as tightly as possible, in one char sized unit possibly.

On a typical CPU and C implementation, without the field widths, the
struct would be two characters wide, and only one character wide with
the above.

#
# If I define a second struct as below:
#
# typedef struct
# {
# unsigned char unused_bits;
# unsigned char wchair_state;
# } abc;
#
#
# and then declare
#
# void main(void)
# {
# xyz _xyz;
# abc _abc;
# }
#
# In terms of memory allocation, is there any difference between that
# allocated for _xyz and _abc?

Add
printf("%d %d\n",sizeof(xyz),sizeof(abc));
I would expect to see it print
1 2
 
K

Keith Thompson

I recently came across the following segment of code that defines a C
struct:

typedef struct
{
unsigned char unused_bits:4;
unsigned char wchair_state:2;
} xyz;

Look up "bit fields" in any C textbook.
 
K

Kenneth Brody

Shark said:
":" means bit-allocation.
System will allocate 1 byte to struct _xyz and 2 byte to struct _abc.
Here is mem structure chart:
_________________________________________________
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 |

I don't believe the standard specifies the order in which bits are
allocated. If that's the case, you may have:
 
P

Peter Shaggy Haywood

Groovy hepcat SM Ryan was jivin' on Sat, 28 Aug 2004 16:33:30 -0000 in
comp.lang.c.
Re: Colon :)) syntax in defining fields in a struct's a cool scene!
Dig it!
(e-mail address removed) (Raj Kotaru) wrote:
# Hello all,
#
# I recently came across the following segment of code that defines a C
# struct:
#
# typedef struct
# {
# unsigned char unused_bits:4;
# unsigned char wchair_state:2;
# } xyz;
#
# What do the numbers 4 and 2 refer to?

unused_bits is four bits wide and wchair_state is two bits. The fields may
be packed as tightly as possible, in one char sized unit possibly.

On a typical CPU and C implementation, without the field widths, the
struct would be two characters wide, and only one character wide with
the above.

It should be pointed out, though, that unsigned char is non-portable
for the type of a bit field. Only a qualified or unqualified version
of signed int, unsigned int or _Bool (in C99) are portable.
# If I define a second struct as below:
#
# typedef struct
# {
# unsigned char unused_bits;
# unsigned char wchair_state;
# } abc;
#
# and then declare
#
# void main(void)

Pay attention (Raj Kotaru)! I'm only going to say this a billion
times or so. The main() function is supposed to return an int, not
void. Portable return values for main() are 0, EXIT_SUCCESS and
EXIT_FAILURE, the latter two being macros defined in stdlib.h.

int main(void)
# {
# xyz _xyz;
# abc _abc;

return 0;
# }
#
# In terms of memory allocation, is there any difference between that
# allocated for _xyz and _abc?

Add
printf("%d %d\n",sizeof(xyz),sizeof(abc));
I would expect to see it print
1 2

I would expect the unexpected. When you invoke the wrath of the
undefined behaviour gods, you never really know what to expect. Well,
sometimes you can make a reasonable guess, but it can always go wrong
and do something you least expect.
Instead, try this:

printf("sizeof xyz = %lu, sizeof abc = %lu\n",
(unsigned long)sizeof xyz,
(unsigned long)sizeof abc);

--

Dig the even newer still, yet more improved, sig!

http://alphalink.com.au/~phaywood/
"Ain't I'm a dog?" - Ronny Self, Ain't I'm a Dog, written by G. Sherry & W. Walker.
I know it's not "technically correct" English; but since when was rock & roll "technically correct"?
 
K

Karthik Kumar

Raj said:
Hello all,

I recently came across the following segment of code that defines a C
struct:

typedef struct
{
unsigned char unused_bits:4;
unsigned char wchair_state:2;
} xyz;

What do the numbers 4 and 2 refer to?

Look up for "Bit Fields and C " in Google.
If I define a second struct as below:

typedef struct
{
unsigned char unused_bits;
unsigned char wchair_state;
} abc;


and then declare

void main(void)
{
xyz _xyz;
abc _abc;
}

In terms of memory allocation, is there any difference between that
allocated for _xyz and _abc?


Bit Fields are generally used to pack data to conserve spacee.
So the two structs would have different sizes altogether.
 
F

Fao, Sean

Peter said:
I would expect the unexpected. When you invoke the wrath of the
undefined behaviour gods, you never really know what to expect. Well,
sometimes you can make a reasonable guess, but it can always go wrong
and do something you least expect.
Instead, try this:

printf("sizeof xyz = %lu, sizeof abc = %lu\n",
(unsigned long)sizeof xyz,
(unsigned long)sizeof abc);

Ok, you've got me! Why is the above modification necessary and why
could the original invoke undefined behavior?

P.S.
I waited about two hours for my reply to show up on my news server and
still haven't seen it. I apologize if the original finally shows up.
 
D

Dan Pop

In said:
Ok, you've got me! Why is the above modification necessary and why
could the original invoke undefined behavior?

What is the type expected by %d in a printf format?
What is the type of the value yielded by the sizeof operator?
What happens when the two types don't match?

Dan
 
F

Fao, Sean

Ok, you've got me! Why is the above modification necessary and why
could the original invoke undefined behavior?

Actually, let me see if I can figure this out for myself.

The sizeof operator results in something of type size_t, which I assume
on some implementations *could* be an unsigned long. If that's correct,
why not unsigned long long?

But this doesn't answer my question as to where the UD could come into
play. Obviously, the original code with a %d could not display any
number over INT_MAX; however, the result would still be defined (at
least I _think_ rolling over is defined)

Am I missing something?
 
S

SM Ryan

# Ok, you've got me! Why is the above modification necessary and why
# could the original invoke undefined behavior?

Because there's a clique running about that gets so angry that I answer
questions they doomed unanswerable, that they have to attack me in anyway
possible.
 
C

CBFalconer

Fao said:
Actually, let me see if I can figure this out for myself.

The sizeof operator results in something of type size_t, which I assume
on some implementations *could* be an unsigned long. If that's correct,
why not unsigned long long?

But this doesn't answer my question as to where the UD could come into
play. Obviously, the original code with a %d could not display any
number over INT_MAX; however, the result would still be defined (at
least I _think_ rolling over is defined)

Am I missing something?

Yes. How does the calling code know to transform a size_t into an
int? This is a variadic function being called. Lets say an int
occupies two bytes, and a size_t occupies 4 bytes, and we a using
a stack machine. The printf operates on two 2 byte portions of
the first (or last) parameter. Maybe those are trap values. UB
everywhere you look.
 
K

Keith Thompson

Fao said:
Actually, let me see if I can figure this out for myself.

The sizeof operator results in something of type size_t, which I assume
on some implementations *could* be an unsigned long. If that's correct,
why not unsigned long long?

But this doesn't answer my question as to where the UD could come into
play. Obviously, the original code with a %d could not display any
number over INT_MAX; however, the result would still be defined (at
least I _think_ rolling over is defined)

Am I missing something?

Yes, you're missing the fact that printf() has no way of knowing the
actual types of its arguments other than the format string.

Let's take a simple example:

#include <stdio.h>
int main(void)
{
long int n = 42;
printf("n = %d\n", n);
return 0;
}

By using "%d" in the format string, you're promising that the
corresponding argument is going to be of type int. The compiler can't
check that you've kept your promise; the format string could be a
variable, so the compiler has no way of knowing what's in it.
(Actually, some compilers can and do perform such checks if the format
is a literal, but the standard doesn't require it. "gcc -Wall" prints
a warning for mismatched printf formats.) The long int value is not
converted it int; you've told the compiler to assume that it's
*already* of type int. In other words, you've lied to the compiler,
and it will get its revenge.

What actually happens is going to depend on a number of things, such
as the parameter-passing convention. One possibility is that the
caller will push a long int (the value of n) onto the stack, and
printf() will pop an int from the stack, because you told it to expect
an int. In many cases, this will happen to work (because int and long
int are the same size, or because an int argument is passed the same
way as a long int argument, or because you just got lucky.) In other
cases, it could corrupt the stack pointer, causing subsequent code to
lose track of which local variables are stored where (this is unlikely
for historical reasons, but the standard allows it). The bottom line
is that it's undefined behavior, meaning that the standard places no
constraints on what could happen, from behaving as you expect to
making demons fly out your nose. (Behaving as you expect is actually
the worst outcome, since it prevents you from finding the error until
you port the code to another platform, and it fails subtly or
spectactularly at the most inconvenient possible moment.)

By contrast, consider this example:

#include <stdio.h>

static void print_int(int x)
{
printf("%d", x);
}

int main(void)
{
long int n = 42;
printf("n = ");
print_int(n);
printf("\n");
return 0;
}

This doesn't invoke undefined behavior; it's guaranteed to print

n = 42

(assuming there are no problems writing to stdout). The function
print_int() expects an argument of type int, but you're passing it an
argument of type long int, so the argument is implicitly converted
from long int to int before being passed. Since the value 42 is
guaranteed to fit into an int, overflow is not an issue.

So why is the implicit conversion performed on the call to print_int()
but not on the call to printf()? Because for print_int(), there's a
prototype that tells the compiler that it expects an argument of type
int. The compiler knows it's going to need to generate an implicit
conversion, and the standard requires it to do so. For printf(),
there's also a prototype (assuming you've remembered the
"#include <stdio.h>"; if not, any call to printf() invokes undefined
behavior) -- but the prototype looks like this:

int printf(const char * restrict format, ...);

(The "restrict" keyword was added in C99; don't worry about the
"const" or "restrict" keywords for now.) The point is that the
compiler, given this prototype, has no way of knowing that the second
argument in printf("%d", x) is supposed to be an int.

Getting back to the original example:

printf("%d %d\n",sizeof(xyz),sizeof(abc));

Given the format string, printf() assumes (because you promised it)
that the second and third arguments are going to be of type int. The
compiler doens't know about this promise, so it passes arguments of
type size_t. Undefined behavior.

One solution is to use *explicit* conversions, so you know that you're
passing arguments of the right type:

printf("%d %d\n", (int)sizeof(xyz), (int)sizeof(abc));

If you happen to know that sizeof(xyz) and sizeof(abc) will both fit
into an int, this is fine. If not, use a bigger type (probably an
unsigned one, since size_t is unsigned):

printf("%lu %lu\n",
(unsigned long)sizeof(xyz),
(unsigned long)sizeof(abc));

Or, if you happen to have a C99-compliant implementation, you can use
the new 'z' length modifier, which specifically applies to size_t
arguments:

printf("%zu %zu\n", sizeof(xyz), sizeof(abc));

but that gives you undefined behavior in a C90 implementation.
 
K

Keith Thompson

SM Ryan said:
# Ok, you've got me! Why is the above modification necessary and why
# could the original invoke undefined behavior?

Because there's a clique running about that gets so angry that I answer
questions they doomed unanswerable, that they have to attack me in anyway
possible.

Nope.

The modification in question was changing
printf("%d %d\n",sizeof(xyz),sizeof(abc));
to
printf("sizeof xyz = %lu, sizeof abc = %lu\n",
(unsigned long)sizeof xyz,
(unsigned long)sizeof abc);

The former will happen to work on many systems, but it invokes
undefined behavior, for reasons I've explained at length elswhere in
this thread. The latter corrects the problem.

If someone corrects a technical error I've made, I consider it a
favor, not an attack. Similarly, if someone pointed out to me that my
quoting style and signature delimiter were causing real problems for
other readers and posters, I would try to correct the problem; you
have stubbornly refused to do so, for no reason that you've ever made
clear.
 
R

Randy Howard

wyrmwif@tango-sierra-oscar- said:
... there's a clique running about that gets so angry that I answer
questions they doomed unanswerable, that they have to attack me in anyway
possible.

I haven't seen anyone attack you, although I admit that people are often
miffed to find out they got something "not quite right" and a bunch of
folks jumped on them here. That's one of the less fortunate side-effects
of this being an incredibly good place to get information on C, because
of this. The point is that the "pedants" pretty much guarantee as a
collective that no mistake will go unpunished. It's sort of the "Borg"
of virtual proofreaders. Don't make the mistake of thinking that having
such horsepower applied to your posts is a bad thing, consider it rather
to be a learning (trial by fire) experience unequalled elsewhere and
move on to better programming.

I note that my newsreader didn't strip your signature automatically. The
reason is that you do not have the customary ' ' character after the '--'
in your sig. It's quite simple to fix. Here's a free example:
 
C

CBFalconer

Randy said:
.... snip ...

I note that my newsreader didn't strip your signature automatically. The
reason is that you do not have the customary ' ' character after the '--'
in your sig. It's quite simple to fix. Here's a free example:

He doesn't care. He also insists on his deviant quote marker. So
the only useful correction is PLONKing.
 
K

Keith Thompson

CBFalconer said:
He doesn't care. He also insists on his deviant quote marker. So
the only useful correction is PLONKing.

I seldom bother replying to him, but it is sometimes useful to correct
misinformation for the benefit of others. (That's not to imply that
he's always wrong; he isn't.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,578
Members
45,052
Latest member
LucyCarper

Latest Threads

Top