wtf is this?

Noah Roberts · Sep 29, 2003

This is the first time I have seen code like this and I don't know what
it means. It appears to be C code but I thought I was familiar with
most constructs...must have missed one:

u_char c_perm : 3;

This declairation is inside of a struct along with several other
variables declaired in a similar manner.

I am assuming u_char to be a typedef for unsigned char, but what is the
" : 3" part doing there and what does it mean? Is this new syntax from
C99 or has it always been around and I just never ran into it before?

I found this code in the Xok exokernel from MIT if it makes a diff or
you are curious.

Thanks,
NR

Noah Roberts · Sep 29, 2003

Noah said:
This is the first time I have seen code like this and I don't know what
it means. It appears to be C code but I thought I was familiar with
most constructs...must have missed one:

u_char c_perm : 3;

You hold off asking until you are sure you can't find the answer and
then you do

I actually knew what this was I just hadn't ever seen it
used so forgot all about it. I have never actually seen code that used
bitfields, and that isn't the only strange thing it does:

/* An array of seven (7--I'm cheating) bytes naming the capability if
* it is not a pointer: */
u_char c_name[1];
/* The length and address of the capability list if this is a pointer: */
u_short cl_len;
struct Capability *cl_list;

Of course this code is not meant to be exactly portable

NR

Mike Wahler · Sep 29, 2003

Noah Roberts said:
You hold off asking until you are sure you can't find the answer and
then you do I actually knew what this was I just hadn't ever seen it >

used so forgot all about it. I have never actually seen code that used

bitfields, and that isn't the only strange thing it does:

What do you find strange?

/* An array of seven (7--I'm cheating) bytes naming the capability if
* it is not a pointer: */
u_char c_name[1];

That's an array of *one* type 'u_char' object.

/* The length and address of the capability list if this is a pointer: */
u_short cl_len;
struct Capability *cl_list;

Of course this code is not meant to be exactly portable

If 'u_char' and 'u_short' are typedefs for standard
types, then yes, it's portable.

-Mike

Mike Wahler · Sep 29, 2003

Noah Roberts said:
This is the first time I have seen code like this and I don't know what
it means. It appears to be C code but I thought I was familiar with
most constructs...must have missed one:

u_char c_perm : 3;

This declairation is inside of a struct along with several other
variables declaired in a similar manner.

I am assuming u_char to be a typedef for unsigned char, but what is the
" : 3" part doing there and what does it mean? Is this new syntax from
C99 or has it always been around and I just never ran into it before?

It's been around quite a while. Look up 'bitfield'.

-Mike

Noah Roberts · Sep 29, 2003

Mike said:
/* An array of seven (7--I'm cheating) bytes naming the capability if
* it is not a pointer: */
u_char c_name[1];

Click to expand...

That's an array of *one* type 'u_char' object.

By the comments I would gather that it is being used as *7*. Glad you
noticed the 1 there though. Kind of interesting that someone would
create an array of 1 element...

If 'u_char' and 'u_short' are typedefs for standard
types, then yes, it's portable.

Ahhhh...that's right; short == 2 bytes and pointer == 4, always...I knew
I was missing something there...guess I should have looked at it more.

Cute optimization, but I know *I* wouldn't call it portable...but like I
said, in this domain it doesn't need or even want to be.

NR

Dan Pop · Sep 29, 2003

In said:
used so forgot all about it. I have never actually seen code that used

What do you find strange?

The fact that the bit field has the type u_char, which is seldom used as
an alias for unsigned int.

A bit-field may have type int, unsigned int, or signed int.
Whether the high-order bit position of a ``plain'' int bit-field is
treated as a sign bit is implementation-defined. A bit-field is
interpreted as an integral type consisting of the specified number of
bits.

Dan

Mike Wahler · Sep 29, 2003

Noah Roberts said:
Mike said:

/* An array of seven (7--I'm cheating) bytes naming the capability if
* it is not a pointer: */
u_char c_name[1];

Click to expand...

That's an array of *one* type 'u_char' object.

Click to expand...

By the comments I would gather that it is being used as *7*. Glad you
noticed the 1 there though. Kind of interesting that someone would
create an array of 1 element...

If 'u_char' and 'u_short' are typedefs for standard
types, then yes, it's portable.

Click to expand...

Ahhhh...that's right; short == 2 bytes and pointer == 4, always

Not true.

...I knew
I was missing something there...guess I should have looked at it more.

Cute optimization,

I see no 'optimization', only obfuscation.

but I know *I* wouldn't call it portable...

Why not?

but like I
said, in this domain it doesn't need or even want to be.

No, much software is necessarily nonportable. This is
not a 'bad thing', but being unaware of it is.

-Mike

Arthur J. O'Dwyer · Sep 29, 2003

Noah Roberts said:
Noah Roberts said:

Mike said:

[Noah wrote:]
/* An array of seven (7--I'm cheating) bytes naming the capability if
* it is not a pointer: */
u_char c_name[1];

That's an array of *one* type 'u_char' object.

Click to expand...

By the comments I would gather that it is being used as *7*. Glad you
noticed the 1 there though. Kind of interesting that someone would
create an array of 1 element...

/* The length and address of the capability list if this is a pointer: */
u_short cl_len;
struct Capability *cl_list;

Of course this code is not meant to be exactly portable

If 'u_char' and 'u_short' are typedefs for standard
types, then yes, it's portable.

Click to expand...

Click to expand...

Ahhhh...that's right; short == 2 bytes and pointer == 4, always

Click to expand...

Not true.

I can't tell whether Noah's response there was supposed to be
tongue-in-cheek or not. But the code is a perfectly valid
optimization [and obfuscation] for your average x86 platform
(although, as Noah points out, non-portable). The portable
equivalent would have been

union {
char c_name[7];
struct {
char c_name[1];
unsigned short cl_len;
struct Capability *cl_list;
} rest;
} my_data;

So where old code might have written

if (c_name[0] == '\0')
process(&cl_len, &cl_list);
else
okay = strcmp(c_name, "SIXCHR");

the new, portable version would read

if (my_data.c_name[0] == '\0')
process(&my_data.rest.cl_len, &my_data.rest.cl_list);
else
okay = strcmp(my_data.c_name, "SIXCHR");

In the case that no structure padding is used, and sizeof(short)==2
and sizeof(struct Capability *)==4, then we have a data layout that
basically replicates the x86-compiler-specific code.

I see no 'optimization', only obfuscation.

Ditto. The six-byte savings is not worth the unportability.

Why not?

'Cause it isn't.

Although a c.l.c regular might be
forgiven for his relative abundance of tree-vision over
forest-vision in this case... ;-)

HTH,
-Arthur

Noah Roberts · Sep 30, 2003

Arthur said:
I can't tell whether Noah's response there was supposed to be
tongue-in-cheek or not.

Yeah, I was being a smartass.

But the code is a perfectly valid

optimization [and obfuscation] for your average x86 platform
(although, as Noah points out, non-portable).

And since this is a kernel only ever meant to be run on an x86 this is
ok. Certainly obfuscated, but not that badly; I would prefer the union
version myself as it confused me at first. I would bet it would work on
most current systems at any rate, you would only need to worry if one of
the last two where too small to hold 6 bytes.

The portable

equivalent would have been

union {
char c_name[7];
struct {
char c_name[1];
unsigned short cl_len;
struct Capability *cl_list;
} rest;
} my_data;

So where old code might have written

if (c_name[0] == '\0')
process(&cl_len, &cl_list);
else
okay = strcmp(c_name, "SIXCHR");

the new, portable version would read

if (my_data.c_name[0] == '\0')
process(&my_data.rest.cl_len, &my_data.rest.cl_list);
else
okay = strcmp(my_data.c_name, "SIXCHR");

In the case that no structure padding is used, and sizeof(short)==2
and sizeof(struct Capability *)==4, then we have a data layout that
basically replicates the x86-compiler-specific code.

Can we be sure that there is no padding in the x86 specific version?

Ditto. The six-byte savings is not worth the unportability.

The portability issue is not a major issue in this case. This code is
meant to expose the underlying hardware through a secure interface; you
can't really do that portably. Also, this is a much used component
inside of a kernel. These objects will be passed with any system call
an application makes or tied to an internal representation of the
process (haven't gotten that far yet). It might be worth it in this
special case.

'Cause it isn't. Although a c.l.c regular might be
forgiven for his relative abundance of tree-vision over
forest-vision in this case... ;-)

That's why I said I should have looked harder

Mike is usually pretty
on the point, I think he just read this one too fast or something.

NR

Arthur J. O'Dwyer · Sep 30, 2003

Yeah, I was being a smartass.

Okay. You're gonna have to try harder next time; looks like
[almost] everyone missed it. ;-)

But the code is a perfectly valid
optimization [and obfuscation] for your average x86 platform
(although, as Noah points out, non-portable).

Click to expand...

And since this is a kernel only ever meant to be run on an x86 this is
ok. Certainly obfuscated, but not that badly; I would prefer the union
version myself as it confused me at first.

I *strongly* recommend the union version over the current version.
Because at the moment, the code is just *begging* for some clever
maintainer to re-arrange the global variables so that the pointer
comes first, then the short, then the char[1] (in the interests of
removing some padding bytes, on compilers that do that). Once
someone does that, the code will become entirely broken and will
cause incredibly hard-to-find system bugs. In a kernel, that's
bad. :-(

I would bet it would work on
most current systems at any rate, you would only need to worry if one of
the last two where too small to hold 6 bytes.

Which is entirely possible. *Especially* on an x86, where
2-byte pointers are the easiest thing to use, in programs not
exceeding one segment's-worth of data. Again, crashes and
Bad Things. [Hmm... this optimization is looking less and less
"perfectly valid" the more I think about it...]

The portable
equivalent would have been

union {
char c_name[7];
struct {
char c_name[1];
unsigned short cl_len;
struct Capability *cl_list;
} rest;
} my_data;
In the case that no structure padding is used, and sizeof(short)==2
and sizeof(struct Capability *)==4, then we have a data layout that
basically replicates the x86-compiler-specific code.

Click to expand...

Can we be sure that there is no padding in the x86 specific version?

No. But the code was obviously written with the *assumption* that
there wouldn't be any. (Oh, and BTW, it's not *really* x86-specific.
It's specific to any compiler+architecture with the right size
data, the right alignment requirements, and the right strategy for
allocating variables in memory.)

The portability issue is not a major issue in this case. This code is
meant to expose the underlying hardware through a secure interface; you
can't really do that portably.

But you *can* do *this* little bit of it portably. So why not
take the five minutes and make it portable? ...And if the modification
takes more than five minutes, then you're in more trouble than I
thought!

Also, this is a much used component
inside of a kernel. These objects will be passed with any system call
an application makes or tied to an internal representation of the
process (haven't gotten that far yet). It might be worth it in this
special case.

It's all just bits and bytes at the machine level. The compiler
really won't care whether you wrap things in a struct or not, trust
me.
*Some* optimizations do help dumb compilers in a real sense --
for example, loop unrolling or use of the 'register' keyword can
really help some compilers limp along. But in this case, I don't
see any way that wrapping up the variables safely could possibly
hurt your performance. They keep the same addresses and everything.

If you're really curious about performance questions, you
should compile both versions and examine the assembly output.
If you *still* want help, consult someone or some group where
your particular compiler or architecture is on-topic. I.e.,
not comp.lang.c, anymore, because this discussion IMHO has
drifted too far off-topic.

HTH,
-Arthur

Noah Roberts · Sep 30, 2003

Arthur said:
Which is entirely possible. *Especially* on an x86, where
2-byte pointers are the easiest thing to use, in programs not
exceeding one segment's-worth of data. Again, crashes and
Bad Things. [Hmm... this optimization is looking less and less
"perfectly valid" the more I think about it...]

Hey, it is from MIT, it must be right.

[snip]
It's all just bits and bytes at the machine level. The compiler
really won't care whether you wrap things in a struct or not, trust
me.

Yeah, there really shouldn't be such a thing as struct and union once it
is compiled, except possibly longer symbols (at a kernel level would
these exist or would it only use addresses?). I assumed you had meant a
six byte saving over this:

u_char name[7];
u_short whatever;
struct Capability* list;

I totally agree, it isn't pretty and the union version is better.

NR

What is this obfuscation?	1	Jul 10, 2023
WTF is this?	10	Apr 1, 2013
wtf is n1570?	1	Jun 29, 2011
What code is this	2	Jan 22, 2021
WTF??	1	Jul 31, 2006
Why does this cause a seg fault?	1	Jan 23, 2023
Getting this error, what to do?	0	Dec 27, 2022
A little afternoon WTF	53	May 13, 2010

wtf is this?

Noah Roberts

Noah Roberts

Mike Wahler

Mike Wahler

Noah Roberts

Dan Pop

Mike Wahler

Arthur J. O'Dwyer

Noah Roberts

Arthur J. O'Dwyer

Noah Roberts

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads