Coding Style Question - Is there any sane reason for doing this?

B

BCFD36

I am working on a piece of legacy code that has many, shall we say,
strange things. This one, as J.K. Anderson (classics Prof at UC
Berkeley) used to say, "baffles me". It has the following statement in
the code:

const unsigned int WAKEUP_MASK = (1 << 1) ;

Is there any reason for doing this, other than to make the code even
more obscure?

D. Scruggs
 
A

Alf P. Steinbach

I am working on a piece of legacy code that has many, shall we say,
strange things. This one, as J.K. Anderson (classics Prof at UC
Berkeley) used to say, "baffles me". It has the following statement in
the code:

const unsigned int WAKEUP_MASK = (1<< 1) ;

Is there any reason for doing this, other than to make the code even
more obscure?

No, there is no good reason: one should in general reserve ALL UPPERCASE
identifiers for macros.

But they should be used for macros.

Also, there is no reason for the keyword `int`.

So that's also quite baffling: is the author, like, a compulsive typist
who just wants to type as much as possible, regardless of lack of meaning?

Anyway, the almost OK part of that code, `1 << 1`, explicitly tells you
that this is a single bit bitmask, and which bit is set; in the context
of the rest it's bafflingly clear & communicative. It's generally good
to be explicit in code, and let the compiler handle the translation to
machine code, rather than doing it oneself and writing e.g. 0x0002. The
latter is just a number which the reader then must decompile. However,
for the general case `1 << n` runs the risk of Undefined Behavior, since
it is a signed expression. It should be `1u << n`.


Cheers & hth.,

- Alf
 
I

Ian Collins

I am working on a piece of legacy code that has many, shall we say,
strange things. This one, as J.K. Anderson (classics Prof at UC
Berkeley) used to say, "baffles me". It has the following statement in
the code:

const unsigned int WAKEUP_MASK = (1<< 1) ;

Is there any reason for doing this, other than to make the code even
more obscure?

In addition to Alf's comments, the form is quite common especially in
the embedded or driver world. Although it is more common to see
something like

const unsigned wakeupBit = 1;
const unsigned someOtherBit = 2;

// other bit definitions

const uint32_t wakeupMask = (1u << wakeupBit);
 
B

BCFD36

In addition to Alf's comments, the form is quite common especially in
the embedded or driver world. Although it is more common to see
something like

const unsigned wakeupBit = 1;
const unsigned someOtherBit = 2;

// other bit definitions

const uint32_t wakeupMask = (1u << wakeupBit);

Interesting. I didn't know the form is common. I've never seen it
before, but I don't do much driver work and haven't seen it in the
embedded work I have done.

Personally, I find it quite obscure and prefer the 0x0002 notation
myself. But I will have to think on it some more. I can see how you
could OR together a bunch of these and it be pretty clear which bits are
set. But I still find it obscure.

Thanks, much. Just for grins, I think I'll run it by the rest of my
group and see if any of them have seen it before.
 
R

Richard

[Please do not mail me a copy of your followup]

"Alf P. Steinbach" <[email protected]> spake the secret code
No, there is no good reason: one should in general reserve ALL UPPERCASE
identifiers for macros.

But they should be used for macros.

I've seen people defining constant quantities like this in upper case
because they are used to only being able to define constants with
macros and they've equated the semantic category of "constants" to
"macros" in their mind.

I could go either way about it.

I prefer the advice given in Coding Standard 0: Don't Sweat the Small
Stuff, from "C++ Coding Standards" by Alexandrescue and Sutter.
 
A

Alf P. Steinbach

[Please do not mail me a copy of your followup]

"Alf P. Steinbach"<[email protected]> spake the secret code
No, there is no good reason: one should in general reserve ALL UPPERCASE
identifiers for macros.

But they should be used for macros.

I've seen people defining constant quantities like this in upper case
because they are used to only being able to define constants with
macros and they've equated the semantic category of "constants" to
"macros" in their mind.

I could go either way about it.

I prefer the advice given in Coding Standard 0: Don't Sweat the Small
Stuff, from "C++ Coding Standards" by Alexandrescue and Sutter.

Well, I also like that statement, and have also quoted it many times.

However, I think "small stuff" is mainly stuff that only impacts on the
feelings of proponents of this scheme and that.

Things that have associated costs such as wasted work time, especially
for others, are IMHO not that kind of small stuff, to be disregarded.
Name collisions have such costs. Reduced readability have such costs.


Cheers,

- Alf
 
B

BCFD36

[Please do not mail me a copy of your followup]

"Alf P. Steinbach"<[email protected]> spake the secret
code
No, there is no good reason: one should in general reserve ALL UPPERCASE
identifiers for macros.

But they should be used for macros.

I've seen people defining constant quantities like this in upper case
because they are used to only being able to define constants with
macros and they've equated the semantic category of "constants" to
"macros" in their mind.

I could go either way about it.

I prefer the advice given in Coding Standard 0: Don't Sweat the Small
Stuff, from "C++ Coding Standards" by Alexandrescue and Sutter.

Well, I also like that statement, and have also quoted it many times.

However, I think "small stuff" is mainly stuff that only impacts on the
feelings of proponents of this scheme and that.

Things that have associated costs such as wasted work time, especially
for others, are IMHO not that kind of small stuff, to be disregarded.
Name collisions have such costs. Reduced readability have such costs.


Cheers,

- Alf

I fully agree. If the original author had just added a comment about
what this was, it would have been no problem. Obscure with an
explanation I can handle. Obscure and just sitting there with everything
else, which also has no explanation, is just wrong.
 
B

BCFD36

The "1<<..." style is better for several reasons, most of which kick in
when you have larger values than just 0x2:

- it clearly documents this is a bit mask
- it clearly documents there is only one bit set
- it clearly documents which bit it is so one can easily avoid
collisions with other masks and see which bits are unused

The hexadecimal notation can also do all these things, but requires more
mental effort from the reader and familiarity with the hexadecimal
notation in the first place.

Cheers
Paavo

Only two small quibbles... it was not at all clear to me what this was
supposed to indicate. Now that I know the secret handshake, it makes
more sense. And doing it in HEX is no problem at all, at least for me.
Maybe it is the dinosaur in me.
 
N

Nick Keighley

[strange legacy code]

it's quite acommon idiom (the sift that is). I've always thought it a
bit twee. "I'm so cool I can do clever things with bit operators, even
though I don't know hex!".

Perhaps better would be

const unsigned BIT0 = 1;
const unsigned BIT1 = 2;
const unsigned BIT2 = 4;
const unsigned BIT3 = 8;
// etc.

then

const unsigned WAKEUP_MASK = BIT1;
The "1<<..." style is better for several reasons, most of which kick in
when you have larger values than just 0x2:

  - it clearly documents this is a bit mask

not really
  - it clearly documents there is only one bit set

sort of
  - it clearly documents which bit it is so one can easily avoid
collisions with other masks and see which bits are unused

I submit my named constantrs are clearer
The hexadecimal notation can also do all these things, but requires more
mental effort from the reader and familiarity with the hexadecimal
notation in the first place.

scarey thought. There are people writing device drivers that don't
know hex?
 
A

Alf P. Steinbach

scarey thought. There are people writing device drivers that don't
know hex?

Why not use octal? Saves you typing an "x". After all, the creators of C
were quite happy with octal for expressing low level bit patterns.


Cheers,

- Alf
 
V

Victor Bazarov

Assuming you're not having us on, you can't evenly divide 8, 16, or 32 by three.

Octal made sense in the 12-bit and 36-bit worlds from which the creators of C
hailed.

Are you saying that 040 is more difficult to read than 0x20? Just
checking...

V
 
J

Jorgen Grahn

Are you saying that 040 is more difficult to read than 0x20? Just
checking...

Perhaps both Alf and you are joking and my humor center is out of
operation, but anyway:

I learned programming 20 years ago, and never found any reason to
learn to read octal, other than in the very limited field of Unix file
protection bits. It takes practice, and very, very few people get that
practice these days.

/Jorgen
 
J

Jorgen Grahn

....
Also, there is no reason for the keyword `int`.

Oddly, I've recently met two decent/good programmers who didn't know that
the 'int' in 'unsigned int' is redundant. One asked me if just writing
'unsigned' was a GCC extension!

I suspect others write 'unsigned int' because they believe verbosity
increases readability. I remember I did that very early in my career.

/Jorgen
 
D

David Dyer-Bennet

Jorgen Grahn said:
Oddly, I've recently met two decent/good programmers who didn't know that
the 'int' in 'unsigned int' is redundant. One asked me if just writing
'unsigned' was a GCC extension!

I suspect others write 'unsigned int' because they believe verbosity
increases readability. I remember I did that very early in my career.

I write "unsigned int" because making just "unsigned" work is a clear
mistake in the standard :).
 
J

Jens Thoms Toerring

Oddly, I've recently met two decent/good programmers who didn't know that
the 'int' in 'unsigned int' is redundant. One asked me if just writing
'unsigned' was a GCC extension!
I suspect others write 'unsigned int' because they believe verbosity
increases readability. I remember I did that very early in my career.

I use the "unsigned int" form I guess due to some sense of sym-
metry, since I also have to write "unsigned long" and "unsigned
short" (and I wouldn't consider 4 more characters as "verbosity"
and, yes, to me it looks a bit more readable or at least it may
stop me from from getting distracted by wondering "Did I really
mean 'int' or did I perhaps forgot about the 'long' or 'short'?"
when I later read it again).
Regards, Jens
 
J

Jorgen Grahn

I use the "unsigned int" form I guess due to some sense of sym-
metry, since I also have to write "unsigned long" and "unsigned
short" (and I wouldn't consider 4 more characters as "verbosity"

Yes, you're right. I would have cited symmetry reasons back then.

I now think of the symmetry argument as bogus though. 'long' and
'short' are not "siblings" to 'int', just shorthand for 'long int' and
'short int'.

(And now someone will come claiming he always writes 'unsigned long
int' for maximum readability.)

/Jorgen
 
M

MikeWhy

Jorgen said:
Yes, you're right. I would have cited symmetry reasons back then.

I now think of the symmetry argument as bogus though. 'long' and
'short' are not "siblings" to 'int', just shorthand for 'long int' and
'short int'.

(And now someone will come claiming he always writes 'unsigned long
int' for maximum readability.)

For full, well rounded symmetry, you might consider writing 'signed long
int'.

I'm still stuck on simpler matters, 'uint32_t' for example, where 'unt32_t'
would seem more symmetric with 'int32_t'.

int32_t ifoo;
unt32_t ufoo;

uint32_t xfoo; // needs a tab rather than space to keep it aligned.
 
H

hanukas

[strange legacy code]

it's quite acommon idiom (the sift that is). I've always thought it a
bit twee. "I'm so cool I can do clever things with bit operators, even
though I don't know hex!".

Perhaps better would be

const unsigned BIT0 = 1;
const unsigned BIT1 = 2;
const unsigned BIT2 = 4;
const unsigned BIT3 = 8;
// etc.

then

const unsigned WAKEUP_MASK = BIT1;








The "1<<..." style is better for several reasons, most of which kick in
when you have larger values than just 0x2:
  - it clearly documents this is a bit mask

not really
  - it clearly documents there is only one bit set

sort of
  - it clearly documents which bit it is so one can easily avoid
collisions with other masks and see which bits are unused

I submit my named constantrs are clearer
The hexadecimal notation can also do all these things, but requires more
mental effort from the reader and familiarity with the hexadecimal
notation in the first place.

scarey thought. There are people writing device drivers that don't
know hex?

Sure we do know hex. But we are translating System/Catapult code and
specification into C/C++ header, that is the simplest way to do it.
Let's say we have a 5 bit field at offset 20. Sure we can juggle the
masks in our heads nibble-at-time and get the right answer 99.9% of
the time. But with this approach we do it in fraction of the time and
get it right 99.999% of the time.

We don't really care what that header looks like, what we do care
about is that RB_BLOCK_CLOCK_GATE_MASK has the right value on it. We
don't think in terms of "we are too cool to get our jobs done on
time", you know? What would doing this thing ass-backwards prove? That
we know hex? You are serious, though? Scary.
 
H

hanukas


It's a question of point of view, as usual; in my mind I see those as:

0x20 = 0010 0000
040 = 000 100 000

It's just a different # of bits grouped together, big deal. But that
doesn't change anything. The idea of (1L << N) isn't to be readable
anyway (in the context of embedded driver work) so arguing which form
is more readable is missing the point.

So what is the point? The point is painless translation of
specification into usable header in finite time. Each field has mask
and offset, these are used through everywhere else. This comes from
digital circuit logic design languages like VHDL, SystemC, Catapult.
The idea is painless interaction between those and C/C++ (C mostly).
 
H

hanukas

It's a question of point of view, as usual; in my mind I see those as:

0x20 = 0010 0000
040 = 000 100 000

It's just a different # of bits grouped together, big deal. But that
doesn't change anything. The idea of (1L << N) isn't to be readable
anyway (in the context of embedded driver work) so arguing which form
is more readable is missing the point.

So what is the point? The point is painless translation of
specification into usable header in finite time. Each field has mask
and offset, these are used through everywhere else. This comes from
digital circuit logic design languages like VHDL, SystemC, Catapult.
The idea is painless interaction between those and C/C++ (C mostly).

For the record, none of the proposed approaches are used anyway.

(1 << 4) // nope
(1U << 4) // naah
(1L << 4) // uh-huh..

Here's how it's done:

(0xf << 12) // zero-based mask shifted to correct position

Or more concrete:

SOME_REGISTER_FIELD_MASK (0xff << 16)
SOME_REGISTER_FIELD_OFFSET 16
SOME_REGISTER_FIELD_SIZE 8

Usually this is not done by hand. The HLL description of logic is
translated into C header and we always get all of these; mask, offset,
size. It's up to the driver author to use the appropriate fields where
needed.

When we type these by hand, we follow the same convention for
consistency. =)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top