Appropriate Name Question

I

Immortal Nephi

I want to know. Should I start naming first upper case and then lower
case on both variable and function. Name helps to reduce confusion
for better readability.

For example:

int Keyboard_Data = 0; // OK?
int keyboard_Data = 0; // Fine
int keyboardData = 0; // Fine

void Get_Foo(); // OK ?
void get_Foo(); // Fine
void getFoo(); // Fine

const int KEY_A = 0x41; // Fine
const int Key_B = 0x42; // OK ?

enum list { KEY_A = 65, KEY_B, KEY_C }; // Fine
enum list { Key_A = 65, Key_B, Key_C }; // OK ?

typedef unsigned long u_int32_t; // OK ?
typedef unsigned long U_INT32_t; // OK ?

Thanks...
 
A

Arne Mertz

Immortal said:
I want to know. Should I start naming first upper case and then lower
case on both variable and function. Name helps to reduce confusion
for better readability.

thats only a matter of taste. I prefer the following:

classes types etc LikeThis
variables and Methods likeThis
constants, enum members, defines LIKE_THIS;

greets
A
 
V

Victor Bazarov

blargg said:
Immortal said:
I want to know. Should I start naming first upper case and then lower
case on both variable and function. Name helps to reduce confusion
for better readability.
[...]

This is the convention I use:
[..]
void delete_files(); // verb phrase: functions that do something

void disk_count(); // noun phrase: function that returns value
.. ^^^^^^^^^^^^^
Aha... Shouldn't it be

unsigned disk_count();

perhaps? said:
int file_count; // lowercase noun phrase: object
[..]

V
 
V

Victor Bazarov

blargg said:
Victor said:
blargg said:
Immortal Nephi wrote:
I want to know. Should I start naming first upper case and then lower
case on both variable and function. Name helps to reduce confusion
for better readability.
[...]

This is the convention I use:
[..]
void delete_files(); // verb phrase: functions that do something

void disk_count(); // noun phrase: function that returns value
. ^^^^^^^^^^^^^
Aha... Shouldn't it be

unsigned disk_count();

perhaps? <BG>

Heh, yeah, I went and made the same error as the original poster (and I
was even going to correct him on it, until I decided not to comment on his
examples). Except I'd return a signed int, since it's a number not a
bitmask. :)

My logic (which has to be faulty) dictates that the number of something
cannot be signed... But if the positive range of an int covers all
possible values, then there is probably no difference.

V
 
J

James Kanze

I want to know. Should I start naming first upper case and
then lower case on both variable and function. Name helps to
reduce confusion for better readability.

Certainly. There are several different widespread conventions.
About the only real rules are:

-- Macros should be all caps, and nothing else (at least
nothing longer than a single character) should be all caps.

-- Types and non-types (variables and functions) should be
easily distinguished---you need to distinguish between types
and non types in order to parse C++. One frequent variant
is that typenames begin with a capital, and variables and
functions with a small letter.

If everything is perfectly named, this rule shouldn't be
necessary; the semantics of the name should make it clear.
In practice, however...

-- All of the code I've seen applies the rules for classes and
functions to class and function templates. I'm not sure
that this is a good idea, given that in some cases you need
to know whether a symbol is a template or not in order to
parse C++ as well, but I've not seen any good proposals for
a distinguishing rule.

-- Not quite as absolute (there are exceptions), but generally,
types should be unqualified nouns, variables qualified
nouns, and functions verbs. It's also a fairly widespread
convention that "predicate" functions, returning a bool
start with "is", "are" or "has". But as I said, this is
only a very general guideline, with exceptions.
 
J

James Kanze

Immortal Nephi schrieb:
thats only a matter of taste. I prefer the following:
classes types etc LikeThis
variables and Methods likeThis
constants, enum members, defines LIKE_THIS;

The all caps is generally reserved for macros, and only macros.
Don't forget that one man's constant is another man's variable.
It's not a good idea to distinguish them from other variables.
 
J

James Kanze

blargg said:
Victor said:
blargg wrote:
Immortal Nephi wrote:
I want to know. Should I start naming first upper case
and then lower case on both variable and function. Name
helps to reduce confusion for better readability.
[...]
This is the convention I use:
[..]
void delete_files(); // verb phrase: functions that do something
void disk_count(); // noun phrase: function that returns value
. ^^^^^^^^^^^^^
Aha... Shouldn't it be
unsigned disk_count();
perhaps? <BG>
Heh, yeah, I went and made the same error as the original
poster (and I was even going to correct him on it, until I
decided not to comment on his examples). Except I'd return a
signed int, since it's a number not a bitmask. :)
My logic (which has to be faulty) dictates that the number of
something cannot be signed... But if the positive range of an
int covers all possible values, then there is probably no
difference.

There is if you write something like:
if ( disk_count() > -1 )
:). Mixing signed and unsigned in C++ leads to all sorts of
surprises, and is best avoided. The natural "integral" type is
int, which is signed, so it's best to avoid unsigned except for
special cases. (The problem is that "unsigned" doesn't behave
as a proper cardinal type, especially when mixed with signed.
If it did, I'd agree with you.)
 
J

James Kanze

»Procedure names should reflect what they do;
function names should reflect what they return.«
Rob Pike; »Notes on Programming in C«; February 21, 1989

That's actually not too bad, but only if you understand the
distinction between "procedure" and "function" at an abstract
level: a function returns a value, and does nothing else
(doesn't modify state---with the possible exception of things
like rand()). Formally, the Posix function read is a function,
not a procedure, but I wouldn't like to see it named "count",
even if that's what it returns. (Conceptually, it's a
procedure, of course---it does something, and the value it
returns is, in many ways, incidental.)

There's also the convention that predicate functions start with
"is", "are" or "has". This is very necessary if symbol names
are based on English, since it's often ambiguous whether
something is a verb or not, e.g.: a function named "isEmpty"
clearly returns true if whatever is empty; a function named
"empty" probably empties something. (And yes, the standard
library is full of bad examples.)
 
S

Stefan Ram

(And yes, the standard library is full of bad examples.)

Regarding the standard library and case of letters:

IIRC, the names of entities in both ISO/IEC 9899:1999 (E) and
ISO/IEC 14882:2003(E) are written with lowercase letters only,
unless they designate a macro.

One exceptions are the names of some standard functions in C:
the implementation is allowed to implement them as a macro,
but still they are written in lowercase letters only.

(»Any function declared in a header may be additionally
implemented as a function-like macro«,
ISO/IEC 9899:1999 (E), 7.1.4)

Another example are recent additions to C: »_Bool«,
»_Complex«, »_Imaginary«. I believe that this is the first
time, a C standard includes mixed case identifiers.

I am not aware of such identifiers in ISO/IEC 14882:2003(E).

The classical application of C is Unix system programming,
and, IIRC, the classical Unix sources I have seen use
lowercase letters only.

The first time I saw mixed case identifiers (like »WriteLn«)
was in Pascal (where case is not significant), and then on
code for Apple's Lisa, which was Pascal-based IIRC. When C
compilers became available for the Macintosh they had to use
the mixed case system calls, too. Possibly this was the way
mixed case identifiers crept their way into C (and, later,
C++) programming.

Since I believe to remember to have grown up with
lowercase-only identifiers in C, they still look the most
beautiful to me in the context of C.

The classical style I remember uses underscores to separate
words (»print_formatted«), but names like »printf« where used
for global identifiers because some linkers only supported up
to six significant identifiers.

One can look at classical source code by Bjarne Stroustrup:

http://www.softwarepreservation.org/projects/c_plus_plus/cfront/release_e/src/cfront.pdf

It contains an identifier made of two english words without an
underscore »morecore«. It also contains mixed-case
identifiers, like »NFn« or »Nfree«.

Donald E. Knuth has consistently avoided both word separators
and uppercase letters in identifiers in the surface of TeX,
even in long names such as »exhyphenpenalty«,
»tracinglostchars«, »tracingparagraphs«, »thickmuskip«,
»rightleftharpoons«, »scriptscriptfont«, or
»normalbaselineskip«.

The English language uses /open/ compounds, like »school bus«,
while German uses /closed/ compounds, like »Schulbus«. But
still, sometimes, even in English words can merge to form a
new closed compound, like »schoolbus«. To quote Donald E. Knuth:

»Newly coined nonce words of English are often spelled
with a hyphen, but the hyphen disappears when the words
become widely used. For example, people used to write
"non-zero" and "soft-ware" instead of "nonzero" and
"software"; the same trend has occurred for hundreds of
other words. Thus it's high time for everybody to stop
using the archaic spelling "e-mail". Think of how many
keystrokes you will save in your lifetime if you stop now!
The form "email" has been well established in England for
several years, so I am amazed to see Americans being
overly conservative in this regard.«

http://www-cs-faculty.stanford.edu/~knuth/email.html

When trying to answer the question which naming scheme to use,
I would look up authoritative sources, such as example code
from ISO/IEC 14882:2003(E), example code in Books of Bjarne
Stroustrup, the C++ standard library itself, or boost, and see
how it is done there.
 
S

Stefan Ram

The first time I saw mixed case identifiers (like »WriteLn«)
was in Pascal (where case is not significant), and then on
code for Apple's Lisa, which was Pascal-based IIRC. When C
compilers became available for the Macintosh they had to use
the mixed case system calls, too. Possibly this was the way
mixed case identifiers crept their way into C (and, later,
C++) programming.

One advantage of lowercase-only identifiers and never using an
underscore is that this is easy to remember: One only needs to
remember the words.

In a language where case is not significant, this does not
matter that much, but in Java, where case is significant, one
has to learn that one needs to write »copyOf« with an
uppercase »O« (a method name from the standard class »java.
util.Arrays«), but »if( o instanceof C )« with a lowercase »o«.

If the style guide also allows underscores, there are more
possibilities, like »instance_of«, »Instance_of«,
»instance_Of«, »Instance_Of«, »instanceof«, »Instanceof«,
»instanceOf«, and »InstanceOf«. A consistent scheme helps to
eliminate many of those and thus to remember how to write a
name.
 
J

Juha Nieminen

Arne said:
variables and Methods likeThis

One small problem with that is that if you don't distinguish in any
way eg. between member variables, local variables and global variables
(or, more precisely, variables global to the current compilation unit,
ie. the ones in a nameless namespace, because we don't really use
programwide globals, do we?) in the naming, code can become a bit confusing.

For example, if you have a line of code like this in the
implementation of some member function:

totalSum += counter * factor;

then it's not at all clear what that "totalSum" is. Is it a variable
local to this function? Is it a member variable of this class? Maybe
it's a variable global to this compilation unit? If the function is
dozens of lines long, it may not be immediately obvious. If "totalSum"
is not being declared as a local variable of the function, then it
becomes even less clear what it is. You would have to examine the
declaration of the class to see if "totalSum" has been declared there.
If it hasn't, then it's even more confusing.

Some people use simple prefixes to distinguish between these types of
variables. For example, you could use the prefix "m" for member
variables and "g" for global variables. Some people also use a prefix
from local variables (I think "a" is used by some people), but
personally I'm not sure that's necessary (because if everything else has
been prefixed, then a non-prefixed variable name becomes rather
obviously a local variable).

In other words, if the line was:

mTotalSum += counter * factor;

then - assuming you know the naming convention - you know immediately
that "mTotalSum" is a member variable of this class, while the other two
variables in that line are local to this function, and thus the line
became much easier to understand even without the context.
 
B

Bo Persson

Stefan said:
Regarding the standard library and case of letters:

IIRC, the names of entities in both ISO/IEC 9899:1999 (E) and
ISO/IEC 14882:2003(E) are written with lowercase letters only,
unless they designate a macro.

With a few exceptions, of course. :)

A famous one is the class ios_base::Init - with a capital I. Possibly
a mistake.

Another one is assert(), which is lowercase but required to be a
macro.
The first time I saw mixed case identifiers (like »WriteLn«)
was in Pascal (where case is not significant), and then on
code for Apple's Lisa, which was Pascal-based IIRC. When C
compilers became available for the Macintosh they had to use
the mixed case system calls, too. Possibly this was the way
mixed case identifiers crept their way into C (and, later,
C++) programming.

I think you overstate the importance of the Macintosh, but perhaps not
Pascal (or Modula-2).

I for sure use mixed case identifiers before the Mac was introduced.
:)
Since I believe to remember to have grown up with
lowercase-only identifiers in C, they still look the most
beautiful to me in the context of C.

The classical style I remember uses underscores to separate
words (»print_formatted«), but names like »printf« where used
for global identifiers because some linkers only supported up
to six significant identifiers.

One can look at classical source code by Bjarne Stroustrup:

http://www.softwarepreservation.org/projects/c_plus_plus/cfront/release_e/src/cfront.pdf

It contains an identifier made of two english words without an
underscore »morecore«. It also contains mixed-case
identifiers, like »NFn« or »Nfree«.
The English language uses /open/ compounds, like »school bus«,
while German uses /closed/ compounds, like »Schulbus«. But
still, sometimes, even in English words can merge to form a
new closed compound, like »schoolbus«.

I believe this to be more important. Just like Wirth, many people
native in German or scandinavian languages, read and write compound
words all the time. That would explain why Bjarne used morecore as his
function. It is not at all hard to read!

A lot of underscores on the other hand - words just don't have
underscores!
When trying to answer the question which naming scheme to use,
I would look up authoritative sources, such as example code
from ISO/IEC 14882:2003(E), example code in Books of Bjarne
Stroustrup, the C++ standard library itself, or boost, and see
how it is done there.

And just don't forget to skip the occational bad example. :)


Bo Persson
 
A

Arne Mertz

Juha said:
Arne Mertz wrote:
If the function is
dozens of lines long, it may not be immediately obvious.

If a function is dozens of lines long it most assuredly does not one
thing at one level of abstraction. If I find myself writing a
function of dozens of lines I refactor it an pull out a bunch of
subfunctions at lower levels of abstraction.

A
 
J

Juha Nieminen

Arne said:
If a function is dozens of lines long it most assuredly does not one
thing at one level of abstraction. If I find myself writing a function
of dozens of lines I refactor it an pull out a bunch of subfunctions at
lower levels of abstraction.

1) Sometimes trying to forcefully split a longer function into smaller
ones only makes the overall result more complicated, harder to
understand and more laborious to write.

2) Even if all your functions are 2 lines of code in length at max,
that still doesn't mean using a clear variable naming scheme wouldn't be
a good idea which would make the code easier to understand.
 
J

James Kanze

Regarding the standard library and case of letters:

The case is not the problem with standard library names. It's
things like "empty" not emptying anything, "remove" not removing
anything, and "good" not being the opposite of "bad".
 
J

James Kanze

One advantage of lowercase-only identifiers and never using
an underscore is that this is easy to remember: One only
needs to remember the words.

And how would e.g. using an underscore between words or
starting each word with a capital letter change this. It's
important to have a convention, and you should be able to
dictate code over the telephone with no ambiguities, but whether
you separate words with underscores, use a capital at the start
of each new word, or just run words together doesn't affect
that, as long as you are consistent.
In a language where case is not significant, this does not
matter that much, but in Java, where case is significant,
one has to learn that one needs to write »copyOf« with an
uppercase »O« (a method name from the standard class »java.
util.Arrays«), but »if( o instanceof C )« with a lowercase
»o«.

That's because Java isn't consistent:). (Actually, you've
choosen a bad example. Because instanceof is a keyword, it is a
single, newly invented word, so the use of all lower case can be
justified. Although, even as a keyword, I'd have made it
instanceOf.)
If the style guide also allows underscores, there are more
possibilities, like »instance_of«, »Instance_of«,
»instance_Of«, »Instance_Of«, »instanceof«, »Instanceof«,
»instanceOf«, and »InstanceOf«.

The style guidelines should choose one, and only one.
A consistent scheme helps to eliminate many of those and
thus to remember how to write a name.

Exactly. Consistency is the key.
 
S

Stefan Ram

James Kanze said:
And how would e.g. using an underscore between words or
starting each word with a capital letter change this. It's
important to have a convention, and you should be able to
dictate code over the telephone with no ambiguities, but whether
you separate words with underscores, use a capital at the start
of each new word, or just run words together doesn't affect
that, as long as you are consistent.

This would not be obvious, when the English language already
allows both possibilities:

»Usage in the US and in the UK differs and often depends
on the individual choice of the writer rather than on a
hard-and-fast rule; therefore, open, hyphenated, and
closed forms may be encountered for the same compound
noun, such as the triplets
container ship/container-ship/containership and
particle board/particle-board/particleboard.«

http://en.wikipedia.org/wiki/English_compound
 
A

Arne Mertz

Juha said:
1) Sometimes trying to forcefully split a longer function into smaller
ones only makes the overall result more complicated, harder to
understand and more laborious to write.

It is not about doing anything _forcefully_. It's about doing it
reasonably. Again, if I have a function that is dozens of lines
long, it is not doing one thing but it is doing several tasks at a
time. Giving each task several names and pulling the implementation
of the tasks out into functions labeled with those names IMHO makes
the overall result easier to understand and less complicated. Yes,
it is a bit more laborious to write but its woth the effort if it is
much less laborious to read after that.
2) Even if all your functions are 2 lines of code in length at max,
that still doesn't mean using a clear variable naming scheme wouldn't be
a good idea which would make the code easier to understand.

I did never say I am against giving the variables clear names. In
contrary, I say you should give the variables names that are clear
enough to make all those local/global/classlevel/hungarian-notation
prefixes and suffixes superflouos. When one reads code, those
ever-appearing prefixes and suffixes tend to be ignored after a
time. And if they are ignored, they are useless and can be left away
completely.

greets
A
 
J

Juha Nieminen

Arne said:
I did never say I am against giving the variables clear names. In
contrary, I say you should give the variables names that are clear
enough to make all those local/global/classlevel/hungarian-notation
prefixes and suffixes superflouos. When one reads code, those
ever-appearing prefixes and suffixes tend to be ignored after a time.
And if they are ignored, they are useless and can be left away completely.

I was talking about 2 possible prefixes consisting of one single
character each. It's not like they would clutter the variable names
visually or anything.

I don't find them useless when they help distinguishing between local,
member and global variables without having to see the context (which
might be large and spread across multiple files).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top