Common misconceptions about C (C95)

Ioannis Vranos · Nov 18, 2009

I have created a text available under GNU FDL 3 (Free Documenation License)
or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions.html

Any constructive comments/error corrections are welcome.

Best regards,

--
Ioannis Vranos

C95 / C++03 Software Developer

http://www.cpp-software.net

Seebs · Nov 18, 2009

Any constructive comments/error corrections are welcome.

You state that "Note: Empty parentheses are not equivalent to void arguments
in C, and should be avoided in function declarations and definitions, other
than main()."

I agree that they should be avoided, but I currently think that in a
function definition, they are "equivalent". (Which does not necessarily
mean that they have exactly the same effects on other code.)

You are incorrect to state that there are no implicit type conversions
for variadic functions. More precise would be to state that the implicit
type conversions are the default promotions (integer types smaller than int
to int, floating point types to double) rather than promotions specific to
the function. In the presence of a prototype, the arguments for which the
variadic function has corresponding declarations in its prototype get the
normal conversions, and it is only the arguments AFTER that which get
the default promotions.

I don't think it's poor coding practice to rely on the fact that '9' - '0' is
exactly 9; it's in the spec, and I think people have pretty much committed
to it.

Your example for the <ctype.h> functions is incorrect. You are passing
plain char to them, but they take unsigned char values. On a machine
with signed char, a character that happens to have a negative representation
will come out wrong. Also, getchar() returns an int, and you should probably
use an int variable to hold its result -- this solves the problem, as getchar
returns things in the range for unsigned char (or EOF, which is negative,
on error).

I'm not sold on the wchar_t material; I think multibyte characters are also
basically supported, but really, neither is completely portable -- at least,
not enough that you can reasonably assume that you can create a program
whose source will work on an arbitrary system and print Greek letters. (Maybe
I'm too pessimistic here.)

The string literal form explicitly supports multibyte characters -- although
that obviously depends on your target environment. But then, so does passing
arbitrary strings in as wchar_t, I suspect.

-s

Keith Thompson · Nov 18, 2009

Seebs said:
You state that "Note: Empty parentheses are not equivalent to void arguments
in C, and should be avoided in function declarations and definitions, other
than main()."

I agree that they should be avoided, but I currently think that in a
function definition, they are "equivalent". (Which does not necessarily
mean that they have exactly the same effects on other code.)

Ioannis, in case you missed the recent discussion, the issue is that a
function definition with empty parentheses specifies that the function
has no parameters, but not that it expects no arguments, strange as
that sounds. For example:

void func()
{
/* ... */
}

/* ... */

func(42);

The call's behavior is undefined, but no diagnostic is required, as it
would be if the definition started with "void func(void)".

But why do you say "other than main()"? Surely the main function, if
argc and argv are not used, should be declared as
int main(void) { /* ... */ }
not as
int main() { /* ... */ }

A rule that "(void)" is *always* preferable to "()" in function
declarations and definitions is easier to remember.

The latter is likely to work (and yes, the examples in K&R2 use
empty parentheses), but it's not clear that "int main()" is valid,
whereas "int main(void)" definitely is.

[...]

So far, I'm only responding to Seebs's comments; I'll try to find time
to read the document and give it a more thorough review.

Eric Sosman · Nov 18, 2009

Ioannis said:
I have created a text available under GNU FDL 3 (Free Documenation License)
or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions.html

Any constructive comments/error corrections are welcome.

Misconception #3: There seems to be a misconception here,
or at least a confusing statement. The variable arguments to
a variadic functions are subject to "the default argument
promotions," so saying that "implicit type conversions do not
take place" is at best misleading. The first printf() call is
not erroneous, but valid.

Misconception #5: It *is* possible to use printf() "without
including any header file," because printf() is a Standard
library function that can be declared free-hand, without use
of a Standard header (unlike fopen(), say). Of course, it's
a dumb idea to write such a declaration -- but maintaining
that it's impossible isn't right.

Misconception #11: Might be positioned near #1 for greater
effect and better continuity of exposition.

Misconception #13: Might be combined with #12 for brevity's
sake. "Brevity is the soul of wit," so if you can shrink
something by half that makes you a ...

General observation: There are quite a few misconceptions
about C that you have not touched upon, like "An int has 32
bits" and "Pointers are faster than array indices" and "The
size of a struct is the sum of the sizes of its elements."
But rather than try to fill a long list, why not just refer
people to the FAQ? Not meaning to be disrespectful, but what
does this list do that the FAQ hasn't already done?

Ioannis Vranos · Nov 18, 2009

Seebs said:
I agree that they should be avoided, but I currently think that in a
function definition, they are "equivalent". (Which does not necessarily
mean that they have exactly the same effects on other code.)

I think that in function definitions are not "equivalent", and it is a bad
style practice. For example consider the code:

void f() { }

int main(void)
{
/* Obviously the user expects different functionality than what f()
* provides.
*/
f(4);

return 0;
}

The good style would catch the mistake:

void f(void) { }

int main(void)
{
/* It catches the mistake. */
f(4);

return 0;
}

Another example:

/* Bad style, no diagnostic. */

void f() { }

int main(void)
{
void (*p)(int)= f;

p(4);

return 0;
}

/* Good style, we get a diagnostic. */

void f(void) { }

int main(void)
{
void (*p)(int)= f;

p(4);

return 0;
}

You are incorrect to state that there are no implicit type conversions
for variadic functions. More precise would be to state that the implicit
type conversions are the default promotions (integer types smaller than
int to int, floating point types to double) rather than promotions
specific to
the function. In the presence of a prototype, the arguments for which the
variadic function has corresponding declarations in its prototype get the
normal conversions, and it is only the arguments AFTER that which get
the default promotions.

You are right, I will have to rephrase that.

I don't think it's poor coding practice to rely on the fact that '9' - '0'
is exactly 9; it's in the spec, and I think people have pretty much
committed to it.

I think it is a poor practice usually.

Under specific situations it can be helpful, like goto can be helpful too.

Your example for the <ctype.h> functions is incorrect.

The code is:

#include <stdio.h>
#include <ctype.h>

int main(void)
{
char c= getchar();

if( islower(c) )
printf("\nThe character is lower case.\n");

else
if( isupper(c) )
printf("\nThe character is upper case.\n");

else
if( isdigit(c) )
printf("\nThe character is a decimal digit.\n");

return 0;
}

You are passing
plain char to them, but they take unsigned char values.

Actually they are taking an int, with valid value ranges either the value
EOF, or a value representable as unsigned char.

Since "char c" gets its value from getchar() (probably it would be more
elegant if c was an int, however I do not check for EOF in this code
snippet), there is no way to get undefined behaviour in the above code.

Also, as far as I know, getchar() returns only character values of the basic
character set (value ranges in [0, 127]), and EOF, and not other character
values of the extended character set.

However I will change it to int.

. On a machine
with signed char, a character that happens to have a negative
representation
will come out wrong.

With "signed char", I suppose you meant "plain char implemented as signed
integer type".

As far as I know, C90/C95 does not allow a character to have a negative
representation.

Also, getchar() returns an int, and you should
probably use an int variable to hold its result -- this solves the
problem, as getchar returns things in the range for unsigned char (or EOF,
which is negative, on error).

I'm not sold on the wchar_t material; I think multibyte characters are
also basically supported, but really, neither is completely portable -- at
least, not enough that you can reasonably assume that you can create a
program
whose source will work on an arbitrary system and print Greek letters.
(Maybe I'm too pessimistic here.)

Yes, because actually a system may not have Greek or some other non-latin
characters language support installed or ability to provide.

--
Ioannis Vranos

C95 / C++03 Software Developer

http://www.cpp-software.net

Keith Thompson · Nov 18, 2009

Ioannis Vranos said:
Seebs wrote:

[...]

A brief comment: You have a lot more blank lines than you need
in your code samples.

[...]

The code is:

#include <stdio.h>
#include <ctype.h>

int main(void)
{
char c= getchar();

if( islower(c) )
printf("\nThe character is lower case.\n");

else
if( isupper(c) )
printf("\nThe character is upper case.\n");

else
if( isdigit(c) )
printf("\nThe character is a decimal digit.\n");

return 0;
}

A side comment: the usual style is to put "else if" on one line:

if( islower(c) )
...
else if( isupper(c) )
...
else if( isdigit(c) )
...

Actually they are taking an int, with valid value ranges either the value
EOF, or a value representable as unsigned char.
Right.

Since "char c" gets its value from getchar() (probably it would be more
elegant if c was an int, however I do not check for EOF in this code
snippet), there is no way to get undefined behaviour in the above code.

Yes, there is. getchar() can easily return a value that exceeds
CHAR_MAX (if plain char is signed).

Also, as far as I know, getchar() returns only character values of the basic
character set (value ranges in [0, 127]), and EOF, and not other character
values of the extended character set.

Incorrect. A concrete example: if plain char is signed, 8 bits, and
the system's character set (in the current locale) is ISO-8859-1, if
the user enters 'e' with an acute accent ('Ã©'), getchar() will return
the value 233, which when stored in a plain char will be converted to
-23. islower(-23) invokes undefined behavior.

[...]

As far as I know, C90/C95 does not allow a character to have a negative
representation.

It does. C90 6.1.2.5 says:

An object declared as type char is large enough to store any
member of the basic execution character set. If a member of the
required source character set enumerated in 5.2.1 is stored
in a char object, its value is guaranteed to be positive. If
other quantities are stored in a char object, the behavior is
implementation-defined; the values are treated as either signed
or nonnegative integers.

Similarly, C99 (actually N1256) 6.2.5p4 says:

An object declared as type char is large enough to store any
member of the basic execution character set. If a member of
the basic execution character set is stored in a char object,
its value is guaranteed to be nonnegative. If any other
character is stored in a char object, the resulting value is
implementation-defined but shall be within the range of values
that can be represented in that type.

Members of the basic execution character set must be non-negative, but
members of the extended character set may be negative.

[...]

Michael Tsang · Nov 18, 2009

Ioannis said:
I have created a text available under GNU FDL 3 (Free Documenation
License) or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions.html

Any constructive comments/error corrections are welcome.

Best regards,

Why do you still write a document about an outdated C standard. The current
version is C99.

Seebs · Nov 18, 2009

I think that in function definitions are not "equivalent", and it is a bad
style practice. For example consider the code:

It may be a bad style practice. I'm well aware of the way in which they can
deny you expected diagnostics.

However, there is definitely a formal equivalence, in that both are described,
when used in a function definition, as specifying that the function takes no
arguments.

I think it is a poor practice usually.

Why?

It's guaranteed by the standard. It's worked everywhere always.

#include <stdio.h>
#include <ctype.h>

int main(void)
{
char c= getchar();

if( islower(c) )
printf("\nThe character is lower case.\n");

else
if( isupper(c) )
printf("\nThe character is upper case.\n");

else
if( isdigit(c) )
printf("\nThe character is a decimal digit.\n");

return 0;
}

Actually they are taking an int, with valid value ranges either the value
EOF, or a value representable as unsigned char.

Right. But 'char c = getchar()' is not guaranteed to yield a value which
can be passed to them.

Since "char c" gets its value from getchar() (probably it would be more
elegant if c was an int, however I do not check for EOF in this code
snippet), there is no way to get undefined behaviour in the above code.

Yes, there is.

Also, as far as I know, getchar() returns only character values of the basic
character set (value ranges in [0, 127]), and EOF, and not other character
values of the extended character set.

Wrong.

getchar() returns arbitrary characters from the execution character set,
having first converted them to the range of unsigned char.

On a machine where plain char is signed and 8 bits, it is certainly possible
for getchar() to return 200. Converted to char, this yields a negative
value which is not a valid argument for isalpha(), etc.

With "signed char", I suppose you meant "plain char implemented as signed
integer type".
Yes.

As far as I know, C90/C95 does not allow a character to have a negative
representation.

You are incorrect. Plain char must have the same representation as either
signed char or unsigned char, but it is allowed to be either. It can have
a negative representation in plain char.

-s

Michael Tsang · Nov 18, 2009

Ioannis said:
I have created a text available under GNU FDL 3 (Free Documenation
License) or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions.html

Any constructive comments/error corrections are welcome.

Best regards,

10. Use of malloc without casting is considered a poor practice (it is an
error in C++). Cast it to suitable type instead.

Keith Thompson · Nov 18, 2009

Michael Tsang said:
10. Use of malloc without casting is considered a poor practice (it is an
error in C++). Cast it to suitable type instead.

No, casting the result of malloc is considered poor practice in C.
This is covered in the comp.lang.c FAQ, <http://www.c-faq.com>.

spinoza1111 · Nov 18, 2009

I have created a text available under GNU FDL 3 (Free Documenation License)
or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions...

Any constructive comments/error corrections are welcome.

Best regards,

--
Ioannis Vranos

C95 / C++03 Software Developer

http://www.cpp-software.net

I gave you five stars because:

(1) The list seems useful to me despite the fact that it may not be
100% correct: 100% correctness is not possible in C, since C is an out-
dated language and a series of technical errors

(2) You did not attack Herb Schildt but wrote like a MAN and a
PROFESSIONAL about ideas and concepts, TEACHING other people to the
best of your ability

(3) You use English well and when you misspell you misspell in an
Attic (Greek) way that I like, as in invokation instead of invocation

Please stick around and don't mind the behavior of Seebach et al. They
think this ng is their personal fiefdom and they tend to transform
technical issues into campaigns of personal destruction owing to their
insecurity.

spinoza1111 · Nov 18, 2009

Misconception #3: There seems to be a misconception here,
or at least a confusing statement. The variable arguments to
a variadic functions are subject to "the default argument
promotions," so saying that "implicit type conversions do not
take place" is at best misleading. The first printf() call is
not erroneous, but valid.

Misconception #5: It *is* possible to use printf() "without
including any header file," because printf() is a Standard
library function that can be declared free-hand, without use
of a Standard header (unlike fopen(), say). Of course, it's
a dumb idea to write such a declaration -- but maintaining
that it's impossible isn't right.

Misconception #11: Might be positioned near #1 for greater
effect and better continuity of exposition.

Misconception #13: Might be combined with #12 for brevity's
sake. "Brevity is the soul of wit," so if you can shrink
something by half that makes you a ...

General observation: There are quite a few misconceptions
about C that you have not touched upon, like "An int has 32
bits" and "Pointers are faster than array indices" and "The

Why is it not the case that pointers are faster than array indices?

Are they not-faster some of the time, or are they not-faster all of
the time?

In a normal language you need to convert the index at compile or at
run time to the pointer by multiplying times the size of array
elements and adding the base address.

If there are things about C that make good programming practice bad,
and destroy knowledge, then C is the problem, isn't it?

And today, an int has 32 bits because today, 64 bits are in common use
yet common parlance recognizes the fact that 64 bits provides far more
precision than will ever be needed by most applications, and it is
unlikely that we shall need more. Therefore it makes sense in common
CS parlance to refer to the 64 bit integer as long, which entails
referring to the 32 bit integer as int. It is to be a troll from the
dark ages to want to call a 16 bit integer int, a 32 bit integer long,
and a 64 bit integer long long, or at a minimum it is to want to live
in the past.

Many creepy little C programmers creep into night skewl and find the
professor using a CS language which generalizes over language in this
way and decide that the prof doesn't know what he's talking about, and
this seems to be the case here.

size of a struct is the sum of the sizes of its elements."

Would it were true. But many platforms pad. This is why in modern
languages it is MEANINGLESS to speak of the size of a struct. A struct
is a lightweight class in C sharp and it MAKES NO SENSE to talk about
its size, nor is it necessary, because in an array of structs (a
permitted construct in C Sharp) we do not want to know the address of
any member. It's "undefined".

Seebs · Nov 18, 2009

10. Use of malloc without casting is considered a poor practice

This is false.

(it is an error in C++).

This is true, but irrelevant.

Use of malloc with a cast regularly hides errors that should have been
caught by the compiler. Use of malloc without a cast is portable, reliable,
and correct.

C++ intentionally broke this behavior because, in C++, you should be
using operator new anyway.

-s

Richard Tobin · Nov 18, 2009

spinoza1111 said:
Why is it not the case that pointers are faster than array indices?

Are they not-faster some of the time, or are they not-faster all of
the time?

In a normal language you need to convert the index at compile or at
run time to the pointer by multiplying times the size of array
elements and adding the base address.

It's certainly true that in a naively-compiled program, there may be a
cost to converting an array element expression to an address. This is
not always so: if the array elements have size 1 there is no
multiplication, and it's possible that the processor may have
addressing modes where the addition has no overhead.

It's also possible that better optimisation can be done in the case
of an array reference. Consider these alternatives:

int a[10], b[10], x;
int *pointer;
pointer = compute_pointer(a, b, ...args...);
a[1] = 25;
*pointer = 42;
x = a[1];

and

int a[10], b[10], x;
int index;
index = compute_index(a, b, ...args...);
a[1] = 25;
b[index] = 42;
x = a[1];

In the first case the compiler cannot tell whether a[1] will still be
25 when it is assigned to x. It might be 42 if computer_pointer
returned a pointer into a.

In the second case, the compiler can see that it's b that is modified,
and there's no possibility of a[1] having been changed. Using the
less powerful array indexing instead of a pointer also helps the
human reader see what is happening.

So pointers may be slower or faster than, or the same speed as, array
references.

-- Richard

Richard Tobin · Nov 18, 2009

Seebs said:
Use of malloc with a cast regularly hides errors that should have been
caught by the compiler.

Much less of a problem now that compilers generally warn about
undeclared functions.

-- Richard

jacob navia · Nov 18, 2009

Richard Tobin a écrit :

Much less of a problem now that compilers generally warn about
undeclared functions.

-- Richard

C++ NEEDS the cast... If you ever want to compile your code in C++ mode
it is better to leave the cast there.

bartc · Nov 18, 2009

Ioannis Vranos said:
I have created a text available under GNU FDL 3 (Free Documenation License)
or later, regarding "Common misconceptions about C" (C95).

http://www.cpp-software.net/documents/free_documents/c_misconceptions.html

Any constructive comments/error corrections are welcome.

Why is it double- (and sometimes triple- and quadruple-) spaced? That makes
it hard to follow.

The license section (about 60% of the bulk of the document) is a
distraction. Why not just make it a link.

Kenny McCormack · Nov 18, 2009

10. Use of malloc without casting is considered a poor practice (it is an
error in C++). Cast it to suitable type instead.

No, casting the result of malloc is considered poor practice in C.
This is covered in the comp.lang.c FAQ, <http://www.c-faq.com>.[/QUOTE]

You are both right, of course. The key question being:"considered" by
whom?

BTW, I just heard someone call out "Bingo!".

P.S. The Yankees are considered one of the best teams in baseball.
No. The Phillies are considered one of the best teams in baseball.

See the point?

Tom St Denis · Nov 18, 2009

No, casting the result of malloc is considered poor practice in C.
This is covered in the comp.lang.c FAQ, <http://www.c-faq.com>.

You are both right, of course. The key question being:"considered" by
whom?
[/QUOTE]

By people who know the language? By default the function will be
assumed to return an int and the parameter type is not known, so you
could do something like

unsigned long long a = 16;
void *p = malloc(a);

And it wouldn't know to convert 'a' to size_t first or that the return
value is a pointer. Now imagine you're on a platform where sizeof
(int) = 2 and sizeof(void *) = 4 or long long > size_t (entirely
possible). Admittedly, that's all a bit contrived but it's possible.

In short, it's bad form to call functions without a prototype.

Tom

Kenny McCormack · Nov 18, 2009

By people who know the language? By default the function will be
assumed to return an int and the parameter type is not known, so you
could do something like

(The usual CLC arguments - seen many, many times over the years - snipped)

I understand these standard CLC arguments - I also understand the other
side of it. The point is that the "not casting the return value of
malloc()" thing is clearly a shibboleth. That is, a thing by which the
in-crowd recognizes each other, and thus a thing you need to profess to
believe in order to be accepted in the in-crowd.

Where to buy C90 and C95 standards	12	May 9, 2009
"Concepts" were removed from C++0x	42	Jul 22, 2009
Profiler for g++ programs	4	Jun 12, 2009
Question about times of standard container operations	10	Dec 8, 2009
About adoption of implicit zero initialisation of POD types in theC++ standard	18	Apr 9, 2009
C++ 0x size and complexity	18	Feb 17, 2009
Padding bits in signed char	5	Apr 14, 2009
Happy Easter	4	Apr 17, 2009

Common misconceptions about C (C95)

Ioannis Vranos

Seebs

Keith Thompson

Eric Sosman

Ioannis Vranos

Keith Thompson

Michael Tsang

Seebs

Michael Tsang

Keith Thompson

spinoza1111

spinoza1111

Seebs

Richard Tobin

Richard Tobin

jacob navia

bartc

Kenny McCormack

Tom St Denis

Kenny McCormack

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads