some well known stupidness in c99

Eric Sosman · Nov 22, 2012

[...]
Also, if you implement a binary search (tree of if/else) instead
of linear search (if/elseif/elseif... /else) it is a lot of work to
get right, the nesting levels look horrible, and it is pretty much
impossible to extend. (Not that you could do that with the first
match logic, though.)

Getting "first match" with a slightly elaborated binary
search isn't hard. But wasn't this whole construct proposed
for the sake of "case" values not known until run-time? How
will you perform a binary search among those cases? Sort
them every time you execute the "switch" (because they may
have changed since the previous execution)?

Rui Maciel · Nov 22, 2012

glen said:
Doesn't that just move the problem around?

The problem, in this example, is exactly the same, as is the way it is
solved: writing a string matching function whose return value is passed to a
switch statement. The only difference is that the solution I suggested
doesn't require a fancy switch() statement which is, at best, somewhere
between useless and redundant.

So, although the problem might not have been moved around, the solution was:
back to the realm of plain standard C, the kind that already exists for some
decades now. This was in itself a major problem.

Rui Maciel

Rui Maciel · Nov 22, 2012

Ian said:
Less to type!

If, in my matchStrings(), the string matching code was replaced with a call
to strcmp() or strncmp(), it would consist of a single for() loop iterating
a call to strncmp(), which would require about half a dozen lines of code
(LoC).

I don't believe it would be a good idea to bloat the C standard just to be
able to save at best half a dozen LoC on a corner case which no one actually
needs.

Rui Maciel

Rui Maciel · Nov 22, 2012

Ben said:
Perl has a wonderful construct that looks like the proposed case
statement but can't quite be described in these terms. An example:

given ($x) {
$y = 'a' when $a;
$y = 'b' when $b;
$y = 'c' when $c;
}

The reason it's not exactly as described is that the "given" part and
the "when" parts are not connected syntactically. You can, for example,
have a set of "when" clauses in a "while" or "for" loop, and you don't
have to have any inside a "given" statement.

It's an extraordinarily powerful construct, mainly due to the magic
that a "when" clause can do.

The description of that feature in Perl isn't very reassuring.

Quoted from:
http://perldoc.perl.org/perlsyn.html#Switch-Statements

<quote>
Exactly what the EXPR argument to when does is hard to describe precisely,
but in general, it tries to guess what you want done. Sometimes it is
interpreted as $_ ~~ EXPR, and sometimes it does not. It also behaves
differently when lexically enclosed by a given block than it does when
dynamically enclosed by a foreach loop. The rules are far too difficult to
understand to be described here.
</quote>

I don't doubt it can be wonderful or useful to some people, but I really
doubt that it is a good idea do adopt a feature whose rules are described as
"far too difficult to understand" in the language's documentation.

Maybe that kind of stuff is valued in the Perl community, but gratuituously
adding features which are "far too difficult to understand", particularly
when they don't provide anything which isn't already offered by features
that exist for decades, only creates problems and needlessly complicates
things.

Rui Maciel

Rui Maciel · Nov 22, 2012

BartC said:
Not in syntax. That's one of the easiest parts of a compiler.

In C however everyone is on the one hand encouraged to use macros to
overcome shortcomings in the syntax ('there really is no need for such a
feature because here's a simple macro to do it');

I don't believe this is true. I don't remember ever seeing anyone encourage
the use of macros: quite the opposite, actually. They are seen as a last
resort kind of thing.

What is indeed encouraged is the practice of writing your own routines to
solve your own problems. This isn't a bad thing to do, because it is
essentially all that a programmer does.

and on the other hand,
discouraged to use tacky macros, implemented in a different way by each
programmer, because it makes the code unreadable to others! Because
effectively each is creating their own dialect.

That's essentially what every API or every helper routine is: a dialect.
Whenever someone develops their own library or adopts one developed by a
third-party, that person is defining their own dialect.

For example, I hate the C for-statement, since there's a lot of typing,
and things to mistype or get wrong, for a simple A to B iteration, or
N-times repeat.

So in the quite a few thousand lines of C code I've written over the last
few weeks, I've extensively used macros such as FOR(I,A,B) or TO(I,N)
(which just repeats some code N times; I don't need the index, but the
macro is simpler with it).

What's the difference between for(int I = A; i < B; i++) and FOR(I, A, B) ?

But obviously the need is there, so why not have it part of the language?

Because the suggestions that have been presented are already a part of the
language. A concept might not be expressed with a particular syntactic
sugar, but that doesn't mean the language doesn't support it, and if we have
a choice between a concise language and an inflated one to express the exact
same concepts, conciseness always wins.

This doesn't mean that all suggestions are bad, or that the C programming
language can't be improved. Yet, this doesn't mean that every single
suggestion should be considered, let alone accepted, just because someone,
once in its life, stumbled on an obscure corner case that could be expressed
differently if a very specific and equally obscure statement was supported
by a programming language.

Rui Maciel

Ben Bacarisse · Nov 22, 2012

Rui Maciel said:
The description of that feature in Perl isn't very reassuring.

Quoted from:
http://perldoc.perl.org/perlsyn.html#Switch-Statements

<quote>
Exactly what the EXPR argument to when does is hard to describe precisely,
but in general, it tries to guess what you want done. Sometimes it is
interpreted as $_ ~~ EXPR, and sometimes it does not. It also behaves
differently when lexically enclosed by a given block than it does when
dynamically enclosed by a foreach loop. The rules are far too difficult to
understand to be described here.
</quote>

I don't doubt it can be wonderful or useful to some people, but I really
doubt that it is a good idea do adopt a feature whose rules are described as
"far too difficult to understand" in the language's documentation.

You mean in C? I was not proposing it for C! Perl programmers tend to
be rather relaxed people, and they'll just avoid this construct in
production code until the exact definition is pinned down. But in Perl,
the advantage of adopting a feature with evolving rules is that language
developers will get real freedback about real use cases. When the
semantics are finally cast in stone, the feature will be all the better
for it.

I've used it a few times just to get a job done fast, and it's very
expressive. I doubt any of these script will actually break down the
line because the "natural" uses are the ones that are going to be
preserved. It's the corner cases that are in flux.

Maybe that kind of stuff is valued in the Perl community, but gratuituously
adding features which are "far too difficult to understand", particularly
when they don't provide anything which isn't already offered by features
that exist for decades, only creates problems and needlessly complicates
things.

You've subtly changed the meaning there! "far too difficult to
understand" is not the same as "far too difficult to understand to be
described here". There is a much fuller explanation elsewhere.

Perl did not have a switch statement, and I think it would have been
considered "underpowered" to add one that just compared expressions for
equality.

Keith Thompson · Nov 22, 2012

BartC said:
I use another version of switch when:

o I don't care whether a jumptable is created
o Integer case values are too big or disparate to form a jump table
o Non-integer case values are used
o Non-constant case values are being tested

That last one is the proposal here, but I have say there is very rarely a
need for that (in my own code); the other reasons are more likely ones to
use this alternate (in my case) form of switch. But why not allow it anyway?
Just to have the 'expressive power' when you need it.

It already is allowed; any compiler can add this as an extension, as
long as it doesn't break conformance.

As for adding it to the standard, "why not" is the wrong question.
There are a plethora of featuers that would be "nice to have".
Adding all of them to the standard would make for a huge language.
There's a strong burden of proof on anyone proposing a new feature
to convince the committee that it's worth the substantial cost.

BartC · Nov 22, 2012

Rui Maciel said:
BartC wrote:

I don't believe this is true. I don't remember ever seeing anyone
encourage
the use of macros: quite the opposite, actually. They are seen as a last
resort kind of thing.

What is indeed encouraged is the practice of writing your own routines to
solve your own problems. This isn't a bad thing to do, because it is
essentially all that a programmer does.

Writing routines is OK; that's just programming. And it doesn't change the
syntax.

Using macros however can effectively change the language.

That's essentially what every API or every helper routine is: a dialect.
Whenever someone develops their own library or adopts one developed by a
third-party, that person is defining their own dialect.

But it's a well-understood way of extending a language. A function call
looks like a function call. A macro invocation can involve anything. You
look at the body of a function, and you still see normal C code. You look at
a macro
definition and it's quite often gobbledygook.

What's the difference between for(int I = A; i < B; i++) and FOR(I, A, B)
?

My FOR(I,A,B) macro iterates between A and B inclusively, and does so
reliably between having to write the loop index 3 times (you've mixed up a I
with i), or remembering to use a <= instead of = (you've used <), or just
having to bother to write all those parts of a loop which are the compiler's
job not mine:

#define FOR(i,a,b) for (i=a; i<=b; ++i)
#define TO(i,x) for (i=x; i; --i)

With FOR, I just give it the 3 elements that are all that are really needed
to define the loop, and can concentrate the loop body!

Because the suggestions that have been presented are already a part of the
language. A concept might not be expressed with a particular syntactic
sugar, but that doesn't mean the language doesn't support it, and if we
have
a choice between a concise language and an inflated one to express the
exact
same concepts, conciseness always wins.

Syntax, especially basic constructs that practically every other language
have had for decades, will hardly inflate the language. What does inflate it
are the 1000 or so functions in the run-time library. And endless blocks of
macros like this:

#define INT_MAX +32767
#define INT_MIN -32767
#define LONG_MAX +2147483647
#define LONG_MIN -2147483647
#define LLONG_MAX +9223372036854775807
#define LLONG_MIN -9223372036854775807
#define MB_LEN_MAX 1
#define SCHAR_MAX +127
#define SCHAR_MIN -127
#define SHRT_MAX +32767
#define SHRT_MIN -32767
#define UCHAR_MAX 255
#define USHRT_MAX 65535
#define UINT_MAX 65535
#define ULONG_MAX 4294967295
#define ULLONG_MAX 18446744073709551615

when all that is really needed is a single syntax feature that can be
applied to any type in the same way as sizeof().

BartC · Nov 22, 2012

Keith Thompson said:
It already is allowed; any compiler can add this as an extension, as
long as it doesn't break conformance.

As for adding it to the standard, "why not" is the wrong question.
There are a plethora of featuers that would be "nice to have".
Adding all of them to the standard would make for a huge language.

I don't agree. If you have restrictions, they have to be documented and
enforced. Sometimes it's easier to just allow something, than try and
explain exactly why it can't be done!

And if the need really is there, then the workaround needed will increase
the size of that specific program anyway.

Keith Thompson · Nov 22, 2012

BartC said:
I don't agree. If you have restrictions, they have to be documented and
enforced.

The current restrictions on the switch statement are clearly documented
and enforced.

Sometimes it's easier to just allow something, than try and
explain exactly why it can't be done!

But not in this case. You're talking about a new flow control
construct, one whose implementation is *very* different from the current
switch statement. Are you volunteering to provide the implementation
for every C compiler on the market? (Good luck with the closed source
ones.)

And if the need really is there, then the workaround needed will increase
the size of that specific program anyway.

As opposed to increasing the size of the language specification and of
every C compiler.

Stanley Rice · Nov 28, 2012

It is sad to say so but there is some real

and terrible stupidnes in c99. When i wrote

programs in c (and I am involved c user) I

just use const int for array boundaries but

in c99 (and some previous too, terribly) it

will not compile (see for example

http://stackoverflow.com/questions/1712592/variably-modified-array-at-file-scope ) It is terribly wrong

thing for ppl who just wants to use const int to this

purposes and not to use preprocesor - as I do.

It is terribly stupidness in my opinion, it have to

be changed.

In C, the *const* qualifier has the meaning *read only". It only guarantee the variable that is qualified with such qualifier to be read only, but not the constant.

David Thompson · Nov 28, 2012

Doesn't that just move the problem around?

It seems that String case might be added to Java in a future
version, but you could just case on the hashCode() of a String.

(Suncle) JavaSE 7, out a year now, has switch on String. (They used to
number things separately, so to be pedantically correct we had to say
things like "the JDK 6 implementation of JLS 3". For 7 they numbered
JLS and JVMS changes to match the JDK.)

If the String hash function is fixed, you could just calculate
the hashCodes.

switch(str.hashCode) {
case "abc".hashCode():

except that, as far as I know, hashCode isn't evaluate at compile
time, so it is back to the run-time evaluation problem.

One could precompute the hashCode values, check that there
aren't any collisions, and put those in.

It actually uses a combination of your ideas here: it first compares
the hashCode of the value to those of the case labels, which must be
compile-time constant expressions, and then compares the full string
values to protect against collisions and other false hash matches, and
maps to a small integer value (or -1); it then dispatches on that
integer value to the statements, in the same way as for a (since
forever IINM) switch-on-integer.

Note in Java Object.hashCode can be overridden for user-defined
classes as long as you maintain the constraint that two objects with
the same semantic value give the same hashcode -- but not necessarily
vice versa. But for the language-defined and library-defined classes
hashCode is fixed, so the compiler knows what it is for String.

glen herrmannsfeldt · Nov 28, 2012

(snip, I wrote)

The one I read must have been out of date.

(Suncle) JavaSE 7, out a year now, has switch on String. (They used to
number things separately, so to be pedantically correct we had to say
things like "the JDK 6 implementation of JLS 3". For 7 they numbered
JLS and JVMS changes to match the JDK.)

It actually uses a combination of your ideas here: it first compares
the hashCode of the value to those of the case labels, which must be
compile-time constant expressions, and then compares the full string
values to protect against collisions and other false hash matches, and
maps to a small integer value (or -1); it then dispatches on that
integer value to the statements, in the same way as for a (since
forever IINM) switch-on-integer.

Earlier in the thread I noted that "abcd"[2] isn't a constant
expression in C. I found that out when I tried to use a string
as an argument to a preprocessor macro to generate case labels.

#define hash(x) (2*x[0]+3*x[1]+5*x[2]+7*x[3])

(Though I had a better hash function.)

But it doesn't work for case labels.

Note in Java Object.hashCode can be overridden for user-defined
classes as long as you maintain the constraint that two objects with
the same semantic value give the same hashcode -- but not necessarily
vice versa. But for the language-defined and library-defined classes
hashCode is fixed, so the compiler knows what it is for String.

It seems that the hash function for String is actually in the Sun
(oops, Oracle) documentation for the class.

-- glen

Shao Miller · Dec 15, 2012

having a const int not generate an external global read only object
would be a big mistake in my opinion. One solution would be to use

static const int table_max = 100;

being static, the name only has scope for that translation unit, and the
compiler is allowed to omit the actual generation of the object by the
"as if" rule if it is not needed.

The other thing to note, is I believe that currently, multiple
definitions of a global in different translation units is not a
constraint violation, but undefined behavior, so the compiler would be
allowed (but not required) to make it work as expected, letting the
program compile, link, and run, and even not take up memory for the
value if not needed. Typically this would be done by making a "weak
definition" for the object, and letting the linker discard it if not
needed, or allow multiple definitions as needed.

It might be possible for a future standard to REQUIRE the compiler to
make this case work, but it might be hard to find a specification that
actually allowed the desired multiple definitions without also forcing
the compiler to accept as valid multiple definitions that are really
programming errors.

Another option might be to allow a construct like

extern const int table_max = 100;

to declare the value as a constant expression without allocating memory
for it, and require exactly one translation unit to have a definition like:

const int table_max;

to generate the memory location for any uses that might need it.

It might be interesting if, at file scope,

register const int table_max = 100;

was allowed and meant something, but it isn't and it doesn't. It might
mean something like "non-addressable, external linkage, constant, shall
be defined in only one translation unit, all other declarations must use
'extern' and omit any initializer," or some such.

However, 'int' doesn't seem like too interesting a case, since 'enum'
can accomplish the same thing. I guess the idea is so that another
source file needn't #include a header with the value? That an
already-translated (or partially-translated) blob would be all that
needs to be "carried around"? Well then when do things get resolved?
As I saw BartC already mentioned, there could be some cyclic definition
of the value.

- Shao Miller

Shao Miller · Dec 15, 2012

On 11/19/2012 06:35 AM, BartC wrote: ...

#define lengthof(x) (sizeof(x)/sizeof(x[0]))

In C it's all done with macros.

Click to expand...

...
I'm reasonably sure he's interested in an operator that could be applied
to a pointer (not an array) to give the length of the array it points
at.

Click to expand...

Correction: if pointer did point at an array:

int (*parray)[15];

lengthof(*parray) would gives 15, the desired value. That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

lengthof("Hello world!") returns the expected value, as the string
literal here is an array, and the operator [on the LHS of '/'] is 'sizeof'.

(But James has blocked my posts, so he'll never know.)

I might've said:

char * hello = "Hello world!";
/*
* Nope. Always going to be:
* sz == sizeof (char *)
* No matter how the string literal is changed.
*/
size_t sz = lengthof(hello);

- Shao Miller

Eric Sosman · Dec 16, 2012

James said:
James said:

On 11/19/2012 06:35 AM, BartC wrote: ...
#define lengthof(x) (sizeof(x)/sizeof(x[0]))

Click to expand...

Click to expand...

That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

Click to expand...

I was just reading a post by Shao Miller,
with words to the effect that lengthof("Hello world!") works just fine.

I would have used a different name for the macro,
so as not to confuse the meaning of the macro
with the meaning of how the standard defines the length of a string.

What did you mean by "doesn't handle in the desired fashion" ?

This is Question 6.21 on the comp.lang.c Frequently Asked
Questions (FAQ) page, <http://www.c-faq.com/>.

James Kuyper · Dec 16, 2012

James said:
James said:

On 11/19/2012 06:35 AM, BartC wrote: ...
#define lengthof(x) (sizeof(x)/sizeof(x[0]))

Click to expand...

Click to expand...

That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

Click to expand...

I was just reading a post by Shao Miller,

I didn't see it; I've got his messages filtered out. While he
occasionally raised legitimate points (this apparently being one of
them), all too often he doesn't.

with words to the effect that lengthof("Hello world!") works just fine.

I would have used a different name for the macro,
so as not to confuse the meaning of the macro
with the meaning of how the standard defines the length of a string.

What did you mean by "doesn't handle in the desired fashion" ?

Sorry - I gave a bad example. Given

char array[] = "Hello world!";
char *p = array;

then lengthof(p) gives lengthof(p) gives (sizeof(char*)/sizeof(char)),
while what is actually desired is (sizeof(array)/sizeof(char)).

Phil Carmody · Dec 17, 2012

Shao Miller said:
On 11/19/2012 06:35 AM, BartC wrote: ...
#define lengthof(x) (sizeof(x)/sizeof(x[0]))

In C it's all done with macros. ...
I'm reasonably sure he's interested in an operator that could be applied
to a pointer (not an array) to give the length of the array it points
at.

Click to expand...

Correction: if pointer did point at an array:

int (*parray)[15];

lengthof(*parray) would gives 15, the desired value. That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

Click to expand...

lengthof("Hello world!") returns

Only functions "return". Operators "yield", or "evaluate to", or ... .

Phil

Shao Miller · Dec 17, 2012

Shao Miller said:
Shao Miller said:

On 11/19/2012 08:21 AM, James Kuyper wrote:
On 11/19/2012 06:35 AM, BartC wrote:
...
#define lengthof(x) (sizeof(x)/sizeof(x[0]))

In C it's all done with macros.
...
I'm reasonably sure he's interested in an operator that could be applied
to a pointer (not an array) to give the length of the array it points
at.

Correction: if pointer did point at an array:

int (*parray)[15];

lengthof(*parray) would gives 15, the desired value. That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

Click to expand...

lengthof("Hello world!") returns

Click to expand...

Only functions "return". Operators "yield", or "evaluate to", or ... .

Quite right; my mistake. I was thinking along the lines of the numerous
"macro returns" usage in the C Standard, but this is clearly no library
macro. "Yields" definitely makes more sense.

- Shao Miller

Phil Carmody · Dec 17, 2012

Shao Miller said:
Shao Miller said:

On 11/19/2012 09:06, James Kuyper wrote:
On 11/19/2012 08:21 AM, James Kuyper wrote:
On 11/19/2012 06:35 AM, BartC wrote:
...
#define lengthof(x) (sizeof(x)/sizeof(x[0]))

In C it's all done with macros.
...
I'm reasonably sure he's interested in an operator that could be applied
to a pointer (not an array) to give the length of the array it points
at.

Correction: if pointer did point at an array:

int (*parray)[15];

lengthof(*parray) would gives 15, the desired value. That's not the
problematic case. It's pointers that point at the first element of an
array (such as "Hello world!") that lengthof() doesn't handle in the
desired fashion.

lengthof("Hello world!") returns

Click to expand...

Only functions "return". Operators "yield", or "evaluate to", or ... .

Click to expand...

Quite right; my mistake. I was thinking along the lines of the
numerous "macro returns" usage in the C Standard, but this is clearly
no library macro. "Yields" definitely makes more sense.

Many of those other 'returns' were single-handedly purged a while back,
post '99, I think.

Phil

C99 integer types	24	Jul 29, 2012
Variably modified struct/union and C99	2	Oct 8, 2009
"Homemade" C99 prototype?	7	Dec 10, 2007
shift/reduce conflicts in the YACC grammar of C99	4	Oct 30, 2010
C99 float variants of math.h functions	6	Aug 2, 2009
Trap Representations - c99 [again]	10	Mar 15, 2006
What's going on with C Compilers and C99??	30	Mar 28, 2007
Return an array of known length from a function (that takes in an arrayusing pointers	0	Mar 3, 2012

some well known stupidness in c99

Eric Sosman

Rui Maciel

Rui Maciel

Rui Maciel

Rui Maciel

Ben Bacarisse

Keith Thompson

BartC

BartC

Keith Thompson

Stanley Rice

David Thompson

glen herrmannsfeldt

Shao Miller

Shao Miller

Eric Sosman

James Kuyper

Phil Carmody

Shao Miller

Phil Carmody

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads