Has thought been given given to a cleaned up C? Possibly called C+.

  • Thread starter Casey Hawthorne
  • Start date
L

lawrence.jones

In comp.lang.c.moderated Richard Delorme said:
Forward function declaration should be added to my list of things to remove.

Then how do you propose to support mutually recursive functions?
 
K

Keith Thompson

Rod Pemberton said:
Keith Thompson said:
Rod Pemberton said:
[...]
If I were
redesigning the switch statement from scratch, you'd be able to
specify multiple values in a single case, the "break" keyword
would not be required, and there would probably be special syntax
to specify falling through to the next case.


That works. It's not my first choice though. It just transposes the
locations where one must add additional control flow. E.g., "break;" is
removed, while, say, "fallthru;" is added. It's much like rewriting a
loop with "break's" to use "continue's" instead.

Except that "fallthrough;" (yes, I'd insist on spelling it correctly)
would be much rarer than "break;" is in current C code. Make the
normal case the default, and require a little extra work for the
exception. Of course it's too late to change this in C, unless we
leave the switch statement alone and add a new form of selection
statement (something I'm not advocating).
[...]
It also imposes an arbitrary ordering on the cases and restricts
which cases can fall through to which other cases.

What about "recase" or "reswitch"? E.g., perhaps like so:

case 0x10: recase (0x30);
case 0x20: /* stuff */
break;
case 0x30: /* stuff */
break;

The advantage is you don't need a goto label. The ordering can be as one
wishes. And, each case could then be auto-break. Hmm, that's not too bad,
IMO.

So "recase" is just like "goto", except that it jumps to a specified
case label rather than to a goto label.

Ok, that might not be an entirely bad idea. Let's explore it a bit.

I don't think the parentheses are necesssary: "recase 0x30;" should
suffice, though of course the expression can be parenthesized if you
like.

Presumably "recase default;" would be permitted. Would a "recase"
statement that targets a nonexistent case label be a constraint
violation, or would it jump to the "default:" label if it exists,
or terminate the switch statement if it doesn't?

What does it do in the presence of nested switch statements?
I'd guess that it would apply only to the innermost one, but
it would have to be specified.

It still creates just as much opportunity for abuse as the goto
statement itself. If you want to write BASIC in C:

int line = 10;
switch (line) {
case 10:
puts("Infinite loop");
recase 20;
case 20:
recase 10;
}

The "recase" statement could be defined consistently, but I don't
believe the benefits would outweigh the cost.
Well, you'd accept that they are C-subset compilers... since they don't
have those keywords. But, they're still C compilers.

Ron Cain's SmallC and variants
(1980, PC, no auto, static, extern, register, double, or float)

Lutz Hamel's WCC
(1987, VMS, no auto, static, extern, register, or double, has float)

I would dispute your claim that theses are "C compilers", any
more than, say, lcc-win is a C++ compiler. (I'm not picking on
lcc-win, it's just an example of a C compiler that doesn't attempt
to compile C++.) This is not, of course, to say that they're not
useful, they're just not C compilers.

We could debate the point, but then we'd be arguing about the meaning
of the phrase "C compiler" rather than about anything concrete.
Sigh.. Ok, you never post code.

That's neither relevant nor true.
You could've done so to demonstrate
whatever you're getting at.

I claimed that a certain task is not possible. What code could I
possibly post to support that claim?
But, you always want others to do so, so that
you can pick their's apart. Fine, here's one of many possible "Hello
World!" programs in C:


#include <stdio.h>
#include <stdlib.h>

#define HW "Hello World!\n"

int main(void)
{
FILE *out;
char ch;

out=tmpfile(); /* ANSI wb+ */
fprintf(out,HW);
rewind(out);
while(1)
{
ch=getc(out);
if(feof(out))
break;
putchar(ch);
} /* because I could... */

exit(EXIT_SUCCESS);
}

So you print "Hello World!\n" to a binary file, then you read it back
and print each character *to a text stream". The whole rigmarole
of writing the string to a binary stream and reading it back again
accomplished, at best, exactly nothing (assuming that the binary file
isn't padded with additional null characters, which is permitted).

Strictly speaking, you've written a "hello world" program that
uses binary mode -- but it doesn't use binary mode to produce the
"hello world" output.
The code works for MS-DOS with multiple compilers (ASCII CRLF as newline).
It works for 64-bit Linux with GCC (ASCII LF as newline). I can't test
MAC's (non-x86 Mac's had ASCII CR as newline), nor EBCDIC (EBCDIC NL as [snip]
PDP-11, or whatever...

Of course it works. If you print '\n' *to a text stream*, you'll
get a valid line ending, however it's represented. That's what
text mode is for.

Do you have something that actually supports your claim? Can you
write a portable "hello world" program that doesn't use text mode?

[...]
C'mon? Who cares?

Almost everyone except you, apparently.
Really! C is only *so* portable anyway (30% IMO).
Personally, I only have need of C programs working for ASCII, x86, DOS,
Linux.

(I.e., why do I or anyone else for that matter care if it's portable if it
works on an x86 PC? It's the _dominant_ computing platform. I don't have
access to IBM systems or mainframes. I no longer have access to
miniframes, or VMS. So, I can't test the code on them, nor can I fix found
issues. Most other people don't have access either. So, if the compiler
warnings don't catch it, then it's "portable"...)

So portability is unimportant to you. That's fine. But you're
advocating removing features from the C language that exist for
the purpose of enabling portable code. Fortunately for the vast
majority of us, that's just not going to happen.
 
S

Seebs

This is the one side of the coin. In practice, all [str|mem]cpy and
friends can be used securely.

Pretty much.
This is the other side of the usability vs security tradeoff. Since
strlcpy() imposes an additional precondition (on size) which strncpy()
does not, and has no way to verify it, it doesn't offer anything in
security terms. It may be easier to use (turning that two-line
"strncpy and NUL terminate" into one) but it is meaningless from a
security point of view.

It does, however, also have the desireable semantics of not always
filling all N bytes even if there's no need to.

In general, strncpy() should never be used unless you're working with
early UNIX inodes. strlcpy() works nicely, and I wish it were in the spec.
As a fallback, I use snprintf.

-s
 
A

Andrew Poelstra

Not that I am advocating the change, but Java has been dealing with such
things since the beginning. I fail to see why it would be so difficult
for a "C" compiler to do the same.

Can you compile java files separately and then distribute the
object files, linking them together all willy-nilly?

(I genuinely don't know. I've never looked into what magic
the Java bytecode is capable of.)
 
R

Richard Delorme

Le 16/03/2010 16:47, Nick Keighley a écrit :
how?!

How would mutual recursion work? Or separate compilation? What about
those weird people who put main() at the beginning of their
compilation units?

Well it already works in other languages like java, caml, etc. And it
almost already works in C:

Here is an example:
/* File square.c */
/*----8<-----8<--------*/
int square(int x)
{
return x * x;
}
/*----8<-----8<--------*/


/* File main.c */
/*----8<-----8<--------*/
int main(int argc, char **argv)
{
int x;

if (argc == 2) {
x = atoi(argv[1]);
printf("%d^2 = %d\n", x, square(x));
} else {
printf("usage: square <number>\n");
}
return 0;
}
/*----8<-----8<--------*/

$ gcc -c main.c -o main.o
main.c: In function ‘main’:
main.c:7: attention : incompatible implicit declaration of built-in
function ‘printf’
main.c:9: attention : incompatible implicit declaration of built-in
function ‘printf’
$ gcc main.o square.o -o square
$ square 4
4^2 = 16

This code contains no forward function declaration, two compilation
units, compiles and produces a working executable with an existing C
compiler. The diagnosis it prints out is also interesting, as it proves
the compiler knows the right declaration format of standard functions
without the need of #include & forward declaration.
Of course, I cheat a little by using an int as return type (if you
replace all 'int' by 'double' and 'atoi' by 'atod' the above code will
still compile but won't work as expected). When seeing an undeclared
function, gcc considers it as a function returning int and decipher its
arguments from its usage (except for variadic functions, as shown by the
diagnosis message). So it works mostly by accident. But I don't think
much work is necessary to make it works in other cases (with functions
not returning int).
 
E

Eric Sosman

[...]
without forward declarations how is the compiler to deal with that?

Not that I am advocating the change, but Java has been dealing with such
things since the beginning. I fail to see why it would be so difficult
for a "C" compiler to do the same.

Hysterical raisins, more or less. In C as it stands, when
the compiler is part way through a translation unit it already
has all the information it requires to decide what to do with
the next token. Roughly speaking, C can be compiled by a "one-
pass" compiler; Java requires multiple "passes."
Can you compile java files separately and then distribute the
object files, linking them together all willy-nilly?

Answering what you probably meant: Yes.

Answering what you actually asked: No, because there's no
"linking" step in Java. You go straight from lumps of compiled
byte code (each lump representing a class or interface) to
classes (interfaces) loaded into the Java Virtual Machine,
loaded at the moment they're needed. Structurally speaking
there's no such thing as a Java "program" in the sense C uses.
You tell the JVM "run class Foo," and the JVM loads Foo's byte
code and starts executing it. At some point, Foo's byte code
makes reference to a class Bar, at which point the JVM rummages
around until it finds and loads Bar's byte code, and so on.
It's all "on-demand," no before-the-fact joining of pieces.
 
R

Richard Delorme

Le 16/03/2010 16:52, Richard Bos a écrit :
Yes. Now try this if all you have are a declaration and a precompiled
library.

Right now it won't work because in absence of a return type, a C
compiler uses the type int as a return type. The above function is in
fact correct and
three() { return 3; }
is the same as
int three() { return 3; }

Now suppose a new standard or a new language that changes this behaviour
so that, in the absence of a type, the compiler uses the type void. What
is the problem ? To me the unnatural behaviour is to return an int when
nothing is written, instead of returning void.
 
W

Willem

Jasen Betts wrote:
)
)> Forward function declaration should be added to my list of things to remove.
)
) you're kidding right?
)
) what are you intending to use instead?
)
) // bar.c
)
) void bar(void)
) {
) foo( 3);
) }
)
) // foo.c
)
) void foo( double x)
) {
) printf ("%lf",x);
) }
)
) without forward declarations how is the compiler to deal with that?

The completely obvious and rather simple way: It looks ahead.

Duh.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
R

Richard Delorme

Le 16/03/2010 17:28, (e-mail address removed) a écrit :
Then how do you propose to support mutually recursive functions?

like this :
/* file recurse.c */
/*------8<--------------*/
int f(int a)
{
if (a <= 1) return 1;
else return g(a + 2);
}

int g(int b)
{
if (b >= 5) return 5;
return f(b - 1);
}

int main()
{
int i;

for (i = 0; i < 5; ++i) {
printf("f(%d) = %d\n", i, f(i));
printf("g(%d) = %d\n", i, g(i));
}

return 0;
}
/*------8<--------------*/

$ gcc recurse.c -o recurse
recurse.c: In function ‘main’:
recurse.c:18: attention : incompatible implicit declaration of built-in
function ‘printf’
$ recurse
f(0) = 1
g(0) = 1
f(1) = 1
g(1) = 1
f(2) = 5
g(2) = 1
f(3) = 5
g(3) = 5
f(4) = 5
g(4) = 5

So it is obviously not a problem to implement this as existing compilers
can deal with it.
 
K

Keith Thompson

Richard Delorme said:
Le 16/03/2010 16:47, Nick Keighley a écrit :

Well it already works in other languages like java, caml, etc. And it
almost already works in C:

The question isn't whether it works, it's *how* you suggest it it
should work.
Here is an example: [snip]
This code contains no forward function declaration, two compilation
units, compiles and produces a working executable with an existing C
compiler. The diagnosis it prints out is also interesting, as it
proves the compiler knows the right declaration format of standard
functions without the need of #include & forward declaration.
Of course, I cheat a little by using an int as return type (if you
replace all 'int' by 'double' and 'atoi' by 'atod' the above code will
still compile but won't work as expected). When seeing an undeclared
function, gcc considers it as a function returning int and decipher
its arguments from its usage (except for variadic functions, as shown
by the diagnosis message). So it works mostly by accident. But I don't
think much work is necessary to make it works in other cases (with
functions not returning int).

Here's another example:

=== square.c ===
float square(float x)
{
return x * x;
}

=== main.c ===
int main(void)
{
if (square(2.0) == 4.0) {
return 0;
}
else {
return 1;
}
}

I've avoided doing any I/O to sidestep the issue of references
to declarations in the standard library; I've also assumed that
"return 1;" is a sensible thing to do on an error.

In C as it's currently defined, the above won't work, or at least
will invoke undefined behavior. When the compiler compiles main.c,
it doesn't know what type square() returns. Under C90 rules,
it assumes (incorrectly, as it turns out) that it returns int.
Under C99 rules, the missing declaration makes the call a constraint
violation.

In your hypothetical C without "function forward declarations", how
would this work? How would the compiler figure out that square()
takes a float argument and returns a float result?

I'm not saying it can't be done. I'm asking how you suggest doing it.
 
J

Jasen Betts

Le 05/03/2010 18:56, Casey Hawthorne a écrit :

While reading this thread, it looks that some people want to make
additions to the language, like Jacob Navia's operator overloading &
generic containers. To make the language simpler, or cleaner, why not
rather remove things from it. For example :

- void can be removed from the language. So instead of declaring
void f(void);
we can simply write :
f();
The generic pointer type (void *), could then be replaced by (char*)
without much harm.

- auto can be removed from the language.

- register can be removed from the language. The only thing it is
usefull is to prevent the usage of the address of a variable. If we do
not want to use an address of a variable I think we can refrain
ourselves from doing so, without the need of a keyword. Some compiler
can do some better optimizations with it, but I think most of recent
compilers don't really care of the presence of this keyword.

- restrict can be removed. It is mostly here to facilitate some
optimizations by the compiler by preventing aliases. I think this is not
the duty of the programmer to facilitate optimization, but rather the
burden of the compiler.

- -> operator can be replaced by the . operator.

We can also get further by cleaning the names of basic types. I would
prefer to spell char b.y.t.e., as it makes its purpose more obvious.
Some integer types are redundant, short/int or int/long or long/long
long depending on the machine. I think such redundancy should go.

Obviously the standard library could be improved, at least by removing
dangerous function like gets() or stupid functions like strncpy and
making all functions thread safe. It might also be made simpler by
removing useless type like size_t.

strncpy is not stupid, it's just often misused, fgets is also often
misused, should that be aboloshed too.

I think I have used strncpy exactly twice where its behavior was
exactly what I wanted, without it I would have had to find some other
way, perhaps using memcpy.

null padded fixed length buffers seem to have gone out of style.


--- news://freenews.netfront.net/ - complaints: (e-mail address removed) ---
 
K

Keith Thompson

Richard Delorme said:
Le 16/03/2010 16:52, Richard Bos a écrit :

Right now it won't work because in absence of a return type, a C
compiler uses the type int as a return type. The above function is in
fact correct and
three() { return 3; }
is the same as
int three() { return 3; }

Only in C90. C99 dropped the implicit int rule; in C99, attempting to
call a function with no visible declaration is a constraint violation.
Now suppose a new standard or a new language that changes this
behaviour so that, in the absence of a type, the compiler uses the
type void. What is the problem ? To me the unnatural behaviour is to
return an int when nothing is written, instead of returning void.

The problem is that valid C90 code (which happens to be invalid C99
code because it depends on implicit int) becomes valid Cwhatever code
with a different meaning.
 
K

Keith Thompson

Richard Delorme said:
Le 16/03/2010 17:28, (e-mail address removed) a écrit :
Then how do you propose to support mutually recursive functions?

like this :
/* file recurse.c */
/*------8<--------------*/
int f(int a)
{
if (a <= 1) return 1;
else return g(a + 2);
}

int g(int b)
{
if (b >= 5) return 5;
return f(b - 1);
}

int main()
{
int i;

for (i = 0; i < 5; ++i) {
printf("f(%d) = %d\n", i, f(i));
printf("g(%d) = %d\n", i, g(i));
}

return 0;
}
/*------8<--------------*/ [snip]
So it is obviously not a problem to implement this as existing
compilers can deal with it.

Your proposal means that C can no longer be compiled in a single pass;
when the compiler sees the call to g(), it has to look ahead to find
the declaration (and definition) of g.

That's not necessarily a bad thing. But what if f() and g() are
defined in separate files?
 
K

Keith Thompson

Nick Keighley said:
this being completely unprecedented in the C language

i = i * 2;
f = f * 2.0;
t = *tp;

I didn't say it was unprecedented.

Richard Delorme (what, another Richard?) propose adding a new meaning
to the "." operator (and presumably dropping the "->" operator,
but perhaps he means to leave it in place). I'm not even saying
that's a bad idea; I'm merely suggesting that it's not necessarily
a simplification.
 
N

Nick

Willem said:
Rod Pemberton wrote:
) What about "recase" or "reswitch"? E.g., perhaps like so:
)
) case 0x10: recase (0x30);
) case 0x20: /* stuff */
) break;
) case 0x30: /* stuff */
) break;
)
) The advantage is you don't need a goto label. The ordering can be as one
) wishes. And, each case could then be auto-break. Hmm, that's not too bad,
) IMO.

I would prefer the 'see' keyword, and I would then recommend pointing back
to earlier cases: It

case 0x10: /* stuff */
break; /*
case 0x20: /* Other stuff */
break;
case 0x30: see case 0x10;

Oh come on, it's too funny not to do it :p

It's rare for programming language discussion to cause a genuine LOL
moment, but that was one.

I've actually implemented something vaguely similar in my basic-like
homebrew language.

It has "select" statements with a controlling variable, which can be
changed. "next" functions inside "select" statements and causes the
whole switch to be started again.

So, for example:
for thisloop = 10 to 20 step 5
select thisloop as x
case x = 10
print 'hello'
x = 20
next x
case x = 15
print 'boo!'
case x = 20
print 'goodbye'
end select
end for thisloop

results in:
hello
goodbye
boo!
goodbye

I'm still not sure if this is a good idea or not.
 
S

Seebs

It's rare for programming language discussion to cause a genuine LOL
moment, but that was one.

However, it's inappropriate for C.

It should be:

case 0x30: static case 0x10;

because "static" is the C keyword for "hey, guys, I have this great
idea..."

-s
 
I

Ian Collins

Even considering only systems using CR, CR/LF, or LF, C's text mode can go
wrong: reading in a file created on a computer with a different newline
sequence, or writing a text file on this computer and reading it on one
using a different sequence.

And then there are hybrid files which are mainly binary data, but also
contain embedded text that can include newline characters. Which means that
binary data that looks like CR/LF gets converted to LF (and the entire file
shrinks in size by one byte), or vice versa.

I suspect people who advocate text mode tend to use machines with a single
character newline, and simply don't see the problems it creates when newline
is multiple characters.

I doubt that, on systems with a single character newline, test mode is
irrelevant.
 
I

Ian Collins

How would a person do it? You've given us a hint here, but placing square.c
next to main.c, so I'm assuming the square() in that file is the one in
question.

The compiler of main.c will need some extra information, such as the names
of some files to search for names which are undefined in main.c. And without
namespaces, it may have to search all the files suggested. In practice it
will likely build a database of exports from those files to speed things up.

In other words a step backwards; requiring a two pass compilation.
The question you haven't asked, is how mutually recursive functions, one in
each of two modules (or perhaps in a circular chain of modules), are
handled. I'm still thinking about that one...

That might take a while!
 
E

Eric Sosman

Even considering only systems using CR, CR/LF, or LF, C's text mode can go
wrong: reading in a file created on a computer with a different newline
sequence, or writing a text file on this computer and reading it on one
using a different sequence.

Okay, we agree: C's text streams are unsuitable for processing
files that are not text files. (This is news?)

On the other hand, C's binary streams are also unsuitable for
manipulating text in such files. I have used a real live system
(I believe the O/S in question is still in use and still supported
today) on which puts("Hello") produced

'\005' '\000' 'H' 'e' 'l' 'l' 'o' '\000'

If you moved those eight bytes verbatim to a Unix or Windows
system and tried to read them with a text stream, you'd get junk
at best. If you read them with a binary stream, you'd get the
eight bytes -- but then it would be *your* problem to know the
text-file conventions of the foreign system. (See the newline?
No? Too bad: It's there, sort of. See the two NUL's? Yes? Too
bad: They're not there, sort of.)

The solution is not for C to try to impose its own formatting
ideas on every system everywhere ("We control the horizontal, we
control the vertical"), but to find out and fix what went wrong
in the file transfer process. It's not C's fault that a malformed
file is malformed.
And then there are hybrid files which are mainly binary data, but also
contain embedded text that can include newline characters. Which means that
binary data that looks like CR/LF gets converted to LF (and the entire file
shrinks in size by one byte), or vice versa.

If you use the wrong kind of stream, you may get unwanted
translations (or miss out on necessary translations). The fix is
simple: Use the right kind of stream for the job at hand.
I suspect people who advocate text mode tend to use machines with a single
character newline, and simply don't see the problems it creates when newline
is multiple characters.

This whole thing has been explained to you several times, and
you're still grasping the wrong end of the stick. C's distinction
between text and binary streams doesn't *create* incompatibilities,
it gives you a fighting chance to *solve* incompatibilities that
arise outside C's sphere of influence.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,227
Latest member
Daniella65

Latest Threads

Top