Has thought been given given to a cleaned up C? Possibly called C+.

jacob navia · Mar 10, 2010

bartc a écrit :

jacob said:
jacob said:

Ben Bacarisse a écrit :

<snip>
I am not convinced of any application of operator overloading that
is not (1) Done with new numeric types
(2) Done with access to containers (overloading [ ] )

ALl others, like using + to "add" srings or << or >> to print stuff
are bad usages in my opinion.

What about a library for doing symbolic manipulation of expressions
-- for example polynomials. Would you not want to define + as the
operation of addition on polynomials?

Click to expand...

In general we should have

a+b <--> b+a
a*b <--> b*a

Click to expand...

What do you have against a-b and a/b?

Hey bart!

Maybe (just maybe) you are aware that

a-b != b-a
a/b != b/a

I know EVERYTHING I say is wrong... But maybe (just maybe)
there are some exceptions

bartc · Mar 10, 2010

jacob navia said:
bartc a écrit :

jacob said:

Ben Bacarisse a écrit :
<snip>
I am not convinced of any application of operator overloading that
is not (1) Done with new numeric types
(2) Done with access to containers (overloading [ ] )

ALl others, like using + to "add" srings or << or >> to print stuff
are bad usages in my opinion.

What about a library for doing symbolic manipulation of expressions
-- for example polynomials. Would you not want to define + as the
operation of addition on polynomials?

In general we should have

a+b <--> b+a
a*b <--> b*a

Click to expand...

What do you have against a-b and a/b?

Click to expand...

Hey bart!

Maybe (just maybe) you are aware that

a-b != b-a
a/b != b/a

I thought you were saying that operator overloads were only worth doing when
they are commutative (and naturally if you need "+", you might need "-")

But I think now you're saying that if "+" or "*" are overloaded, they had
better be commutative (otherwise your compiler doesn't know if they can be
reversed).

In which case I'd probably disagree (strings come to mind for "+", and
matrices for "*", for a start).

In this circumstance, perhaps your compiler can just assume they are not
reversible.

Keith Thompson · Mar 10, 2010

bartc said:
jacob navia wrote: [...]

In general we should have

a+b <--> b+a
a*b <--> b*a

Click to expand...

What do you have against a-b and a/b?

Um, what makes you think he has anything against them? He was stating
requirements for overloaded "+" and "*" operators; he didn't even
mention "-" or "/".

Andrew Poelstra · Mar 10, 2010

I thought you were saying that operator overloads were only worth doing when
they are commutative (and naturally if you need "+", you might need "-")

But I think now you're saying that if "+" or "*" are overloaded, they had
better be commutative (otherwise your compiler doesn't know if they can be
reversed).

In which case I'd probably disagree (strings come to mind for "+", and
matrices for "*", for a start).

In this circumstance, perhaps your compiler can just assume they are not
reversible.

I'm not sure that the compiler is the biggest concern here. More
to help out the people reading the code, who might see

a * b + c * a

and think, oh, I can simplify that:

a * (b + c)

And move on, believing the code to be shorter and therefore easier
to comprehend, unaware that a, b, and c were all matrices and he
just caused an operator failure (which would mean...?) or possibly
a subtle and well-hidden bug if (a * b) ~= (b * a).

Keith Thompson · Mar 10, 2010

bartc said:
Tidying up of C Language

For someone taking a fresh look at C, and who doesn't care too much
about it's history or compatibility with existing code, these are a
few things that might stand out.

I'm not necessarily providing any fixes here, just pointing out areas
that might cause raised eyebrows:

As I'm sure you know, anything that will break existing code,
particularly that will silently break existing code, has no chance of
making it into a future C standard. That's not intended to discourage
you from posting your ideas; some might be adapted to avoid breaking
existing code, some might be useful for future design of other
languages, and many are interesting in their own right.

[...]

'Header' Files

A misnomer for ordinary include files. Someone used to Import
statements for example might expect the contents of header files
to exist in a separate scope from the module being compiled.

(Although a proper treatment of header files might be difficult
without introducing module namespaces too.)

The solution for C as it is now is to understand how #include
directives actually work.

I wouldn't mind some sort of "import" feature that works on a higher
level than simple text inclusion. But it would need to be defined,
which would be a lot of work, and I've never seen a proposal.
And it would have to coexist with what we have now.

[...]

Type Declarations

These are C's famous convoluted inside-out declarations. I've
already suggested a left-to-right alternative (which might just
co-exist with the old scheme)

I'm skeptical that such a scheme could actually coexist with the
current one. I'm even more skeptical that a new scheme could be
*proven* to be compatible with the current scheme. In the best case,
there would almost inevitably be cases where a casual reader would
have difficult figuring out whether a given declaration uses the
old scheme or the new one, and what it means.

I'm not a big fan of C's declaration syntax, but I don't think
adding a new set of rules is the answer.

Struct Namespace

Just a cause of confusion. Just let a struct name be equivalent to
a typedef name.

It worked for C++. I don't see much of a problem with the idea.
I think it would break some existing code, but only in relatively
obscure cases.

Numeric Literals

Who would have guessed that 0123 is an octal number? Get rid of that
notation, and introduce 8x123 if anyone is still interested.

Making 0123 mean 123 (decimal) would quietly change the meaning of
existing code. You could avoid that by banning leading 0s except for
"0", but that would invalidate existing code.

I agree that the existing octal notation is confusing, and a different
syntax like 8x123 would have been better.

And there needs to be *some* octal notation, at least for Unix file
permissions.

And there should be a notation for binary literals, perhaps 2x11011.

There's some precedent for 0b11011.

And those strange suffixes you sometimes see: LU, LLU and so on,
are they really necessary? Why can't the type of the constant be
automatic?

The type of an unsuffixed constant already is automatic; see C99
6.4.4.1. The suffixes are needed only when you need to specify a type
other than the implicit one.

I suppose you could have the type of an integer constant depend on
the context in which it appears but (a) I'm not sure what that would
buy you, and (b) it would break the existing rule that the type of
a subexpression is (almost always) determined by the subexpression
itself, not by its context.

Sizeof Operator

This just gives the number of bytes in a type (or the type of an
expression). That's fine, but what about getting the number of
elements of an array? Ie. without bothering having to divide the
bytes in the entire array by the bytes in one element...

It's easy enough to write a macro:

#define ARRLEN(arr) (sizeof (arr) / sizeof (arr)[0])

On the other hand, it's easy to misuse that by applying it to a
pointer. Something that can only be applied to array expressions
could be useful.

Type Limits

These is where you start seeing names such as USHRT_MAX and
LLONG_MIN (all tacky abbreviations we are constantly told to avoid
as macros and typedefs), and where you start wondering, is there a
Better Way?

(Such as, perhaps, long.max or signed char'min, which together with
a set of standardised short type names would tidy things up
considerably.)

Ada, for example, has a number of "attributes" that can be applied
to various entities: Typename'Size, Object'Size, Typename'First,
Typename'Last, Array'Length, and so forth. Something like that
in C could replace sizeof, offsetof, the above ARRLEN macro, and
probably a number of other things. On the other hand, creeping
featurism is always dangerous.

[...]

Operators

A power operator is missing (I think because no-one can decide what
to use, since * is heavily involved with pointers).

I suppose ^^ is available.

The << and >> operators have a strange precedence (they effectively
multiply and divide, so should be the same as * and /)

Can't be fixed without quietly breaking existing code.

Switch Statement

It should not be necessary to use break to terminate every case
statement. (And there's the problem that break cannot then be used
to escape from a loop).

csh uses "breaksw" to break out of a switch statement.

Case expressions should be able to use ranges and commas:

case 1,2,3,5..7,8:

instead of:

case 1: case 2: case 3: case 5: case 6: case 7: case 8:

No excuses!

On the other hand, that introduces the temptation to write
case 'a'..'z':
which isn't portable.

gcc uses the existing "..." token for this.

And Switch statements are very strange in that the case statements
do not form a normal block scope, so that you can have a case label
buried deep inside an embedded if statement or a loop! This is just
too weird to have in a serious language.

Take a look at ioccc.org and tell me C is a serious language.

}

I don't think the existing switch statement can be removed,
but a new form of selection statement might be added. (Again,
this runs into the creeping featurism problem.)

For Statement

This is a funny, but useful, variation, of a loop statement, but is
not a For statement as normally understood. A streamlined 'proper'
For statement would be handy (but is awkward to fit into C's zero-
based philosophy).

Can you be more specific?

[...]

Named Constants

Ie. what someone might expect when writing const int x=1000; x is
variable not an alias for 1000.

Given that const really means read-only, there is no proper way of
assigning a name to a literal, other than workarounds using #define
and enum, both with their own restrictions.

I think stealing what C++ did with this would be quite reasonable.

Arrays

Array handling is ... different. However I don't have suggestions
to fix that, without completely changing the language.

Name Scoping

As I understand it, function names, and variable names declared
outside of functions, are always exported unless some attribute
(static?) is used.

I don't think this is what one would expect (ie. names are normally
private unless explicitly exported). The way C works now seems just
a little too casual.

I agree with the principle, but again, this would break existing code.

Something related to this: if I were designing my own language,
declared objects would be read-only by default. If you want to be
able to change an object's value after declaring it, you need to
say so, perhaps with a "var" keyword.

Text and Binary File Modes

No comments needed...

Compiler Attributes

When you look at actual header files they always seem to be full of
cr*p like this (and often a lot worse):

_CRTIMP __p_sig_fn_t __cdecl __MINGW_NOTHROW signal(int, __p_sig_fn_t);

all full of ad-hoc non-portable extensions specially designed to
make declarations completely incomprehensible.

Whatever it is these attributes are supposed to do, why not just
standardise them?

Because many of them are implemention-specific. Though a standard
syntax for implementation-defined attributes wouldn't be a bad thing;
is that what you meant? gcc provides some precedent for this.

[...]

Well, that's about all I could think of before breakfast. I've tried to
leave out personal preferences as that would have made it several
times the size.

And I've mainly concentrated on syntax...

Hey, you forgot to define a new meaning for "static"!

Michael Foukarakis · Mar 10, 2010

OpenBSD's strlcpy(), for one.

Those aren't any more secure than regular strcpy().

Dag-Erling SmÃ¸rgrav · Mar 10, 2010

Andrew Poelstra said:
I think the biggest missing feature of C is namespaces.

Agreed, but adding namespaces to the standard would be a lot of work,
because there are a lot of corner cases to consider. There is also the
issue of preprocessor macros, which can't be coerced into namespaces.

DES

James Kuyper · Mar 10, 2010

Nick said:
quite correct. This harping on about keywords is just silly. You can
write code that compiles with both and a C compiler and a C++
compiler. And it isn't that hard.

Having the code compile isn't sufficient; the code should also have the
same behavior in both languages. Still, it's not too hard to take well
written C code and make the modifications needed for it to compile with
the same behavior under both C and C++. However, there's lots of tricky
little special cases that you should be aware of; see Annex C of the C++
standard for more details. Most of the issues listed there don't come up
in well written C code, but some of them may come up when porting legacy
code that takes undue advantage of C90's backward compatibility with K&R C.

Jasen Betts · Mar 10, 2010

I think the biggest missing feature of C is namespaces. If we had
namespaces it would be a lot easier to share (and reuse) libraries
because there'd be far less risk of name clashing.

The linker doesn't have namespaces, C++ only has mamespaces because
the compiler mangles names, and each compiler mangles differently.

you can do name mangling in C by using a header that #defines short
convenient names for inconveniently named things.

--- news://freenews.netfront.net/ - complaints: (e-mail address removed) ---

Richard Bos · Mar 10, 2010

jacob navia said:
My opinion is that C should be developed further, exactly BECAUSE is
a simple language. Adding a container library to C doesn't make the
language any bigger or more complicated but makes programs COMPATIBLE
because it is possible to interchange data with standard API/containers.

I'm sorry, but... did you _really_ just write something to the tune of:
"C is a simple language. Adding a large, complicated gathering of data
structures to it won't make it any bigger and more complicated"?

Because if you did, I'd like to know what the French government thinks
of the stuff you imported from some "coffee shop" in Amsterdam; and if
you didn't, I'd like to know how you can possibly think that any
flexible container library that could be good enough to use by the
majority of C users could ever _not_ be too large, complicated, and
highly over-specced to use.

Richard

Richard Bos · Mar 10, 2010

Ian Collins said:
One interesting question is why, after over 30 years of use, there
hasn't appeared a widely accepted a general and extensible container
interface for C?

The C++ STL appeared fairly early on in that language's evolution and
was rapidly and widely accepted by C++ developers. The widespread use
resulted in it being incorporated into the standard library in first
language standard.

Why hasn't the same thing happened with C?

Because it's the Wrong Thing for C. C isn't a one-purpose language.

Richard

Nick · Mar 10, 2010

Andrew Poelstra said:
I'm not sure that the compiler is the biggest concern here. More
to help out the people reading the code, who might see

a * b + c * a

and think, oh, I can simplify that:

a * (b + c)

And move on, believing the code to be shorter and therefore easier
to comprehend, unaware that a, b, and c were all matrices and he
just caused an operator failure (which would mean...?) or possibly
a subtle and well-hidden bug if (a * b) ~= (b * a).

Although lots of languages do do something like this.

It's easy to think of one where if a=3 and b = "ab" and c = "xy" then

a * b + c * a

will give "abababxyxyxy"
but

a * (b + c)

will give "abxyabxyabxy"

I've not heard of people suffering that sort of problem though.

I've said this before, the problem I see with operator overloading is
that you are stuck with an arbitrary set of operators to work with. If
you really are just implementing, say, complex numbers or very large
integers, then it's useful - but you don't do that very often. This is
Jacob's position AIUI. But if you want to do something else you end up
forcing it into this mould. This is what tends to happen.

Really you need to be able to define new operators, even if this is just
syntactic sugar for functions. Give them all the same precedence and
off you go.

Andrew Poelstra · Mar 10, 2010

The linker doesn't have namespaces, C++ only has mamespaces because
the compiler mangles names, and each compiler mangles differently.

you can do name mangling in C by using a header that #defines short
convenient names for inconveniently named things.

It wouldn't be difficult to standardize name mangling - hell, if you
just left the :: characters in there and forced linkers to handle :
as a token character that would be a solution.

Or, you could require namespace support to be included in linkers.
IRRC C99 required changes to linkers, so this isn't a taboo
requirement. (And really, almost all compilers come with their own
linkers these days. It wouldn't be a big deal if both needed to be
updated.)

Andrew Poelstra · Mar 10, 2010

Agreed, but adding namespaces to the standard would be a lot of work,
because there are a lot of corner cases to consider. There is also the
issue of preprocessor macros, which can't be coerced into namespaces.

Preprocessor macros would be an unfortunate edge case, but IMHO
not a fatal one. For function-like macros, you can use the inline
keyword on a function to get namespaces. For constants, you could
use an enum.

What other corner cases can you think of?

Feeling C++-ey would be simple enough without any standard library
changes:

namespace std {
#include <stdio.h>
#include <stdlib.h>
}

And boom, you've got std::malloc() and std:

uts() and all that.

Though now that I think about it, using that idea would probably
cause a fair bit of confusion for linkers.

Ian Collins · Mar 10, 2010

Because it's the Wrong Thing for C. C isn't a one-purpose language.

Very few languages are, C++ certainly isn't.

Seebs · Mar 10, 2010

Those aren't any more secure than regular strcpy().

I don't entirely agree. Obviously, both can be used securely, but in
practice, I think strlcpy() is easier to use in a reliably safe way.

-s

Seebs · Mar 10, 2010

I'm sorry, but... did you _really_ just write something to the tune of:
"C is a simple language. Adding a large, complicated gathering of data
structures to it won't make it any bigger and more complicated"?

I suspect Jacob is relying somewhat on the distinction between "language"
and "library" here. A standard container library does not make the "language"
part of the language more complicated. (It is a bug in English that the C
language has two components, language and library. Library functions are not
part of the language part of the language, but the library as a whole is
part of the language. This should be clear and simple to understand, and is
not because English is defective. Please file a defect report with the love
child of some random Norman/Saxon pairing in the 1500s.)

The issue, as I see it is:

Insofar as no complexity is really added, neither is any functionality; you
can't use the container library without knowing what it does.

Insofar as functionality is added, it creates additional complexity.

The designs I've seen for Jacob's container library, while reasonably
flexible, are IMHO too top-heavy for the sorts of things that normally make
it into the C standard library. They're very clearly influenced by things
learned during the development of C++ and the STL, but those things, I think,
make less sense in a pure C environment.

I guess the question is:

Were this part of a future spec, would we get more or fewer complaints from
confused newbies? I'd guess more. That, to me, means it is increasing
complexity.

-s

Ian Collins · Mar 10, 2010

You do operator overloading in c when you use an operator like
+ since the compiler will do all necessary conversions for you.

The only thing that operator overloading adds to the language is the possibility
for the USER to add itsb own numeric types. Besides, if you do not like that
feature, do not use it. The language stays the same otherwise.

C's conversion rules also provide a degree of function overloading there
as well (float+float, float+int etc..), do you support something similar?

Richard Bos · Mar 10, 2010

bartc said:
jacob said:

Ben Bacarisse a écrit :

<snip>
I am not convinced of any application of operator overloading that
is not (1) Done with new numeric types
(2) Done with access to containers (overloading [ ] )

ALl others, like using + to "add" srings or << or >> to print stuff
are bad usages in my opinion.

What about a library for doing symbolic manipulation of expressions
-- for example polynomials. Would you not want to define + as the
operation of addition on polynomials?

Click to expand...

In general we should have

a+b <--> b+a
a*b <--> b*a

Click to expand...

What do you have against a-b and a/b?

And against vector multiplication?

Richard

bartc · Mar 10, 2010

Keith Thompson said:
As I'm sure you know, anything that will break existing code,
particularly that will silently break existing code, has no chance of
making it into a future C standard.

The OP suggested a new version called "C+" or some such name. That would be
a good opportunity to drop some baggage.

I'm skeptical that such a scheme could actually coexist with the
current one. I'm even more skeptical that a new scheme could be
*proven* to be compatible with the current scheme. In the best case,
there would almost inevitably be cases where a casual reader would
have difficult figuring out whether a given declaration uses the
old scheme or the new one, and what it means.

The current syntax looks like this:

T x;

Where T is an existing type (standard or typedefed) and x is a name. And
optionally x can be surrounded by *,[] and () to build on top of T.

The left-to-right type-spec just sits in place of T, and you can still build
on it with *,[] and (), I think even when there is no name, as in a cast
(there might be a couple of loose-ends to do with the ambiguity of *, and ()
when a "function" keyword is not used").

The following would all be equivalent:

int x[10][20];
[20]int x[10];
[10][20]int x;

However this:

[10][20]int x,y;

would be: int x[10][20],y[10][20] in the old syntax.

It's easy enough to write a macro:

#define ARRLEN(arr) (sizeof (arr) / sizeof (arr)[0])

That's fine but it should be built, otherwise everyone will use a slightly
different version.

On the other hand, it's easy to misuse that by applying it to a
pointer. Something that can only be applied to array expressions
could be useful.

This is where a compiler-aware tag would be useful (in another language I
might use x.len, as well as x.upb, x.lwb, which can only be applied to
appropriate types, and x.bytes -- ie. sizeof -- which works on anything).

Ada, for example, has a number of "attributes" that can be applied
to various entities: Typename'Size, Object'Size, Typename'First,
Typename'Last, Array'Length, and so forth. Something like that
in C could replace sizeof, offsetof, the above ARRLEN macro, and
probably a number of other things. On the other hand, creeping
featurism is always dangerous.

These are not really features, just basic requirements I would have thought.
And it is mainly syntax that is easy to implement.

I don't think the existing switch statement can be removed,
but a new form of selection statement might be added. (Again,
this runs into the creeping featurism problem.)

I liked the idea of using "switch!" (either from Ruby, or something I
dreamed up) which doesn't test for a default case, and therefore is faster
when you know the switch index is in range. Another featurette I suppose.

Can you be more specific?

A typical For statement iterates from A to B inclusively, or more often from
1 to N. C however would often need to iterate from 0 to N-1, already losing
some of it's advantage in conciseness.

Unless it were to designed to work from 0 to N-1 anyway, like Python's for i
in range(N):

for (i in 5) printf("%d ",i);

would display 0 1 2 3 4 .

Something related to this: if I were designing my own language,
declared objects would be read-only by default. If you want to be
able to change an object's value after declaring it, you need to
say so, perhaps with a "var" keyword.

(I do design my own languages; I have tried to have a read-only attribute,
but it's quite fiddly to implement. But I'm not too bothered because it
doesn't it buy you any extra performance (not with my compilers anyway...)
and just introduces more clutter.)

Because many of them are implemention-specific. Though a standard
syntax for implementation-defined attributes wouldn't be a bad thing;
is that what you meant? gcc provides some precedent for this.

I mean something where you can guess what the attributes mean. Being able to
port between compilers would be useful too.

(In one of my designs such things might look like:

windows function ....
clang function ...

which are self-explanatory...)

Hey, you forgot to define a new meaning for "static"!

I didn't have time...

Book Opinion-Expert C programming	9	Feb 5, 2004
Initialized and uninitialized global variable allocation	9	Apr 6, 2012
C++ Goldmine has been updated: http://preciseinfo.org/Convert/index_Convert_CPP.html	0	Jan 28, 2010
Tasks	1	Nov 29, 2022
C language now truly universal	0	Jan 1, 2011
How to tell if a posted input value has been changed?	4	May 10, 2011
Python Goldmine has been updated: http://preciseinfo.org/Convert/index_Convert_Python.html	9	Jan 8, 2010
Yet another book recommendation, but for someone who can program and yet does not the terminology	4	Apr 26, 2005

Has thought been given given to a cleaned up C? Possibly called C+.

jacob navia

bartc

Keith Thompson

Andrew Poelstra

Keith Thompson

Michael Foukarakis

Dag-Erling SmÃ¸rgrav

James Kuyper

Jasen Betts

Richard Bos

Richard Bos

Nick

Andrew Poelstra

Andrew Poelstra

Ian Collins

Seebs

Seebs

Ian Collins

Richard Bos

bartc

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads