How would you design C's replacement?

R

Rui Maciel

If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?


Rui Maciel
 
B

BartC

Rui Maciel said:
If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?

Perhaps the problems and shortcomings should be summarised first. Then they
need to be agreed to be shortcomings; many of the experts here are quite
happy with the language as it is. Others will already be using alternatives.
 
K

Keith Thompson

Rui Maciel said:
If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?

Just about every post-1978 language that uses curly braces to delimit
code blocks, and many that don't, is *somebody's* answer to that
question.
 
K

Kaz Kylheku

If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?

void. I would get rid of void. No "void func()", no (void) argument list, no
void * pointers. Functions with no return value could be declared/defined with
the "proc" keyword. Generic pointer would be "byte *", where byte would be a
type built in to the language, exactly like unsigned char, and explicit
conversion by cast would be required in both directions.

More consistent syntax. For instance function parameter decls separated
by semicolons, not commas, allowing:

proc foo(int a, b, c; double *p)

I would smear the distinction between statements and expressions. Statements
would be allowed to return values, so for instance, this would be valid:

int x = if (a > b) c; else d;

Conformance. I would distinguish warning and error diagnostics in the spec,
and forbid required diagnostics from being mere errors. Implementations which
translate a unit that requires a diagnostic, allowing it to be linked and
executed, would be considered nonconforming.

Introspection. The language would have an API for run time introspection.
I don't want to start writing details about this, but suffice it to say that
the introspection would be sufficient that it could allow a program in the
language to implement a precisely tracing garbage collector (even in the
presence of threads) without resorting to any platform-specific assembly
language or other hacks. Introspection would allow the GC to walk stacks and
know exactly where the live variable values are.

I would strengthen arrays without the disadvantages of making them more
encapsulated. There would a way to define an integral object which indicates
the size of an array referenced by a pointer:

struct vec {
double *dynamic : int size;
double *fixed : 1; /* pointer to just one double */
};

Such a definition introduces two objects, so that struct vec above has
a member called size of type int. The compiler understands that the
size of the array pointed at by v->dynamic is given by int size.

This would be in function parameters also:

double dotproduct(double *v1 : int size_v1, *v2 : int size_v2);

Now when we have a sized vector, we can do this:

dotproduct(v1->dynamic, v2->fixed);

The compiler knows that the size of v1->dynamic (where v1 points to a struct
vec) is given by v1->size. So, automatically, it passes v1->size as
the value for the size_v1 formal parameter. For size_v2, it passes 1.

If you pass an unsized pointer to a sized parameter, the size parameter is not
filled in. It must be specified explicitly:

dotprodut(some_pointer : 4, v2->fixed);

If a pointer is derived from an array, its size is inferred from the array.

All dataflows involving pointers would automatically carry the size
information, if any, propgating it between the size objects:

v1->dynamic = v2->dynamic; /* v1->size = v2->size is implicit! */

I would drop variable length arrays. They could be replaced by sized pointers
which are initialized using a notation:

int *p : int s = [42];

This [42] is an initializer which means dynamically allocate automatic
storage, returning null if it is not possible, and the size is 42 times the
size of the pointer being initialized. The variable s receives the value 42 if
the allocation succeeds, otherwise zero.

Of course, the s is optional:

{
int *p = [42]; // make auto array of 42 ints, or fail with null
}
// out of scope: array is now gone

The nice thing is that we can now pass p into functions, and using the
size-propagation logic, functions can know the size. This is even if we wrap p
inside a data structure:

struct obj {
int *ptr : int size;
};

proc foo(struct obj);
int bar(int *ptr : int size);

/* ... */

{
int *p = [42];
struct obj;
obj.ptr = p; // implicit: obj.size = 42;
foo(obj); // foo knows size
bar(p); // bar receives size
}

Pointer arithmetic would also propagate the size, if possible. (Which is part
of the point.)

If p is a size-attributed pointer of constant size s, then p + n is either
erroneous, if this is out of bounds, or it produces a new pointer displaced
by n, whose size attribute is the value s - n.

If p is a size-attributed pointer with a variable size stored in object s, then
p + n evaluates s to determine whether p + n is out of bounds.
If it is not out of bounds, then the displaced pointer p + n is produced,
and its size attribute is the rvalue s - n. (No provision for safety for
negative displacements would be provided.)

Example:

{
int *p : int size = [42];

p++; /* size is now 41 */

p -= 2; /* UB. */
}


Arrays would be first class citizens: passed into functions, returned from
functions. Array syntax in a function argument list would not denote a
pointer. Size mismatches would diagnose. An explicit cast would override the
size mismatch, resulting in truncating or zero-padding semantics.

/* funct takes one int, returns array of 3 int. */

int (func[3])(int a)
{
int x[2] = { 1, 2 };

return x; /* error, size mismatch */

return (int[3]) x; /* return value is padded with zero */
}

I would provide a simple namespacing scheme based on textual gluing of prefixes
onto identifiers.

On the definition side:

prefix mylib_ {
int open(char *);
int close(int);
};

The above is precisely equivalent to:

int mylib_open(char *);
int mylib_close(int);

On the use side:

prefix mylib_ open, close;

This means that if the names mylib_open and mylib_close are visible in this
scope, the short names open and close now stand for mylib_open and mylib_close.
If no such names are visible, it is erroneous.

Something similar would be provided for preprocessor symbols (if I actually
kept the preprocessor as such).

Control flow. I would provide a non-local dynamic return mechanism with
cleanup (unwind protect). Any function or block would be able to execute
cleanup code if it is terminated by a nonlocal transfer.
Some kind of exception handling would be provided.
Named blocks for breaking out of nested constructs.
 
R

Rui Maciel

io_x said:
i propose assembly: 8 32 bits registers and 20 - 30 instruction on them
...

Why? What's wrong with writing routines in assembly and then calling those
routines from C?


Rui Maciel
 
F

FireXware

If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?

I would keep C the same but enforce a universal 'package manager' type
of thing for installing and building with libraries. So that I don't
have to spend more time figuring out how to get the damn library to
compile with my code than it would take me to write the library myself.
 
S

Stefan Ram

Rui Maciel said:
If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?

I'd decline this task. I am not capable to design something
better. I'd propose to keep evolving C.

(However, I would suggest some small changes to the specification.
For example, footnote 197, seems to use the term »variable«,
but this term does not seem to be defined in N1570. So I'd
either change the wording of this footnote or define »variable«.
I would try to change implicit definitions of terms to a more
explicit wording of the form »An x is a y, so that ...«, where
»y« is a superordinate concept of x that is already known.)
 
B

BartC

BartC said:
Perhaps the problems and shortcomings should be summarised first. Then
they need to be agreed to be shortcomings; many of the experts here are
quite happy with the language as it is. Others will already be using
alternatives.

This has all been discussed before, for example "Has thought been given
given to a cleaned up C? Possibly called C+" from March 5 2010 in clc.

Actually I've already designed a language, which does the same sorts of
things as C, and quite a bit more, but with hardly any of its
idiosyncrasies. But it's not a replacement for C, and is based on a
different syntax. I'm trying to implement it at the moment.

But I can tell you that, even if I created a curly-braced front-end for it,
it wouldn't be popular in this newsgroup. It's a struggle even to get people
here to accept binary literals in the language!

Another problem with C is that it isn't one language, it's several: (1) The
main C syntax; (2) type declaration syntax, which is in a class of its own;
(3) its macro language (a code-obfuscation tool) (4) compiler-switch options
(5) make-file syntax. This is apart from various proprietary extensions.

If you download any open source C project, you need to become an expert in
all these, otherwise the project probably won't build.

I don't think giving the main language a make-over would make that much
difference to it. Any replacement would need to consider the whole
tool-chain, but still be targeted at the same application fields (mainly,
implementing everything else that other languages are not capable of).
 
M

Mark Storkamp

Rui Maciel said:
If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?


Rui Maciel

A Heuristic Algorithmic Language. Seems we're about 11 years behind
schedule on that as it is.
 
B

BGB

Perhaps the problems and shortcomings should be summarised first. Then
they need to be agreed to be shortcomings; many of the experts here are
quite happy with the language as it is. Others will already be using
alternatives.


a few thoughts:

"#include", maybe some sort of alternative mechanism could be devised,
like a standardized precompiled header mechanism.

a major issue is mostly that as-is, if you do:
#include <foo.h>
#include <bar.h>

then by definition declarations within "foo.h" may alter the behavior of
declarations within "bar.h", regardless of whether or not such behavior
is desired or relevant.

a better mechanism could be one which is "similar" to include, except it
leaves the matter undefined as to whether actual textual inclusion is
used, or if the headers are compiled independently (and their contents
imported later).

one idea here:
#pragma pch_standalone
or:
#pragma precompiled_header


(decided to mostly leave out analysis of mechanisms in several other
languages, namely Java and C#).

my own language uses a variant of "import" here, which mostly just
imports "modules" (more or less just "source files"), with a tree-based
organization scheme vaguely similar to the JVM package system, but sadly
also with a lot of special-case modifiers.


another issue:
the current declaration syntax (and also the cast syntax) is ambiguous
apart from knowing in advance what the types are. this makes parsing
both more problematic, and potentially slower.

one option is placing some restrictions on allowed constructions (such
as those similar to C#), which could allow for the parser to skip
looking up typedefs.

the concern here is that such a feature would not be strictly backwards
compatible, as there may be some code which could run into problems.

a specific problem with the strategy as used in C# is that it would
limit some constructions involving calling the result of a prior
expression, forcing the use of temporary variables for function pointers
in some cases.

in my own language, I had switched to an alternate cast syntax: "x as
type" and "x as! type", personally rather having the ability to curry
functions without a temporary at the cost of losing the traditional cast
syntax.


next issue:
the preprocessor allows some constructions which are only possible if
textual substitution is used, which hinders alternate strategies (such
as AST-based macro expansion).

the main imposition here (of changing this) would be being unable to
express macros with unmatched parenthesis or braces.

also, block-macros would be nice.

so, rather than having to type:
#define FOO(x, y, z) \
first_line(x); \
second_line(y); \
third_line(z)

a declaration like:
#macro FOO(x, y, z)
first_line(x);
second_line(y);
third_line(z);
#endmacro

could be used.


yet another issue:
current handling in C for variable argument lists sucks.

nicer would be some alternative mechanism for these, which could
potentially allow:
not needing an argument before '...', or va_list / va_start() / va_end().

maybe some ability to determine the argument types (would be problematic
for current ABIs though).


other things which could be nice:

ability to put "_" in numbers as a separator, as in "0x0123_4567_89AB_CDEF"

multi-line strings (like in Lua, Python, and my own language).

example:
"""
this is a string
which spans lines...
"""
or (a syntax my language uses):
<[[this is a string
which spans lines...]]>
rather than:
"this is a string\n"
"which spans lines..."



far less certain features (wouldn't work well with current compilers or
ABIs):

maybe references and operator overloading could be nice.
maybe support for variant types.

maybe also an object system, or features to aide in implementing an
object system (such as via operator overloading), but this would be
getting a bit far outside of C domain (as would be features like eval, ...).

admittedly, I am not fond of the C++ OO facilities, but would like it if
it were more possible for people to "role their own", such as by
overloading "->" or similar.

for example, if this could work:
variant operator->(MyType obj, char *name)
{ ... }

or, further (syntax for defining a custom method handler):
variant operator->(MyType obj, char *name)
{ return function(variant[] args...) { ... }}

(where "function" would be a "true" closure, as in, one which retains
the parent scope following the parent function returning).

what this would look like in my own language's syntax:
function operator.(obj:MyType, name:string):variant
function(args...) { ... }

(although doing something like this isn't really needed in my case, as
there are more specialized features, like "function get*()" and similar).


likewise, it would also be nice if C could (optionally) support things
like late-binding and link-time type specialization (these would likely
be mostly transparent at the language level, but could have a more
significant impact on the compiler).


but, it is all not as big of a deal for me, since I can roll my own
language and have whatever features I want in it...

a downside is that my own language can't currently really compete with
C, especially at present WRT things like performance (or, sadly, being
entirely free bugs in the implementation).

the performance issue is partly due the problem of my own difficulties
creating or maintaining a reliable native code generator, leaving me
mostly stuck with much slower options: namely the use of interpreters
and threaded-code, and even then still trying to nail down all the bugs
(and having an annoyingly large bytecode ISA, roughly about 600 opcodes
at present).

even then, it is still likely that my language would require a fair
amount of run-time support, and so would be ill-suited to non-hosted
environments (unless restricted to a more strict C-like subset).

so, C remains as my primary development language.
 
K

Kaz Kylheku

I'd decline this task. I am not capable to design something
better. I'd propose to keep evolving C.

I'd propose to hang C on the shelf and keep evolving C++.
 
B

BartC

BGB said:
On 4/29/2012 10:44 AM, BartC wrote:
"#include", maybe some sort of alternative mechanism could be devised,
like a standardized precompiled header mechanism.

a major issue is mostly that as-is, if you do:
#include <foo.h>
#include <bar.h>

then by definition declarations within "foo.h" may alter the behavior of
declarations within "bar.h", regardless of whether or not such behavior is
desired or relevant.

a better mechanism could be one which is "similar" to include, except it
leaves the matter undefined as to whether actual textual inclusion is
used, or if the headers are compiled independently (and their contents
imported later).

The #include mechanism can be left alone. But for using library functions,
there would just be something like:

#import bar

As I've implemented this elsewhere, this defines the imported names from bar
in a protective scope, forming a namespace that ought not to clash with
anything that happens in foo. This does require that the compilation of
bar/foo produces a special file containing all the exported names, and that
that compilation has already been done. (That makes mutual imports/exports
between two modules tricky. That's why you leave the old #include in
place...)
other things which could be nice:

ability to put "_" in numbers as a separator, as in
"0x0123_4567_89AB_CDEF"

At least two or three major versions of C over twenty years, and no-one
bothers to add in trivial stuff like this which takes five minutes..
multi-line strings (like in Lua, Python, and my own language).

Doesn't C already have that? You do need to have the portion on each line
fully quoted.
but, it is all not as big of a deal for me, since I can roll my own
language and have whatever features I want in it...

Same here...
 
B

BGB

The #include mechanism can be left alone. But for using library
functions, there would just be something like:

#import bar

As I've implemented this elsewhere, this defines the imported names from
bar in a protective scope, forming a namespace that ought not to clash
with anything that happens in foo. This does require that the
compilation of bar/foo produces a special file containing all the
exported names, and that that compilation has already been done. (That
makes mutual imports/exports between two modules tricky. That's why you
leave the old #include in place...)

except #import is already used by C++, and is technically different from
include.

a hack involving precompiled headers and a #pragma could allow C to use
a more efficient mechanism which remaining backwards compatible and
still conform to the C standard.


example(mylib.h):
<--
#pragma precompiled_header //compiler hint to use PCH
#pragma once //compiler hint to only include once

#ifndef MYLIB_H //good old header guard (for everyone else)
#define MYLIB_H
....
#endif
-->

(bar.c)
<--
#include <mylib.h> //doesn't look any different than before
....
-->

so, what happens in this case is that the preprocessor starts going,
sees the header, and sees the #pragma, and is all like "my work is done
here" and inserts a command into the preprocessor output telling the
compiler to load the precompiled header instead (or however exactly it
ends up being implemented).

in the likely worst case, #include is no slower than before.

At least two or three major versions of C over twenty years, and no-one
bothers to add in trivial stuff like this which takes five minutes..

yeah, my language uses this syntax.
it would be nice in C for making things like 64-bit constants easier to
read.

Doesn't C already have that? You do need to have the portion on each
line fully quoted.

the problem in C is that it is ugly-looking, and makes it awkward to
type large multi-line string constants.

it also doesn't help that compilers like MSVC will barf if a string
constant is larger than about 4096 characters or so (so, then one has
the fun of not only having nasty multi-line strings, but such strings in
an array).

as-is, mostly these are used for inlining globs of ASM and BGBScript
code into C (sometimes tool-aided, mostly so that I can type the code
into stand-alone text files without needing all those nasty quotes and
"\n" escapes).


however, with multi-line string literals, I would no longer need a
separate tool for this, and could just be like:
char *mylib_bigGlobOfASM=
"""
....
lots of code...
....
lots more code...
....
""";


although, the compiler for my language currently has a limit of 64k
characters for a multi-line string literal, but I could potentially
change this later (such as maybe allowing several MB or more in a large
string literal), but at this point it would almost make sense to devise
some sort of data-compression feature (such as an LZ77 variant) for
large string literals.

however, even 64k is "pretty damn large" WRT a string literal (a person
is hard-pressed to exceed this by writing code by hand, whereas 4k is a
much easier limit to run into), but this could happen if folding
data-files into source-code or similar.


my language allows both triple-quotes and a custom "<[[ ... ]]>" syntax,
which mostly has the merit of being nestable provided it is evenly
matched. although slightly uglier IMO than the "[[ ... ]]" syntax used
in Lua, the merit is that it doesn't clash with the syntax for nested
arrays (since arrays use the syntax "[ ... ]" in my language, mostly due
to its ECMAScript heritage).

the reasoning also being that "<[[" and "]]>" were otherwise very
unlikely to show up normally, and even then, they are not real tokens by
themselves, and so will only be recognized in certain contexts (such as
when parsing a literal).

Same here...

yep.
 
J

James Kuyper

On 4/29/2012 4:37 PM, BartC wrote: .... ....
except #import is already used by C++, and is technically different from
include.

When did they add that, and what does it mean? I haven't kept up with
C++, my latest draft standard version is n3035, dated 2010-02-16; it
makes no mention of #import.
 
B

BGB

When did they add that, and what does it mean? I haven't kept up with
C++, my latest draft standard version is n3035, dated 2010-02-16; it
makes no mention of #import.

ok, it is not standard, but both MSVC and GCC seem to have support for
it, so either way...


both make mention of it:

http://msdn.microsoft.com/en-us/library/8etzzkb6(v=vs.71).aspx

http://gcc.gnu.org/onlinedocs/gcc-3.2/cpp/Obsolete-once-only-headers.html


it does basically the same thing as #include but only includes the file
once.


however, I would be less fond of using the directive, as either way:
it doesn't have exactly the intended semantics, so a secondary mechanism
would still be needed;
it isn't compatible with compilers which don't support it, and something
like:

#ifdef _SOMECC
#import <foo.h>
#else
#include <foo.h>
#endif

isn't really a desirable solution.
it seems better IMO if a single solution could be used, with the option
of optionally improving performance on certain implementations.
 
G

Guest

here i not find the google sys for NG
so google not support NG?

if you want a proper reply make it clearer what you are asking.

this was posted from Google Groups. Does that answer your question?
 
M

Malcolm McLean

בת×ריך ×™×•× ×¨×שון, 29 ב×פריל 2012 17:17:29 UTC+1, מ×ת Rui Maciel:
If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?
Serialisation needs to be an inherent part of the language, with the ability to serialise to backing store, memory, or a port, in a cross-platform andinteroperable way.

Libraries need to be a lot easier to install and use.

1) We need an end to the type problem. For instance every library that does3d graphics will define some version of Point, point3, vector3, or so on. None of these work together.

2) We need an end to the stickiness problem. I recently asked for a "convolve" function in C which did fast convolution with Fourier transforms. I wasdirected to a massive library, OpenCV. My code is meant to run as a Matlabmex fuunction (a matlab-callable function written in C). In the event I got a KISS fft from the net and wrote my own.

3) We need a simple, consistent method for specifying libraries, installingthem, and linking them.

We also need a bigger standard library. Things like hash tables, regular expressions, directory functions, port access for internet programming, SQL, and the ability to open a graphics window shouldn't have to rely on 3rd party libraries.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top