If you were given the task to design a replacement for the C programming
language intended to fix all its problems and shortcomings, what would you
propopose?
void. I would get rid of void. No "void func()", no (void) argument list, no
void * pointers. Functions with no return value could be declared/defined with
the "proc" keyword. Generic pointer would be "byte *", where byte would be a
type built in to the language, exactly like unsigned char, and explicit
conversion by cast would be required in both directions.
More consistent syntax. For instance function parameter decls separated
by semicolons, not commas, allowing:
proc foo(int a, b, c; double *p)
I would smear the distinction between statements and expressions. Statements
would be allowed to return values, so for instance, this would be valid:
int x = if (a > b) c; else d;
Conformance. I would distinguish warning and error diagnostics in the spec,
and forbid required diagnostics from being mere errors. Implementations which
translate a unit that requires a diagnostic, allowing it to be linked and
executed, would be considered nonconforming.
Introspection. The language would have an API for run time introspection.
I don't want to start writing details about this, but suffice it to say that
the introspection would be sufficient that it could allow a program in the
language to implement a precisely tracing garbage collector (even in the
presence of threads) without resorting to any platform-specific assembly
language or other hacks. Introspection would allow the GC to walk stacks and
know exactly where the live variable values are.
I would strengthen arrays without the disadvantages of making them more
encapsulated. There would a way to define an integral object which indicates
the size of an array referenced by a pointer:
struct vec {
double *dynamic : int size;
double *fixed : 1; /* pointer to just one double */
};
Such a definition introduces two objects, so that struct vec above has
a member called size of type int. The compiler understands that the
size of the array pointed at by v->dynamic is given by int size.
This would be in function parameters also:
double dotproduct(double *v1 : int size_v1, *v2 : int size_v2);
Now when we have a sized vector, we can do this:
dotproduct(v1->dynamic, v2->fixed);
The compiler knows that the size of v1->dynamic (where v1 points to a struct
vec) is given by v1->size. So, automatically, it passes v1->size as
the value for the size_v1 formal parameter. For size_v2, it passes 1.
If you pass an unsized pointer to a sized parameter, the size parameter is not
filled in. It must be specified explicitly:
dotprodut(some_pointer : 4, v2->fixed);
If a pointer is derived from an array, its size is inferred from the array.
All dataflows involving pointers would automatically carry the size
information, if any, propgating it between the size objects:
v1->dynamic = v2->dynamic; /* v1->size = v2->size is implicit! */
I would drop variable length arrays. They could be replaced by sized pointers
which are initialized using a notation:
int *p : int s = [42];
This [42] is an initializer which means dynamically allocate automatic
storage, returning null if it is not possible, and the size is 42 times the
size of the pointer being initialized. The variable s receives the value 42 if
the allocation succeeds, otherwise zero.
Of course, the s is optional:
{
int *p = [42]; // make auto array of 42 ints, or fail with null
}
// out of scope: array is now gone
The nice thing is that we can now pass p into functions, and using the
size-propagation logic, functions can know the size. This is even if we wrap p
inside a data structure:
struct obj {
int *ptr : int size;
};
proc foo(struct obj);
int bar(int *ptr : int size);
/* ... */
{
int *p = [42];
struct obj;
obj.ptr = p; // implicit: obj.size = 42;
foo(obj); // foo knows size
bar(p); // bar receives size
}
Pointer arithmetic would also propagate the size, if possible. (Which is part
of the point.)
If p is a size-attributed pointer of constant size s, then p + n is either
erroneous, if this is out of bounds, or it produces a new pointer displaced
by n, whose size attribute is the value s - n.
If p is a size-attributed pointer with a variable size stored in object s, then
p + n evaluates s to determine whether p + n is out of bounds.
If it is not out of bounds, then the displaced pointer p + n is produced,
and its size attribute is the rvalue s - n. (No provision for safety for
negative displacements would be provided.)
Example:
{
int *p : int size = [42];
p++; /* size is now 41 */
p -= 2; /* UB. */
}
Arrays would be first class citizens: passed into functions, returned from
functions. Array syntax in a function argument list would not denote a
pointer. Size mismatches would diagnose. An explicit cast would override the
size mismatch, resulting in truncating or zero-padding semantics.
/* funct takes one int, returns array of 3 int. */
int (func[3])(int a)
{
int x[2] = { 1, 2 };
return x; /* error, size mismatch */
return (int[3]) x; /* return value is padded with zero */
}
I would provide a simple namespacing scheme based on textual gluing of prefixes
onto identifiers.
On the definition side:
prefix mylib_ {
int open(char *);
int close(int);
};
The above is precisely equivalent to:
int mylib_open(char *);
int mylib_close(int);
On the use side:
prefix mylib_ open, close;
This means that if the names mylib_open and mylib_close are visible in this
scope, the short names open and close now stand for mylib_open and mylib_close.
If no such names are visible, it is erroneous.
Something similar would be provided for preprocessor symbols (if I actually
kept the preprocessor as such).
Control flow. I would provide a non-local dynamic return mechanism with
cleanup (unwind protect). Any function or block would be able to execute
cleanup code if it is terminated by a nonlocal transfer.
Some kind of exception handling would be provided.
Named blocks for breaking out of nested constructs.