Why do you like C more than other programming languages?

santosh · Jul 26, 2008

Antoninus said:
Obviously for arithmetic types, +, -, etc. are clearer.

For abstract types, operator overloading does lead to completely
incomprehensible code. You've only got to look at the first operators
people learn in their C++ hello world program: << and >> absurdly
"generalized" to concatenation operators than have no conceptual
relationship to bit-shifts. If you give people the power to overload
operators on arbitrary types, then they *will* abuse it, even the
supposedly brilliant Strustrup.

You have a habit of intentionally misspelling names, don't you?

Ben Bacarisse · Jul 26, 2008

As I have said before, the real heart of computer programming is
abstraction, in which we go from "here is a series of very explicit
and tiny steps to take to solve some particular problem" to "please
solve this particular problem for me". Finding the right abstractions,
and putting them together into a single "problem solving box", as
it were, allows us to take the problem-solving boxes and put *those*
together, to solve bigger and more interesting problems. Soon,
instead of "compute quadradic root", you may be doing things like
"obtain this food I would like to eat" or "find better way to deal
with energy crisis". (Quadratic equations will probably be involved
somewhere along the way. )

Syntactic sugar, whether in the form of operator overloading or
function overloading or anything else, is merely a small (albeit
useful) tool in doing the abstraction that allows us to solve more
interesting problems. Try not to get too hung up over syntax;
semantics matter more, and proper abstraction makes the semantics
clearer.

As you say, semantics are the key. C is a bit thin when it comes to
semantics for expressing sophisticated abstractions, but then it is
not trying to be ML or Haskell.

There is one bit of semantics I would very much like: partial
application. In languages with higher-order functions, this comes for
"free" (free in that it must be there, but there is a cost to the
implementation and the programmer). C does not have nested functions
so a good part of the cost (often called building a lexical closure)
is not required and I am sure[1] that a reasonably cheap implementation
would be possible.

Why would one want it? The main reason is to avoid global variables
in functions that need to know more than their parameters (at the
point of call). The classic example is a comparison function. To
build a complex, controllable sort, one needs either an unfeasibly
large family of comparison functions or a single one controlled be
external data.

Partial application solves this problem in the following way: Given a
function with this prototype:

RT function(T1 p1, T2, p2, T3, p3);

A function call like function(a1) is an expression that evaluates to a
pointer to a function. The type being:

RT (*)(T2, T3)

Similarly, function(a1)(a2) == function(a1, a2) are both pointers of
type RT (*)(T3).

Because the parameter passing/access mechanism is likely to be
slightly more expensive than for ordinary parameters, it would be
acceptable to identify those initial parameters that may be passed
in this new way by, for example, but marking them off with a ";":

void my_func(int a, double b; char *s);

This fits the C philosophy of not paying for things you don't use. Of
course, if it is unimplementable without huge cost, then it is dead in
the water, but it would give C an expressive boost that is more
significant than operator overloading, particularly in the brave new
world of multi-threaded programs where "global" data is so dangerous.

I hope this wild speculation is permissible as a weekend c.l.c folly.

[1] I am not a compiler writer, but I did write a toy compiler for
C-like language that had this facility and it seemed to be "cheap".
Maybe that is too little to be sure, but it is suggestive.

Chris Torek · Jul 26, 2008

(This is drifting fairly far off topic and probably should move
to comp.programming or some such. But for now...)

There is one bit of semantics I would very much like: partial
application. ... C does not have nested functions

C does not, but GNUC does.

so a good part of the cost (often called building a lexical closure)
is not required and I am sure[1] that a reasonably cheap implementation
would be possible.

There are two most common (in my experience) parameter-passing
mechanisms used in C: "on the (typically hardware-provided) stack"
and "in registers". If one were to borrow GCC's "trampoline"
technique for its nested functions, I think you are correct here.
But there is one "gotcha".

Partial application solves this problem in the following way: Given a
function with this prototype:

RT function(T1 p1, T2 p2, T3 p3);

A function call like function(a1) is an expression that evaluates to a
pointer to a function. The type being:

RT (*)(T2, T3)

[and so on]

On machines with "parameters on the stack", the parameters are
typically pushed in reverse. On machines with "parameters in
registers", the first N parameters (for some machine-dependent
constant N) are in hardware registers -- though this gets more
complicated for machines with separate floating-point registers --
and then any excess parameters are delivered on a stack, or in a
"parameter block" pointed to by yet another register, or similar.

Note that in the proposal you have above, partial application
applies exclusively to the "front-most" (or left-most) parameters.
If you supply one argument, it must be the first one, and you are
left with a pointer to a function taking the remaining two arguments;
if you supply two, they are the first two, and you get a pointer
to a function taking just the third.

For stack-based systems, however, one "wants" (internally) for the
applied parameters to start at the end and work towards the beginning.
(For register-based systems, there are no such restrictions, as
long as you do not go past the number of "register-ized" parameters.)

Imagine we are generating code for a stack-based system. Suppose
also that we have the ability to create code at runtime, by shoving
instructions into memory and then using whatever facilities the
system provides to mark the new section as "executable code". When
the compiler sees function(a1), it can then emit, e.g.:

push #24 /* after calculating that we need 24 bytes */
call __get_memory_for_trampoline /* result in a0 */

mov a0, a1 /* save result */

mov #POP_TO_TMP_REG, (a1)+ /* pops return address */
mov #PUSH, (a1)+ /* push-with-value */
mov arg1, (a1)+ /* the parameter we want */
mov #PUSH_TMP_REG, (a1)+ /* restore return-address */
mov #JUMP, (a1)+ /* then jump-indirect */
mov #function, (a1) /* to target function */

/* now translate from "RAM whose address is in a0" to
"RAM containing executable code" (returned code-address
is also in a0, which may or may not differ from the
supplied data-address) */
push #24 /* size of region, again */
push a0
call __translate_to_executable

mov a0, -20(fp) /* save resulting function pointer */

What this means, though, is that our "pre-evaluated" argument is
now supplied as the *last* argument to the function, because the
"not pre-evaluated" arguments are pushed by the indirect call:

push arg2
push arg3
mov -20(fp),a0
call (a0) /* call the trampoline */

Presumably this is why you suggested adding a special marker syntax:

Because the parameter passing/access mechanism is likely to be
slightly more expensive than for ordinary parameters, it would be
acceptable to identify those initial parameters that may be passed
in this new way by, for example, but marking them off with a ";":

void my_func(int a, double b; char *s);

(On the other hand, perhaps you were thinking of doing this without
trampolines, by "smuggling" an extra "void *" argument pointing to
the equivalent of a closure block.)

Ben Bacarisse · Jul 27, 2008

Chris Torek said:
(This is drifting fairly far off topic and probably should move
to comp.programming or some such. But for now...)

Maybe one more...

There is one bit of semantics I would very much like: partial
application. ... C does not have nested functions

Click to expand...

so a good part of the cost (often called building a lexical closure)
is not required and I am sure[1] that a reasonably cheap implementation
would be possible.

Click to expand...

There are two most common (in my experience) parameter-passing
mechanisms used in C: "on the (typically hardware-provided) stack"
and "in registers". If one were to borrow GCC's "trampoline"
technique for its nested functions, I think you are correct here.
But there is one "gotcha".

I think there may be another... :-(

Partial application solves this problem in the following way: Given a
function with this prototype:

RT function(T1 p1, T2 p2, T3 p3);

A function call like function(a1) is an expression that evaluates to a
pointer to a function. The type being:

RT (*)(T2, T3)

Click to expand...

[and so on]

On machines with "parameters on the stack", the parameters are
typically pushed in reverse. On machines with "parameters in
registers", the first N parameters (for some machine-dependent
constant N) are in hardware registers -- though this gets more
complicated for machines with separate floating-point registers --
and then any excess parameters are delivered on a stack, or in a
"parameter block" pointed to by yet another register, or similar.

Note that in the proposal you have above, partial application
applies exclusively to the "front-most" (or left-most) parameters.
If you supply one argument, it must be the first one, and you are
left with a pointer to a function taking the remaining two arguments;
if you supply two, they are the first two, and you get a pointer
to a function taking just the third.

For stack-based systems, however, one "wants" (internally) for the
applied parameters to start at the end and work towards the beginning.
(For register-based systems, there are no such restrictions, as
long as you do not go past the number of "register-ized" parameters.)

Imagine we are generating code for a stack-based system. Suppose
also that we have the ability to create code at runtime, by shoving
instructions into memory and then using whatever facilities the
system provides to mark the new section as "executable code". When
the compiler sees function(a1), it can then emit, e.g.:

push #24 /* after calculating that we need 24 bytes */
call __get_memory_for_trampoline /* result in a0 */

mov a0, a1 /* save result */

mov #POP_TO_TMP_REG, (a1)+ /* pops return address */
mov #PUSH, (a1)+ /* push-with-value */
mov arg1, (a1)+ /* the parameter we want */
mov #PUSH_TMP_REG, (a1)+ /* restore return-address */
mov #JUMP, (a1)+ /* then jump-indirect */
mov #function, (a1) /* to target function */

/* now translate from "RAM whose address is in a0" to
"RAM containing executable code" (returned code-address
is also in a0, which may or may not differ from the
supplied data-address) */
push #24 /* size of region, again */
push a0
call __translate_to_executable

mov a0, -20(fp) /* save resulting function pointer */

What this means, though, is that our "pre-evaluated" argument is
now supplied as the *last* argument to the function, because the
"not pre-evaluated" arguments are pushed by the indirect call:

push arg2
push arg3
mov -20(fp),a0
call (a0) /* call the trampoline */

Presumably this is why you suggested adding a special marker syntax:

Yes, I don't really mind if the "pre-applied" args were allowed only at
the end (though this rules out rather useful stdarg-based functions).

(On the other hand, perhaps you were thinking of doing this without
trampolines, by "smuggling" an extra "void *" argument pointing to
the equivalent of a closure block.)

My ideas were not even half baked. The trampoline is a neat idea (who
thought of it?) but it works for nested functions, I think, because
the trampoline only needs to exist during the execution of one lexical
scope. This means that it can be allocated on the stack, or if it
needs to be statically allocated, its "settings" can be adjusted at
entry to the scope in question.

The pointer that results from function(a1) can be squirreled away and
used at any time. This suggest that the trampoline has to be a
garbage-collected object. Not the right sort of idea for C. Forgive
me if I've missed something in your careful outline and you have
finessed this problem already. I confess to not understanding the
trampoline idea fully.

In my C-like language that caused me to suspect an efficient
implementation was possible, the pre-applied args were part of a sort
of extended function pointer. function(a1) was bigger that function
by exactly the size of a1 (aligned and so forth) and function(a1, a2)
was bigger still. This does not fit with C's rule of function
pointers being interchangeable (by cast).

Chris Torek · Jul 27, 2008

My ideas were not even half baked. The trampoline is a neat idea
(who thought of it?)

I do not know the origin of the idea. It has been around for a
long time, though, in various forms. (The signal delivery code in
Unix-like systems has long been called "sigtramp", the "signal
trampoline", and that was done in the early 1970s if not earlier.
The idea itself predates even this, though.)

but it works for nested functions, I think, because the
trampoline only needs to exist during the execution of one lexical
scope. This means that it can be allocated on the stack ...

Not in general: a number of CPUs (including the conventional x86,
when set up correctly) prohibit execution of "stack area" memory
(and, e.g., Linux does this under various security models). This
is why you must, in the general case, call some OS service to turn
code-stored-in-data-space into executable-code. (The OS call may
also "vet" the code, depending on security models, although I do
not believe any existing Linux implementations do this yet. One
would also do code-vetting on a system that takes ideas from the
old Burroughs A-series machines.)

The pointer that results from function(a1) can be squirreled away and
used at any time. This suggest that the trampoline has to be a
garbage-collected object.

Or, as on actual implementations, dynamically created functions
simply live "forever" (until program exit). This generally works
reasonably well since they tend to be small and not all that heavily
used (if nothing else, the OS call to allow executing the code is
slow enough to make creating partial application expensive, so that
one does it only if the cost is sufficiently amortized otherwise).

David Thompson · Jul 28, 2008

<OT=very>

Like the old joke about object oriented cobol being named
"Add one to COBOL returning COBOL" ?

No, just ADD ONE TO COBOL .

ADD ONE TO COBOL GIVING COBOL -- not RETURNING -- is (also) valid
syntactically, but is not object-oriented, or at least not as much,
hence doesn't make the joke work. FSVO work.

- formerly david.thompson1 || achar(64) || worldnet.att.net

user923005 · Jul 28, 2008

As a side note, once we are talking about roots of quadratics
there are several issues to be dealt with. The first is that the
roots may be complex. The second is that the grade school
formula for the roots should only be used for one of the two
roots for numerical accuracy reasons.

If one has a function that returns a structure containing the
roots then all of these grubby details can be handled in the
function. If, however, one mindlessly transcribes the formula
one learned in school into one's code unpleasant results can
ensue.

A nice explanation is found here:
http://docs.sun.com/source/806-3568/ncg_goldberg.html

If we do a search for "quadratic formula" we can find it quickly.

Serve Lau · Jul 31, 2008

rio said:
it mean
copy(&i, add(&j, &k))
or better
add(&i, &j, &k)

After properly naming the variables the code gets more clearer

name = firstname + ' ' + lastname;

is much clearer than
i = j + ' ' + k;

instead lame examples are fabricated now and then we blame operator
overloading. It's not hard to come up with good variable names and on top of
that operator overloading fits perfectly in C's "trust the programmer"
philosophy

Richard Bos · Aug 1, 2008

jacob navia said:
You raise a valid point. Within my proposal, there is a lot of
room for bad use, as in any proposal, as in the language
itself.

However, I see no reason to add yet another bag onto the side of C, just
for prettiness, when it's this easy to use confusingly.

Numbers adapt themselves VERY WELL to this notation. Problems
start when you use this notation for something else than numbers!

And guess what people will do?

If you can find a way to add overloading to the C Standard in such a way
that only types which normally would be seen as more-or-less compatible
can have their types overloaded - for example, only arithmetic types -
_and_ find a committeeable way of phrasing your proposal, feel free to
suggest it. Otherwise, I don't believe it's worth the trouble.

Click to expand...

Another BIG application of operator overloading is access to generalized
containers using the [ ] notation

String a; // "String" is a counted character string

a[2] = 'b'; // overloaded operator [ ]

Yes, that's another way in which operator overloading can become
confusing. For example, now *(a+2) is no longer the same as a[2].
*Bang* goes the use of pointers. Bad, baaad idea.

Richard

jacob navia · Aug 1, 2008

Richard said:
jacob navia said:

Another BIG application of operator overloading is access to generalized
containers using the [ ] notation

String a; // "String" is a counted character string

a[2] = 'b'; // overloaded operator [ ]

Click to expand...

Yes, that's another way in which operator overloading can become
confusing. For example, now *(a+2) is no longer the same as a[2].
*Bang* goes the use of pointers. Bad, baaad idea.

Richard

If you overload the addition operator for String + integer
you can return a ppointer to the third character of the
data of the string. Then
*(a+2) = 'b';

is exactly the same as you would have with a pointer...

*Bang* goes your objection. Good, gooood idea.

Richard Bos · Aug 1, 2008

jacob navia said:
Richard said:

jacob navia said:

Another BIG application of operator overloading is access to generalized
containers using the [ ] notation

String a; // "String" is a counted character string

a[2] = 'b'; // overloaded operator [ ]

Click to expand...

Yes, that's another way in which operator overloading can become
confusing. For example, now *(a+2) is no longer the same as a[2].
*Bang* goes the use of pointers. Bad, baaad idea.

Click to expand...

If you overload the addition operator for String + integer
you can return a ppointer to the third character of the
data of the string. Then
*(a+2) = 'b';

is exactly the same as you would have with a pointer...

Now tell me, what does *(a+0) contain? The first byte of a? Or the first
character in a's string? In the first case, your implementation is
broken; in the second case, it's broken in a different way.

*Bang* goes your objection. Good, gooood idea.

There's this document called the "ISO/IEC International Standard;
Programming languages - C". It contains some interesting information
about what you can and cannot do with a C implementation. You might like
to read it some day.

Richard

Richard · Aug 1, 2008

Serve Lau said:
After properly naming the variables the code gets more clearer

name = firstname + ' ' + lastname;

is much clearer than
i = j + ' ' + k;

instead lame examples are fabricated now and then we blame operator
overloading. It's not hard to come up with good variable names and on
top of that operator overloading fits perfectly in C's "trust the
programmer" philosophy

You seem immune to the main point - its almost impossible to debug from
reading the code. You have to keep a complex mesh of classes and
operator overloading in your head. My own experience tells me that even
using a good debugger its easy to step over "common operators" and have
to rewind in order to step into them on the rerun. All theory and
practice I guess.

Serve Lau · Aug 1, 2008

Richard said:
You seem immune to the main point - its almost impossible to debug from
reading the code. You have to keep a complex mesh of classes and
operator overloading in your head. My own experience tells me that even
using a good debugger its easy to step over "common operators" and have
to rewind in order to step into them on the rerun. All theory and
practice I guess.

I have stepped through operator overload code and examples like given in
this thread do not cause problems provided that proper variable names were
chosen. It only gets rough when overloading casts and when operators are
abused. Suppose you see your code crash at the line name = first + last; Its
not hard to figure out theres an overloaded operator involved here. Same
with queue3 = queue1 + queue2; or other data structures like that

jacob navia · Aug 1, 2008

Richard said:
jacob navia said:

Richard said:

Another BIG application of operator overloading is access to generalized
containers using the [ ] notation

String a; // "String" is a counted character string

a[2] = 'b'; // overloaded operator [ ]
Yes, that's another way in which operator overloading can become
confusing. For example, now *(a+2) is no longer the same as a[2].
*Bang* goes the use of pointers. Bad, baaad idea.

Click to expand...

If you overload the addition operator for String + integer
you can return a ppointer to the third character of the
data of the string. Then
*(a+2) = 'b';

is exactly the same as you would have with a pointer...

Click to expand...

Now tell me, what does *(a+0) contain?
> The first byte of a?
No.

> Or the first character in a's string?
Yes.

In the first case, your implementation is
broken; in the second case, it's broken in a different way.

It is not "broken" in any way.

There's this document called the "ISO/IEC International Standard;
Programming languages - C". It contains some interesting information
about what you can and cannot do with a C implementation. You might like
to read it some day.

Structure '+' integer is not defined in the language.
This invokes UB, then lcc-win can do anything, for
example call a user defined function for this purpose.

Richard · Aug 2, 2008

Ian Collins said:
No, it isn't. Not once you get used to thinking of operators as just
another function.

Yes it is. It is FAR harder to figure out the type of a variable and
then the correct operator being brought in than it is to glance across
at a single function.

Don't get me wrong - its not impossible. A good C++ programmer learns to
think differently. But this in the context of C.

Sounds like you have been exposed to poorly written code where
programmers have abused operator overloading.

Err, yes. Its the whole point of my argument.

Most projects I have reviewed or been brought into to trouble shoot had
poorly written code. Nothing new there.

What is AI programming to us non-bigtech programmers?	4	Jun 1, 2023
What should I do Before I give up programming?	6	Jan 14, 2023
Why you should love c++	3	Aug 2, 2017
With this artifact, everyone can easily invent new languages	5	Jan 11, 2014
Why is it impossible to create a compiler than can compile Python tomachinecode like C?	25	Feb 28, 2013
Other languages to try?	19	May 26, 2009
Composability and Concurrency and Functional programming	1	Jun 13, 2014
integer overflow behavior in other languages	10	Sep 13, 2009

Why do you like C more than other programming languages?

santosh

Ben Bacarisse

Chris Torek

Ben Bacarisse

Chris Torek

David Thompson

user923005

Serve Lau

Richard Bos

jacob navia

Richard Bos

Richard

Serve Lau

jacob navia

Richard

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads