operators similar to functions?

Keith Thompson · Nov 29, 2007

jacob navia said:
Since basically there is no difference between operator usage
and a function call, there are heretics that propose to use the
simpler operator notation instead of the function call notation
to call functions.

Those heretics (that are looked upon with disdain by the law-abiding
regulars of this group) propose a sin called "operator overloading"
where you can use the operator notation to call your own functions,
imagine that.

You write this absolutely heretical notation:

result_type operator+(arg_type a,arg_type b);

Then, with suitable definitions of arg_type, the compiler will call
this function instead of doing a normal addition.

[...]

This talk of "heresy" is, of course, complete and utter nonsense.

Operator overloading is neither heretical nor sinful. C is not a
religion. Code that uses features that are not part of standard C is
not sinful, or bad, or even necessarily wrong, it's merely something
other than C.

Ironically enough, operator overloading is actually relevant to the
original poster's question. It's a very nice illustation of how
operators and functions *can* be very similar, and of the fact that
the distinction is more syntactic than semantic (though are are also
some semantic differences). It's a pity that jacob couldn't have made
that point without descending into his usual self-pitying nonsense.

cr88192 · Nov 29, 2007

CBFalconer said:
Actually I was quite surprised some years ago, when I found that
people thoroughly conversant with Lisp could write amazingly clear
source code in it.

yes.

the main problem though is that the syntax is not, pretty, and is awkward to
type (absent special text editors).
otherwise, it is not that bad though...

for a calculator language of mine, I ended up doing something bizarre and
forcing infix math and a generally lisp-like syntax together into the same
language, but of course this loses the original point of this thread.

in this lang:
'{+ a b}' and 'a+b' were equivalent (infix was largely handled as
sugarring).

some special cases were handled via spacing rules.
'3-2' or '3 - 2' was different than '3 -2' (this being 2 expressions).
....

otherwise, operators were semantically equivalent to functions (and, this
language has assignable functions, so theoretically operator overloading is
really easy...).

parens were also used, but would serve both as parens, and for indicating
blocks.
(
{print "foo "}
{println "bar"}
)

many other things were done with lists, for example: '[2 -3 4 2+3]'.

....

so, a fairly simplistic, yet still fairly easy to look at and type,
syntax...

originally, I implemented it to work as a "better" language for an
interactive calculator, than making use of guile (me finding it frustrating
for these kind of things, as it is tedius to be sure parens are just right,
....).

but, apart from all this, the language hasn't really gone anywhere...

Gordon Burditt · Nov 30, 2007

Yes, conceptually operators are exactly like functions. You could

rewrite all C with

int a,b,c,d;

a = div(add(b,c),sub(c,d));

I suspect it would be more like:

assign_signedint(&a, div_signedint(add_signedint(b,c), sub_signedint(c,d)));

C (built-in) operators are overloaded, but functions aren't (unlike
in C++, where both can be).

Spoon · Dec 3, 2007

Keith said:
One difference is syntax; operators such as "+" and "*" mimic common
mathematical notation. Addition *could* have been defined using
functional syntax, so you'd have to write ``add(x, y)'' rather than
``x + y''; the latter is just more convenient. Would you rather write
a + b + c + d
or
add(a, add(b, add(c, d)))
?

Nitpick.

"a + b + c + d" is equivalent to add(add(add(a, b), c), d)

(left-to-right associativity)

Richard Harter · Dec 3, 2007

This is a bit misleading. There really isn't much difference
between +(x y), (+ x y), and (x + y). [Whether the parenetheses
and argument separating commas are necessary depends on context
and syntactical conventions.]

As for "a + b + c + d" in functional languages one would probably
write something like

(fold + a b c d)

or some variant thereof, or possibly even (+ a b c d). Moreover
the functional form specifies the order of summation, something
that is not true in C.

In short, the convenience of infix notation is easily overstated.
What is true is that it is familiar and perhaps intrinsically
easier to use.

Nitpick.

"a + b + c + d" is equivalent to add(add(add(a, b), c), d)

(left-to-right associativity)

IIANM in the expression "a + b + c + d" the compiler is free to
generate the sums in any order. So both translations are correct
renditions of what the compiler might produce, along with
add(d, add(c, add(a,b))).

Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die

Eric Sosman · Dec 3, 2007

Richard said:
IIANM in the expression "a + b + c + d" the compiler is free to
generate the sums in any order. So both translations are correct
renditions of what the compiler might produce, along with
add(d, add(c, add(a,b))).

Yes and no. There are two viewpoints: What the language
requires, and how the actual machine satisfies the requirements.

The language specifies left-to-right associativity (or its
equivalent in the form of a grammar). The compiler is free to
play games *if* it can produce the same result whenever the
original does not invoke undefined behavior. Thus, for example:
`1 + x + 2 + y' may actually be evaluated as `x + y + 3' or even
as `++TEMPRESULT' if the value of `x + y + 2' is already known.

But the compiler is still obliged to respect associativity
when it matters! `x - y + z' must be evaluated as `(x - y) + z'
and never as `x - (y + z)'! Only the rules of associativity
distinguish the two outcomes, and the rules must be followed.

Richard Harter · Dec 3, 2007

Yes and no. There are two viewpoints: What the language
requires, and how the actual machine satisfies the requirements.

The language specifies left-to-right associativity (or its
equivalent in the form of a grammar). The compiler is free to
play games *if* it can produce the same result whenever the
original does not invoke undefined behavior. Thus, for example:
`1 + x + 2 + y' may actually be evaluated as `x + y + 3' or even
as `++TEMPRESULT' if the value of `x + y + 2' is already known.

This is a bit misleading to me. Suppose, for example, we have a
sum of signed integers, a+b+c, a and c positive, b negative,
where a+c overflows but (a-b)+c does not. Then a+c invokes
undefined behaviour; however the compiler cannot know that. Ergo
the compiler can alter the order to a form where there is
undefined behaviour when the original left to right order did
not.

There are similar but worse problems with floating point numbers
- the number of significant figure depends on the actual order.

But the compiler is still obliged to respect associativity
when it matters! `x - y + z' must be evaluated as `(x - y) + z'
and never as `x - (y + z)'! Only the rules of associativity
distinguish the two outcomes, and the rules must be followed.

Quite true.
Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die

Walter Roberson · Dec 3, 2007

As for "a + b + c + d" in functional languages one would probably
write something like

(fold + a b c d)

or some variant thereof, or possibly even (+ a b c d). Moreover
the functional form specifies the order of summation, something
that is not true in C.

Hmmm, Maple is arguably a functional language (and also
arguably *not* a functional language), but in Maple, the order
of summation is unspecified and may change even during a single
Maple session. This is of particular note in Maple because
some summation orders "compact nicely" and some do not; for example,
sin(theta)^2 - 1 + cos(theta)^2 if required to be evaluated in
that order would not produce much useful, but Maple knows associative
properties and so can reorder this to (sin(theta)^2 + cos(theta)^2)-1
which is 0.

Keith Thompson · Dec 3, 2007

Spoon said:
Nitpick.

"a + b + c + d" is equivalent to add(add(add(a, b), c), d)

(left-to-right associativity)

Good catch, thanks.

Keith Thompson · Dec 3, 2007

This is a bit misleading. There really isn't much difference
between +(x y), (+ x y), and (x + y). [Whether the parenetheses
and argument separating commas are necessary depends on context
and syntactical conventions.]

Right, but I was thinking of the difference between operators and *C*
function calls. Lisp's (+ x y) is yet another way of expressing the
same thing.

[...]

IIANM in the expression "a + b + c + d" the compiler is free to
generate the sums in any order. So both translations are correct
renditions of what the compiler might produce, along with
add(d, add(c, add(a,b))).

The compiler is allowed to re-order expressions *if* it yields the
same result. Consider, for example:
-1 + INT_MAX + 1
In the abstract machine, this is
add(add(-1, INT_MAX), 1)
Rearranging it to
add(-1, add(INT_MAX, 1))
creates an overflow. (The rearrangement is still allowed if the
compiler knows that it will produce the same result in spite of the
overflow, which will be the case on most two's-complement systems.)

Eric Sosman · Dec 3, 2007

Richard said:
[...]
The language specifies left-to-right associativity (or its
equivalent in the form of a grammar). The compiler is free to
play games *if* it can produce the same result whenever the
original does not invoke undefined behavior. Thus, for example:
`1 + x + 2 + y' may actually be evaluated as `x + y + 3' or even
as `++TEMPRESULT' if the value of `x + y + 2' is already known.

Click to expand...

This is a bit misleading to me. Suppose, for example, we have a
sum of signed integers, a+b+c, a and c positive, b negative,
where a+c overflows but (a-b)+c does not.

I guess you mean (a+b)+c, or (a-|b|)+c.

Then a+c invokes
undefined behaviour; however the compiler cannot know that.

It can certainly know that an addition carries the risk of
overflow. Furthermore, the compiler for a particular machine
probably has a pretty good idea of the consequences of overflow
on that machine. But in any event, evaluating a+b+c does not
require evaluating a+c.

Ergo
the compiler can alter the order to a form where there is
undefined behaviour when the original left to right order did
not.

I'm unable to follow your reasoning. You seem to be saying
that some rearrangements of valid expressions render them
invalid (I agree), that the compiler cannot be aware of this (I
disagree, but that seems to be a side-issue), and that "ergo"
the compiler is free to make trash of correct programs (and
that's where I lose your thread, and disagree with the conclusion).

Let's take the (in)famous DeathStation 9000, whose builders
connected the overflow latch to the detonator of a small fission
bomb (they felt overflow was a serious error that should not stay
hidden). If the program evaluates `1 + -42 + INT_MAX' the system
is required to produce the value `INT_MAX - 41' and is not allowed
to blow the program to Kingdom Come. Implication: on the DS9000,
the compiler's freedom to exploit the (so-called) commutativity
of addition is limited to those cases where it can be sure the
rearrangement will not change the outcome.

The HAL StinkWad represents an opposite pole in system design.
It was built by a Big Company with Big Ideas, where the idea that
"overflow is just an opportunity for More Growth" was encouraged.
The designers therefore didn't feel it was worth bothering the
programmer when an overflow or underflow occurred, and in fact
decided to cut costs by eliminating the overflow latch entirely.
On this system overflow goes undetected and causes no harm, and
the compiler is free to evaluate `1 + INT_MAX + -42' if it so
desires: The two wrap-arounds cancel each other out, and the
result of `INT_MAX - 41' pops out as required, with minimal fuss
and, er, fallout.

There are similar but worse problems with floating point numbers
- the number of significant figure depends on the actual order.

This is (probably) part of the reason for the looseness of
the Standard's description of floating-point arithmetic. See also
#pragma STDC FP_CONTRACT (in C99).

Dik T. Winter · Dec 3, 2007

> Richard Harter wrote: ....
>
> Yes and no. There are two viewpoints: What the language
> requires, and how the actual machine satisfies the requirements.
>
> The language specifies left-to-right associativity (or its
> equivalent in the form of a grammar). The compiler is free to
> play games *if* it can produce the same result whenever the
> original does not invoke undefined behavior.

Which (for instance) means that the compiler in general can not do
such re-orderings for floating point.

pete · Dec 3, 2007

Richard said:
This is a bit misleading to me. Suppose, for example, we have a
sum of signed integers, a+b+c, a and c positive, b negative,
where a+c overflows but (a-b)+c does not. Then a+c invokes
undefined behaviour; however the compiler cannot know that. Ergo
the compiler can alter the order to a form where there is
undefined behaviour when the original left to right order did
not.

There are similar but worse problems with floating point numbers
- the number of significant figure depends on the actual order.

For floating point numbers, the case is simple:
((x + y) + z) is not guaranteed to be equal to (x + (y + z)).

/* BEGIN ouptput from epsilon.c */

FLT_ROUNDS is 1

x = y = 4 * DBL_EPSILON / 3;
z = 2.0;

(x + y) + z < x + (y + z)
(x + y) + z == y + z

/* END ouptput from epsilon.c */

/* BEGIN epsilon.c */

#include <stdio.h>
#include <float.h>

int main(void)
{
double x, y, z;

puts("/* BEGIN ouptput from epsilon.c */\n");
printf("FLT_ROUNDS is %d\n\n", FLT_ROUNDS);
x = y = 4 * DBL_EPSILON / 3;
puts("x = y = 4 * DBL_EPSILON / 3;");
z = 2.0;
puts("z = 2.0;\n");
if ( (x + y) + z == x + (y + z) ) {
puts("(x + y) + z == x + (y + z)");
}
if ( (x + y) + z > x + (y + z) ) {
puts("(x + y) + z > x + (y + z)");
}
if ( (x + y) + z < x + (y + z) ) {
puts("(x + y) + z < x + (y + z)");
}
if ( (x + y) + z == y + z) {
puts("(x + y) + z == y + z");
}
puts("\n/* END ouptput from epsilon.c */");
return 0;
}

/* END epsilon.c */

Philip Potter · Dec 4, 2007

Richard said:
This is a bit misleading to me. Suppose, for example, we have a
sum of signed integers, a+b+c, a and c positive, b negative,
where a+c overflows but (a-b)+c does not. Then a+c invokes
undefined behaviour; however the compiler cannot know that. Ergo
the compiler can alter the order to a form where there is
undefined behaviour when the original left to right order did
not.

What? You're saying that even though the language specifies
left-to-right associativity of addition, if /any/ order of summation
could overflow then an addition expression is undefined?

The implementation *can* know that a+c+b may overflow in a case where
a+b+c does not. Because a+b+c is what the programmer asked for, the
implementation must provided the correct, defined behaviour of a+b+c. If
the implementation cannot be sure that a+c+b will always provide the
same behaviour as a+b+c whenever the behaviour of a+b+c is defined, then
the implementation cannot use a+c+b as its order of addition.

If you argue that the compiler can do as much rearranging as it likes,
then where do you stop? a+INT_MAX+b-INT_MAX+c yields the same result in
maths, but in C carries a high risk of overflow. As a result, a
programmer cannot even add two numbers together without invoking
undefined behaviour.

Richard Harter · Dec 4, 2007

Richard said:
Richard said:

[...]
The language specifies left-to-right associativity (or its
equivalent in the form of a grammar). The compiler is free to
play games *if* it can produce the same result whenever the
original does not invoke undefined behavior. Thus, for example:
`1 + x + 2 + y' may actually be evaluated as `x + y + 3' or even
as `++TEMPRESULT' if the value of `x + y + 2' is already known.

Click to expand...

This is a bit misleading to me. Suppose, for example, we have a
sum of signed integers, a+b+c, a and c positive, b negative,
where a+c overflows but (a-b)+c does not.

Click to expand...

I guess you mean (a+b)+c, or (a-|b|)+c.

Oops. Right.

It can certainly know that an addition carries the risk of
overflow. Furthermore, the compiler for a particular machine
probably has a pretty good idea of the consequences of overflow
on that machine. But in any event, evaluating a+b+c does not
require evaluating a+c.

I'm unable to follow your reasoning. You seem to be saying
that some rearrangements of valid expressions render them
invalid (I agree), that the compiler cannot be aware of this (I
disagree, but that seems to be a side-issue), and that "ergo"
the compiler is free to make trash of correct programs (and
that's where I lose your thread, and disagree with the conclusion).

I said what I said; what I said is not much like what you said I
said.

You do raise an interesting point about making "trash of correct
programs". The thing is, if the compiler can reorder the
calculations using the as-if rule (which permits treating
expressions as though addition were associative and commutative)
and the resulting sequence of operations invokes undefined
behaviour, then the program isn't "correct" under the rules of C.

The "compiler cannot be aware" is the main issue. Technically
you are right and I am wrong - obviously the compiler could
generate checks for the possibility of overflow. However the
compiler is not obliged to do so, and I rather doubt that there
is any production compiler that does.

In short, the responsibility for avoiding overflow rests with the
programmer.

Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die

Richard Harter · Dec 4, 2007

What? You're saying that even though the language specifies
left-to-right associativity of addition, if /any/ order of summation
could overflow then an addition expression is undefined?
Correct.

The implementation *can* know that a+c+b may overflow in a case where
a+b+c does not. Because a+b+c is what the programmer asked for, the
implementation must provided the correct, defined behaviour of a+b+c. If
the implementation cannot be sure that a+c+b will always provide the
same behaviour as a+b+c whenever the behaviour of a+b+c is defined, then
the implementation cannot use a+c+b as its order of addition.

Well, no, that is not my understanding. I'm always open to
chapter and verse correction from the standard. However here is
what Harbison (C - A reference Manual) has to say (7.12 Order of
Evaluation).

The assumption of commutativity and associativity is
always true for &, ^, and | on unsigned operands...
It may not be true for * and + because of the possibility
that the order indicated by the expression as written
might avoid overflow but another order might not.
Nevertheless, the compiler is allowed to exploit the
assumption. In such situations the programmer must use
assignments to temporary variable to force a particular
evaluation order:

If you argue that the compiler can do as much rearranging as it likes,
then where do you stop? a+INT_MAX+b-INT_MAX+c yields the same result in
maths, but in C carries a high risk of overflow.

As a result, a
programmer cannot even add two numbers together without invoking
undefined behaviour.

This is quite true. To quote myself, the responsibility for
avoiding overflow rests with the programmer and not with the
compiler.

Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die

jameskuyper · Dec 4, 2007

Richard Harter wrote:
....

You do raise an interesting point about making "trash of correct
programs". The thing is, if the compiler can reorder the
calculations using the as-if rule (which permits treating
expressions as though addition were associative and commutative)

That's rather the whole point: there is no such permission granted to
the implementation. If you think otherwise, please provide a
citation. Computer math differs from theoretical math in that it is,
in general, neither associative nor commutative, and the standard
nowhere permits implementations to assume that it is.

and the resulting sequence of operations invokes undefined
behaviour, then the program isn't "correct" under the rules of C.

Programs which rely upon the evaluation order specified by the C
standard to (among other things) avoid overflows are correct programs.

The "compiler cannot be aware" is the main issue. Technically
you are right and I am wrong - obviously the compiler could
generate checks for the possibility of overflow. However the
compiler is not obliged to do so, and I rather doubt that there
is any production compiler that does.

The implementation is allowed to generate code that overflows when the
operations are carried out in the order specified by the program,
interpreted in accordance to the C standard. The as-if rule allows
rearrangement of those operations, but requires that the results from
the rearrangement must be the same as from the correct arrangement,
whenever the correct arrangement has defined behavior, as it does
when, for instance, you evaluate (5-3+INT_MAX).

Overflow checks are not required, except when the implementation
rearranges the operations so that it can produce overflows when the
correct order would not. Then a conforming implementation is indeed
required to detect the overflow and respond by making sure that the
correct result is returned. If that's too difficult, the easy option
is to simply refrain from rearranging them.

In short, the responsibility for avoiding overflow rests with the
programmer.

Agreed - in part by relying upon the promises made by the standard.

jacob navia · Dec 4, 2007

Richard said:
The "compiler cannot be aware" is the main issue. Technically
you are right and I am wrong - obviously the compiler could
generate checks for the possibility of overflow. However the
compiler is not obliged to do so, and I rather doubt that there
is any production compiler that does.

In short, the responsibility for avoiding overflow rests with the
programmer.

For your information:

lcc-win invoked with the -overflowcheck option will check
each operation that can generate an overflow. Any operation that
does generate one will provoke the end of the program
with an error message.

Richard Harter · Dec 4, 2007

Richard Harter wrote:
...

That's rather the whole point: there is no such permission granted to
the implementation. If you think otherwise, please provide a
citation. Computer math differs from theoretical math in that it is,
in general, neither associative nor commutative, and the standard
nowhere permits implementations to assume that it is.

Cited elsewhere, Harbison &Steele, 1991, p 207, section 7.12,
Order of evaluation. The section begins:

In general, the compiler can rearrange the order in which an
expression is evaluated. The rearrangement may consist of
evaluating only the arguments or the two operands of a binary
operator, in some particular order other than the obvious
left-to-right order. The binary ooperators +, *, &, ^, and | are
assumed to be completely associative and commutiative, and a
compiler is permitted to exploit this assumption.

Of course Harbison & Steele could be wrong, or the standard may
have been amended since then, but the text is quite clear. If
you disagree with the text feel free to point out where it is
wrong by citing the appropriate section of the standard.

Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com
The good news is that things could be worse; the bad
news is that things aren't as good as they should be.

Eric Sosman · Dec 4, 2007

Richard said:
[...]
You do raise an interesting point about making "trash of correct
programs". The thing is, if the compiler can reorder the
calculations using the as-if rule (which permits treating
expressions as though addition were associative and commutative)
and the resulting sequence of operations invokes undefined
behaviour, then the program isn't "correct" under the rules of C.

Where does the Standard say anything to support your
parenthesized claim? The "as if" rule does not sanction
the blind application of inapplicable rules, nor does it
permit transformations that turn correct programs into
incorrect ones. "As if" frees the implementation from
having to imitate the Standard's abstract machine, step
by painful step, but still requires it to produce the
same results if the results were well-defined in the
first place.

Implication: If a machine behaves badly on overflow
and if a rearrangement of some expression would overflow
in situations where the original would not, the compiler
must refrain from rearranging, or must somehow manage to
intercept the overflow and deliver the required result
in spite of it.

Implication: If the original expression was going to
overflow anyhow, the compiler *is* free to rearrange it
and the rearrangement is not obliged to overflow. "As if"
does not promise to leave undefined behavior unchanged.

Parting shot: If the compiler had the freedom you claim
for it, then `x = 0' could invoke undefined behavior. Why?
Because the compiler would be free to rewrite it as
`x = 3 * INT_MAX - 2 * INT_MAX - INT_MAX', and both of
the product terms overflow.

operators vs. methods	11	Aug 25, 2009
Default operators.	4	Dec 9, 2008
Monkey patching class String to add bitwise operators	2	Feb 23, 2011
bitwise decimal operators	2	Apr 1, 2005
Declarator operators	7	Dec 6, 2005
non-standard functions in libc -- bad design?	21	Sep 29, 2009
Does GCC optimize variadic functions to death?	14	Mar 16, 2010
complexity of trigonometric functions	25	Sep 2, 2010

operators similar to functions?

Keith Thompson

cr88192

Gordon Burditt

Spoon

Richard Harter

Eric Sosman

Richard Harter

Walter Roberson

Keith Thompson

Keith Thompson

Eric Sosman

Dik T. Winter

pete

Philip Potter

Richard Harter

Richard Harter

jameskuyper

jacob navia

Richard Harter

Eric Sosman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads