Re: "Strong typing vs. strong testing"

Chris Rebert · Sep 29, 2010

I'd like to design a language like this. If you add a quantity in
inches to a quantity in centimetres you get a quantity in (say)
metres. If you multiply them together you get an area, if you divide
them you get a dimeionless scalar. If you divide a quantity in metres
by a quantity in seconds you get a velocity, if you try to subtract
them you get an error.

Sounds just like Frink:
http://futureboy.us/frinkdocs/
http://en.wikipedia.org/wiki/Frink

Cheers,
Chris

George Neuner · Sep 29, 2010

He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

Click to expand...

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

I don't care to check it ... the fact that the SI unit involves 12
decimal places whereas the imperial unit involves 3 tells me the
conversion probably shouldn't be done in a program that wants
accuracy.

George

rustom · Sep 29, 2010

Sounds just like Frink:http://futureboy.us/frinkdocs/http://en.wikipedia.org/wiki/Frink

A currently developed language with units is curl: see
http://developers.curl.com/userdocs/docs/en/dguide/quantities-basic.html

Chris Rebert · Sep 29, 2010

A currently developed language with units is curl: see
http://developers.curl.com/userdocs/docs/en/dguide/quantities-basic.html

Frink's most recent version is only 17 days old. (You seemed to imply
Frink isn't under active development.)

Cheers,
Chris

Torsten Zühlsdorff · Sep 29, 2010

Keith said:
Actually, the speed of light is exactly 299792458.0 m/s by
definition.

Yes, but just in vacuum.

Greetings,
Torsten

Pascal J. Bourguignon · Sep 29, 2010

George Neuner said:
He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

Click to expand...

That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

Click to expand...

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

Click to expand...

I don't care to check it ... the fact that the SI unit involves 12
decimal places whereas the imperial unit involves 3 tells me the
conversion probably shouldn't be done in a program that wants
accuracy.

Because perhaps you're thinking that oil is sent over the oceans, and
sold retails in barrils of 42 gallons?

Actually, when I buy oil, it's from a pump that's graduated in liters!

It comes from trucks with citerns containing 24 m³.

And these trucks get it from reservoirs of 23,850 m³.

"Tankers move approximately 2,000,000,000 metric tons" says the English
Wikipedia page...

Now perhaps it all depends on whether you buy your oil from Total or
from Texaco, but in my opinion, you're forgetting something: the last
drop. You never get exactly 42 gallons of oil, there's always a little
drop more or less, so what you get is perhaps 158.987 liter or
41.9999221 US gallons, or even 158.98 liter = 41.9980729 US gallons,
where you need more significant digits.

Paul Wallich · Sep 29, 2010

George Neuner said:
George Neuner said:

On 28 Sep 2010 12:42:40 GMT, Albert van der Horst
I would say the dimensional checking is underrated. It must be
complemented with a hard and fast rule about only using standard
(SI) units internally.

Oil output internal : m^3/sec
Oil output printed: kbarrels/day

"barrel" is not an SI unit.

He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

Click to expand...

That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

Click to expand...

I don't care to check it ... the fact that the SI unit involves 12
decimal places whereas the imperial unit involves 3 tells me the
conversion probably shouldn't be done in a program that wants
accuracy.

Click to expand...

[...]

Now perhaps it all depends on whether you buy your oil from Total or
from Texaco, but in my opinion, you're forgetting something: the last
drop. You never get exactly 42 gallons of oil, there's always a little
drop more or less, so what you get is perhaps 158.987 liter or
41.9999221 US gallons, or even 158.98 liter = 41.9980729 US gallons,
where you need more significant digits.

And even that pales in comparison to the expansion and contraction of
petroleum products with temperature. Compensation to standard temp is
required in some jurisdictions but not in others...

Keith Thompson · Sep 29, 2010

Erik Max Francis said:
I know. Hence why I wrote the comment "floating point accuracy aside"
when printing it.

Ok. I took the comment to be an indication that the figure was
subject to floating point accuracy concerns; in fact you meant just
the opposite.

Thomas A. Russ · Sep 29, 2010

George Neuner said:
That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

I suppose that one has to choose between two fundamental designs for any
computational system of units. One can either store the results
internally in a canonical form, which generally means an internal
representation in SI units. Then all calculations are performed using
the interal units representation and conversion happens only on input or
output.

Or one can store the values in their original input form, and perform
conversions on the fly during calculations. For calculations one will
still need to have some canonical representation for cases where the
result value doesn't have a preferred unit provided. For internal
calculations this will often be the case.

Now whether one will necessarily have a loss of precision depends on
whether the conversion factors are exact or approximations. As long as
the factors are exact, one can have the internal representation be exact
as well. One method would be to use something like the Commmon Lisp
rational numbers or the Gnu mp library.

And a representation where one preserves the "preferred" unit for
display purposes based on the original data as entered is also nice.
Roman Cunis' Common Lisp library does that, and with the use of rational
numbers for storing values and conversion factors allows one to do nice
things like make sure that

30mph * 3h = 90mi

even when the internal representation is in SI units (m/s, s, m).

George Neuner · Sep 29, 2010

George Neuner said:
George Neuner said:

On 28 Sep 2010 12:42:40 GMT, Albert van der Horst
I would say the dimensional checking is underrated. It must be
complemented with a hard and fast rule about only using standard
(SI) units internally.

Oil output internal : m^3/sec
Oil output printed: kbarrels/day

"barrel" is not an SI unit.

He didn't say it was. Internal calculations are done in SI units (in
this case, m^3/sec); on output, the internal units can be converted to
whatever is convenient.

Click to expand...

That's true. But it is a situation where the conversion to SI units
loses precision and therefore probably shouldn't be done.

And when speaking about oil there isn't
even a simple conversion.

42 US gallons ? 34.9723 imp gal ? 158.9873 L

[In case those marks don't render, they are meant to be the
double-tilda sign meaning "approximately equal".]

There are multiple different kinds of "barrels", but "barrels of oil"
are (consistently, as far as I know) defined as 42 US liquid gallons.
A US liquid gallon is, by definition, 231 cubic inches; an inch
is, by definition, 0.0254 meter. So a barrel of oil is *exactly*
0.158987294928 m^3, and 1 m^3/sec is exactly 13.7365022817792
kbarrels/day. (Please feel free to check my math.) That's
admittedly a lot of digits, but there's no need for approximations
(unless they're imposed by the numeric representation you're using).

Click to expand...

I don't care to check it ... the fact that the SI unit involves 12
decimal places whereas the imperial unit involves 3 tells me the
conversion probably shouldn't be done in a program that wants
accuracy.

Click to expand...

Because perhaps you're thinking that oil is sent over the oceans, and
sold retails in barrils of 42 gallons?

Actually, when I buy oil, it's from a pump that's graduated in liters!

It comes from trucks with citerns containing 24 m³.

And these trucks get it from reservoirs of 23,850 m³.

"Tankers move approximately 2,000,000,000 metric tons" says the English
Wikipedia page...

Now perhaps it all depends on whether you buy your oil from Total or
from Texaco, but in my opinion, you're forgetting something: the last
drop. You never get exactly 42 gallons of oil, there's always a little
drop more or less, so what you get is perhaps 158.987 liter or
41.9999221 US gallons, or even 158.98 liter = 41.9980729 US gallons,
where you need more significant digits.

No. I'm just reacting to the "significant figures" issue. Real
world issues like US vs Eurozone and measurement error aside - and
without implying anyone here - many people seem to forget that
multiplying significant figures doesn't add them, and results to 12
decimal places are not necessarily any more accurate than results to 2
decimal places.

It makes sense to break macro barrel into micro units only when
necessary. When a refinery purchases 500,000 barrels, it is charged a
barrel price, not some multiple of gallon or liter price and
regardless of drop over/under. The refinery's process is continuous
and it needs a delivery if it has less than 20,000 barrels - so the
current reserve figure of 174,092 barrels is as accurate as is needed
(they need to order by tomorrow because delivery will take 10 days).
OTOH, because the refinery sells product to commercial vendors of
gasoline/petrol and heating oil in gallons or liters, it does makes
sense to track inventory and sales in (large multiples of) those
units.

Similarly, converting everything to m³ simply because you can does not
make sense. When talking about the natural gas reserve of the United
States, the figures are given in Km³ - a few thousand m³ either way is
irrelevant.

George

MRAB · Sep 29, 2010

I suppose that one has to choose between two fundamental designs for any
computational system of units. One can either store the results
internally in a canonical form, which generally means an internal
representation in SI units. Then all calculations are performed using
the interal units representation and conversion happens only on input or
output.

Or one can store the values in their original input form, and perform
conversions on the fly during calculations. For calculations one will
still need to have some canonical representation for cases where the
result value doesn't have a preferred unit provided. For internal
calculations this will often be the case.

Now whether one will necessarily have a loss of precision depends on
whether the conversion factors are exact or approximations. As long as
the factors are exact, one can have the internal representation be exact
as well. One method would be to use something like the Commmon Lisp
rational numbers or the Gnu mp library.

And a representation where one preserves the "preferred" unit for
display purposes based on the original data as entered is also nice.
Roman Cunis' Common Lisp library does that, and with the use of rational
numbers for storing values and conversion factors allows one to do nice
things like make sure that

30mph * 3h = 90mi

even when the internal representation is in SI units (m/s, s, m).

You could compare it to handling strings, where Unicode is used
internally and the encoding can be preserved for when you want to
output.

Squeamizh · Sep 29, 2010

that is a lie.

Compilation only makes sure that values provided at compilation-time
are of the right datatype.

What happens though is that in the real world, pretty much all
computation depends on user provided values at runtime. See where are
we heading?

this works at compilation time without warnings:
int m=numbermax( 2, 6 );

this too:
int a, b, m;
scanf( "%d", &a );
scanf( "%d", &b );
m=numbermax( a, b );

no compiler issues, but will not work just as much as in python if
user provides "foo" and "bar" for a and b... fail.

What you do if you're feeling insecure and paranoid? Just what
dynamically typed languages do: add runtime checks. Unit tests are
great to assert those.

Fact is: almost all user data from the external words comes into
programs as strings. No typesystem or compiler handles this fact all
that graceful...

I disagree with your conclusion. Sure, the data was textual when it
was initially read by the program, but that should only be relevant to
the input processing code. The data is likely converted to some
internal representation immediately after it is read and validated,
and in a sanely-designed program, it maintains this representation
throughout its life time. If the structure of some data needs to
change during development, the compiler of a statically-typed language
will automatically tell you about any client code that was not updated
to account for the change. Dynamically typed languages do not provide
this assurance.

RG · Sep 29, 2010

Squeamizh said:
I disagree with your conclusion. Sure, the data was textual when it
was initially read by the program, but that should only be relevant to
the input processing code. The data is likely converted to some
internal representation immediately after it is read and validated,
and in a sanely-designed program, it maintains this representation
throughout its life time. If the structure of some data needs to
change during development, the compiler of a statically-typed language
will automatically tell you about any client code that was not updated
to account for the change. Dynamically typed languages do not provide
this assurance.

This is a red herring. You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
return (a > b ? a : b);
}

int foo(int x) { return 9223372036854775807+x; }

int main () {
printf("%d\n", maximum(foo(1), 1));
return 0;
}
[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

Squeamizh · Sep 29, 2010

I disagree with your conclusion. Sure, the data was textual when it
was initially read by the program, but that should only be relevant to
the input processing code. The data is likely converted to some
internal representation immediately after it is read and validated,
and in a sanely-designed program, it maintains this representation
throughout its life time. If the structure of some data needs to
change during development, the compiler of a statically-typed language
will automatically tell you about any client code that was not updated
to account for the change. Dynamically typed languages do not provide
this assurance.

Click to expand...

This is a red herring. You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
return (a > b ? a : b);

}

int foo(int x) { return 9223372036854775807+x; }

int main () {
printf("%d\n", maximum(foo(1), 1));
return 0;}

[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

In short, static typing doesn't solve all conceivable problems.

We are all aware that there is no perfect software development process
or tool set. I'm interested in minimizing the number of problems I
run into during development, and the number of bugs that are in the
finished product. My opinion is that static typed languages are
better at this for large projects, for the reasons I stated in my
previous post.

RG · Sep 29, 2010

Squeamizh said:
Â Squeamizh said:

On 27 set, 05:46, TheFlyingDutchman <[email protected]> wrote:

Click to expand...

On Sep 27, 12:58Â am, (e-mail address removed) (Pascal J. Bourguignon)
wrote:
<7df0eb06-9be1-4c9c-8057-e9fdb7f0b...@q16g2000prf.googlegroups.com
,
Â TheFlyingDutchman <[email protected]> wrote:

Click to expand...

This might have been mentioned here before, but I just came
across
it: a
2003 essay by Bruce Eckel on how reliable systems can get
built in
dynamically-typed languages. Â It echoes things we've all said
here, but
I think it's interesting because it describes a conversion
experience:
Eckel started out in the strong-typing camp and was won over.

Click to expand...

Â Â https://docs.google.com/View?id=dcsvntt2_25wpjvbbhk

Click to expand...

-- Scott

Click to expand...

If you are writing a function to determine the maximum of two
numbers
passed as arguents in a dynamic typed language, what is the
normal
procedure used by Eckel and others to handle someone passing in
invalid values - such as a file handle for one varible and an
array
for the other?

Click to expand...

The normal procedure is to hit such a person over the head with a
stick
and shout "FOO".

Click to expand...

Moreover, the functions returning the maximum may be able to work
on
non-numbers, as long as they're comparable. Â What's more, there are
numbers that are NOT comparable by the operator you're thinking
about!.

Click to expand...

So to implement your specifications, that function would have to be
implemented for example as:

Click to expand...

(defmethod lessp ((x real) (y real)) (< x y))
(defmethod lessp ((x complex) (y complex))
Â (or (< (real-part x) (real-part y))
Â Â Â (and (= (real-part x) (real-part y))
Â Â Â Â Â Â (< (imag-part x) (imag-part y)))))

Click to expand...

(defun maximum (a b)
Â (if (lessp a b) b a))

Click to expand...

And then the client of that function could very well add methods:

Click to expand...

(defmethod lessp ((x symbol) (y t)) (lessp (string x) y))
(defmethod lessp ((x t) (y symbol)) (lessp x (string y)))
(defmethod lessp ((x string) (y string)) (string< x y))

Click to expand...

and call:

Click to expand...

(maximum 'hello "WORLD") --> "WORLD"

Click to expand...

and who are you to forbid it!?

Click to expand...

- Show quoted text -

Click to expand...

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer. If someone
tries to call it incorrectly it is a compile error.
In a dynamic typed language maximum(a, b) can be called with
incorrect
datatypes. Even if I make it so it can handle many types as you did
above, it could still be inadvertantly called with a file handle for
a
parameter or some other type not provided for. So does Eckel and
others, when they are writing their dynamically typed code advocate
just letting the function blow up or give a bogus answer, or do they
check for valid types passed? If they are checking for valid types it
would seem that any benefits gained by not specifying type are lost
by
checking for type. And if they don't check for type it would seem
that
their code's error handling is poor.

Click to expand...

that is a lie.

Click to expand...

Compilation only makes sure that values provided at compilation-time
are of the right datatype.

Click to expand...

What happens though is that in the real world, pretty much all
computation depends on user provided values at runtime. Â See where are
we heading?

Click to expand...

this works at compilation time without warnings:
int m=numbermax( 2, 6 );

Click to expand...

this too:
int a, b, m;
scanf( "%d", &a );
scanf( "%d", &b );
m=numbermax( a, b );

Click to expand...

no compiler issues, but will not work just as much as in python if
user provides "foo" and "bar" for a and b... fail.

Click to expand...

What you do if you're feeling insecure and paranoid? Â Just what
dynamically typed languages do: Â add runtime checks. Â Unit tests are
great to assert those.

Click to expand...

Fact is: Â almost all user data from the external words comes into
programs as strings. Â No typesystem or compiler handles this fact all
that graceful...

Click to expand...

I disagree with your conclusion. Â Sure, the data was textual when it
was initially read by the program, but that should only be relevant to
the input processing code. Â The data is likely converted to some
internal representation immediately after it is read and validated,
and in a sanely-designed program, it maintains this representation
throughout its life time. Â If the structure of some data needs to
change during development, the compiler of a statically-typed language
will automatically tell you about any client code that was not updated
to account for the change. Â Dynamically typed languages do not provide
this assurance.

Click to expand...

This is a red herring. Â You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. Â For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
Â return (a > b ? a : b);

}

int foo(int x) { return 9223372036854775807+x; }

int main () {
Â printf("%d\n", maximum(foo(1), 1));
Â return 0;}

[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

Click to expand...

In short, static typing doesn't solve all conceivable problems.

More specifically, the claim made above:

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer.

is false. And it is not necessary to invoke the vagaries of run-time
input to demonstrate that it is false.

We are all aware that there is no perfect software development process
or tool set. I'm interested in minimizing the number of problems I
run into during development, and the number of bugs that are in the
finished product. My opinion is that static typed languages are
better at this for large projects, for the reasons I stated in my
previous post.

More power to you. What are you doing here on cll then?

rg

Thomas A. Russ · Sep 29, 2010

RG said:
More power to you. What are you doing here on cll then?

This thread is massively cross-posted.

Keith Thompson · Sep 30, 2010

RG said:
This is a red herring. Â You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. Â For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
Â return (a > b ? a : b);

}

int foo(int x) { return 9223372036854775807+x; }

int main () {
Â printf("%d\n", maximum(foo(1), 1));
Â return 0;}

[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

Click to expand...

In short, static typing doesn't solve all conceivable problems.

Click to expand...

More specifically, the claim made above:

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer.

Click to expand...

is false. And it is not necessary to invoke the vagaries of run-time
input to demonstrate that it is false.

But the above maximum() function does exactly that. The program's
behavior happens to be undefined or implementation-defined for reasons
unrelated to the maximum() function.

Depending on the range of type int on the given system, either the
behavior of the addition in foo() is undefined (because it overflows),
or the implicit conversion of the result to int either yields an
implementation-defined result or (in C99) raises an
implementation-defined signal; the latter can lead to undefined
behavior.

Since 9223372036854775807 is 2**63-1, what *typically* happens is that
the addition yields the value 0, but the C language doesn't require that
particular result. You then call maximum with arguments 0 and 1, and
it quite correctly returns 1.

More power to you. What are you doing here on cll then?

This thread is cross-posted to several newsgroups, including
comp.lang.c.

Pascal J. Bourguignon · Sep 30, 2010

Squeamizh said:
In short, static typing doesn't solve all conceivable problems.

We are all aware that there is no perfect software development process
or tool set. I'm interested in minimizing the number of problems I
run into during development, and the number of bugs that are in the
finished product. My opinion is that static typed languages are
better at this for large projects, for the reasons I stated in my
previous post.

Our experience is that a garbage collector and native bignums are much
more important to minimize the number of problems we run into during
development and the number of bugs that are in the finished products.

RG · Sep 30, 2010

Keith Thompson said:
RG said:

This is a red herring. Â You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. Â For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
Â return (a > b ? a : b);

}

int foo(int x) { return 9223372036854775807+x; }

int main () {
Â printf("%d\n", maximum(foo(1), 1));
Â return 0;}

[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

In short, static typing doesn't solve all conceivable problems.

Click to expand...

More specifically, the claim made above:

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer.

Click to expand...

is false. And it is not necessary to invoke the vagaries of run-time
input to demonstrate that it is false.

Click to expand...

But the above maximum() function does exactly that. The program's
behavior happens to be undefined or implementation-defined for reasons
unrelated to the maximum() function.

Depending on the range of type int on the given system, either the
behavior of the addition in foo() is undefined (because it overflows),
or the implicit conversion of the result to int either yields an
implementation-defined result or (in C99) raises an
implementation-defined signal; the latter can lead to undefined
behavior.

Since 9223372036854775807 is 2**63-1, what *typically* happens is that
the addition yields the value 0, but the C language doesn't require that
particular result. You then call maximum with arguments 0 and 1, and
it quite correctly returns 1.

This all hinges on what you consider to be "a function maximum(int a,
int b) that ... always work ... [and] never give an invalid
answer." But if you don't consider an incorrect answer (according to
the rules of arithmetic) to be an invalid answer then the claim becomes
vacuous. You could simply ignore the arguments and return 0, and that
would meet the criteria.

If you try to refine this claim so that it is both correct and
non-vacuous you will find that static typing does not do nearly as much
for you as most of its adherents think it does.

This thread is cross-posted to several newsgroups, including
comp.lang.c.

Click to expand...

Ah, so it is. My bad.

rg

Keith Thompson · Sep 30, 2010

RG said:
Keith Thompson said:

RG said:

This is a red herring. Â You don't have to invoke run-time input to
demonstrate bugs in a statically typed language that are not caught by
the compiler. Â For example:

[ron@mighty:~]$ cat foo.c
#include <stdio.h>

int maximum(int a, int b) {
Â return (a > b ? a : b);

}

int foo(int x) { return 9223372036854775807+x; }

int main () {
Â printf("%d\n", maximum(foo(1), 1));
Â return 0;}

[ron@mighty:~]$ gcc -Wall foo.c
[ron@mighty:~]$ ./a.out
1

Even simple arithmetic is Turing-complete, so catching all type-related
errors at compile time would entail solving the halting problem.

rg

In short, static typing doesn't solve all conceivable problems.

More specifically, the claim made above:

in C I can have a function maximum(int a, int b) that will always
work. Never blow up, and never give an invalid answer.

is false. And it is not necessary to invoke the vagaries of run-time
input to demonstrate that it is false.

Click to expand...

But the above maximum() function does exactly that. The program's
behavior happens to be undefined or implementation-defined for reasons
unrelated to the maximum() function.

Depending on the range of type int on the given system, either the
behavior of the addition in foo() is undefined (because it overflows),
or the implicit conversion of the result to int either yields an
implementation-defined result or (in C99) raises an
implementation-defined signal; the latter can lead to undefined
behavior.

Since 9223372036854775807 is 2**63-1, what *typically* happens is that
the addition yields the value 0, but the C language doesn't require that
particular result. You then call maximum with arguments 0 and 1, and
it quite correctly returns 1.

Click to expand...

This all hinges on what you consider to be "a function maximum(int a,
int b) that ... always work ... [and] never give an invalid
answer."

int maximum(int a, int b) { return a > b ? a : b; }

But if you don't consider an incorrect answer (according to
the rules of arithmetic) to be an invalid answer then the claim becomes
vacuous. You could simply ignore the arguments and return 0, and that
would meet the criteria.

Click to expand...

I don't believe it's possible in any language to write a maximum()
function that returns a correct result *when given incorrect argument
values*.

The program (assuming a typical implementation) calls maximum() with
arguments 0 and 1. maximum() returns 1. It works. The problem
is elsewhere in the program.

(And on a hypothetical system with INT_MAX >= 9223372036854775808,
the program's entire behavior is well defined and mathematically
correct. C requires INT_MAX >= 32767; it can be as large as the
implementation chooses. In practice, the largest value I've ever
seen for INT_MAX is 9223372036854775807.)

If you try to refine this claim so that it is both correct and
non-vacuous you will find that static typing does not do nearly as much
for you as most of its adherents think it does.

Click to expand...

Speaking only for myself, I've never claimed that static typing solves
all conceivable problems. My point is only about this specific example
of a maximum() function.

[...]

python philosophical question - strong vs duck typing	2	Jan 3, 2012
Strong encapsulation, static typing and unit testing	3	Dec 29, 2006
STRONG dynamic typing favors tools..?	5	Jan 7, 2008
Regex ^ beginning not strong?	2	Jul 26, 2010
Error Testing	37	Oct 19, 2013
Parameter guidelines for the Strong AI Singularity	0	Feb 24, 2012
How to Name Gems - A STRONG Recommendation	3	Nov 15, 2010
Strong Typed generic Composite class	0	Jul 17, 2008

Re: "Strong typing vs. strong testing"

Chris Rebert

George Neuner

rustom

Chris Rebert

Torsten Zühlsdorff

Pascal J. Bourguignon

Paul Wallich

Keith Thompson

Thomas A. Russ

George Neuner

MRAB

Squeamizh

RG

Squeamizh

RG

Thomas A. Russ

Keith Thompson

Pascal J. Bourguignon

RG

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads