# strange warning

Discussion in 'C Programming' started by Bill Cunningham, May 10, 2014.

1. ### Ben BacarisseGuest

Yes, it should. I edited the original with brain in neutral. Thanks.

Ben Bacarisse, May 14, 2014

2. ### Malcolm McLeanGuest

It's like the terms "bit" and "real", as in "real number".
Real numbers are represented in the computer by bits, so in a sense you
could say that reals are a subset of the bits. But that's only when you're
thinking in pure memory layout terms. Reals aren't a subset of bits
in any higher level or mathematical sense. And in fact a real on a
digital computer isn't really real. This bit me yesterday. (A line
was represented by y1 = y2 - epsilon. So of course it wasn't horizontal.
But there was no ray y = (y1 + y2)/2 which would intersect it.
You could do it like this.

void buffer[1024];

buff[0] = 0; /* illegal, it's a buffer, not an array */

int mclean_array[] = buffer; /* we've got a McLean array, it's
empty and of size 0, it uses the memory space buffer */

mclean_array[0] = 0; /* illegal, the mclean array has length 0 */

mclean_array.push_back(0); /* legal, as in C++ */
mclean_array[0] = 42; /* legal now */

char mclean_array2[] = buffer + mclean_array.memsize(); /*create another
mclean_array, in the buffer, after the first array, this time holding
chars */

mclean_array2.push_back('x'); /* put some data into it */
mclean_array.push_back(123); /* legal, but we've overwritten mclean_array 2*/

printf("%c", mclean_array2[0]); /* accepted, technically undefined behaviour,
we've set up two arrays over the same space in the same buffer, but it's
C and presumably we know what we're doing, prints a garbage character */

printf("%d", mclean_array[1]); /* defined and legal, must print 123 */

Malcolm McLean, May 14, 2014

3. ### Ian CollinsGuest

I was going to suggest you shift to C++ in response to one of your
earlier posts, but seeing this it looks like C++'s std::vector and

Ian Collins, May 14, 2014
4. ### Malcolm McLeanGuest

Basically yes.
The idea that an array is a data-representation level object, a buffer
an implementation-level object, isn't one that I invented, and there
are plenty of languages which offer arrays but discourage treating them
as buffers.

Malcolm McLean, May 14, 2014
5. ### Malcolm McLeanGuest

All programming languages are fundamentally equivalent.
So in a high-level language, say Matlab, I can declare a matrix
of 64K real values, then treat each real as a byte value 0-255.
Then I can write Z80 instructions, in Matlab code, to move
Zx81 ROM from the web, and write a little Basic program,

So I'm using the Matlab matrix as an array, in one context, as
a buffer in another, context. To my Matlab Z80 simulator, it's
an array, even if I set all the values to random on startup, they
have to have defined values, the Z80 instructions have defined
effects on them. But to the Space Invaders, it's a buffer, we
can peek and poke bytes to move the Space Invaders about, it
doesn't inherently represent or mean anything until we
give it a meaning.

Malcolm McLean, May 14, 2014
6. ### Malcolm McLeanGuest

It's a basic fundamental theorem of computer science. Interestingly, you don't
have to know it to be a perfectly competent programmer.
Matlab provides a C interface, so you can implement Matlab functions in
C, and of course matrices are arrays of doubles, contiguous in memory.
But Matlab itself has no address of operator. You could rewrite the matrices
to be column-major, or to use strides, and you wouldn't break any Matlab
code.
A Matlab matrix is, in my terminology, an "array" rather than a buffer.
You can't create one with uninitialised values, at least not easily - there
might be a hack that allows you to defeat the system in Matlab code
but I'm not aware of it. So typically Matlab programmers create matrices
initialised to zeroes, or ones, or the identity matrix, whatever makes
most sense in context.But all programming languages are fundamentally equivalent. Though
Matlab sees the matrix as defined object, holding data, not as a buffer,
we can treat it as a buffer again in higher level code. So we can write a
Z80 simulator, in Matlab, run the ZX81 ROM on it, which includes a
Basic interpreter, and implement peek and poke. Now in terms of the
Z80 simulator, we've got a buffer, not an array.

Malcolm McLean, May 14, 2014
7. ### James KuyperGuest

But no such array object exists yet. I consider this a flaw in the
standard's definition of pointer arithmetic, not in your example code.

James Kuyper, May 14, 2014
8. ### Ben BacarisseGuest

No, not as stated. As most people use the term "programming language",
it is, in effect, a definition not a theorem.

If I present you with a new notation for expressing rote sequences of
computational steps (let's call it BB/1) and I ask you "is this a
programming language?" you will want to determine two things about it.
First, is it Turing complete? Can it express every computation that a
Turing machine can perform? If it isn't, you won't call it a
programming language. The second thing you'd do is determine if it is
Turing equivalent, i.e. that there is no computation it can describe
that can't be computed by a Turning machine. If it passes both tests,
BB/1 will be declared to be a programming language.

Maybe you have some other definition of programming language, in which
case it might well be a theorem, but it's not one I've come across.

There are various fundamental theorems about what minimal constructs are
needed for a notation to be Turing complete, and it is widely believed
that any practical Turing complete notation would be Turing equivalent
(that's the Church-Turing thesis) but that's not a theorem.

<snip>

Ben Bacarisse, May 14, 2014
9. ### Malcolm McLeanGuest

Most English speakers can use the term "programming language" perfectly
competently, and they mean a notation that is used to give instructions
to computers.
They might well know that Fortran is designed for numerical programming,
whilst Snobol is designed for non-numerical problems. Then won't know,
unless they've done a smidgen of computer science, that any Snobol
program can be rewritten as a Fortran program, and vice versa. It's not
obvious that this is the case, and as a practical proposition, often it isn't
possible.

Maths is tautology, ultimately. I'm not a mathematician, and I might lack a
deep insight into the nature of mathematical proof and what a theorem
is. But, like most native English speakers, I can use the term "theorem"
competently enough for general use.

Malcolm McLean, May 14, 2014
10. ### Ben BacarisseGuest

Using that definition what you stated as a theorem is not a theorem.
There are many notations for giving instructions to computers which are
not Turing complete (many macro processors (C's for example), regular
expressions, SQL...). You will, maybe, counter by saying "but most
people don't consider those as programming languages" which was exactly
my point. Or maybe you will complain that these don't "run" on
hardware, but that means that, say, Python is not a programming
language.
Yes, in that narrow sense I was wrong: the claim that the members of the set
of even numbers are all divisible by two is a theorem.
Does the "all programming languages are equivalent" theorem have a name?

Ben Bacarisse, May 14, 2014
11. ### Malcolm McLeanGuest

And they managed to work Turing completeness into C++ template notation.
But most people wouldn't consider the subset of C++ that consists of the
template system as a "programming language", except in a geeky conversation.

Things like SQL and regular expressions are marginal cases, also html, a
lot of people use the term "html programming". So in that sense, the claim
that "all programming languages are equivalent" was wrong. It extends to the
core members of the set, and many of the marginal members, but not all
marginal members. That's commonly the case with natural language. Terms
denote what computer scientists call "fuzzy sets".
It follows very simply from Turing equivalence. But that wasn't quite what I
meant when I first said "all programming languages are equivalent". I meant
that all popular languages in current use are based on a procedural paradigm
that allows for loops, integer indices, and so on. All (existing, practical, in
common use) programming languages are fundamentally equivalent in ways
that go deeper than Turing equivalency.
What that means is that the ideas "array" and "buffer" are not tied to C, they
are language independent. In a straightforwards way for the vast majority
of languages, and kind of for every language, So "all" is the appropriate word.

Then we have
That's a kind of misunderstanding. Structures are essentially independent of the
implementation details. And it's a deep sort of misunderstanding.

Malcolm McLean, May 14, 2014
12. ### Ben BacarisseGuest

I was not commenting on that. I was commenting on what you said was a
fundamental theorem of computer science. What is it's conventional
name? I'd like to know which theorem you were thinking of and I can't
work that out from what you've said about it.

[At this point the quoting goes wrong and you attribute to me things that
other people said so I'll just snip the rest.]

Ben Bacarisse, May 15, 2014
13. ### Malcolm McLeanGuest

The conventional name is Turing equivalence. This means that any computing
device many emulate any other, given only unlimited memory or storage
space. It follows from that that, given a few conditions, any computer
language may emulate any other. So all languages have a fundamental
similarity.

Now you might object that, if we implement a simple Turing machine with
a pencil, rubber, read head, and infinite paper tape, we can write a
C compiler for it, but we can't possibly provide O(constant) time
are at root the same thing. The real observation is that most if not
all rational, practical, in-use etc programming languages on real,
existing devices will operate in much the same way. That follows
from the fundamental theorem, but it's not actually imposed by it.
Everyone who implements an array is going to calculate a memory
index then index into it, for example. The array may be broken up,
it may have strides it it, it may have a layer of indirection under
it via a hash table. But it's still at bottom going to work in the
same way.

If you're going to argue, you have to understand natural language.
You speak English fluently, probably as first language, so you
do have an intuitive understanding of language which is good enough
to discuss technical points. Your understanding is not good
enough for you to participate competently in a philosophical debate.
You try to nitpick on the word theorem. "Theorem" means, "something
which is known form pure logical deduction, as opposed to know
from observation", in normal usage. It might have a technical
mathematical definition that I an the normal person is unaware
of, and you can point that out. But you have to be careful that
you're not just being a clever dick.
Similarly "programming language" is a class which has core members
and marginal members. The word "all" doesn't necessarily cover
the marginal members of the set, it also excludes some special
case ("All Britons are subjects of the Queen" - well no, not the
Queen herself, it doesn't mean that the claim is in any normal
sense false).

Now I don't believe you don't know I'm talking about Turing
equivalence. But what the main thrust of what I was saying
wasn't a claim about mathematical equivalence. You have to
understand that the mathematical equivalence is there before
you can really appreciate what I am saying, however.

Malcolm McLean, May 15, 2014
14. ### Ian CollinsGuest

Nice!

Ian Collins, May 15, 2014
15. ### Malcolm McLeanGuest

"As in elephant, so in E. coli". All biochemistry is fundamentally "the same",
because all organisms are facing essentially the same problem, and they
have essentially the same toolset to deal with it.
It's a similar situation with programming languages. They are bound by
logical laws rather than physical laws, which gives a limited set of
fundamental operations. Then, just as all organisms evolved from a common
ancestor, all processing chips were built by engineers from the same culture,
using each other's designs. So the "toolkit" is the same.

Malcolm McLean, May 15, 2014
16. ### Keith ThompsonGuest

Turing equivalence is not a theorem, it's a characteristic that a given
language may or may not have. There might be a theorem that says that a
given language is Turing equivalent, but I know of no actual theorem
that says that all languages are Turing equivalent (and in fact there
could be no such theorem, since not all languages *are* Turing
equivalent).

I think what you're talking about might be a definition, not a theorem.

[...]

Keith Thompson, May 15, 2014
17. ### James KuyperGuest

It's not what the standard defines that's a problem: it's what the
standard fails to define. It's not a failure to match my model, it's a
failure by the standard to specify what happens in certain cases. It's
will happen in those cases, and as far as I know, every compiler I've
ever used has conformed to those expectations.
I would have it modified so that it makes sense even when referring to
pointers that, according to the standard, point at objects that have
neither a declared type nor an effective type, and which therefore
cannot, in particular, have an array type. The simplest approach would
be to specify that there's always an underlying array of char type, and
to define arithmetic on T* objects in terms of movement though that
underlying array in steps of sizeof(T).
It isn't defined at all, formally or not. Adding such a definition would
resolve the issue I'm talking about, but it has some unfortunate side
effects.

Currently, if you write to a block of dynamically allocated memory using
an lvalue of float type, it acquires the effective type of 'float'. If
you subsequently try to read that memory using an lvalue of 'int' type,
the behavior is undefined (6.5p7). This rule enables optimizations that
are based upon the assumption that, for example, an lvalue of float type
never aliases an lvalue of int type. As a result, a function that takes
a float* argument and an int* argument can read a float value using the
first pointer, and then write things through the second pointer, and
never have to worry about having to reload the float value. The int*
might alias the same memory location as the float*, so that such writes
might change the value represented when the same memory is read as a
float. However, the compiler is relieved of the responsibility of
necessary, because 6.5p7 says that it won't be necessary.

Unions can render such optimizations invalid, but only when the union
type is in scope, and only for pairs of types both involved in the same
union. Your approach would have the union always in scope, and since
it's a universal union, it would affect all possible pairs of object types.

James Kuyper, May 15, 2014
18. ### Malcolm McLeanGuest

You're making the same mistake as Ben.

Of course Turing equivalence is a "theorem". Just as the theorem that
any two squares are identical part from translation, rotation, and scaling.
Ok, but what if we're not talking about the Euclidean plane? What about
Trafalgar square - that's a square, isn't it?

It becomes tautological. If a computer isn't Turing complete, or if it is
capable of determining a non-computable function (e.g. if it can use
natural language), then it's not a "computer". It doesn't mean that
Turing was saying nothing.

Malcolm McLean, May 15, 2014
19. ### Kenny McCormackGuest

I don't think you are allowed to say that Kiki is wrong.

Not in this newsgroup.

--
Republicans/Conservatives/Independents/Liberatarians/whatevers)
don't hate big government. They *love* big government as a means for
them to get rich, sucking off the public teat. What they don't like is
*democracy* - little people actually having the right to vote and stuff
like that.

Kenny McCormack, May 15, 2014
20. ### Keith ThompsonGuest

Can you give a brief statement of this "theorem"?

A given language or system either is or is not Turing equivalent.
A theorem might say that some system or set of systems *are*
Turing equivalent.

Using your analogy, "identical to a square apart from translation,
rotation, and scaling" is not a theorem. "Some particular entity is
identical ..." or "some set of entities are identical ..." might be a
theorem.

You're just misusing the word "theorem", that's all.

Keith Thompson, May 15, 2014