Array indexing

M

Malcolm

Keith Thompson said:
Malcolm said:
Keith Thompson said:
To overcome this,
is an implementation allowed to cast array indexes to long long?

(This, of course, is why the "best" type for array indexing is
generally size_t, or a signed variant of size_t: size_t can [and
should] be "unsigned long long" on implementations with memory
sizes exceeding UINT_MAX.)

If there is a problem using an integer to index an array, in this
case that array sizes can go beyond the range, then really the
implementation needs to be fixed, or the standard.

You said "integer". Did you mean "int"?
Due to this size_t nonsense integers are no longer ints, which is
the heart of the problem.

I don't see that it's a problem.

If you want a language in which "int" is the only integer type,
C is not that language, and I don't believe it ever has been.
The earliest C reference manual I can find, which predates K&R1,
has both char (8 bits) and int (16 bits).
char is for character data. Admittedly it has been overloaded to represent
bytes for where you need direct control of bits in memory, which was a
mistake.
short and long are concessions to machine efficiency, relucantly allowed and
only for special use. Rarely you need integers of a special length.
size_t is an integer designed bya committee, an interloper which has the
capaicty to severely damage our language.
Once again, the term "integer" refers to a number of distinct types:
(signed|unsigned|plain) char, (signed|unsigned) (short|int|long|long
long),
and zero or more extended types. size_t is merely an alias for one of
them. That's just not going to change. If you dislike it, that's
certainly your right (though I'm at a loss to understand why you think
it's a problem), but it's how the language works.
Integer means either int , which is an abbreviation, or a number which is a
whole number of units, or a number which for physical reasons can only be a
whole number of units. int is the normal integer types, the other types are
for special purposes.
Or was, until it was decide to use size_t as an array index.
Perhaps you'd prefer B or BCPL?

You're free to use a different language that meets your needs if you
can find one, or design and implement one, or get someone else to do
so, if you're technically and/or financially able. The result is
likely to be something that *I* wouldn't want to use, but that's
perfectly fine.
If you've got problems knocking up a little language as an experiment I
recommend my book "MiniBasic, how to write a script interpreter". Minibasic
simply has numbers, implemeted as doubles, and strings. However it isn't
meant to be efficient.
But if you're going to post here, I suggest that keeping the
distinction between "int" and "integer" clear is going to make
communication much easier.
Language isn't like that.
KISS. Keep it simple, stupid. We don't need this distinction, so lets get
rid of size_t and start a campaign for 64 bit ints.
 
K

Keith Thompson

Malcolm said:
char is for character data. Admittedly it has been overloaded to represent
bytes for where you need direct control of bits in memory, which was a
mistake.

I agree that tying "char" and "byte" together was a mistake. I would
have preferrred to keep "char" for character data only, and introduce
a distinct integer type caled "byte" (or maybe something more generic
like "storage_unit").

Yes, I'm talking about adding yet another integer type. But it's too
late anyway; we're stuck with char==byte.

[...]
Integer means either int , which is an abbreviation, or a number which is a
whole number of units, or a number which for physical reasons can only be a
whole number of units. int is the normal integer types, the other types are
for special purposes.

Integer, in C, means any of a specified set of types; see my previous
article or the standard for details.

I simply disagree with your assertion that int is "the normal" integer
type. It is one of several.

Or was, until it was decide to use size_t as an array index.

size_t was added in the first C standard (ANSI, 1989). It's a bit too
late to change it now.

[...]
If you've got problems knocking up a little language as an experiment I
recommend my book "MiniBasic, how to write a script interpreter". Minibasic
simply has numbers, implemeted as doubles, and strings. However it isn't
meant to be efficient.

Um, it's not my problem. I was suggesting what *you* might want to do
if you feel the need for a language that meets your stated
requirements.

[...]
Language isn't like that.

Clearly, the C language *is* like that.
KISS. Keep it simple, stupid. We don't need this distinction, so lets get
rid of size_t and start a campaign for 64 bit ints.

No thanks.

The bottom line is this: I think I understand what you're saying.
You're entitled to your own preferences; I don't share them. I
disagree with almost everything you're saying. It happens that the C
language, as it's currently defined, meets my own preferences much
more closely than yours. Realistically, I believe there is little or
no chance that the language will be changed to meet your preferences.

I'm not necesssarily saying that I'm right and you're wrong, just that
you're not likely to get any support for your positions, and if you
want to pursue something that does meet your requirements, standard C
isn't likely to be, or to become, what you're looking for.
 
K

Keith Thompson

Malcolm said:
Because on normal machines there simply isn't enough memory to overflow the
range of a singed integer, so this isn't a problem.

Perhaps by your own definition of "normal".

I have two machines sitting in my office, each of which has 32-bit int
(in the compiler that came with the operating system) and 4 gigabytes
of memory. A malloc with an int argument could only support 2
gigabyte allocations.

I don't know whether either system will actually allocate more than 2
gigabytes at a time, but it's certainly plausible that it could.

Making int 64 bits would leave a gap in the type system. Currently,
char is 8 bits, short is 16 bits, and int is 32 bits; if int were 64
bits, it wouldn't be possible to have both a 16-bit and a 32-bit
integer type. (Unless you added a C99-style extended integer type,
but I presume that would't be to your liking.)
 
O

Old Wolf

Malcolm said:
When was the last time you had a negative amount of money in your pocket?
However you can have an negative amount in your account. Intermediate
calculations of the sizes of memory objects may give negative results.

So what if they do? The way that unsigned arithmetic is defined
in C, the final positive result will be correct even if some of the
intermediate results are represented as large positive numbers.

size_t s = 3u - 5 + 5;
printf("%u\n", (unsigned int)s);

will give you 3.
It is another type swilling about. The fact that it probaly has the same
number of bits as another type is neither here nor there.

No, it is an alias for an existing type. If it happens to be an alias
for unsigned long then the following code will work; you might
like to try it:

void foo(size_t *p) { }
int main()
{ unsigned long bar; foo(&bar); }

If they were different types with the same representation then
this code would always fail (try it with char and signed char).
Now Bloggs writes

setpixels(size_t *x, size_t *y, size_t N)

because size_t is the right thing for an index, right?

Muggins writes
int *xvals;
int *yvals;
int N;

Because his pixel co-ordiantes are integers.
Now he calls Blogg's setpixels() routine.

Muggins was foolish for not writing his code to be compatible
with the API he was about to use !
Oh dear. The code becomes a mass of castings and type
conversions.

If it does, then Muggins is a very poor coder. The code would
have to declare values of size_t in order to pass them to
setpixels(), and then copy values out of xvals and yvals
before the call and copy them back afterwards. Which begs
the question, why not just declare the values size_t in the first
place.
It's the two standards problem. We need
one standard way of representign integers in the machine.
Maybe you don't have the intellectual capacity to
appreciate the strength of the point being made.

Maybe you don't have the intellectual capacity to hold
more than one integer type in your brain at once? The
rest of us find that sometimes we want integers that use
less storage than long long.
 
K

Keith Thompson

Malcolm said:
It's the two standards problem. We need one standard way of
representign integers in the machine.

No, we really don't. (And I think that's the core of our disagreement.)
 
S

santosh

Malcolm wrote:
It's the two standards problem. We need
one standard way of representign integers in the machine.
<snip>

No. The basic purpose of high level languages is to provide various
abstractions, (both of data and code), to make programming portable and
easier.

Maybe assembly langauge would suit you?
 
F

Frederick Gotham

Malcolm:
No. It is ugly.


Have you considered "underscore t" counselling?

When code looks like line
noise it becomes unreadable, and then it gets hard to maintain, and you
get bugs.


Geez you'd want a good programmer on the job so.

As happened to Perl and C++. size-t is admittedly only a small step in
that direction, but its a step in the wrong direction.


"hyphen t"? Is that new?

However you can have an negative amount in your account.


Depends which bank you ask. However, yes, it isn't too far-fetched to have
a negative bank balance.

Intermediate calculations of the sizes of memory objects may give
negative results.


Yes, something like:

char buf[2];

ptrdiff_t i = buf - (buf+1);

Yes it does. C frequently has to call non-C code, or be called by it.

Relevance?


Yes I would. The fact that char and byte are the same thing in C is a
major headache to anyone who has had to use a non-English language.
However that is somethign we are stuck with. size_t is a newcomer which
can still, just, be syuppressed if we are determined enough to squeeze
it out of our nice C code.


And use what instead? size-t?

Now Bloggs writes

setpixels(size_t *x, size_t *y, size_t N)

because size_t is the right thing for an index, right?

Muggins writes
int *xvals;
int *yvals;
int N;


Bad Muggins! Go stand in the corner!

Because his pixel co-ordiantes are integers.
Now he calls Blogg's setpixels() routine. Oh dear. The code becomes a
mass of castings and type conversions. It's the two standards problem.
We need one standard way of representign integers in the machine.


Either:

(1) Don't use "size_t" for the function parameters.
(2) Slap Muggins for using "int".

Standard:
malloc() takes an numerical type as its argument which must be an
integer.

Normal application

void *malloc(int N)
{
assert(N >= 0);
}

Because on normal machines there simply isn't enough memory to overflow
the range of a singed integer, so this isn't a problem.


(1) There's plenty of machines that have lots and lots and lots and lots
and lots and lots and lots and lots and lots of memory. Ask the chaps that
run Google.

Also, why the hell would you want to use a signed integer type when the
number should always be positive?

I realise that some programmers are like old dogs -- i.e. there isn't a
snowball's chance in hell of getting them to change their ways -- but the
use of "int" willy-nilly wherever an integer type is needed is poor style
to me.

void *malloc(double N)
{
assert(N == floor(N));
}
Don't be arrogant. Maybe you don't have the intellectual capacity to
appreciate the strength of the point being made.


If we're going to use signed integer types to specify memory sizes, we may
as well start using "double" for it too.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top