32 or 64 bit processor info in C

R

Richard Bos

Malcolm McLean said:
I want a type called "int" and signed that is the size of the address bus.

Why on earth? I want a type called int which is signed. I have this,
because the Standard requires it. If I want this type to be the size of
any bus at all (and usually I don't give a hoot), I want it to be the
size of the _data_ bus, not of the address bus.
To be fernickety about it it really needs an extra bit, but in practise if
you need over half the address space for a single array then you can code
access to it specially.

If you need an array that size, you use the feature which the language
has given you for dealing with it: pointers and size_t.
That has been the convention until now.

No, it has not.
The proposal is to change it, to make int no longer the natural integer
size for the machine.

No, it is not.

It is usual, but not required, to make int the natural integer for a
machine - and this has always meant the size of the _data_ bus. The
address bus has never been relevant to the size of int.

There have been many machines on which int _could not_ be the natural
size integer for that machine, because it had an 8-bit chip. There have
been many machines on which int was 16 bits but the address bus 32, and
TTBOMK even /vice versa/. There have also been machines on which the
data and address bus were the same size, but anyone who relied on this
attribute was a fool. There are even machines on which there are
_several_ data and several address buses, and I wouldn't be surprised at
all at a machine on which the data buses themselves were of different
sizes, and ditto for the address buses.
The problem with size_t was that

....amateur BASIC programmers can't get their heads around it.
Virtually every integer, pace Flash Gordon, is used as an array
index or in intermediate calculations to derive indices.

This is complete bollocks.
But since size_t is unsigned, awkward to read, and most numbers are
much less than 2 billion, people won't use it consistently.

You mean _you_ won't use it consistently. Well, excuse me, but that's
your problem, not mine.

Richard
 
M

Malcolm McLean

Richard Bos said:
Why on earth? I want a type called int which is signed. I have this,
because the Standard requires it. If I want this type to be the size of
any bus at all (and usually I don't give a hoot), I want it to be the
size of the _data_ bus, not of the address bus.
Then it can index all of memory. I like to use integers to index into
arrays.
If an int isn't guaranteed to be able to index an arbitrary array, that
creates difficulties.
This is complete bollocks.
This is turning into a good litmus test to sort out those who understand
from those who don't.
You mean _you_ won't use it consistently. Well, excuse me, but that's
your problem, not mine.
If int was deprecated so that everyone used size_t all the time, that
wouldn't be so bad. Effectively we would have 64-bit ints. They would have a
silly name and be unsigned, but it would be tolerable.
The problem is that people won't. They will say "an int is big enough to
index any conceivable array of this type" and use it. So code will no longer
fit together easily.
 
K

Keith Thompson

Malcolm McLean said:
Then it can index all of memory. I like to use integers to index
into arrays. If an int isn't guaranteed to be able to index an
arbitrary array, that creates difficulties.

You like to use integers to index into arrays. size_t is an integer
type. There's no real problem *as long as you know how to use it*.

You continue to conflate the concepts of "int" and "integer". The
type "int" is one of a set of types which are collectively called
"integer" types. If you're going to argue for a change in the
language, that's fine -- but *please* use the correct terminology.

On the other hand, sometimes it's ok to use int to index into an
array, if, for example, you happen to know that the array cannot have
more than 32768 elements. The indexing operation takes one pointer
operand (often a converted array name) and one integer operand; the
integer operand can be of any integer type, including int, size_t, and
even _Bool if your compiler supports it.

This thread seems to be about one thing: *you* personally have
problems using distinct integer types for distinct purposes, and the
rest of us don't. I suggest that the solution is not to change the
language to meet your personal requirements; the solution is for you
to learn to use the language as it exists. You might even like it.
 
M

Malcolm McLean

Keith Thompson said:
On the other hand, sometimes it's ok to use int to index into an
array, if, for example, you happen to know that the array cannot have
more than 32768 elements. The indexing operation takes one pointer
operand (often a converted array name) and one integer operand; the
integer operand can be of any integer type, including int, size_t, and
even _Bool if your compiler supports it.
This is the problem.

int count_spaces(char *str)
{
int i;
int answer = 0;

for(i=0;str;i++)
if(str == ' ')
answer++;

return answer;
}

Perfectly unexceptional code. In fact it is unlikely that anyone is going to
want that routine on anything other than a line limited elsewhere to be a
thousand or so characters.
But someone might, just might, pass in a really long string. So those ints
have got to be size_ts to be strictly correct. So size_t is propagating up
out type.


Now consider this

foo(char *str1, char *str2)
{
hackme = strlen(str1) - count_spaces(str2);
if(hackme < 0)
bar(str1, str2);
else
baz(str12, str2);
}

Again nothing too exceptional about this code. Typical rather horrid flow
control logic. What type would you say hackme should be?
 
I

Ian Collins

Malcolm said:
Then it can index all of memory. I like to use integers to index into
arrays.
If an int isn't guaranteed to be able to index an arbitrary array, that
creates difficulties.
It (int) never has been guaranteed to be able to index an arbitrary
array. Being a signed type, even if it is the width of the bus, it
could only index half of the address space.
This is turning into a good litmus test to sort out those who understand
from those who don't.
No, it's turned into complete bollocks.
 
C

CBFalconer

Malcolm said:
.... snip ...

Now consider this

foo(char *str1, char *str2)
{
hackme = strlen(str1) - count_spaces(str2);
if(hackme < 0)
bar(str1, str2);
else
baz(str12, str2);
}

Again nothing too exceptional about this code. Typical rather
horrid flow control logic. What type would you say hackme should
be?

You can simplify, because strlen can never return a value less than
0. It returns a size_t, which is an unsigned quantity. Thus you
can reliably replace the whole routine with:

bar(str1, str2);

and save a minimum of 9 lines of useless source code.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
I

Ian Collins

Malcolm said:
On the other hand, sometimes it's ok to use int to index into an
array, if, for example, you happen to know that the array cannot have
more than 32768 elements. The indexing operation takes one pointer
operand (often a converted array name) and one integer operand; the
integer operand can be of any integer type, including int, size_t, and
even _Bool if your compiler supports it.
This is the problem.

int count_spaces(char *str)
{
int i;
int answer = 0;

for(i=0;str;i++)
if(str == ' ')
answer++;

return answer;
}

Perfectly unexceptional code. In fact it is unlikely that anyone is
going to want that routine on anything other than a line limited
elsewhere to be a thousand or so characters.
But someone might, just might, pass in a really long string. So those
ints have got to be size_ts to be strictly correct. So size_t is
propagating up out type.

So? Consider a 16 bit machine with a 20 bit address space, should int
on that machine be 21 bits?

Why would a function that counts something return a signed type?

*Please* don't quote signatures.
 
E

Eric Sosman

Malcolm McLean wrote On 04/17/07 16:25,:
On the other hand, sometimes it's ok to use int to index into an
array, if, for example, you happen to know that the array cannot have
more than 32768 elements. The indexing operation takes one pointer
operand (often a converted array name) and one integer operand; the
integer operand can be of any integer type, including int, size_t, and
even _Bool if your compiler supports it.

This is the problem.

int count_spaces(char *str)
{
int i;
int answer = 0;

for(i=0;str;i++)
if(str == ' ')
answer++;

return answer;
}

Perfectly unexceptional code. In fact it is unlikely that anyone is going to
want that routine on anything other than a line limited elsewhere to be a
thousand or so characters.
But someone might, just might, pass in a really long string. So those ints
have got to be size_ts to be strictly correct. So size_t is propagating up
out type.


Now consider this

foo(char *str1, char *str2)
{
hackme = strlen(str1) - count_spaces(str2);
if(hackme < 0)
bar(str1, str2);
else
baz(str12, str2);
}

Again nothing too exceptional about this code. Typical rather horrid flow
control logic. What type would you say hackme should be?


"No type at all."

foo(char *str1, char *str2) {
if (strlen(str1) < count_spaces(str2))
bar(str1, str2);
else
baz(str12 /* sic */, str2);
}

.... but this is just your most recent salvo in a spasmodic
series of ill-aimed broadsides. Elsewhere in this thread
you have argued that lack of a 64-bit int dooms C *and* that
"most numbers are less than 2 billion." You have argued for
a 64-bit int *and* desired that it correctly represent all
signed pointer differences (hence, a 63-bit or narrower
pointer). You have argued for a 64-bit int *and* for an
int the same width as "the address bus," hence requiring
a 64-bit address bus to go along with a <=63-bit address.

And all, it seems, because you have an unshakeable
loyalty to the spelling eye enn tee. Loyalty can be a fine
thing at times, but slavishness is another thing altogether.
 
R

Richard Heathfield

CBFalconer said:
You can simplify, because strlen can never return a value less than
0. It returns a size_t, which is an unsigned quantity. Thus you
can reliably replace the whole routine with:

bar(str1, str2);

and save a minimum of 9 lines of useless source code.

But your "replacement" does not actually have the same functionality as
Malcolm's code. You might want to read it again, more carefully.
 
C

CBFalconer

Richard said:
CBFalconer said:

But your "replacement" does not actually have the same functionality
as Malcolm's code. You might want to read it again, more carefully.

So it doesn't. Another good idea hits the dust.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
M

Malcolm McLean

Ian Collins said:
Malcolm said:
On the other hand, sometimes it's ok to use int to index into an
array, if, for example, you happen to know that the array cannot have
more than 32768 elements. The indexing operation takes one pointer
operand (often a converted array name) and one integer operand; the
integer operand can be of any integer type, including int, size_t, and
even _Bool if your compiler supports it.
This is the problem.

int count_spaces(char *str)
{
int i;
int answer = 0;

for(i=0;str;i++)
if(str == ' ')
answer++;

return answer;
}

Perfectly unexceptional code. In fact it is unlikely that anyone is
going to want that routine on anything other than a line limited
elsewhere to be a thousand or so characters.
But someone might, just might, pass in a really long string. So those
ints have got to be size_ts to be strictly correct. So size_t is
propagating up out type.

So? Consider a 16 bit machine with a 20 bit address space, should int
on that machine be 21 bits?

Ideally yes. In practise it is better to have the extra efficiency and put
burdens on those people who want an array spanning over half the address
space. Once an architecture get weird there is no answer - in practise such
machines had near and far pointers and constrained single arrays to be less
that 64K, allowing 16 bit indexing. It worked, but it was far from ideal,
and threw a hammer in portability.
Why would a function that counts something return a signed type?
Because intermediate calculations using it might go below zero.
 
I

Ian Collins

Malcolm said:
Malcolm said:
This is the problem.

int count_spaces(char *str)
{
int i;
int answer = 0;

for(i=0;str;i++)
if(str == ' ')
answer++;

return answer;
}

So? Consider a 16 bit machine with a 20 bit address space, should int
on that machine be 21 bits?

Ideally yes. In practise it is better to have the extra efficiency and
put burdens on those people who want an array spanning over half the
address space.


So there you have it - 32 bit int on a typical contemporary 64 bit
machine *does* provide extra efficiency of over a 64 bit int.
Because intermediate calculations using it might go below zero.
Bud does it make any sense?
 
M

Malcolm McLean

Ian Collins said:
So there you have it - 32 bit int on a typical contemporary 64 bit
machine *does* provide extra efficiency of over a 64 bit int.
Engineering is like that. A bad design decision usually has something going
for it, and there are good arguments we can use in its favour.
I'd like to see some benchmarks on real applications to see whether hitting
the cache capacity has a big effect or a small effect.
 
M

Malcolm McLean

Eric Sosman said:
Malcolm McLean wrote On 04/17/07 16:25,:

"No type at all."

foo(char *str1, char *str2) {
if (strlen(str1) < count_spaces(str2))
bar(str1, str2);
else
baz(str12 /* sic */, str2);
}
That's acceptable code, but you are reversing my decision to improve
readbility by splitting the computation into two statements.
Whilst in this case you might be right, readability is highly important, and
layout should not be dictated by the weaknesses of the type sytem. Anyway,
we could easily modify the example so that hackme is required twice.
 
E

Eric Sosman

Malcolm McLean wrote On 04/18/07 15:06,:
That's acceptable code, but you are reversing my decision to improve
readbility by splitting the computation into two statements.

You "improve" readability by using many statements
where one suffices? Does your family tree blossom with
lawyers, politicians, marketeers, and spin doctors?
Whilst in this case you might be right, readability is highly important, and
layout should not be dictated by the weaknesses of the type sytem. Anyway,
we could easily modify the example so that hackme is required twice.

int hackme = strlen(str1) < count_spaces(str2);

(Optional: Use _Bool instead of int for C99.)
 
R

Richard Heathfield

Eric Sosman said:
Malcolm McLean wrote On 04/18/07 15:06,:

You "improve" readability by using many statements
where one suffices?

This is sometimes the case, yes, although it may depend on what you mean
by "suffices".

Consider, for example, the following (genuine) code:

new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y]);
if(new->pixel[y] == NULL)


Now, you /could rewrite this as:

if((new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y])) == NULL)

....which certainly suffices, but I am not convinced that readability is
improved by the contraction.

Does your family tree blossom with
lawyers, politicians, marketeers, and spin doctors?

I think that's a touch unfair. I'm actually on your side in this
discussion, but I think you overstate the case here.
 
W

William Ahern

Richard Heathfield said:
Now, you /could rewrite this as:

if((new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y])) == NULL)

...which certainly suffices, but I am not convinced that readability is
improved by the contraction.

I would agree with you. Which is why I generally swap the operands:

if (0 != (new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y])))
...

Operands are used in order of most importance to thine eyes and head.
Personally, this reads much more easily than multiple statements.
 
E

Eric Sosman

Richard Heathfield wrote On 04/18/07 15:43,:
Eric Sosman said:

Malcolm McLean wrote On 04/18/07 15:06,:


You "improve" readability by using many statements
where one suffices?


This is sometimes the case, yes, although it may depend on what you mean
by "suffices".

Consider, for example, the following (genuine) code:

new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y]);
if(new->pixel[y] == NULL)


Now, you /could rewrite this as:

if((new->pixel[y] = clc_malloc(width * sizeof *new->pixel[y])) == NULL)

...which certainly suffices, but I am not convinced that readability is
improved by the contraction.

Yah. "Bless you, it all depends." It is useful to
decompose lengthy statements into smaller, more easily-
comprehended pieces (not just for the reader's sake, but
for the writer's as well!), but decomposition for its own
sake is, well, compost.

McClean's example showed a perfectly simple, obvious
and easily-comprehended test, which he chose to decompose
into two statements. In the process he introduced a brand-
new variable (the point of his example was to try to expose
a difficulty concerning that unnecessary variable), plus two
additional operators. That sort of rearrangement is not
only obfuscatory (hiding a comparison behind a subtraction,
a la FORTRAN's three-way IF), but bug-inducing.

As for the malloc example, I myself usually write an
assignment statement and a separate test. This is not so
much out of a concern that the whole thing would be too
long, but to direct the focus: "I will now allocate some
memory. (By the way, I'll also check for failure.)" But
sometimes I'll gang the whole thing together, particularly
during an initialization where I'm just going to exit the
program on a failure:

if ( (buff1 = malloc(N1 * sizeof *buff1) == NULL
|| (buff2 = malloc(N2 * sizeof *buff2) == NULL
|| (buff3 = malloc(N3 * sizeof *buff3) == NULL ) {
perror ("malloc");
fputs ("No memory; bye-bye!\n", stderr);
exit (EXIT_FAILURE);
}

I think this is easier to read than three assignments, three
tests, and three error-exits, or than shuffling the test-and-
exit off to a wrapper function -- although I do *that* too,
sometimes. (Note that three assignments followed by one
three-way test and one error-exit is not quite the same: if
malloc() sets errno, the successful allocation of buff3 could
obscure why buff2's allocation failed. malloc() need not set
errno and some do not, but I take the optimistic view and try
to give the poor user all the available diagnoses, even if
they're suspect.)
I think that's a touch unfair. I'm actually on your side in this
discussion, but I think you overstate the case here.

Tastes vary. Or, "There's no point arguing with Gus."
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,433
Messages
2,571,683
Members
48,796
Latest member
Greg L.

Latest Threads

Top