32 or 64 bit processor info in C

R

Richard Bos

Malcolm McLean said:
I know. Most systems are not conforming. Lack of 64 bit ints also has the
potential to wreck our beloved C language. Soon it will be necessary to use
a gibberish type every time you need an integer, and cast to another
gibberish type to pass to a library function.

My dear boy, you're welcome to leave this rapidly sinking ship if you
fear for your BASIC wetsuit. Just don't complain about the timber we're
using, and stop being frightened by the rigging.

Richard
 
E

Eric Sosman

Malcolm McLean wrote On 04/12/07 03:00,:
You've made this post several times over the past few months. I don't know
what size most of the types are on the machines I work on.

Then why did you write on April 10

"Lack of 64 bit ints also has the
potential to wreck our beloved C language."

? Or to put it another way: If you don't know the sizes of
the types you use, how do you know you lack a 64-bit type?
"Who told thee that thou wast naked?"
 
F

Flash Gordon

Malcolm McLean wrote, On 12/04/07 08:07:
The convention that int is not a fixed-size type but represents the
natural integer size for the machine. It is also a clause in one of the
standards.

Most "64 bit" processors are just as happy working with 32 bit integers
as with 64 bit integers. I don't know about x86 based processors, but if
Intel/AMD have ripped off ideas that have been used on DSPs for over 10
years they will even be able to do some stuff faster if using 32 bit
integers, since they will use the two halves of the ALU for two
independent operations!
That is creeping in. It is part of where the size_t nonsense leads you.

A lot of people do not consider it to be nonsense.
People should be educated to realise that it doesn't matter how many
bits an integer has as long as it is big enough to hold the biggest
number you need.

True, and if you want to hold lots of numbers that fit in to 32 bits you
might not want to waste the space and memory bandwidth that using 64 bit
integers would introduce.
> Since most integers are used in array indexes, i.e.
count things in the computer's memory, that tells you what the natural
integer size is on any particualr machine.

You've claimed that before and failed to support it. In over 20 years of
professional programming most of my use of integers has not had anything
to do with array indexes. On the stuff I currently work on we use
integers for money, for example, and I've never needed to use cost as an
index in to an array!
A gibberish type is one that is adapted to the needs of compilers and or
standards bodies rather than the human programmers using the system.

Apart from a lot of people on the standards body being humans, a lot of
other people seem to think that size_t suits their purposes quite
nicely. It even has the advantage of documenting the fact you are
working with a size!
Read any Wondows code to see where going down this path leads.

Well, it has not yet managed to destroy the language, although it was
only standardised in 1989. Perhaps we should wait until the hundredth
anniversary of the standard in 2989.
 
M

Malcolm McLean

Eric Sosman said:
Malcolm McLean wrote On 04/12/07 03:00,:

Then why did you write on April 10

"Lack of 64 bit ints also has the
potential to wreck our beloved C language."

? Or to put it another way: If you don't know the sizes of
the types you use, how do you know you lack a 64-bit type?
"Who told thee that thou wast naked?"
I don't. I know without checking that int will be 32 bits on my Beowulf
machine, but I've never actually bothered to verify this. What long and long
long is I don't know.
I use integers mainly to count things, and mainly to count items held in
arrays within the computer's memory. Whilst my Beowulf has many gigabytes
installed, it is only available in 2GB chunks on each node. So a 32 bit
intger can count ever plains chars stored therein.
This is typical. For a brief momenet we have the situation where int is the
only type you need, except for the exceptional case where space is at a
premium, or the other case where for some reason yiu need numbers over 2
billion. However it won't last. 2GB costs only about 200 dollars, and soon
machines with 40 or 50 GB will be commonplace.
You could argue that all that memory won't be used for arrays. It is true
that to fill 1% of a 1GB machine with 32 bit ints, you need 250,000 data
points. That probably represents a substantial research investment, and you
won't have many such arrays. I've got sympathy with the argument that in
reality we almost always need low integers, and 32 bits is faster. Nothing
is cut and dried. But the simplicity of the 64 bit int approach recommends
itself. Time is ticking. If we don't get 64 bit ints now, it will be too
late, and the language will be size-underscored-ted up.
 
M

Malcolm McLean

Flash Gordon said:
Malcolm McLean wrote, On 12/04/07 08:07:

Most "64 bit" processors are just as happy working with 32 bit integers as
with 64 bit integers. I don't know about x86 based processors, but if
Intel/AMD have ripped off ideas that have been used on DSPs for over 10
years they will even be able to do some stuff faster if using 32 bit
integers, since they will use the two halves of the ALU for two
independent operations!


A lot of people do not consider it to be nonsense.


True, and if you want to hold lots of numbers that fit in to 32 bits you
might not want to waste the space and memory bandwidth that using 64 bit
integers would introduce.


You've claimed that before and failed to support it. In over 20 years of
professional programming most of my use of integers has not had anything
to do with array indexes. On the stuff I currently work on we use integers
for money, for example, and I've never needed to use cost as an index in
to an array!
I posted a study for you. You do need a little bit of insight to intepret
data, unless you have the resources to study everything by brute force.
 
K

Kelsey Bjarnason

[snips]

I have never tried to boot a 32 bit linux OS on 64 bit CPU, thus
this may be wrong:
On a linux platform booted with a 32 bit OS version,
one could open file /proc/cpuinfo (plain text file) and check if the
CPU supports the flag lm (long mode = 64 bit).

It's there, at least on this box, which is a 32-bit distro on a 64-bit
Intel chip.
So yes by using standard C one can figure out if certain CPU's supports
64 bit by parsing a plain text file - the latter being very little
portable and very much non standard.

Indeed; one cannot do the task in general using standard C.
 
F

Flash Gordon

Malcolm McLean wrote, On 12/04/07 22:03:
I posted a study for you. You do need a little bit of insight to
intepret data, unless you have the resources to study everything by
brute force.

I explained why the study did not prove the point. Amongst other things,
it explicitly excluded major classes of code, such as server applications.
 
K

Keith Thompson

Malcolm McLean said:
I use integers mainly to count things, and mainly to count items held
in arrays within the computer's memory. Whilst my Beowulf has many
gigabytes installed, it is only available in 2GB chunks on each
node. So a 32 bit intger can count ever plains chars stored therein.
This is typical. For a brief momenet we have the situation where int
is the only type you need, except for the exceptional case where space
is at a premium, or the other case where for some reason yiu need
numbers over 2 billion. However it won't last. 2GB costs only about
200 dollars, and soon machines with 40 or 50 GB will be commonplace.
You could argue that all that memory won't be used for arrays. It is
true that to fill 1% of a 1GB machine with 32 bit ints, you need
250,000 data points. That probably represents a substantial research
investment, and you won't have many such arrays. I've got sympathy
with the argument that in reality we almost always need low integers,
and 32 bits is faster. Nothing is cut and dried. But the simplicity of
the 64 bit int approach recommends itself. Time is ticking. If we
don't get 64 bit ints now, it will be too late, and the language will
be size-underscored-ted up.

It's already too late, and it has been for many years.

C has had multiple sizes of integers since before K&R1. If you want a
type that's at least 32 bits, you can easily find one. If you want a
type that's at least 64 bits, you can *probably* find one, typically
called "long long" (standard in C99, a common extension in pre-C99
implementations).

Your opinion that type int should be 64 bits is, I suppose, perfectly
valid, but the vast majority of C programmers (everyone except you, as
far as I can tell) don't share that opinion. C's type system isn't
perfect, but it is reasonably consistent, and it works.

I'm not saying you're wrong, merely that the language isn't going to
be changed in the way you want it to be changed. You can accept that,
or you can find or invent another language. In the meantime, of
course, you're free to advocate the changes you want, but you need to
be aware that it would be a major change, and that it would *not* be
in keeping with the language as it currently exists, and has existed
for at least 20 years.
 
I

Ian Collins

Keith said:
It's already too late, and it has been for many years.

C has had multiple sizes of integers since before K&R1. If you want a
type that's at least 32 bits, you can easily find one. If you want a
type that's at least 64 bits, you can *probably* find one, typically
called "long long" (standard in C99, a common extension in pre-C99
implementations).
C99 fixed size types finally provide a standard means for selecting the
required integer size without having to know the target system's choice
of size.

http://www.unix.org/version2/whatsnew/lp64_wp.html

Provides an interesting footnote to this sub-thread.
 
M

Mark L Pappin

Flash Gordon said:
Perhaps we should wait until the hundredth
anniversary of the standard in 2989.

At least one of these two lines contains an incorrect number, Flash.

mlp
 
M

Malcolm McLean

Keith Thompson said:
C has had multiple sizes of integers since before K&R1. If you want a
type that's at least 32 bits, you can easily find one. If you want a
type that's at least 64 bits, you can *probably* find one, typically
called "long long" (standard in C99, a common extension in pre-C99
implementations).

Your opinion that type int should be 64 bits is, I suppose, perfectly
valid, but the vast majority of C programmers (everyone except you, as
far as I can tell) don't share that opinion. C's type system isn't
perfect, but it is reasonably consistent, and it works.


I'm not saying you're wrong, merely that the language isn't going to
be changed in the way you want it to be changed. You can accept that,
or you can find or invent another language. In the meantime, of
course, you're free to advocate the changes you want, but you need to
be aware that it would be a major change, and that it would *not* be
in keeping with the language as it currently exists, and has existed
for at least 20 years.
I want a type called "int" and signed that is the size of the address bus.
To be fernickety about it it really needs an extra bit, but in practise if
you need over half the address space for a single array then you can code
access to it specially.
That has been the convention until now. The proposal is to change it, to
make int no longer the natural integer size for the machine.

The problem with size_t was that the implications weren't thought through. I
am sure the committee thought that they were adding a minor patch to
malloc() and documenting variables that hold sizes of memory. They didn't
realise the implications. The problem is that if int is 32 bits and memory
is 64 bits, then every array index to an arbitrary array needs to be a
size_t. Virtually every integer, pace Flash Gordon, is used as an array
index or in intermediate calculations to derive indices. But since size_t is
unsigned, awkward to read, and most numbers are much less than 2 billion,
people won't use it consistently. A type is a standard for passing data
between functions. So we'll have the two plugs problem. It will seriously
impact the reusability of C code.
If int is 64 bits on 64-bit machines size_t will gradually fade away.
I'm not proposing removing the symbol from C code just yet, merely
suggesting that it isn't used except as a fix to library functions, or in
special cases. Eventually we will return sizeof() to generating an int, and
remove it from the string functions, so it will survive merely as a fossil
in malloc().
C does not have fixed size basic types. Rightly or wrongly. By making
int a fixed size of 32 bits compiler vendors are the ones making the change,
not me. As 64 bit machines hit the mass market, we have a brief chance to
save our language from ANSI doing to C what they did to C++.
 
M

Malcolm McLean

Flash Gordon said:
Malcolm McLean wrote, On 12/04/07 22:03:

I explained why the study did not prove the point. Amongst other things,
it explicitly excluded major classes of code, such as server applications.
This is an important point.
Finding an objection to evidence doesn't make that evidence disappear, or
turn the opposite case into a "mere assertion" position.
You seldom have exactly the evidence you would like. For instance the
example I gave was Java, not C, it was from a selection of about ten
applications, and the study wasn't designed to show what I asserted, the
conclusion had to be derived by looking at the evidence with some
assumptions.

However whilst your objections were legitimate, in fact the idea that Java
code is very different from C code in its use of arrays, that server
applications use arrays in a very different way from most general
applications, that array access instructions don't bear a very tight
relationship to index calculations, and so on, aren't all that plausible.
You would need to do a proper study to establish the matter beyond all
possible doubt. However do you think I could get a research grant for people
to sit tagging up C code to establish how much indexing is done? It would be
expensive and wouldn't tell us very much that we don't know already. Most
scientific studies have tight cost limits and aren't as rigorous as we would
like. That doesn't mean that they are useless.

You'll often come across this fallacy in a much more gross way than you have
made it. I can find an objection to the evidence, therefore it disappears.
 
I

Ian Collins

Malcolm said:
I want a type called "int" and signed that is the size of the address
bus. To be fernickety about it it really needs an extra bit, but in
practise if you need over half the address space for a single array then
you can code access to it specially.
That has been the convention until now. The proposal is to change it, to
make int no longer the natural integer size for the machine.
What proposal?
The problem with size_t was that the implications weren't thought
through. I am sure the committee thought that they were adding a minor
patch to malloc() and documenting variables that hold sizes of memory.
They didn't realise the implications. The problem is that if int is 32
bits and memory is 64 bits, then every array index to an arbitrary array
needs to be a size_t. Virtually every integer, pace Flash Gordon, is
used as an array index or in intermediate calculations to derive
indices.

How many of those index more than 2^31 bytes?
But since size_t is unsigned, awkward to read, and most
numbers are much less than 2 billion, people won't use it consistently.

It has to be unsigned to address the full range.
C does not have fixed size basic types. Rightly or wrongly. By
making int a fixed size of 32 bits compiler vendors are the ones making
the change, not me. As 64 bit machines hit the mass market, we have a
brief chance to save our language from ANSI doing to C what they did to
C++.
As of C99 there are fixed size types. I for one am a huge fan of
standardised fixed size types, just like common utility code that
belongs in the standard library, they save everyone from reinventing
them. With the move to 64 bit, the use of these types is pretty much a
requirement for system library functions.

The size of the basic types are not specified by the C standard, or
compiler authors alone, but by the operating system vendors. The two
dominant desktop/server factions already have these locked down and they
aren't going to change. 64 bit has been in the data centre for over a
decade and the choice of 64 bit programming model was made long before
that. Your moment is long gone.
 
R

Richard Heathfield

Malcolm McLean said:

<snip>

Virtually every integer, pace Flash
Gordon, is used as an array index or in intermediate calculations to
derive indices.

Maybe for you, but not for everyone.
But since size_t is unsigned,

That's an advantage.
awkward to read,

That's a matter of opinion.
and most numbers are much less than 2 billion,

And that's false. There are as many numbers greater than 2 billion as
there are less than 2 billion. (We know this because they can be put
into one-to-one correspondence with each other.) I think you mean most
of the numbers most programmers actually want to *use* are less than 2
billion, and I'll accept that.
people won't use it consistently.

Well, some people won't. Others, however, will. People are inconsistent
like that.
A type is a standard for passing data between functions.

Among other things, yes. This doesn't really affect anything, though.

C does not have fixed size basic types. Rightly or wrongly.

Rightly. The silly thing about fixed size types is that, every few
years, you need to introduce a new type. C did this properly right from
the start.
By making
int a fixed size of 32 bits compiler vendors are the ones making the
change, not me. As 64 bit machines hit the mass market, we have a
brief chance to save our language from ANSI doing to C what they did
to C++.

I agree that int should be the natural size for the machine, and there
is certainly an argument that long int should be longer and short
shorter (where possible). I see no particular value in fixing exact
type sizes, though - in fact, quite the opposite.
 
M

Malcolm McLean

Richard Heathfield said:
Malcolm McLean said:
I agree that int should be the natural size for the machine, and there
is certainly an argument that long int should be longer and short
shorter (where possible). I see no particular value in fixing exact
type sizes, though - in fact, quite the opposite.
The camapign for 64 bit ints is actually a campaign for ints the size of the
address bus. For instance if someone brought out a machine that had 48 bits
of addressable memory space then the natural size for an int would be 48
bits. The idea is that an int should be the type to use for an array index -
except in rare circumstances when memory is at a premium or intermediate
calculations overflow its range. The effectively every integer is an int and
everything works together.
The only time I ever really need a fixed size type is for defining pixel
values. Sometimes it is nice to be able to access a pixel as a single
variable, and 32 bits gives you rgba. However nowadays buffer calculations
aren't the bottleneck they were - high end rasterisation is done in hardware
and low-end is fast enough anyway - so my recent code uses mainly arrays of
unsigned chars.
 
I

Ian Collins

Malcolm said:
The camapign for 64 bit ints is actually a campaign for ints the size of
the address bus. For instance if someone brought out a machine that had
48 bits of addressable memory space then the natural size for an int
would be 48 bits. The idea is that an int should be the type to use for
an array index - except in rare circumstances when memory is at a
premium or intermediate calculations overflow its range.

Even at the cost of enlarging the dataset slowing down a large number of
applications that do not require int larger than 32 bits?
 
R

Richard Heathfield

Ian Collins said:
Malcolm McLean wrote:

Even at the cost of enlarging the dataset slowing down a large number
of applications that do not require int larger than 32 bits?

On such systems, you could always use short int. There's nothing magical
about 32, you know. Not that int is the right type for an array index,
of course - it's just another possible source of bugs.
 
A

Army1987

The problem with size_t was that the implications weren't thought through.
I am sure the committee thought that they were adding a minor patch to
malloc() and documenting variables that hold sizes of memory. They didn't
realise the implications. The problem is that if int is 32 bits and memory
is 64 bits, then every array index to an arbitrary array needs to be a
size_t.
Why? Unless you have an array larger than 2 GB, a 32-bit signed int will do
that, regardless of any architecture details.
Virtually every integer, pace Flash Gordon, is used as an array index or
in intermediate calculations to derive indices. But since size_t is
unsigned, awkward to read, and most numbers are much less than 2 billion,
people won't use it consistently.

If the problem is that size_t is unsigned, use ptrdiff_t. But, first, ask
yourself wheter you *really* need it.
 
F

Flash Gordon

Malcolm McLean wrote, On 14/04/07 10:08:
The camapign for 64 bit ints is actually a campaign for ints the size of
the address bus. For instance if someone brought out a machine that had
48 bits of addressable memory space then the natural size for an int
would be 48 bits. The idea is that an int should be the type to use for
an array index - except in rare circumstances when memory is at a
premium or intermediate calculations overflow its range. The effectively
every integer is an int and everything works together.

Apart from all those processors with 24 bit address busses but only 16
bit data busses and registers. Those then have to jump through major
hoops and int suddenly becomes a very inefficient type. If you think it
does not happen, look at the processors by that little company called
Intel, specifically the rarely used x86 series of processors.
The only time I ever really need a fixed size type is for defining pixel
values. Sometimes it is nice to be able to access a pixel as a single
variable, and 32 bits gives you rgba. However nowadays buffer
calculations aren't the bottleneck they were - high end rasterisation is
done in hardware and low-end is fast enough anyway - so my recent code
uses mainly arrays of unsigned chars.

Some stuff is done in hardware some in software some in a mix. It all
depends. For example, I doubt that Paintshop Pro on the PC is using
custom hardware for a lot of its image processing, and the same for a
lot of other software. I also have other uses for known fixed size
integer types and accepting that some SW will need changing if ported to
a system not supporting those types.
 
F

Flash Gordon

Malcolm McLean wrote, On 14/04/07 09:09:
This is an important point.
Finding an objection to evidence doesn't make that evidence disappear,

If the objections are valid then it means the evidence is not relevant.
or turn the opposite case into a "mere assertion" position.

If you have no evidence for which there are not valid objections then
your position is merely your opinion.
You seldom have exactly the evidence you would like.

That is why people conduct studdies. Some times those studdies can be
done by taking the raw data from a number of other studdies, but you
have to design your study first then go and collect the data from the
other studies.
> For instance the
example I gave was Java, not C, it was from a selection of about ten
applications, and the study wasn't designed to show what I asserted, the
conclusion had to be derived by looking at the evidence with some
assumptions.

Assumptions which have to be justified, especially if challenged. You
have not given any justification for your assumptions (which you did not
state) or explanation of why my challenges are wrong.
However whilst your objections were legitimate, in fact the idea that
Java code is very different from C code in its use of arrays, that
server applications use arrays in a very different way from most general
applications, that array access instructions don't bear a very tight
relationship to index calculations, and so on, aren't all that
plausible.

I gave specific reasons why certain classes of applications do a lot of
integer work that is not to do with indexing. Here is another, a number
of the SQL data types are either integer types (which will be impleneted
as integer types) or scaled integer types (which will be implemented as
integer types with the scaling stored as part of the column definition.
The manipulation of those integer types will be on the basis of SQL
queries, summing costs etc. On financial applications, such as Sun
Financials, they make use of such types a lot, and hardly any use of
arrays. I suspect that there is a LOT of financial SW out there where
they do not want to use floating point for numbers because the law does
not permit them to have rounding errors (or specifies specific rounding
which you won't be able to guarantee with floating point numbers).

Probable most "indexing" done by a database is indexing in to a file,
not memory, so by that argument we need int to be large enough to store
the size of the largest table that might be required, and that is
independent of the processor!
> You would need to do a proper study to establish the matter
beyond all possible doubt.

You would need to do rather more than look at one very biased sample to
have any significant evidence.
> However do you think I could get a research
grant for people to sit tagging up C code to establish how much indexing
is done? It would be expensive and wouldn't tell us very much that we
don't know already.

So far the only person to jump in on either side is Richard Heathfield,
and he seems to take the opposing view. So another biased sampling shows
that most C developers either have no opinion on the subject or disagree
with you.
> Most scientific studies have tight cost limits and
aren't as rigorous as we would like. That doesn't mean that they are
useless.

True, but they have vastly more rigour than looking at a sample that
says it was biased and then using the data for something the collection
was not designed for.
You'll often come across this fallacy in a much more gross way than you
have made it. I can find an objection to the evidence, therefore it
disappears.

If a sample can be shown to be biased for the use it is being put to
then you cannot use it on its own as evidence for your point. If you
collect data from enough other studdies with different biases and can
show they cancel out, or you can come up with a way to counteract the
bias (with evidence supporting your method) then you can use it.

Note that I do not claim to know whether overall integer types are used
more for indexing and related operations or for other purposes. I only
claim that my experience contradicts it and that you have no significant
evidence to support your personal opinion.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,278
Latest member
BuzzDefenderpro

Latest Threads

Top