32 or 64 bit processor info in C

M

Malcolm McLean

Ian Collins said:
If don't specify the type, everyone else who wants it has to add their
own, so we ended up with typedefs like U8, INT16 and just about every
other possible naming of fixed types. At least the standard now sets a
naming convention.
People should be taught that it is almost never necessary to write code like
this.

Data is almost always either real numbers, strings, booleans, enumerated
symbols, or indices into arrays (technically a subset of keys). So double or
float, char *, and int should be basically all you need.
Almost always isn't alway always, and just occasionally you need an integer
of a different type. But it is very occasionally.

A cynic like me might argue that once you admit the size_t and ptrdif_t
gibberish into you code it is unreadable anyway and more gibberish won't
make any difference.
 
I

Ian Collins

Malcolm said:
People should be taught that it is almost never necessary to write code
like this.

Data is almost always either real numbers, strings, booleans, enumerated
symbols, or indices into arrays (technically a subset of keys). So
double or float, char *, and int should be basically all you need.

Never written any device drivers or protocol stacks then?
 
W

Walter Roberson

Malcolm McLean said:
Data is almost always either real numbers, strings, booleans, enumerated
symbols, or indices into arrays (technically a subset of keys). So double or
float, char *, and int should be basically all you need.
Almost always isn't alway always, and just occasionally you need an integer
of a different type. But it is very occasionally.

I don't know what kind of programming you do, but the majority of
my programming over the years has been for integer data. Accounting
information for computer usage (time, disk, I/O). Network usage
analysis. Network trouble-shooting (e.g., number of late collisions).
Security log analysis (yes, that involves strings, but there are also
fundamental statistical summarization phases.) Compression programs
(string input, yes, but the output is integer arithmetic). And so on.

I have also worked on a number of scientic programs which used
real numbers extensively, so I am certainly not attempting to imply
that real numbers are unimportant, but the -majority- of my programs
have involved manipulating integral values.
 
C

CBFalconer

Ian said:
Malcolm McLean wrote:
.... snip ...
all you need.

Never written any device drivers or protocol stacks then?

(val & 0xff) is fairly well guaranteed to carry exactly 8 bits.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
<http://www.securityfocus.com/columnists/423>
<http://www.aaxnet.com/editor/edit043.html>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 
F

Flash Gordon

Walter Roberson wrote, On 21/04/07 13:02:
I don't know what kind of programming you do, but the majority of
my programming over the years has been for integer data.

For me there has been a lot of fixed point stuff. I really do mean fixed
point, floating point would have been inappropriate for several reasons.
> Accounting
information for computer usage (time, disk, I/O).

I can add accounting information in the sense accountants are interested
in as well. Also various other quantities, such as weights and measures,
for accounting, billing, invoicing reasons you want in fixed point with
a defined scaling rather than floating point.
> Network usage
analysis. Network trouble-shooting (e.g., number of late collisions).
Security log analysis (yes, that involves strings, but there are also
fundamental statistical summarization phases.) Compression programs
(string input, yes, but the output is integer arithmetic). And so on.

A lot of embedded work in my experience you want fixed point for
efficiency. Also fixed point or integer is appropriate for certain types
of image processing, and for stability reasons (accurate control of
rounding and precision) you well prefer fixed point for all sorts of
control loops.
I have also worked on a number of scientic programs which used
real numbers extensively, so I am certainly not attempting to imply
that real numbers are unimportant, but the -majority- of my programs
have involved manipulating integral values.

Same here.
 
F

Flash Gordon

Ian Collins wrote, On 21/04/07 01:10:
No, it can't do that, int is a signed type, so it can never index the
largest possible array.

Malcolm is also forgetting the Motorola 68000 series of processors. They
had address busses larger than 16 bits but smaller than 32 bits, they
were also flat address spaces. C compilers for those used 16 bit ints.

Some of the DSPs with 24 or 48 bit integer registers and ALUs will have
had 16/32 bit address busses. However, I suppose Malcolm might be happy
with an int larger than the address bus by a few bits.
Considering 64 bit systems have been in widespread use for over a
decade, short is a relative term.

Indeed.
 
W

Walter Roberson

(val & 0xff) is fairly well guaranteed to carry exactly 8 bits.

But (val & 0xff) is not a "key" nor an "enumerated symbol", so
according to Malcolm it does not fall into the class of
"data is almost always". And there are a lot of IPv4 addresses
(and netmasks) still around; those are certain to fit into a C unsigned
long, but not certain to fit into a C "int"; to me it seems unnatural to
deliberately write the code in terms of arrays of unsigned char
or unsigned int just to fit into someone's notion that int
"should be basically all you need".
 
M

Malcolm McLean

Flash Gordon said:
Some of the DSPs with 24 or 48 bit integer registers and ALUs will have
had 16/32 bit address busses. However, I suppose Malcolm might be happy
with an int larger than the address bus by a few bits.
Ideally you'd have one extra bit so that signed arithmetic couldn't fail to
have enough resolution. However at least in C the convention is that indices
are scaled by the data type. So the problem only arises for char arrays
taking up over half the address space. That happens rarely enough for it to
be reasonable to say "many functions may not work on your dataset, you will
have to code specially with unsigned types".
Most architectures don't distinguish between address and data integer
registers, so it makes sense to use the same registers to hold pointers and
ints.
 
W

Walter Roberson

Most architectures don't distinguish between address and data integer
registers, so it makes sense to use the same registers to hold pointers and
ints.

"Most" maybe, but some well known and widely implemented architectures
do, such as the Motorola 680x0 and the MIPS R2000/ 3000/ 4000/ 5000/
6000/ 8000/ 10000/ 12000 lines (MIPS is used in a lot of embedded
situations, I hear.)

Are we talking about C for general purpose computing, or are we talking
about imposing non-trivial architecture restrictions on the machines
that will use this modified C?
 
M

Malcolm McLean

Walter Roberson said:
"Most" maybe, but some well known and widely implemented architectures
do, such as the Motorola 680x0 and the MIPS R2000/ 3000/ 4000/ 5000/
6000/ 8000/ 10000/ 12000 lines (MIPS is used in a lot of embedded
situations, I hear.)

Are we talking about C for general purpose computing, or are we talking
about imposing non-trivial architecture restrictions on the machines
that will use this modified C?
We're talking about what int should be on a typical 64-bit machine. I'm
arguing 64 bits, the emerging convention is 32 bits, which I oppose. We're
not taling about removing latitude from the language so that DSP chips and
the like can't use funny integer sizes if it is appropriate for them, nor
are we talking about modifying the standard.
Except that I've indicated that as a long-term goal I'd like size_t removed
from the language, or at least generate a mandatory ugliness warning.
However that's for the future. At the moment the fight is for 64 bits.
 
F

Francine.Neary

Richard Heathfield wrote On 04/18/07 15:43,:
As for themallocexample, I myself usually write an
assignment statement and a separate test. This is not so
much out of a concern that the whole thing would be too
long, but to direct the focus: "I will now allocate some
memory. (By the way, I'll also check for failure.)" But
sometimes I'll gang the whole thing together, particularly
during an initialization where I'm just going to exit the
program on a failure:

if ( (buff1 =malloc(N1 * sizeof *buff1) == NULL
|| (buff2 =malloc(N2 * sizeof *buff2) == NULL
|| (buff3 =malloc(N3 * sizeof *buff3) == NULL ) {
perror ("malloc");
fputs ("No memory; bye-bye!\n", stderr);
exit (EXIT_FAILURE);
}

I think this is easier to read than three assignments, three
tests, and three error-exits, or than shuffling the test-and-
exit off to a wrapper function -- although I do *that* too,
sometimes. (Note that three assignments followed by one
three-way test and one error-exit is not quite the same: ifmalloc() sets errno, the successful allocation of buff3 could
obscure why buff2's allocation failed. malloc() need not set
errno and some do not, but I take the optimistic view and try
to give the poor user all the available diagnoses, even if
they're suspect.)

This doesn't seem very sensible to me. I mean, you already know why
malloc() has failed (how many different ways can you say "out of
memory"?), so if I was implementing malloc I definitely wouldn't
bother setting errno, and if I'm recovering from a failed malloc then
why would I risk giving the user a mystifying spurious error message
resulting from errno being set and never cleared half a page of code
above?
 
E

Eric Sosman

[...] if malloc() sets errno, the successful allocation of buff3 could
obscure why buff2's allocation failed. malloc() need not set
errno and some do not, but I take the optimistic view and try
to give the poor user all the available diagnoses, even if
they're suspect.)

This doesn't seem very sensible to me. I mean, you already know why
malloc() has failed (how many different ways can you say "out of
memory"?), so if I was implementing malloc I definitely wouldn't
bother setting errno, and if I'm recovering from a failed malloc then
why would I risk giving the user a mystifying spurious error message
resulting from errno being set and never cleared half a page of code
above?

How many ways can you say "out of memory?" More than one,
I'm sure. A few possibilities:

- "Out of memory" (the basic bleat)

- "Memory quota exceeded" (maybe if the user petitions the
sysadmin for an increased quota all will be well)

- "No more swap space" (maybe if more swap can be allocated
all will be well)

- "Resource temporarily unavailable" (yes, I've seen this one)

- "No error" (I've seen this one, too)

.... and probably others, too. The point is that a library function
may have several reasons for failing, and different reasons may
suggest different responses or corrections. IMHO it's better to
pass along whatever diagnostic information the implementation is
willing to provide than to throw a blanket over it and force the
user to guess about the reasons. Sometimes the diagnostic data is
misleading ("malloc: connection reset by peer"), but when it isn't
it can be most helpful.
 
I

Ian Collins

CBFalconer said:
.... snip ...


(val & 0xff) is fairly well guaranteed to carry exactly 8 bits.
True, but it's a bit silly when one can use a fixed size type to
represent a register or fields in a packet header.
 
I

Ian Collins

Malcolm said:
We're talking about what int should be on a typical 64-bit machine. I'm
arguing 64 bits, the emerging convention is 32 bits, which I oppose.

You appear to consistently miss the point that 64 bit systems and the
LP64 integer model has been in widespread use for over a decade. So the
convention has well and truly emerged, settled down and had kids.
 
K

Keith Thompson

This doesn't seem very sensible to me. I mean, you already know why
malloc() has failed (how many different ways can you say "out of
memory"?), so if I was implementing malloc I definitely wouldn't
bother setting errno, and if I'm recovering from a failed malloc then
why would I risk giving the user a mystifying spurious error message
resulting from errno being set and never cleared half a page of code
above?

On one system (Solaris 9), the malloc man page says that a failing
malloc() can set errno to either of two values:

The malloc(), calloc(), and realloc() functions will fail
if:

ENOMEM
The physical limits of the system are exceeded by size
bytes of memory which cannot be allocated.

EAGAIN
There is not enough memory available to allocate size
bytes of memory; but the application could try again
later.

On another (Red Hat Linux), the man page says:

The Unix98 standard requires malloc(), calloc(), and realloc()
to set errno to ENOMEM upon failure. Glibc assumes that this is
done (and the glibc versions of these routines do this); if you
use a private malloc implementation that does not set errno,
then certain library routines may fail without having a reason
in errno.

The gory details are strictly off-topic, of course, but the point is
that it is possible for malloc() to provide more information in errno
than just "not enough memory". Or it can not bother setting errno at
all. But if you set errno to 0 before the call, you can *probably*
assume that if malloc() failed *and* errno != 0, then the value of
errno is meaningful.

That's not necessarily true for library functions in general, though.
A function might use other functions internally; those functions might
set errno on failure even if the calling function doesn't. It's not
uncommon for a successful fopen() call to set errno to some non-zero
value; it's also possible for a failing function to indirectly set
errno to a non-zero value that doesn't reflect the actual cause of the
error.

<OT>I think POSIX makes more guarantees in this area.</OT>
 
M

Malcolm McLean

Eric Sosman said:
How many ways can you say "out of memory?" More than one,
I'm sure. A few possibilities:
If you ask for a trivial amount of memory on a big system then it is much
more likely that the computer has broken than that it is genuinely out of
memory.

If you ask for a large amount then it may not have enough installed to do
the calculation.

The message you want to send to the user is different. Also in the second
case it is worth anticipating the failure and having a recovery strategy; in
the first it is probably futile - barring safety critical systems and the
like.
 
F

Flash Gordon

Malcolm McLean wrote, On 21/04/07 17:45:
Ideally you'd have one extra bit so that signed arithmetic couldn't fail
to have enough resolution. However at least in C the convention is that
indices are scaled by the data type. So the problem only arises for char
arrays taking up over half the address space. That happens rarely enough
for it to be reasonable to say "many functions may not work on your
dataset, you will have to code specially with unsigned types".

You entirely missed the point that on those processors in int is WIDER
than the address bus. You also ignored (and snipped) my other points.
Most architectures don't distinguish between address and data integer
registers,

Most that you have come across possibly, but I doubt that you have come
across most architectures.
> so it makes sense to use the same registers to hold pointers
and ints.

It is not simply the width of the address register that is important, it
is also the memory bandwidth. Since memory is slow using 64 bit integers
when you do not need them can slow things significantly.
 
W

Walter Roberson

We're talking about what int should be on a typical 64-bit machine.

Is that a specific "typical 64-bit machine", or a generalized 64
bit machine?
I'm
arguing 64 bits, the emerging convention is 32 bits, which I oppose.

The machine I'm using right now is a 64 bit machine, a very typical
one at the time it was made. int, long and pointer are all 32 bits
on it; long long is 64 bits (and fully supported by the architecture.)

I could complain to the designers about them following your
so-called "emerging convention", but I would have to pull some of them
out of retirement to do so, considering that the model line was
introduced to the market in 1993 and they stopped selling this particular
edition of it in 1996. Yes, LL64 machines have already been on the
market for 14 years, and Yes, my deskside 64 bit machine is 12 years old.

The company that made my machine has made some of the largest
single-image compute clusters in the world (i.e., a single operating
system instance is controlling the entire cluster), and oddly those compute
clusters all use int of 32 bits. We're talking machines with multiple
terabytes of cache-coherent RAM (accessible from any program),
and petabytes of disk storage. But somehow in that decade+ of
building record-breaking computers, they missed that simple trick
of just making int 64 bits.

Boy I bet they're sorry in retrospect -- just think, if they had had
your wisdom, then instead of merely building the biggest computers on
Earth, they could have built the biggest computers in the Solar System!
(Oh wait, they did that. Nevermind.)
We're
not taling about removing latitude from the language so that DSP chips and
the like can't use funny integer sizes if it is appropriate for them, nor
are we talking about modifying the standard.

Let's see if I have this straight: you don't want to modify the
standard, you just want the major compiler and OS and chip vendors to
come to their senses and modify their software and instruction
architectures to -de facto- standardize on 64 bit int, because
that's The Right Thing To Do? Is that like, "I would never legislate
a state religion: I would just organize a large-scale boycott campaign
to talk convince people to voluntarily see the error of their ways if
they don't adopt mine!" ?
 
M

Malcolm McLean

Walter Roberson said:
Is that a specific "typical 64-bit machine", or a generalized 64
bit machine?


The machine I'm using right now is a 64 bit machine, a very typical
one at the time it was made. int, long and pointer are all 32 bits
on it; long long is 64 bits (and fully supported by the architecture.)

I could complain to the designers about them following your
so-called "emerging convention", but I would have to pull some of them
out of retirement to do so, considering that the model line was
introduced to the market in 1993 and they stopped selling this particular
edition of it in 1996. Yes, LL64 machines have already been on the
market for 14 years, and Yes, my deskside 64 bit machine is 12 years old.

The company that made my machine has made some of the largest
single-image compute clusters in the world (i.e., a single operating
system instance is controlling the entire cluster), and oddly those
compute
clusters all use int of 32 bits. We're talking machines with multiple
terabytes of cache-coherent RAM (accessible from any program),
and petabytes of disk storage. But somehow in that decade+ of
building record-breaking computers, they missed that simple trick
of just making int 64 bits.

Boy I bet they're sorry in retrospect -- just think, if they had had
your wisdom, then instead of merely building the biggest computers on
Earth, they could have built the biggest computers in the Solar System!
(Oh wait, they did that. Nevermind.)


Let's see if I have this straight: you don't want to modify the
standard, you just want the major compiler and OS and chip vendors to
come to their senses and modify their software and instruction
architectures to -de facto- standardize on 64 bit int, because
that's The Right Thing To Do? Is that like, "I would never legislate
a state religion: I would just organize a large-scale boycott campaign
to talk convince people to voluntarily see the error of their ways if
they don't adopt mine!" ?
Never heard of the software crisis? The hardware people have got their act
together and are giving us lots of cheap processing power. The software
people, us, haven't. So all this bluster about mighty mainframes is beside
the point.

One important reason why projects fail over go over budget is that it is too
difficult to get components to talk to each other, usually because of the
lack of standard conventions for interfacing. The reduction in the number of
data types swilling round the pond is a small part, though only a small
part, in alleviating this. It is relatively rare for a project to fail
because the hardware is 10% too slow to run it, and therefore an improvement
in cache coherency might fix it.

Though mainframes are an important part of the computing world, they tend to
be staffed by teams of professional programmers, and run software specially
written for them. The programs installed and the data run through them can
be strictly controlled.
In a consumer environment things are very different. Small companies are
trying to write software with very limited resources which will be run on
hardware they can't exactly specify, and will have to exchange data with
programs they know nothing about. If there is a bug you don't have a list of
installations and an agreed patch schedule. If Microsoft decide that your
compiler will no longer run of Windows Vista, you have no choice but to run
the code through a different compiler.
However if your mainframe has 32 bit pointers then it has a 32-bit limit on
the size of data objects, and so 32-bit ints are OK. Presumably it rations
processes to 4GB each of memory, even if underneath it is doing all its
memory calculations in 64 bits.
 
I

Ian Collins

Malcolm said:
Though mainframes are an important part of the computing world, they
tend to be staffed by teams of professional programmers, and run
software specially written for them. The programs installed and the data
run through them can be strictly controlled.

I built my first 64 bit desktop system over 10 years ago and I still use
it. You are still deliberately avoiding the fact that 64 bit, LL64 and
more the more common LP64 systems have been in widespread use for a long
time, it's too late to start whinging now.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,434
Messages
2,571,688
Members
48,796
Latest member
Greg L.

Latest Threads

Top