Use of Long and Long Long

J

James Kuyper

Bart said:
Jack Klein wrote: ....
It means the function could have been compiled with a different C compiler
(maybe even a different language) with a specification published using the
local semantics (in this case the exact meaning of 'long').

I had in mind dynamic linking but I think the problem can occur with static
linking too, if using object files compiled elsewhere. (You will say some of
these terms are not in the C standard but this is a practical problem.)

An implementation of C includes not only the compiler but the standard
library and the linker, even if those parts did not all come from the
same vendor. If two different compilers have different sizes for the
same type, and that type is used as an argument to or return value from
the C standard library, then the two compilers are generating code that
must be linked with different versions of the C standard library. If the
compiler automatically invokes the linker (as is the case with most of
the compilers I'm familiar with), it's the compiler's responsibility to
automatically link to the correct version of the standard library.
Provided a binary library is for the correct platform (object format,

Note: a single platform can support many different object formats; the
one I use most often supports at least four different formats, possibly
more.
processor and so on), surely I can use any C compiler to compile the header
files? Otherwise it seems overly restrictive.

The C standard makes no such guarantees. Even using a single compiler,
it's possible to compile code modules that can't be linked together. The
compiler I use most often supports 3 different ABIs (I don't remember
what ABI is an acronym for, but it corresponds to having different sizes
for various types, and possibly different function call protocols). It
supports one 64-bit ABI, and two mutually incompatible 32-bit ABIs, and
you can't link code compiled for one ABI with code compiled for another
ABI. However, you can run programs built with different ABIs at the same
time in the same shell.
The one overriding theme in this newsgroup seems to be that of portability,
yet if I distribute a library I need to specify a particular compiler or
supply a version for every possible compiler?

If you distribute your library as compilable source code, and if you can
write that source code using strictly portable constructs, you only need
to produce one version for all conforming compilers. If you distribute
it as an object library, you're going to have to create a separate
library for each supported object code format, and you'll have to decide
which formats to support.
Wouldn't it be better, in published header files, to use more specific type
designators than simply 'long' (since, once compiled, the binary will not

For the libraries I use most often, the installation script uses various
sophisticated techniques to determine the characteristics of the
compiler it's installed with, and it creates header files that set up
typedefs for the library's interfaces. Code which uses that library must
be compiled using a compiler compatible with the one used during
installation of the library, but the user code just uses the typedef,
and doesn't need to be aware of the fact that it's a different type when
using different compilers.
 
F

Flash Gordon

James Kuyper wrote, On 10/01/08 12:14:
Flash said:
Bart C wrote, On 10/01/08 01:16:
Flash Gordon wrote:
Bart C wrote, On 09/01/08 20:33:

Integer widths that obey the rule short < int < long int <long long
int (instead of short<=int<=long int or whatever) would be far more
intuitive and much more useful (as it is now, changing int x to long
int x is not guaranteed to change anything, so is pointless)
Now look at a 32 bit DSP which cannot deal with anything below 32
bits. Given your definition it would have to do at least
sizeof(char) == 1 - 32 bits
sizeof(short) == 2 - 64 bits
sizeof(int) == 3 - 96 bits
sizeof(long) == 4 - 128 bits
sizeof(long long) == 5 - 156 bits [160?]

Yes in this example it sort of makes sense, if the compiler does not
try too hard to impose anything else.

However it's not impossible either for the compiler to impose 8, 16,
32, 64-bit word sizes (don't know how capable DSPs are or whether
this is even desirable). So a 128K array of 8-bit data for example
could take 128KB instead of 512KB.

It is not impossible, but on a significant number of main stream DSP
processors using anything less than the size the processor understands
(16 bits, 24 bits, 32 bits and 48 being common) you would make the
smaller types very inefficient since it would have to keep masking etc.

On any given processor, types that are too small are inefficient because
they must be emulated by bit-masking operations, while types that are
two big are inefficient because they must be emulated using multiple
instances of the largest efficient type. If signed char, short, int,
long and long long were all required to have different sizes, a
processor with hardware support for fewer than 5 different sizes would
have to choose one form of inefficiency or the other; efficient
implementation of all 5 types would not be an option.

On these processors it does not make sense to do the work to provide
smaller types as they would not be used because of the nature of what
the processors are used for and the consequent inefficiencies, and it
would not make sense for int to by larger than one C byte because that
would make using int impracticable for most purposes and prevent you
from reusing source code not specifically targeted at the processor
without changing all the int and unsigned int variables to signed and
unsigned char. Thus C would be very difficult to use on those processors.
In practice, what would be provided would depend upon what's needed, and
not just what's efficient.

Yes, and in practice that means making char, short and int all the same
size, that being 16, 24, 32 or 48 bits depending on what the processor
supports.
There's a lot more need for 8 or 16 bit types

Not on the processors I'm talking about or the purposes they are used
for. The reason they do not have the concept of types that small is that
they are designed for applications which only require larger types and
require the maximum speed available from them. If smaller types were
required the processors would not sell.
than there is for 128 or 156 bit types (though I'm not saying there's no
need for those larger types - I gather that the cryptographic community
loves them).

Not only there I believe, but the world is a *lot* larger than desktop
applications. Certainly on the types of image and video processing I've
done I would have been more likely to use larger types than smaller
types, although I did my best to stick to the natural size of the
processors and hence used int a lot.
 
E

Eric Sosman

Bart said:
[...]
Provided a binary library is for the correct platform (object format,
processor and so on), surely I can use any C compiler to compile the header
files? Otherwise it seems overly restrictive.

No, because implementation-provided headers may (often do)
contain implementation-specific "magic."

/* <setjmp.h> for Frobozz Magic C */
typedef _registerblock jmp_buf[1];
_restrictedvalue int setjmp(jmp_buf);
_nonreturning void longjmp(jmp_buf, int);

Compile this header with a non-Frobozz compiler, and it's quite
likely not to recognize the three special tokens. (If you are
particularly unlucky, it *will* recognize them ...)
 
R

Richard Tobin

(I don't remember
what ABI is an acronym for, but it corresponds to having different sizes
for various types, and possibly different function call protocols)

It's "Application Binary Interface". Contrast with "API", where "P"
stands "Programming", which is typically a source-level specification.

-- Richard
 
K

Keith Thompson

As long as they can represent the minimum value for MIN/MAX they are
okay.

If you are talking about sizes, there is no such rule.
Here are the evaluation rules for sizeof:
sizeof (char) == 1
sizeof (anything) >= 1 && sizeof (anything) <= SIZE_MAX
Very little is guaranteed, but I don't see the problem here.
You can use <stdint.h> and (u)intN_t where N is 8,16,32,64. You may
find that does not guarantee much either.

What the standard does guarantee is certain minimum *ranges* for the
predefined integer types. Specifically:

signed char is at least -127 .. +127
unsigned char is at least 0 .. 255
char is the same as either signed char or unsigned char

short is at least -32767 .. +32767
unsigned short is at least 0 .. 65535

int and unsigned int have the same minimal ranges as short and
unsigned short

long is at least -2**31 .. +2*31
unsigned long is at least 0 .. 2**32-1

long long is at least -2**63 .. +2**63
unsigned long long is at least 0 .. 2**64-1

(In the following, "<=" means that the range of the second
type includes the range of the first type.)

signed char <= short <= int <= long <= long long

unsigned char <= unsigned short <= unsigned int <= unsigned long
<= unsigned long long

This implies a minimum number of bits for each type (there have to be
enough bits to represent each distinct value in the range).

The reason most of these guarantees don't apply to sizeof(whatever) is
that a type can have padding bits which don't contribute to the value.
The vast majority of implementations you're likely to run into will
have no padding bits but you shouldn't assume that there are none
(unless the assumption makes your task significantly easier *and*
you're willing to give up some degree of portability).

[...]
The proper format specifier for size_t is '%zu'

But a significant number of implementations still don't support "%zu".
You can also cast to unsigned long and use "%lu" (this can fail if
size_t is wider than unsigned long *and* the value you're printing
happens to exceed ULONG_MAX).
'sizeof (type) * CHAR_BIT' is not an accurate way to calculate the
number of bits 'type' uses in your system.

It is *if* you can assume there are no padding bits (see above). But
it is an accurate way to calculate the number of types 'type' will
*occupy*.
 
B

Bart C

Bart said:
I've always had a problem knowing exactly how wide my integer
variables were in C...
Integer widths that obey the rule short < int < long int <long long
int (instead of short<=int<=long int or whatever) would be far more
intuitive and much more useful...

Just been looking at a book on the C# language and it has these signed
datatypes:

sbyte 8 bits
short 16 bits
int 32 bits
long 64 bits

With unsigned versions byte, ushort, uint, ulong. Perfect and unambiguous,
exactly what I wanted in C. By comparison C's numeric types are a minefield.
(And visually untidy, like _Bool and size_t)

Of course C# is a fairly recent language devised by MS that didn't have to
worry about old code or running on obscure processors.

Is there anywhere a development of C that has been brought up-to-date and
tidied up, to run on modern 32/64-bit CPUs (with a stack of course :)?


Bart
 
W

William Ahern

Just been looking at a book on the C# language and it has these signed
datatypes:

sbyte 8 bits
short 16 bits
int 32 bits
long 64 bits

With unsigned versions byte, ushort, uint, ulong. Perfect and unambiguous,
exactly what I wanted in C. By comparison C's numeric types are a minefield.
(And visually untidy, like _Bool and size_t)

Of course C# is a fairly recent language devised by MS that didn't have to
worry about old code or running on obscure processors.

Or worry about new code, either. In some respects all that mattered was the
instant impressions and opinions at the moment of inception. Only time can
tell whether they're "perfect and unamgiguous" over a longer duration of
computing evolution. Same thing happened with UCS-2/UTF-16. I highly doubt
that C# "got it right".

Only a few years out and C# has already become disheveled. Delegates weren't
the panacea once touted (maybe not be the creators). There are many other
warts and such popping up, especially in relation to newer and more modern
languages.

Many (most?) will argue that C's path is, on balance, preferable. A little
"ambiguity" never hurt anybody. Certainly it forces you to focus more on the
end and not the means. In any event, I think many here would agree that
unjustfied concern with the layout of data types is a sign of inexperience.
When I began programming in C I struggled for years with appropriate usage
of data types. Autoconf tests, macros, typedefs. I was convinced that it
mattered, if only because so many other programmers seemingly expended great
effort wrestling with the "problem".

The implications of C# and Java's decisions aren't as clear cut as you might
think.

Eventually you become comfortable with the notion that you can create
software in a world where all your constraints aren't predefined and
absolutely fixed for you. Amount of memory changes, CPUs get faster (or
slower in some cases). Networks provide higher (or lower) throughput; higher
(or lower) latency. And yes, in some languages even the properties of data
types can change. And if they didn't, it wouldn't matter one iota. At some
point you have to learn how to program in a manner which isn't at odds with
reality, unless you enjoy re-writing the same code over and over again. You
use the tools to solve the abstract problem; you shouldn't invent problems
where none need exist.

At the very least, that some programmers aren't bothered by this aspect of C
should give you pause.
 
P

Peter Nilsson

No, your problem is that you think you need precise fixed
width integers.

Is there anywhere a development of C that has been brought
up-to-date and tidied up, to run on modern 32/64-bit CPUs
...

Yes. It was done back in the late 80's. Pity you missed it.
Still, it's not to late to catch up.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top