Specific sizes of variable types

G

gamehack

Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

Thanks
 
M

Mike Wahler

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?

You can find out the sizes of types, but you can't change them.
See the 'sizeof' operator.
Should I have a platform dependent header file which checks for
CHAR_BIT

You already have such a headers as part of your implementation.
(<limits.h>). CHAR_BIT gives the number of bits in a character
(byte). 'INT_MAX' gives the largest possible value for type
'int', etc.
and sizeof(int) etc

'sizeof' will give the number of bytes a particular type
uses.

? and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?

The only way before compilation is to read your documentation.
At compile-time, use the 'sizeof' operator. E.g. you can
Use a 'sizeof' expression for the 'size' argument of 'fread()'.

C99 implementations can provide exact-sized types, but you'll
still need to check documentation to see if/which one(s) are
available for a given implementation.

-Mike
 
F

Flash Gordon

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

If you want true portability with a binary file you also have to think
about endianness, so reading the file bytewise is best. If you are using
C99 stdint.h will help, if not it is easy to fake up and people have
posted links to such things here before.

Remember also that the standard does not require an implementation to
actually have a 16 bit integer type, even if CHAR_BIT is 8.
 
K

Keith Thompson

gamehack said:
I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

C99 has a standard header called <stdint.h>, which provides typedefs
for int8_t, uint8_t, int16_t, uint16_t, int32_t, etc. Those are
exact-width types, and the signed ones are required to be
2's-complement with no padding bits; they may not exist if the
implementation doesn't provide types with the required attributes. It
also provides "least" types (the smallest type with at least the
specified number of bits) and "fast" types (the "fastest" type with at
least the specified number of bits); it's not always clear what
"fastest" really means.

If your implementation doesn't provide this header, you can roll your
own, or you can use Doug Gwyn's C90-compatible public-domain
implementation q8 at <http://www.lysator.liu.se/c/q8/>. You'll
probably need to do some tailoring for your system.
 
G

gamehack

Thanks

Mike said:
You can find out the sizes of types, but you can't change them.
See the 'sizeof' operator.


You already have such a headers as part of your implementation.
(<limits.h>). CHAR_BIT gives the number of bits in a character
(byte). 'INT_MAX' gives the largest possible value for type
'int', etc.


'sizeof' will give the number of bytes a particular type
uses.

? and have a few typedefs like int16, long32

The only way before compilation is to read your documentation.
At compile-time, use the 'sizeof' operator. E.g. you can
Use a 'sizeof' expression for the 'size' argument of 'fread()'.

C99 implementations can provide exact-sized types, but you'll
still need to check documentation to see if/which one(s) are
available for a given implementation.

-Mike
 
G

Gordon Burditt

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there

2 byte integers is *NOT* a definite size.
I think you mean 2-octet integers or 16-bit integers.
C bytes may have any number of bits >= 8.
a portable way in being sure that each integer will be exactly 2 bytes?

No. And there's no portable way to ensure that an integer will be
exactly 2 octets, either. And even if there were, there's the
bit-order issue, which WILL be an issue if you're reading out of
binary files.
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.

There's no guarantee that there will be a type that is EXACTLY 16, 19,
23, 29, 32, 37, or whatever bits long.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?

You can compile and run a program which outputs the size in bits
of each type you care about, then build a header based on the
results.

Gordon L. Burditt
 
K

Keith Thompson

Mike Wahler said:
news:[email protected]... [...]
? and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?

The only way before compilation is to read your documentation.

Well, that's not really true. You can write a program that looks at
At compile-time, use the 'sizeof' operator. E.g. you can
Use a 'sizeof' expression for the 'size' argument of 'fread()'.

For maximum portability, you can't assume that sizeof gives you all
the information you need. There could be padding bits.
C99 implementations can provide exact-sized types, but you'll
still need to check documentation to see if/which one(s) are
available for a given implementation.

Assuming you have <stdint.h>, the exact-width types may or may not be
available; you can determine this during compilation by using the
*_MAX macros. For example:

#include <stdint.h>

#ifdef INT16_MAX
/* ... int16_t is available ... */
#else
/* ... int16_t is not available ... */
#endif

But this is unlikely to be useful. If the program really requires
int16_t, you might as well just use it and let the compilation fail if
it doesn't exist. If you'll settle for a larger type if int16_t
doesn't exist, you can just use int_least16_t.

If you don't have <stdint.h>, as I mentioned elsethread, you can use
Doug Gwyn's q8.

If you wanted to do this yourself from scratch, you could do something
like this:

#if SCHAR_MAX >= 32767
typedef signed char int16;
#elif SHRT_MAX >= 32767
typedef short int16;
#else /* INT_MAX is guaranteed to be at least 32767 */
typedef int int16;
#endif

#if SCHAR_MAX >= 2147483647
typedef signed char int32;
#elif SHRT_MAX >= 2147483647
typedef short int32;
#elif INT_MAX >= 2147483647
typedef int int32;
#else /* LONG_MAX is guaranteed to be at least 2147483647 */
typedef long int32;
#endif

But since the work has already been done for you, there's no need to
bother.
 
G

gamehack

Hey Keith,

Thanks a lot! I was just exploring my /usr/include and stumbled upon
stdint.h and saw all these typedefs. Exactly what I needed. One more
question. What happens if you user signed char or unsigned char for
input/output? Does it matter which one I'm using?

Regards
 
G

gamehack

Hello,

I'm quite aware of the endianness problem and the need to swap octets
if necessary.BTW, when I say byte I'm referring to an octet even though
I know that a byte doesn't mean an octet. As I said, stdint.h is doing
the job for me now.

Thanks
 
R

Rod Pemberton

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?

No. Most C compilers are still C89, and are slowly working toward C99 which
standardized exact sizes. If you are reading in data in sizes larger than a
byte (long, short, etc), you'll also need to be concerned about endianess of
the data.
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32

I'm not sure if you mean 'OS' or 'compiler' by 'platform'. You'll need a
compiler dependent header file. Each compiler implements it's own sizes.
You'll probably want to avoid automatic sizes such as 'int' or 'auto'.
You'll also want to avoid 'unions' of different size data because of the
endianess issue. If you download a classic portable C game, like perhaps
UMORIA, you'll should be able to 'cut and paste' much of the needed code...
Or, you could look at other portable code such as Info-ZIP's Unzip, etc...

There are a number of programs which will give you the necessary info when
compiled on a specific compiler. One is Steven Pemberton's (no relation)
enquire.c.


Rod Pemberton
 
K

Keith Thompson

gamehack said:
Hey Keith,

Thanks a lot! I was just exploring my /usr/include and stumbled upon
stdint.h and saw all these typedefs. Exactly what I needed. One more
question. What happens if you user signed char or unsigned char for
input/output? Does it matter which one I'm using?

Regards

(Please don't top-post. Your response goes below, or interspersed
with, any quoted text, and you should trim anything that's not
relevant to your followup.)

You should use whatever type the function you're using expects. Read
the documentation.
 
C

CBFalconer

Rod said:
No. Most C compilers are still C89, and are slowly working
toward C99 which standardized exact sizes. If you are reading
in data in sizes larger than a byte (long, short, etc), you'll
also need to be concerned about endianess of the data.

C99 only standardizes some names for systems that have components
of those exact sizes. Using stdint.h is not really portable. It
may be convenient.

The only truly portable way to handle binary quantities is as files
of bytes. You have to pack and unpack things yourself into units
of no more than 8 bits, and control all endianness in the file.
You can write functions to do all these chores in a completely
portable manner.

Once you have all that working you can create and read your binary
file. Now you can consider system dependant optimizations, by
replacing the un/packing functions under conditionals such as:

#if defined MYWACKYSYSTEM
....
#elif defined YOURWACKYSYSTEM
....
#else
.... the portable code already developed ....
#endif

and the routines created for WACKYSYSTEM will take advantage of the
(presumably) known endianness, byte sizes, and sizeof values for
it.

Note that the portable code, by dealing with values, need never
concern itself with the actual system endianness.

--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
 
J

Jordan Abel

C99 only standardizes some names for systems that have components
of those exact sizes. Using stdint.h is not really portable. It
may be convenient.

The only truly portable way to handle binary quantities is as files
of bytes. You have to pack and unpack things yourself into units
of no more than 8 bits, and control all endianness in the file.
You can write functions to do all these chores in a completely
portable manner.

If you're going to bother doing that, why not go all the way and make it
base64, then it can be moved between different systems via text
conversion tools that presumably already exist for interchange between
such systems.
 
T

Thad Smith

gamehack said:
I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?

Do you mean ensuring that the size of int for a particular compiler is 2
bytes? Properly written, the application shouldn't need that. An int
in C is at least 16 bits. If that is what you mean by 2 bytes, that
should be sufficient. Extra bits of precision shouldn't be a problem.
 
M

Malcolm

gamehack said:
I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
This is actually on-topic. Discussion of platform-specific libraries is OT,
but discussing features of compilers which can differ from platform to
platform is perfectly OK.
If you have a binary file, what you wnat to do is read integers portably.
Therefore

fread(&x, 1, sizeof(int), fp);

is not OK, because you don't know the endianness, and the size might not be
right.

int fget16(FILE *fp)
{
int answer;

answer = fgetc(fp);
answer <<= 8;
answer |= fgetc(fp);

return answer;
}

is the way to go.

There are a few issues. Firstly, you need to sign extend the return if you
want to support negative numbers. Secondly, you might want to detect EOF.

Lastly, the code is portable enough, but not 100% strictly portable. The
file might not be read in chunks of 8 bytes, and the coding system might not
be twos complement. These possibilities are sufficiently remote that,
usually, you can just forget about them.

Once you've got the data into the computer, you are unlikely to care about
whether an integer is 16, 32, or 64 bits, as long as you know that the
maximum value can be represented. If the integers are, say, 4 digit
Gregorian calendar years, it won't do any harm to have extra zeros in the
significant bits. It may waste a little space, but you can deal with that
problem when you come to it.
 
C

Christian Bau

"gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

Thanks

#if UINT_MAX != 65535
#error This code is non-portable and doesn't work.
#endif

You have to be a bit careful if you check for 32 bit integers, to write
the check in a way that will work on a 16 bit machine. For example if
you write

#if UINT_MAX != 0xffffffff
#error This code is non-portable and doesn't work.
#endif

you can't be sure what a sixteen bit compiler would do. It might not be
able to handle 0xffffffff and interpret it as 0xffff, and the #error
statement would not be compiled. This will work:

#if ((UINT_MAX >> 15) >> 15) != 3
#error This code is non-portable and doesn't work
#endif
 
P

pete

gamehack said:
Hi all,

I'm writing an application which will be opening binary files with
definite sizes for integers(2 byte integers and 4 byte longs). Is there
a portable way in being sure that each integer will be exactly 2 bytes?
Should I have a platform dependent header file which checks for
CHAR_BIT and sizeof(int) etc and have a few typedefs like int16, long32
etc.
<OT>
If I have to go non-portable, then what's the way to determine the
sizes before compilation and typedef accordingly?
</OT>

Binary files aren't closely associated with portability.
Text files are.
 
C

CBFalconer

Malcolm said:
This is actually on-topic. Discussion of platform-specific
libraries is OT, but discussing features of compilers which can
differ from platform to platform is perfectly OK.
If you have a binary file, what you wnat to do is read integers
portably. Therefore

fread(&x, 1, sizeof(int), fp);

is not OK, because you don't know the endianness, and the size
might not be right.

int fget16(FILE *fp)
{
int answer;

answer = fgetc(fp);
answer <<= 8;
answer |= fgetc(fp);
return answer;
}

is the way to go.

Not good enough. You haven't allowed for the possible variation of
CHAR_BIT. Also you must not create an integer overflow. Thus:

unsigned int fget16(FILE *fp) {
unsigned int ans;

ans = (fgetc(fp) & 0xff) << 8;
ans |= (fgetc(fp) & 0xff);
return ans;
}
There are a few issues. Firstly, you need to sign extend the
return if you want to support negative numbers. Secondly, you
might want to detect EOF.

Exactly. Except sign-extend is not the right term.
Lastly, the code is portable enough, but not 100% strictly
portable. The file might not be read in chunks of 8 bytes,
and the coding system might not be twos complement. These
possibilities are sufficiently remote that, usually, you can just
forget about them.

Those are so easily handled (see above) that there is no excuse for
failure.
Once you've got the data into the computer, you are unlikely to
care about whether an integer is 16, 32, or 64 bits, as long as
you know that the maximum value can be represented. If the
integers are, say, 4 digit Gregorian calendar years, it won't do
any harm to have extra zeros in the significant bits. It may waste a
little space, but you can deal with that problem when you come to it.


--
"If you want to post a followup via groups.google.com, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell.org/google/>
 
J

Joe Wright

pete said:
Binary files aren't closely associated with portability.
Text files are.

Go pete! That's why source code is in text files!

Not to talk down to anyone but ASCII is American Standard Code for
Information Interchange. It is text. It is Open Standard.

It is the right way, usually, to communicate among disparate systems and
architectures.

If you will communicate with 'binary' files, you must know exquisitely
the format of the file you are reading. You cannot 'detect' the format
by reading the file.

Trying to write programs to handle format differences among binary files
from disparate systems will make you very tired, frustrated and old.
 
K

Keith Thompson

Christian Bau said:
#if UINT_MAX != 65535
#error This code is non-portable and doesn't work.
#endif

You have to be a bit careful if you check for 32 bit integers, to write
the check in a way that will work on a 16 bit machine. For example if
you write

#if UINT_MAX != 0xffffffff
#error This code is non-portable and doesn't work.
#endif

you can't be sure what a sixteen bit compiler would do. It might not be
able to handle 0xffffffff and interpret it as 0xffff, and the #error
statement would not be compiled.

Actually, that shouldn't be a a problem. In C99, the preprocessor
evaluates integers as if they were of type intmax_t or uintmax_t,
which is guaranteed to be at least 64 bits. I don't have my copy of
the C90 standard handy, but I'm fairly sure the rule is similar,
except that only 32 bits are guaranteed.

If you want to go beyond 32 bits, and you can't assume a C99
implementation, you might have to resort to some ugly tricks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top