c++ integer data types

I

Ioannis Vranos

Shailesh said:
I wrote a short page as a quick reference to c++ integer data types. Any
feedback welcome: http://www.somacon.com/blog/page11.php



"The size of a char is guaranteed to be at least eight bits. The actual
size is an unspecified, system-dependent unit that can represent the
implementation's character set."


Wrong! The size of char is *always* 1 byte, which is not always 8 bits.
That is 1 byte is not always of 8 bits (but most times it is).

We use char, signed char and unsigned char, not only to deal with
characters, but also to deal with bytes.


For example any object of C++ can be considered (and read) as a sequence
of unsigned chars safely.


Any POD object can be read as a sequence of plain chars too. And you can
copy it to an unsigned char/plain char array of the same size and create
a (shallow) copy of the first object.


-------------------------------------------------------------------------


"A new type has been introduced, long long, that is guaranteed to be at
least 64 bits. See limits.h to get the ranges on your system."

Wrong, long long is not part of C++98 and also you omitted <limits>. So
the above would be better this way:


"Use numeric_limits in <limits> (or the C subset <climits> constants) to
get the ranges of all built in types in your system".
 
A

Alf P. Steinbach

* Ioannis Vranos:
"The size of a char is guaranteed to be at least eight bits. The actual
size is an unspecified, system-dependent unit that can represent the
implementation's character set."

Wrong! The size of char is *always* 1 byte, which is not always 8 bits.
That is 1 byte is not always of 8 bits (but most times it is).

I don't see anything _wrong_ in the quoted passage; it doesn't mention
bytes, and since C++ adopts CHAR_MIN and CHAR_MAX from C it (frantic
handwaiving to distract academics) inherits C's guarantee of >= 8 bits.
 
I

Ioannis Vranos

Alf said:
I don't see anything _wrong_ in the quoted passage; it doesn't mention
bytes, and since C++ adopts CHAR_MIN and CHAR_MAX from C it (frantic
handwaiving to distract academics) inherits C's guarantee of >= 8 bits.


Yes. Perhaps he should replace the word "size" with the word "width" or
"bit-size".
 
J

Jack Klein

I wrote a short page as a quick reference to c++ integer data types.
Any feedback welcome: http://www.somacon.com/blog/page11.php

First, you confuse C99's <stdint.h> and <inttypes.h> headers, although
I admit the names are not terribly intuitive.

It is <stdint.h> that provides for various integer types.
<inttypes.h> provides macros for performing formatted character and
wide character input/output operations on the types.

I find your list of "Integer Data Types" likely to be very confusing
for someone not already familiar with the types. You list "short int"
and "long int" on the same line with int, when they are different
types. Then your list of synonyms is incomplete.

Finally, your recommendation is seriously flawed as far as I'm
concerned. The whole point of <stdint.h> is to allow writing code
that is portable to different implementations with different widths
for the integer types. Note that it is possible to write one's own
<stdint.h> for any conforming C++ compiler, with the possible omission
of the 64 bit types.

This is of more than academic concern to some of us, especially those
who work in multiple and less-common environments, such as digital
signal processors and embedded systems of various types.

About a year ago I wrote code to deal with data from a CAN bus
interface. The hardware level driver code is of course off-topic
here, but the code that packs data into packets to transmit, and
unpacks data from received packets to parse, was 100% standard.

The CAN packet in memory consisted of 128 bits in contiguous bytes.
The data in the middle 64 bits represented data that could be any
combination of 8-bit, 16-bit, and 32-bit values.

The processors on each end of the link had very different hardware
restrictions and integer types.

The master was an ARM microcontroller that requires alignment of
16-bit data to even addresses and 32-bit data to addresses evenly
divisible by four. The slave was a DSP with 16-bit bytes that can't
address 8-bit values in memory at all, and required 32-bit data to be
aligned to even addresses.

The ARM compiler came with a <stdint.h> header, the DSP compiler did
not so I wrote my own. Then I wrote packetizing, depacketizing, and
parsing routines using that <stdint.h> that compiled and executed
unchanged on both sides.

You might want to take a look at my page:
http://www.jk-technology.com/c/inttypes.html
 
I

Ioannis Vranos

Jack said:
You might want to take a look at my page:
http://www.jk-technology.com/c/inttypes.html


Very interesting information, however regarding that page I think that
placing C and C++ information together can be very confusing since they
are two different languages with competing features (like Complex types)
and different characteristics (like const meaning) currently, and I
guess will be even more in the future.


In any case, only confusion can arise to a C or C++ newcomer when
reading this kind of mixed language information.
 
S

Shailesh Humbad

Jack said:
First, you confuse C99's <stdint.h> and <inttypes.h> headers, although
I admit the names are not terribly intuitive.

It is <stdint.h> that provides for various integer types.
<inttypes.h> provides macros for performing formatted character and
wide character input/output operations on the types.

I find your list of "Integer Data Types" likely to be very confusing
for someone not already familiar with the types. You list "short int"
and "long int" on the same line with int, when they are different
types. Then your list of synonyms is incomplete.

Finally, your recommendation is seriously flawed as far as I'm
concerned. The whole point of <stdint.h> is to allow writing code
that is portable to different implementations with different widths
for the integer types. Note that it is possible to write one's own
<stdint.h> for any conforming C++ compiler, with the possible omission
of the 64 bit types.

This is of more than academic concern to some of us, especially those
who work in multiple and less-common environments, such as digital
signal processors and embedded systems of various types.

About a year ago I wrote code to deal with data from a CAN bus
interface. The hardware level driver code is of course off-topic
here, but the code that packs data into packets to transmit, and
unpacks data from received packets to parse, was 100% standard.

The CAN packet in memory consisted of 128 bits in contiguous bytes.
The data in the middle 64 bits represented data that could be any
combination of 8-bit, 16-bit, and 32-bit values.

The processors on each end of the link had very different hardware
restrictions and integer types.

The master was an ARM microcontroller that requires alignment of
16-bit data to even addresses and 32-bit data to addresses evenly
divisible by four. The slave was a DSP with 16-bit bytes that can't
address 8-bit values in memory at all, and required 32-bit data to be
aligned to even addresses.

The ARM compiler came with a <stdint.h> header, the DSP compiler did
not so I wrote my own. Then I wrote packetizing, depacketizing, and
parsing routines using that <stdint.h> that compiled and executed
unchanged on both sides.

You might want to take a look at my page:
http://www.jk-technology.com/c/inttypes.html

Hi Jack,

I'm glad you replied. I had skimmed through your page and found the
facts about odd/even alignment and addressing quite interesting . I
didn't claim my page was right or even good. It's just my effort to
understand what's going on. If even experts are conflicting with each
other on this topic, then forgive me for being hopeless. I will have to
look into the synonym issues further. I'll also have to use "width" or
"bit-size" where I'm using "size". You're right about the header files.
Unfortunately, neither stdint.h nor inttypes.h are included in MSVC
2003, so I haven't been exposed to either. At--

http://www.opengroup.org/onlinepubs/009695399/basedefs/stdint.h.html

--it says "The <stdint.h> header is a subset of the <inttypes.h> header
more suitable for use in freestanding environments, which might not
support the formatted I/O functions. In some environments, if the
formatted conversion support is not wanted, using this header instead of
the <inttypes.h> header avoids defining such a large number of macros."

As for the recommendation to only use int, double, char, and bool until
otherwise required, you'll have to take that up with Bjarne Stroustrup.
I based the recommendation, and most of my other information, on what
it says in Section 4.1.1 of his book, _The C++ Programming Language_.

"For most applications, one could simply use bool for logical values,
char for characters, int for integer values, and double for
floating-point values. The remaining fundamental types are variations
for optimizations and special needs that are best ignored until such
needs arise. They must be known, however, to read old C and C++ code."

BTW, I'm writing a protocol that runs over TCP/IP. A major assumption
of TCP is that a byte is always 8-bits. ("... we constrain the length
of a segment to an integral number of 8-bit bytes." --
http://cs.mills.edu/180/reading/CK74.pdf -- "A Protocol for Packet
Network Intercommunication" -- Don't you love the Internet?) Therefore,
in order to define my protocol in C++ and increase its cross-platform
compilability, I need certainty regarding bit-widths. Would you
recommend using stdint.h for this purpose, and if so, where can I get it
from?

Thanks,
Shailesh
 
I

Ioannis Vranos

Shailesh said:
Unfortunately, neither stdint.h nor inttypes.h are included in MSVC
2003, so I haven't been exposed to either.


These are also not part of the C++98 standard, and VC++2003 supports
C++98 and C90 (and I guess it will never support C99).


BTW, I'm writing a protocol that runs over TCP/IP. A major assumption
of TCP is that a byte is always 8-bits. ("... we constrain the length
of a segment to an integral number of 8-bit bytes." --
http://cs.mills.edu/180/reading/CK74.pdf -- "A Protocol for Packet
Network Intercommunication" -- Don't you love the Internet?) Therefore,
in order to define my protocol in C++ and increase its cross-platform
compilability, I need certainty regarding bit-widths. Would you
recommend using stdint.h for this purpose, and if so, where can I get it
from?


stdint.h is not part of C++. To get the number of bits of a byte, for
any type you may use numeric_limits defined in <limits>.

More precisely, the number of bits of a byte in a given implementation is:


std::numeric_limits<unsigned char>::digits



unsigned char and char are guaranteed to have no padding bits in C++ (in
C99 char may have, but lets forget C here).
 
S

Shailesh Humbad

Ioannis said:
These are also not part of the C++98 standard, and VC++2003 supports
C++98 and C90 (and I guess it will never support C99).






stdint.h is not part of C++. To get the number of bits of a byte, for
any type you may use numeric_limits defined in <limits>.

More precisely, the number of bits of a byte in a given implementation is:


std::numeric_limits<unsigned char>::digits



unsigned char and char are guaranteed to have no padding bits in C++ (in
C99 char may have, but lets forget C here).

That gives me the number of bits of a byte at run time, but I need
certainty of it at compile time. Maybe I will try <boost/cstdint.hpp>.
I think that would be better than what I'm doing now, which is to use
the Microsoft-specific unsigned __int8 type.
 
L

Lionel B

Shailesh said:
That gives me the number of bits of a byte at run time, but
I need certainty of it at compile time.

std::numeric_limits<unsigned char>::digits is surely - as a static
const member - known at compile-time? My compiler (gcc 3.3.3 Cygwin)
evidently thinks so:

<code>

#include <iostream>
#include <limits>

int main()
{
std::cout << "Bits per char = " << std::numeric_limits<unsigned
char>::digits;

int array[std::numeric_limits<unsigned char>::digits];
}

</code>

Output:

Bits per char = 8

Regards,
 
I

Ioannis Vranos

Lionel said:
std::numeric_limits<unsigned char>::digits is surely - as a static
const member - known at compile-time? My compiler (gcc 3.3.3 Cygwin)
evidently thinks so:

<code>

#include <iostream>
#include <limits>

int main()
{
std::cout << "Bits per char = " << std::numeric_limits<unsigned
char>::digits;

int array[std::numeric_limits<unsigned char>::digits];
}

</code>

Output:

Bits per char = 8


You are right. However I couldn't make it work in #if statements in my
compilers, but he can also use the C90-subset CHAR_BIT defined in <climits>:


#include <iostream>
#include <climits>

int main()
{
#if CHAR_BIT == 8

std::cout<<"Byte 8 bits\n";

#else
std::cout<<"Byte not 8 bits!\n";

#endif
}




CHAR_BIT provides the bits of a byte in both C90 and C++98, but not in
C99, since in the later a char is allowed to have padding bits.

However, why should we mention C99 all the time in here? :)
 
I

Ioannis Vranos

Ioannis said:
CHAR_BIT provides the bits of a byte in both C90 and C++98, but not in
C99, since in the later a char is allowed to have padding bits.

However, why should we mention C99 all the time in here? :)


Actually it may work the same in C99 too, since in C99 is also mentioned
that it provides the number of bits of a byte.

So in summary regarding C++98, as far as I know, we can get the bits of
a byte by using numeric_limits<unsigned char>::digits (defined in
<limits>) and CHAR_BIT (defined in <climits> and <limits.h>).


Important: Since numeric_limits<T>::digits provides the bits without the
sign bit, numeric_limits<char>::digits does not provide the bits of a byte.
 
I

Ioannis Vranos

Ioannis said:
Actually it may work the same in C99 too, since in C99 is also mentioned
that it provides the number of bits of a byte.

So in summary regarding C++98, as far as I know, we can get the bits of
a byte by using numeric_limits<unsigned char>::digits (defined in
<limits>) and CHAR_BIT (defined in <climits> and <limits.h>).


Important: Since numeric_limits<T>::digits provides the bits without the
sign bit, numeric_limits<char>::digits does not provide the bits of a byte.


.... when the char is implemented as signed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top