strange use of format specifier in printf

A

Andrey Tarasevich

Money said:
Here in this thread
http://groups.google.co.in/group/co...?q=finding+endianness&rnum=1#e3b0dbf76f3e02e3

Tydr Schnubbis in 3rd reply used %d for printing char's...isn't it
wrong?

On a platform with signed 'char' type, when 'char' values are passed as
arguments for '...' (ellipsis) parameters, they are first promoted to 'int'
values. So it's 'int' values that are actually passed. And there's nothing wrong
with using '%d' format specifier with 'int' values.

On a platform with unsigned 'char' type, it is possible that 'int' is not large
enough to hold all values of 'char' and 'char' will be promoted to 'unsigned
int' instead. In this particular case the code would lead to undefined behavior,
since it is illegal to use '%d' format specifier with 'unsigned int' values.
 
M

Money

Tom said:
No. It will print them in decimal though, not as characters.

I suggest you look up what char is promoted to when passed as an
argument.

Tom

But we are telling printf to print 32-bits(on my system), and char is
not 32-bits(I know it can be but atleast it's not on my system)
 
M

Money

Andrey said:
On a platform with signed 'char' type, when 'char' values are passed as
arguments for '...' (ellipsis) parameters, they are first promoted to 'int'
values. So it's 'int' values that are actually passed. And there's nothing wrong
with using '%d' format specifier with 'int' values.

On a platform with unsigned 'char' type, it is possible that 'int' is not large
enough to hold all values of 'char' and 'char' will be promoted to 'unsigned
int' instead. In this particular case the code would lead to undefined behavior,
since it is illegal to use '%d' format specifier with 'unsigned int' values.

Thanks..I got it.
 
T

Tom St Denis

Money said:
But we are telling printf to print 32-bits(on my system), and char is
not 32-bits(I know it can be but atleast it's not on my system)
AHEM

Thanks for playing the usenet game. Can you now play the research
game?

Tom
 
G

Gordon Burditt

I suggest you look up what char is promoted to when passed as an
But we are telling printf to print 32-bits(on my system), and char is
not 32-bits(I know it can be but atleast it's not on my system)

But what is char promoted to when passed as an argument on your
system?

Gordon L. Burditt
 
P

Peter Nilsson

Andrey said:
...
On a platform with unsigned 'char' type, it is possible that 'int' is not
large enough to hold all values of 'char' and 'char' will be promoted
to 'unsigned int' instead. In this particular case the code would lead
to undefined behavior, since it is illegal to use '%d' format specifier
with 'unsigned int' values.

On such a hosted implementation, you'll likely find considerably
more problems than just printing a char. The C standards seem
to let the QoI gods rule out the possibility of such implementations
existing.

[Different story for freestanding implementations though. Real
implementations with CHAR_MAX == UINT_MAX exist, though
none which include <stdio.h> support, AFAIK.]
 
M

Money

How about this solution

#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1

int TestByteOrder()
{
int x = 0x0001;
char *y = (char *) &x;
return(y[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}
 
R

Richard Heathfield

Money said:
How about this solution

#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1

int TestByteOrder()
{
int x = 0x0001;
char *y = (char *) &x;
return(y[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This fails to identify middle-endian systems.
 
M

Money

Richard said:
Money said:
How about this solution

#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1

int TestByteOrder()
{
int x = 0x0001;
char *y = (char *) &x;
return(y[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This fails to identify middle-endian systems.

--
Richard Heathfield
"Usenet is a strange place" - dmr 29/7/1999
http://www.cpax.org.uk
email: rjh at above domain (but drop the www, obviously)


Just forgetting about any other ordering except these two. Will it work
fine if char is more than 8 bits?
 
K

Keith Thompson

Money said:
Richard said:
Money said:
How about this solution

#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1

int TestByteOrder()
{
int x = 0x0001;
char *y = (char *) &x;
return(y[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This fails to identify middle-endian systems.

Just forgetting about any other ordering except these two. Will it work
fine if char is more than 8 bits?

Maybe.

A very minor point: I'd write the initializer for x as "0x1" or just
"1". The three leading zeros seem to imply that int is 16 bits, which
of course it may or may not be.

If char is at least 16 bits, then it's possible that sizeof(int)==1;
in that case, int has no meaningful byte order, but your function will
return LITTLE_ENDIAN.

If int has at least CHAR_BIT padding bits at its lowest address, your
function will BIG_ENDIAN if the padding bits happen to be set to 0, or
possibly some meaningless result if the padding bits are set to some
arbitrary value.

Your function tests the byte order of type int. It's not
inconceivable that other integer types could have different byte
orders.

None of these problems are likely to turn up on any modern hosted
system.
 
F

Frederick Gotham

Richard Heathfield posted:
Money said:
How about this solution

#define BIG_ENDIAN 0
#define LITTLE_ENDIAN 1

int TestByteOrder()
{
int x = 0x0001;
char *y = (char *) &x;
return(y[0] ? LITTLE_ENDIAN : BIG_ENDIAN);
}

This fails to identify middle-endian systems.


Start off with an unsigned integer, and set its value to zero.

Then set its second LSB to 1 (achieve this by taking 1 and shifting it
CHAR_BIT places to the left, and then OR-ing it with the original
variable).

Then set its third LSB to 2. Then set its forth LSB to 3. And so on.

Then use a char pointer to go through the unsigned integer's bytes. The
byte with the value 0 is the LSB. The byte with the value 1 is the second
LSB. And so on. (But beware of padding within the unsigned integer!).

I've already written such code many times but I'm working off a laptop
and don't have my code directory with me...
 
M

Money

Keith said:
If char is at least 16 bits, then it's possible that sizeof(int)==1;
in that case, int has no meaningful byte order, but your function will
return LITTLE_ENDIAN.

How is it possible for sizeof(int)==1. I am not able to understand.
If int has at least CHAR_BIT padding bits at its lowest address, your
function will BIG_ENDIAN if the padding bits happen to be set to 0, or
possibly some meaningless result if the padding bits are set to some
arbitrary value.

I really didn't understood that. Please can you explain in simpler words
 
K

Keith Thompson

Money said:
How is it possible for sizeof(int)==1. I am not able to understand.

char must be at least 8 bits (CHAR_BIT >= 8).

int must be at least 16 bits (CHAR_BIT * sizeof(int) >= 16). [1]

An implementation with CHAR_BIT==16 and sizeof(int)==1 would satisfy
these requirements.

Note that sizeof yields the size of its argument in bytes. In C, a
"byte" is by definition the size of a char, so sizeof(char) == 1 by
definition, however many bits that happens to be. (It's common these
days to use the term "byte" to mean exactly 8 bits, but that's not how
C uses the term; a better word for exactly 8 bits is "octet".)
I really didn't understood that. Please can you explain in simpler words

Here's an example. Suppose CHAR_BIT==8, and sizeof(int)==4 (32 bits),
but only the high-order 24 bits contribute to the value; the low-order
8 bits are ignored. These 8 bits are called "padding bits". Suppose
the byte order is little-endian. Then the value 0x654321, for example,
would be represented by the byte values (0x00, 0x21, 0x43, 0x65), shown
from lowest to highest addresses within the word.

The proposed code sets an int to the value 1, which on our
hypothetical system would be represented as (0x00, 0x01, 0x00, 0x00).
It then looks at the first byte (at the lowest address) of the
representation. Seeing the value 0x00, it assumes, incorrectly, that
the 1 byte was stored at the other end of the word, and that the
machine is big-endian.

(I *think* I got this right.)

[1] The standard doesn't actually say direcly that int is at least 16
bits. It says that the range of values it can represent is at
least -32767 .. +32767. That, and the fact that a binary
represention is required, imply that it's at least 16 bits.
 
A

Andrew Poelstra

How is it possible for sizeof(int)==1. I am not able to understand.

int is guaranteed to be at least 16 bits. char is guaranteed to be
at least 8 bits, and sizeof (char) will always equal one. Therefore,
if both a char and an int are 16 bits (or any number above that),
sizeof (int) will equal one.
I really didn't understood that. Please can you explain in simpler words

Padding bits are unused bits in a variable that don't actually hold
a value. They might contain metadata about the variable, and so if
they are corrupted, no one knows what will happen. (It won't be UB
that I know of, but it may not be what you expect.)
 
K

Keith Thompson

Andrew Poelstra said:
Padding bits are unused bits in a variable that don't actually hold
a value. They might contain metadata about the variable, and so if
they are corrupted, no one knows what will happen. (It won't be UB
that I know of, but it may not be what you expect.)

Padding bits are defined only for integer types. (For other types,
the standard doens't say enough about their representation for the
concept to be meaningful.)

Padding bits do not contribute to the value of an object. Certain
values of padding bits might create a trap representation; accessing
an object that contains a trap representation invokes undefined
behavior.
 
J

Jack Klein

On a platform with signed 'char' type, when 'char' values are passed as
arguments for '...' (ellipsis) parameters, they are first promoted to 'int'
values. So it's 'int' values that are actually passed. And there's nothing wrong
with using '%d' format specifier with 'int' values.

On a platform with unsigned 'char' type, it is possible that 'int' is not large
enough to hold all values of 'char' and 'char' will be promoted to 'unsigned
int' instead. In this particular case the code would lead to undefined behavior,
since it is illegal to use '%d' format specifier with 'unsigned int' values.

That is not entirely true. It is quite legal to pass an unsigned int
type to *printf() with a conversion specifier for the corresponding
signed type, and vice versa, provided that the value is within the
range of values that can be held in both types.

This snippet, assuming correct header inclusion:

void func()
{
int si = 2;
int ui = 2;

printf("\n%d %u\n", ui, si);
}

....must produce the output "2 2".

Passing a negative signed integer type to printf() with an unsigned
conversion specifier, or passing an unsigned integer type with a value
greater than TYPE_MAX with a signed conversion specifier is undefined.

You can also deduce from the standard that the common value range
defined solution actually applies to any function in C.

You can pass a signed int type (int or larger) to a function expecting
the corresponding unsigned type, or an unsigned type (int or larger)
to a function expecting the corresponding signed type, as long as the
value of type actually passed is in the range 0 through TYPE_MAX (not
UTYPE_MAX) inclusive.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top