portably printing a size_t

F

Francois Grieu

How can we portably print (or convert to a string representation)
the value of a variable of type size_t?

Francois Grieu
 
F

Francois Grieu

Beej Jorgensen wrote :

The FAQ suggest: "Use a cast to convert the value to a known,
conservatively-sized type, then use the printf format matching
that type"

and shows (paraphrased)
printf("my_len = %lu", (unsigned long)my_len);

The above is portable in the sense that it compiles and runs without
causing undefined behavior; but not in the sense that it shows the
value of my_len<offtopic>; one platform where it fails is 64-bit
windows</offtopic>.

Can we portably choose a conservatively-sized type, and the
associated printf format?

> Also, C99 has %zu (where z tells printf() that the unsigned
> argument is a size_t.)

I know environments with C99 extensions which do not support this
particular C99 extension.


Francois Grieu
 
J

James Kuyper

Francois said:
Beej Jorgensen wrote :

The FAQ suggest: "Use a cast to convert the value to a known,
conservatively-sized type, then use the printf format matching
that type"

and shows (paraphrased)
printf("my_len = %lu", (unsigned long)my_len);

The above is portable in the sense that it compiles and runs without
causing undefined behavior; but not in the sense that it shows the
value of my_len<offtopic>; one platform where it fails is 64-bit
windows</offtopic>.

That shouldn't be a problem on any conforming implementation of C90, and
it shouldn't be a problem on a conforming implementation of C99 unless
my_len > ULONG_MAX. Do you know whether it was?
Can we portably choose a conservatively-sized type, and the
associated printf format?

Not if you need portability to C99. In C99, SIZE_MAX can be too large to
be printed out using any format string compatible with C90. The best you
can do in the way of portability is:

#if __STDC_VERSION__ >= 199901L
printf("%zu", mylen);
#else
printf("%lu", (unsigned long)mylen);
#endif
If you need to do this many times in the same program, you could
probably wrap that in some fancy macros or an function.
I know environments with C99 extensions which do not support this
particular C99 extension.

It's not properly an extension, it's a new feature.

If you need portability to an implementation that doesn't conform fully
to either standard, you're going to have to get into the details of that
implementation to find out what will work.
 
R

Rui Maciel

Francois said:
I know environments with C99 extensions which do not support this
particular C99 extension.

Then those compilers fail to comply with the C99 standard.

Rui Maciel
 
K

Keith Thompson

Rui Maciel said:
Then those compilers fail to comply with the C99 standard.

Of course they do. Nevertheless, they exist, and some programmers
need to deal with them.

Note that printf's "%zu" is implemented by the library, which is often
(but not always) provided by a different entity than the compiler.
 
F

Francois Grieu

At the end of the day, I settle for the admittedly pedantic:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

/* output the decimal representation of a size_t on stdout */
show_size_t(size_t n)
{
if ((size_t)~(size_t)0<=UINT_MAX)
/* size_t fits an "unsigned int" */
printf("%u",(unsigned int)n);
else
if (ULONG_MAX>UINT_MAX)
/* "unsigned long" wider than "unsigned int" */
if ((size_t)~(size_t)0<=ULONG_MAX || n<=ULONG_MAX)
printf("%lu",(unsigned long)n);
else /* n does not fit an unsigned long */
{
show_size_t(n/1000000000ul);
printf("%09lu",(unsigned long)(n%1000000000ul));
}
else /* "unsigned int" equivalent to "unsigned long" */
if (n<=UINT_MAX)
printf("%u",(unsigned int)n);
else /* n does not fit an "unsigned int */
{
show_size_t(n/1000000000ul);
printf("%09u",(unsigned int)(n%1000000000ul));
}
}


I think it is portable to virtually all systems where printf
is available and supports "unsigned long".
The code struggles to avoid pointless use of recursion,
or of "unsigned long" when "unsigned int" would do.
Although the source has 5 comparisons, all are resolved at
compile time, except on systems when size_t can't be printed
as a single unsigned long, in which case one test subsists.

I rejected several other options because some very real
environments (such as the current MinwGW) have support for the
type "unsigned long long" but not for the format specifiers
"%wu" or "%llu". I admit I never used a system like that which
also has "size_t" wider than "unsigned long", but that is
conceivable.


Francois Grieu
 
F

Francois Grieu

Richard Heathfield wrote :
You could replace this line with:

if(1)

or simply remove it altogether. ITYM: if((size_t)-1 <= UINT_MAX).

<snip>

Is it meant to be humorous? Or maybe the characters '!' '-' and/or
'~' are mysteriously swapped by whatever we use to communicate?

If not, allow me to remark that "size_t" can be "unsigned long"
even if "unsigned int" is limited to 65535. In this case
(size_t)~(size_t)0 will equal ULONG_MAX, which
is at least 4294967295, thus (size_t)~(size_t)0<=UINT_MAX will
evaluate to false.

Francois Grieu
 
F

Francois Grieu

I wrote :
Richard Heathfield wrote :

Is it meant to be humorous? Or maybe the characters '~' '-' and/or
'!' are mysteriously swapped by whatever we use to communicate?

If not, allow me to remark that "size_t" can be "unsigned long"
even if "unsigned int" is limited to 65535. In this case
(size_t)~(size_t)0 will equal ULONG_MAX, which
is at least 4294967295, thus (size_t)~(size_t)0<=UINT_MAX will
evaluate to false.
---------------/
Wrong wording for the group, read: evaluate to 0 :)

Francois Grieu
 
J

jacob navia

Francois Grieu a écrit :
I wrote :
---------------/
Wrong wording for the group, read: evaluate to 0 :)

Francois Grieu

Why?

Standard C defines "false" in stdbool.h. That some people here claim that
"it is not portable" etc because they want to stay in 1989 doesn't mean
that this *whole* group follows them.
 
K

Keith Thompson

jacob navia said:
Francois Grieu a écrit :
I wrote : [...]
If not, allow me to remark that "size_t" can be "unsigned long"
even if "unsigned int" is limited to 65535. In this case
(size_t)~(size_t)0 will equal ULONG_MAX, which
is at least 4294967295, thus (size_t)~(size_t)0<=UINT_MAX will
evaluate to false.
---------------/
Wrong wording for the group, read: evaluate to 0 :)

Francois Grieu

Why?

Standard C defines "false" in stdbool.h. That some people here claim that
"it is not portable" etc because they want to stay in 1989 doesn't mean
that this *whole* group follows them.

In the context of a discussion about how to print a size_t under
an implementation that might not have either "%zu" or unsigned
long long, it seems reasonable to avoid referring to "false"
(which is a simply a macro that expands to 0 anyway).
 
K

Keith Thompson

Francois Grieu said:
At the end of the day, I settle for the admittedly pedantic:

[code snipped]
I think it is portable to virtually all systems where printf
is available and supports "unsigned long".
The code struggles to avoid pointless use of recursion,
or of "unsigned long" when "unsigned int" would do.
[...]

I don't see the point of avoiding unsigned long. If there are any
surviving compilers that don't support unsigned long (they'd have to
be pre-ANSI), it's likely they don't support <limits.h> either.
 
F

Francois Grieu

Keith Thompson a écrit :
Francois Grieu said:
At the end of the day, I settle for the admittedly pedantic:

[code snipped]
I think it is portable to virtually all systems where printf
is available and supports "unsigned long".
The code struggles to avoid pointless use of recursion,
or of "unsigned long" when "unsigned int" would do.
[...]

I don't see the point of avoiding unsigned long. If there are any
surviving compilers that don't support unsigned long (they'd have to
be pre-ANSI), it's likely they don't support <limits.h> either.

I agree that virtually all surviving C compilers now have support
for unsigned long. However
- many compilers for devices with 16-bit size_t have significant
code and data size penalty to using 32-bit types.
- some have only optional support of unsigned long in the
(f)printf library, to save code space; in this case making a
single use of unsigned long in printf is likely to require an
abrupt code & data size increase.

And thus I think it is useful that the code reduce to
printf("%u",(unsigned int)n);
on such systems.

Francois Grieu



Again the admittedly pedantic code:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

/* output the decimal representation of a size_t on stdout */
show_size_t(size_t n)
{
if ((size_t)~(size_t)0<=UINT_MAX)
/* size_t fits an "unsigned int" */
printf("%u",(unsigned int)n);
else
if (ULONG_MAX>UINT_MAX)
/* "unsigned long" wider than "unsigned int" */
if ((size_t)~(size_t)0<=ULONG_MAX || n<=ULONG_MAX)
printf("%lu",(unsigned long)n);
else /* n does not fit an unsigned long */
{
show_size_t(n/1000000000ul);
printf("%09lu",(unsigned long)(n%1000000000ul));
}
else /* "unsigned int" equivalent to "unsigned long" */
if (n<=UINT_MAX)
printf("%u",(unsigned int)n);
else /* n does not fit an "unsigned int */
{
show_size_t(n/1000000000ul);
printf("%09u",(unsigned int)(n%1000000000ul));
}
}


I think it is portable to virtually all systems where printf
is available and supports "unsigned long".
The code struggles to avoid pointless use of recursion,
or of "unsigned long" when "unsigned int" would do.
Although the source has 5 comparisons, all are resolved at
compile time, except on systems which size_t is wider than
"unsigned long", in which case one runtime test subsists.

I rejected several other options because some very real
environments (such as the current MinwGW) have support for the
type "unsigned long long" but not for the format specifiers
"%wu" or "%llu". I admit I never used a system like that which
also has "size_t" wider than "unsigned long", but that is
conceivable.
 
R

Richard Bos

Keith Thompson said:
Of course they do. Nevertheless, they exist, and some programmers
need to deal with them.

Yes; the problem is, though, where do you stop? If you take all kinds of
semi-compatible implementations into account, sooner or later you're
going to run into one which supports some, but not all, parts of the C90
Standard, and the part which it doesn't support is the one where
unsigned long is the largest integer type. Or that size_t is an integer
at all.

Basically, we have to assume _something_. The most practical option,
ISTM, is to assume that we have two Standards, and any implementation
which does not choose either C90, or C99, or C90 with a set of
_coherent_ C99 extensions, is too much trouble to be worth taking into
account unless we specifically have to.
In this case, we assume that any implementation worth considering has
either %zu (C99), or a size_t smaller or equal to unsigned long (C90),
or both. Any implementation writer who has picked longer-than-long
size_t from C99, but _not_ %zu, is being incoherent, and thereby doing
his users sufficient disservice to ignore him.

Richard
 
K

Keith Thompson

Yes; the problem is, though, where do you stop? If you take all kinds of
semi-compatible implementations into account, sooner or later you're
going to run into one which supports some, but not all, parts of the C90
Standard, and the part which it doesn't support is the one where
unsigned long is the largest integer type. Or that size_t is an integer
at all.

Basically, we have to assume _something_. The most practical option,
ISTM, is to assume that we have two Standards, and any implementation
which does not choose either C90, or C99, or C90 with a set of
_coherent_ C99 extensions, is too much trouble to be worth taking into
account unless we specifically have to.
In this case, we assume that any implementation worth considering has
either %zu (C99), or a size_t smaller or equal to unsigned long (C90),
or both. Any implementation writer who has picked longer-than-long
size_t from C99, but _not_ %zu, is being incoherent, and thereby doing
his users sufficient disservice to ignore him.

Agreed, mostly. But the implementer doesn't always have a choice.
Since size_t is generally defined by the compiler implementer, and
"%zu" is implemented by the library implementer, coherence isn't
always an option.

I don't know of any implementation where size_t is bigger than
unsigned long but %zu isn't supported, but such an implementation is
quite possible.

If you don't need 100% reliability and are willing to settle for an
error message in the odd case, you could do something like this:

void print_size_t(size_t size)
{
#if __STDC_VERSION__ >= 199901L
printf("size = %zu\n", size);
#else
if (size > ULONG_MAX) {
puts("size exceeds ULONG_MAX");
}
else {
printf("size = %lu\n", size);
}
#endif
}

This will fail cleanly if size exceeds ULONG_MAX; it can fail
and print incorrect information if size exceeds ULONG_MAX and the
implementation falsely claims to conform to C99.
 
P

Peter Nilsson

Francois Grieu said:
At the end of the day, I settle for the admittedly pedantic:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

/* output the decimal representation of a size_t on stdout */
show_size_t(size_t n)

You're using implicit int here!
{
if ((size_t)~(size_t)0<=UINT_MAX)

if ((size_t) -1 <= UINT_MAX)
/* size_t fits an "unsigned int" */
printf("%u",(unsigned int)n);
else
if (ULONG_MAX>UINT_MAX)
/* "unsigned long" wider than "unsigned int" */
if ((size_t)~(size_t)0<=ULONG_MAX || n<=ULONG_MAX)
printf("%lu",(unsigned long)n);
else /* n does not fit an unsigned long */
{
show_size_t(n/1000000000ul);
printf("%09lu",(unsigned long)(n%1000000000ul));
}
else /* "unsigned int" equivalent to "unsigned long" */
if (n<=UINT_MAX)
printf("%u",(unsigned int)n);
else /* n does not fit an "unsigned int */
{
show_size_t(n/1000000000ul);
printf("%09u",(unsigned int)(n%1000000000ul));
}
}

If you're going to include <limits.h>, why not simply do...

#include <limits.h>
#include <stdout.h>

void fputz(size_t z, FILE *fp)
{
#if SIZE_MAX > ULONG_MAX
fprintf(fp, "%zu", z);
#else
fprintf(fp, "%lu", (unsigned long) z);
#endif
}
I think it is portable to virtually all systems where printf
is available and supports "unsigned long". ...

You use unsigned long in the code above, so do actually mean
(doesn't) support %lu?

I rejected several other options because some very real
environments (such as the current MinwGW) have support for the
type "unsigned long long" but not for the format specifiers
"%wu" or "%llu".

Surely 2 unsigned longs worth is enough?! If you can find
a hosted system that supports 800+ petabyte objects or
files, but doesn't support %lu, then I'd say you have more
issues than just being able to print a size_t. ;)

Of course, you could just roll your own %zu printer...

#include <stdio.h>
int fputz(size_t z, FILE *fp)
{
char b[(sizeof(z) * CHAR_BIT + 5)/ 3];
char t, *p, *q;

q = b;
do
{
*q++ = '0' + (z % 10u);
} while (z /= 10u);
*q = 0;

for (p = b; p < --q; p++)
{
t = *p;
*p = *q;
*q = t;
}

return fputs(b, fp);
}
 
F

Francois Grieu

Peter Nilsson wrote :
You're using implicit int here!
Oooops !
if ((size_t) -1 <= UINT_MAX)

I have always wondered if these are equivalent. To reduce to a
simpler thing, how do we demonstrate the following holds?
(unsigned long)~(unsigned long)0 == (unsigned long)-1

If you're going to include <limits.h>, why not simply do...

#include <limits.h>
#include <stdout.h>

void fputz(size_t z, FILE *fp)
{
#if SIZE_MAX > ULONG_MAX
fprintf(fp, "%zu", z);
#else
fprintf(fp, "%lu", (unsigned long) z);
#endif
}

I have environments with no SIZE_MAX, and some issue warning
when using an undefined thing in #if.
Beside, conceivably, SIZE_MAX could be bigger than ULONG_MAX,
but %z support may be missing. I have this MinGW which supports
unsigned long long, but can't print it with "%llu" and can't
print a size_t with "%zu", so this is not very far fetched.

Also, I want to avoid "long" usage when possible, because on
very low-end platforms the code for my function will be a bit
bigger, and it will force me to use a (f)print with long
support, which will much bigger.
You use unsigned long in the code above, so do actually mean
(doesn't) support %lu?

Support for "%l" in the stdio library is sometime optional
and incurring code and data size penalty. Some of my recent
targets have less than 1KB of RAM, and use size_t as the
type for the difference of two pointers.
I rejected several other options because some very real
environments (such as the current MinwGW) have support for the
type "unsigned long long" but not for the format specifiers
"%wu" or "%llu".

Surely 2 unsigned longs worth is enough?! If you can find
a hosted system that supports 800+ petabyte objects or
files, but doesn't support %lu, then I'd say you have more
issues than just being able to print a size_t. ;)
True.

Of course, you could just roll your own %zu printer...

#include <stdio.h>
int fputz(size_t z, FILE *fp)
{
char b[(sizeof(z) * CHAR_BIT + 5)/ 3];
char t, *p, *q;

q = b;
do
{
*q++ = '0' + (z % 10u);

This is portable to every machine I know, but in this group,
*q++ = "0123456789"[z % 10u];
might be even less objectionable.
} while (z /= 10u);
*q = 0;

for (p = b; p < --q; p++)
{
t = *p;
*p = *q;
*q = t;
}

return fputs(b, fp);
}

Francois Grieu


Again the admittedly pedantic (now fixed):

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

/* output the decimal representation of a size_t on stdout */
void show_size_t(size_t n)
{
if ((size_t)~(size_t)0<=UINT_MAX)
/* size_t fits an "unsigned int" */
printf("%u",(unsigned int)n);
else
if (ULONG_MAX>UINT_MAX)
/* "unsigned long" wider than "unsigned int" */
if ((size_t)~(size_t)0<=ULONG_MAX || n<=ULONG_MAX)
printf("%lu",(unsigned long)n);
else /* n does not fit an unsigned long */
{
show_size_t(n/1000000000ul);
printf("%09lu",(unsigned long)(n%1000000000ul));
}
else /* "unsigned int" equivalent to "unsigned long" */
if (n<=UINT_MAX)
printf("%u",(unsigned int)n);
else /* n does not fit an "unsigned int */
{
show_size_t(n/1000000000ul);
printf("%09u",(unsigned int)(n%1000000000ul));
}
}


I think it is portable to virtually all systems where printf
is available and supports "unsigned long".
The code struggles to avoid pointless use of recursion,
or of "unsigned long" when "unsigned int" would do.
Although the source has 5 comparisons, all are resolved at
compile time, except on systems which size_t is wider than
"unsigned long", in which case one runtime test subsists.

I rejected several other options because some very real
environments (such as the current MinwGW) have support for the
type "unsigned long long" but not for the format specifiers
"%wu" or "%llu". I admit I never used a system like that which
also has "size_t" wider than "unsigned long", but that is
conceivable.
 
F

Francois Grieu

pete said:
((unsigned long)-1) equals ULONG_MAX.

N869
6.2.5 Types

[#9]

A computation involving unsigned operands
can never overflow, because a result that cannot be
represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the
largest value that can be represented by the resulting type.

This shows that
((unsigned long)0-(unsigned long)1) equals ULONG_MAX
and even possibly that
(-(unsigned long)1) equals ULONG_MAX
although many compilers at least emit a warning when unary minus
is applied to an unsigned type.

But I fail to see how one goes from here to
((unsigned long)-1) equals ULONG_MAX.


Francois Grieu
 
B

Ben Bacarisse

Francois Grieu said:
pete said:
((unsigned long)-1) equals ULONG_MAX.

N869
6.2.5 Types

[#9]

A computation involving unsigned operands
can never overflow, because a result that cannot be
represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the
largest value that can be represented by the resulting type.

This shows that
((unsigned long)0-(unsigned long)1) equals ULONG_MAX
and even possibly that
(-(unsigned long)1) equals ULONG_MAX
although many compilers at least emit a warning when unary minus
is applied to an unsigned type.

But I fail to see how one goes from here to
((unsigned long)-1) equals ULONG_MAX.

pete quoted the wrong paragraph. You want 6.3.1.3 paragraph 2 which
says pretty much the same but in the context of converting a signed
integer to unsigned.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top