my_itoa, any comments?

C

ceo

Hi,

I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter

Following is the modified code, please comment.

Thanks,
ceo

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* my_itoa: convert n to string */
char* my_itoa(unsigned int n) {
int i=0,c=0,j=0;
char *s = NULL;
do {
s = (char *) realloc(s, (i+1)*sizeof( char));
s[i++] = n % 10 + '0';
} while ((n /= 10) > 0);
s = (char *) realloc(s, i*sizeof( char));
s = '\0';
for (i = 0, j = strlen(s) - 1; i < j; i++, j--) {
c = s;
s = s[j];
s[j] = c;
}
return s;
}

int main(int agrv, char *argv) {

char *p = NULL;
p = my_itoa(23456);
printf("(%s)\n", p);
if(!p) free(p);
return 0;
}
 
C

ceo

int main(int agrv, char *argv) {
that should be:
int main(int agrv, char *argv[]) {
 
K

kyle york

Greetings,
Hi,

I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter

Following is the modified code, please comment.

Thanks,
ceo

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* my_itoa: convert n to string */
char* my_itoa(unsigned int n) {
int i=0,c=0,j=0;

i & j are indices into an array so should be size_t

since s is an array of char, c should be char
char *s = NULL;

The realloc's() are going to fragement memory something fierce, as well
as be time consuming. A single malloc here would be good. Something like:

s = malloc(CHAR_BIT * sizeof(n) / 2.3 + 1);

You should also check for NULL here.

do {
s = (char *) realloc(s, (i+1)*sizeof( char));

A few problems:
* don't cast the return from realloc, it can hide errors
* if realloc fails, you've leaked memory because you've lost the
original value of s
* sizeof(char) is 1. always.
s[i++] = n % 10 + '0';

Does this assume ASCII? I don't know if the digits are guarenteed to be
contiguous.
} while ((n /= 10) > 0);

n is unsigned, so the '> 0' bit is superfluous
s = (char *) realloc(s, i*sizeof( char));

Same comments about realloc. Also, it should be (i+1), not i
s = '\0';
for (i = 0, j = strlen(s) - 1; i < j; i++, j--) {


You already know the length of s -- it's (i) so you can either reverse
the i/j definitions or change the for statement do

for (j = i, i = 0; ...
c = s;
s = s[j];
s[j] = c;
}
return s;
}

int main(int agrv, char *argv) {

char *p = NULL;


since you set p in the next statement, this bit is unnecessary
p = my_itoa(23456);
printf("(%s)\n", p);

p could be null. is printf() guarenteed to handle that?
if(!p) free(p);

this should either be ``if (p) free (p)'' or, more compactly `free(p)'
since free(NULL) is well defined.
return 0;

the returned value must either be EXIT_SUCCESS or EXIT_FAILURE
 
E

Eric Sosman

ceo said:
Hi,

I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter

Following is the modified code, please comment.

Thanks,
ceo

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* my_itoa: convert n to string */
char* my_itoa(unsigned int n) {
int i=0,c=0,j=0;

Why bother to initialize `c' and `j'? You're never
going to make any use of the initial values; the first
thing you actually do with them is set them to different
values altogether.
char *s = NULL;
do {
s = (char *) realloc(s, (i+1)*sizeof( char));

You do not need the `(char*)' cast. Also, multiplying
by `sizeof(char)' is unnecessary because `sizeof(char)' is
one by definition -- however, this isn't as objectionable
as the cast.

The most objectionable thing, though, is that realloc()
can fail and if it does you won't detect the failure: you'll
press on with `s' equal to NULL, and your program will boldly
go where altogether too many programs have gone before, that
is, into outer space. The super-clean way to do this is

do {
char *new_s = realloc(s, ...);
if (new_s == NULL) {
free (s);
return NULL;
}
s = new_s;
...

In some programs you can get away with the quick and dirty

do {
s = realloc(s, ...);
if (s == NULL)
die_horribly();

However, the *wrong* way to is

do {
s = realloc(s, ...);
if (s == NULL)
return s;

Can you see why? (Hint: What has happened to the memory `s'
pointed to before the realloc() failed?)
s[i++] = n % 10 + '0';
} while ((n /= 10) > 0);
s = (char *) realloc(s, i*sizeof( char));

What's the point of this realloc() call? Unless I've
missed something, it requests that `s' be resized to the
size it already has.
s = '\0';
for (i = 0, j = strlen(s) - 1; i < j; i++, j--) {
c = s;
s = s[j];
s[j] = c;
}
return s;
}

int main(int agrv, char *argv) {


You corrected this in a follow-up.
char *p = NULL;

Why bother setting `p' to NULL when you're about to
overwrite it with another value?
p = my_itoa(23456);
printf("(%s)\n", p);
if(!p) free(p);

Odd that you're so cavalier about the possibility of
realloc() failure inside the function, and yet are careful
to test for it here. Odder still that the test comes after
the printf() rather than before, where it might do some good.
Oddest of all: the test as written is unnecessary, because
free(NULL) is entirely legal (and does nothing)!
return 0;
}

If you knew how many digits the largest possible number
would require, you could get rid of all those realloc() calls
by using an appropriately-sized array of `char' instead. How
can you know this quantity? There's a handy header called
<limits.h> that defines, among other things, a macro named
CHAR_BIT that gives the number of bits in a byte. The binary
representation of the largest possible `unsigned int' value
cannot need more than `CHAR_BIT * sizeof(unsigned int)' binary
digits, right? And the octal representation can't require more
than thbit count divided by three (rounded up), right? And the
decimal representation cannot be longer than the octal; still
with me? So: if you calculate the number of bits in `n', divide
by three and round up, and then add one more spot for the '\0'
you'll have enough characters for the longest possible decimal
number. Here goes:

char buff[(CHAR_BIT * sizeof(n) + 2) / 3 + 1];

Now you can get rid of all that re-re-realloc()'ing: just
deposit the digits in buff[] as you generate them, confident
in the knowledge that there's enough space. Then make just
one call to malloc() to obtain the amount of memory actually
needed, copy the generated digits into it, and return it.

With just a tiny bit more cleverness you can even get rid
of the digit-reversing loop. "Exercise for the reader," as
they say.

Note 1: The decimal-is-no-longer-than-octal calculation
could be made more precise. Another way to look at it is that
you're using 3 as an approximation to lg(10) = 3.3219+, and
you could use a more exact formula like

char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

which approximates log(10) ~= 33/10 = 3.3. As long as the
approximation is smaller than the true value (so the digit
count is over- rather than under-estimated), all will be well.
Personally, I don't think it's worth the trouble.

Note 2: There's a theoretical possibility that `n' might
have fewer than `CHAR_BIT * sizeof n' value bits; the C Standard
allows for "padding bits" that occupy space in the variable but
don't contribute to the value. Again, this could lead to a
slight overestimate and make buff[] a couple characters longer
than it absolutely needs to be -- but again, I don't think it's
anything to worry about.

Note 3: Some people advocate initializing variables --
especially pointers -- when you declare them, in order (so they
say) to make sure of never running the risk of using an
uninitialized variable. To me, the practice seems wrong-headed.
If the program has an execution path that causes it to use a
variable without first storing a value in it, the fact that
the value is "known" is little consolation given that it is
"wrong." Also, some compilers can analyze the execution paths
and warn about variables that are used before it's sure they've
been written to; if you indulge in wholesale initialization you
suppress those helpful warnings. Not a good trade, IMHO: give
the compiler every opportunity to be helpful.
 
E

Eric Sosman

kyle york wrote:

I was going to let this one pass by, but there are
just too many misteaks ...
Greetings,



i & j are indices into an array so should be size_t

Matter of opinion. If an `unsigned int' could have
more than 32766 decimal digits you might have a point --
but what's the likelihood of finding a 16-bit `int' on
a machine with 108848-bit `unsigned int'?
since s is an array of char, c should be char

"Could be," not "should be." I'd stick with `int',
myself, but it doesn't make much difference.
The realloc's() are going to fragement memory something fierce, as well
as be time consuming. A single malloc here would be good. Something like:

s = malloc(CHAR_BIT * sizeof(n) / 2.3 + 1);

ITYM 3.2, not 2.3. 2.3 will work, but ...
s[i++] = n % 10 + '0';


Does this assume ASCII? I don't know if the digits are guarenteed to be
contiguous.

The digits are guaranteed to have consecutive
positive values.
n is unsigned, so the '> 0' bit is superfluous

Do you understand the difference between `>' and `>='?
If not, have you considered another line of work? ;-)
Same comments about realloc. Also, it should be (i+1), not i

Aha! Good catch; I missed that one in my initial review.
(Looked at it, too, and spotted another issue -- but missed
the biggie. Sigh.)
[... and in main() ...]
return 0;

the returned value must either be EXIT_SUCCESS or EXIT_FAILURE

Or zero. Actually, the returned value can be any `int';
it's just that the meanings of values other than these three
(which might actually be fewer than three distinct values) are
system-specific: if you return `-42' your program will probably
be saying different things on different platforms.
 
P

pete

ceo said:
Hi,

I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter

Following is the modified code, please comment.
char* my_itoa(unsigned int n)

You've changed the interface completely.
K&R itoa takes a pointer as one of it's arguments
and returns type void, instead of allocating memory.

void itoa(int n, char s[]);
 
N

Netocrat

Greetings,


i & j are indices into an array so should be size_t

No need for them to be, but no harm if they are.
since s is an array of char, c should be char

Again, no need since sizeof(int) >= sizeof(char), but probably more
appropriate if it is.
The realloc's() are going to fragement memory something fierce, as well
as be time consuming. A single malloc here would be good. Something
like:

s = malloc(CHAR_BIT * sizeof(n) / 2.3 + 1);

That's a good suggestion; it would also be good to perform a realloc once
the actual size is known.

s[i++] = n % 10 + '0';

Does this assume ASCII? I don't know if the digits are guarenteed to be
contiguous.

C does guarantee this.

p could be null. is printf() guarenteed to handle that?

It's undefined behaviour.

the returned value must either be EXIT_SUCCESS or EXIT_FAILURE

0 is also portable.
 
K

kyle york

Greetings,

Eric said:
kyle york wrote:

I was going to let this one pass by, but there are
just too many misteaks ...




Matter of opinion. If an `unsigned int' could have
more than 32766 decimal digits you might have a point --
but what's the likelihood of finding a 16-bit `int' on
a machine with 108848-bit `unsigned int'?

Whenever I index arrays I use size_t. Nothing else is guarenteed to work
is it? You're right of course, in this case it makes little difference.
I also tend to stay clear of int and use unsigned whenever possible.
"Could be," not "should be." I'd stick with `int',
myself, but it doesn't make much difference.

Granted, but I like to use like-types when possible. It eliminates one
source of error.
ITYM 3.2, not 2.3. 2.3 will work, but ...

I *knew* you were going to catch that as soon as I read your response :(
s[i++] = n % 10 + '0';


Does this assume ASCII? I don't know if the digits are guarenteed to be
contiguous.


The digits are guaranteed to have consecutive
positive values.

Ah, thank you. I just found it. I should have looked earlier.
Do you understand the difference between `>' and `>='?
If not, have you considered another line of work? ;-)

Eek, but I don't understand the comment I'm afraid. >= 0 is clearly
always true, but how is > 0 not superfluous? Since this is an unsigned
compare, the opposite would be while (!(x <= 0)). Since clearly < 0
cannot happen, that leaves while (!(x == 0)) which is the same as while (x)

I think. But it's getting late....


Good point about return values from main also. I'll need to remember that.
 
B

Barry Schwarz

Hi,

I picked up the itoa example code from K&R and am trying to modify it
to as per these conditions:
1) input integer is always +ve
2) cannot assume the length of the integer parameter

Following is the modified code, please comment.

Thanks,
ceo

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* my_itoa: convert n to string */
char* my_itoa(unsigned int n) {
int i=0,c=0,j=0;
char *s = NULL;

snip code that others have commented on
s = (char *) realloc(s, i*sizeof( char));
s = '\0';


If realloc succeeds, s now points to an area capable of holding i
char. These are s[0], s[1], ..., s[i-1]. The above statement invokes
undefined behavior by attempting to reference a char that does not
exist.
for (i = 0, j = strlen(s) - 1; i < j; i++, j--) {
c = s;
s = s[j];
s[j] = c;
}
return s;
}


snip


<<Remove the del for email>>
 
A

Anton Petrusevich

kyle said:
Whenever I index arrays I use size_t. Nothing else is guarenteed to work
is it?

char arr[10], *p;
p = arr + 5;
....
p[-1] is guaranteed to work.
 
C

Chris Barts

int main(int agrv, char *argv) {
that should be:
int main(int agrv, char *argv[]) {

Almost. The canonical definition is :
int main(int argc, char *argv[])
argc - ARGument Count: How many arguments you have.
argv - ARGument Values: What those arguments are.
 
C

Christian Kandeler

kyle said:
Eek, but I don't understand the comment I'm afraid. >= 0 is clearly
always true, but how is > 0 not superfluous? Since this is an unsigned
compare, the opposite would be while (!(x <= 0)). Since clearly < 0
cannot happen, that leaves while (!(x == 0)) which is the same as while
(x)

There seems to have been a misunderstanding: Your original post could be
interpreted as to imply that n /= 10 could never become zero.
Still, even with that out of the way, not everyone likes to use arithmetic
expressions (or pointers, for that matter) as booleans. I would always
leave the ">0" part in. It just seems more conceptually clean to me,
although this surely is a matter of taste.


Christian
 
E

Eric Sosman

kyle said:
Greetings,



Eek, but I don't understand the comment I'm afraid. >= 0 is clearly
always true, but how is > 0 not superfluous? Since this is an unsigned
compare, the opposite would be while (!(x <= 0)). Since clearly < 0
cannot happen, that leaves while (!(x == 0)) which is the same as while (x)

Ah. I think I misread the intent of your original remark:
I thought you were saying `(n /= 10) > 0' could never be false,
but in fact you were suggesting `while ((n /= 10));'. My
apologies.

Personally, I feel that writing an explicit comparison to
zero leads to clearer code. I write `if (ptr == NULL)' instead
of `if (!ptr)' -- they mean exactly the same thing, but the
former seems to read more smoothly, more trippingly on the
tongue, than the latter. Similarly, I'd write the O.P.'s test
the same way he did, or possibly as `(n /= 10) != 0' -- again,
it means the same thing but reads better.

I have, on occasion, fixed bugs that were caused by too
much enthusiasm for the more telegraphic style. Most of these,
it seems, involved the sequence `!strcmp(...)' as in

if (!strcmp(ptr, "secret"))
printf ("Incorrect password\n");
else
speak_friend_and_enter();

.... which does the opposite of the probable intent.

The only time I'll write a "bare" test is when the thing
being tested is obviously Boolean: `if (islower(...))' or
`if (feature_enabled)' or some such. Still and all, it's
really just a matter of taste. Eye of the beholder, y'know.
 
P

pete

Eric Sosman wrote:
Note 1: The decimal-is-no-longer-than-octal calculation
could be made more precise. Another way to look at it is that
you're using 3 as an approximation to lg(10) = 3.3219+, and
you could use a more exact formula like

char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

which approximates log(10) ~= 33/10 = 3.3.

The 3.3 constant is log(10) / log(2).
As long as the
approximation is smaller than the true value (so the digit
count is over- rather than under-estimated), all will be well.
Personally, I don't think it's worth the trouble.

I think this is slightly simpler.

The number of digits is CHAR_BIT * sizeof(n) / 3.3 + 1.
buff needs one more byte than that, for a null character.

char buff[(size_t)(CHAR_BIT * sizeof n / 3.3) + 2];
 
E

Eric Sosman

pete wrote On 10/09/05 10:57,:
Eric Sosman wrote:

Note 1: The decimal-is-no-longer-than-octal calculation
could be made more precise. Another way to look at it is that
you're using 3 as an approximation to lg(10) = 3.3219+, and
you could use a more exact formula like

char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

which approximates log(10) ~= 33/10 = 3.3.

The 3.3 constant is log(10) / log(2).

... aka lg(10), as I wrote first before succumbing to
Fat Finger Syndrome. Thanks for the correction.
 
P

pete

Eric said:
pete wrote On 10/09/05 10:57,:
Eric Sosman wrote:

Note 1: The decimal-is-no-longer-than-octal calculation
could be made more precise. Another way to look at it is that
you're using 3 as an approximation to lg(10) = 3.3219+, and
you could use a more exact formula like

char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

which approximates log(10) ~= 33/10 = 3.3.

The 3.3 constant is log(10) / log(2).

... aka lg(10), as I wrote first before succumbing to
Fat Finger Syndrome. Thanks for the correction.

In C99, log2(10)
 
P

pete

Eric Sosman wrote:
char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

char itoa_buff[(size_t)((sizeof(int) * CHAR_BIT - 1) / 3.3) + 3];
char utoa_buff[(size_t)((sizeof(int) * CHAR_BIT ) / 3.3) + 2];
 
O

Old Wolf

pete said:
Eric said:
char buff[(CHAR_BIT * sizeof(n) * 10 + 32) / 33 + 1];

char itoa_buff[(size_t)((sizeof(int) * CHAR_BIT - 1) / 3.3) + 3];
char utoa_buff[(size_t)((sizeof(int) * CHAR_BIT ) / 3.3) + 2];

I was going to suggest:

#include <limits.h>
#define STR(X) #X
#define STR2(X) STR(X)

char itoa_buff[ sizeof STR2(INT_MAX) ];
char utoa_buff[ sizeof STR2(UINT_MAX)];

but it actually doesn't work on my system, because UINT_MAX
is defined like so:

#define UINT_MAX (2147483647 * 2U + 1U)

so the utoa_buff gets a size of 23! Is there any way to
make the preprocessor calculate this out?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top