implement atoi

L

Lew Pitcher

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
does anyone know how to implement this function efficiently?

Yes.

And if you knew how to use Google Groups effectively, you would have found that
this has been discussed to death (along with code examples galore) here in
comp.lang.c.

It is a minor effort to implement atoi(), such that any student of C should be
able to do it with no problem.

- --
Lew Pitcher
IT Consultant, Enterprise Data Systems,
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed are my own, not my employers')
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFB7/fdagVFX4UWr64RApWZAKCgmok2s6Xrrfqky/04U6lxuHk/YwCgqdzX
igkeTZ8f3NJpYnBtvRMRpQs=
=Zh9L
-----END PGP SIGNATURE-----
 
A

Allan Bruce

puzzlecracker said:
does anyone know how to implement this function efficiently?

Here is my fast implementation of atoi. Please bear in mind that this has
no error checking, and since I have total control of the inputs to this
program - it works for me. If you need error checking then you would need
to add it. Also, this is a #define for optimal speed, it might not give
best filesize but my program was reading thousands, if not millions of
numbers from ascii so this was the best way (I also have an equivalent
strtod).

Allan

#define ATOI(X, result, leng) \
do{ \
char *lptr = X; \
result = 0; \
leng = 0; \
while (1) \
{ \
if ((*lptr >= '0') && (*lptr <= '9')) \
{ \
result *= 10; \
result += *lptr - '0'; \
lptr++; \
leng++; \
} \
else \
{ \
break; \
} \
} \
}while(0)
 
E

Eric Sosman

Allan said:
does anyone know how to implement this function efficiently?


Here is my fast implementation of atoi. Please bear in mind that this has
no error checking, and since I have total control of the inputs to this
program - it works for me. If you need error checking then you would need
to add it. Also, this is a #define for optimal speed, it might not give
best filesize but my program was reading thousands, if not millions of
numbers from ascii so this was the best way (I also have an equivalent
strtod).
[code snipped]

<irony>

Good move. I just ran a quick and dirty timing comparison
of your macro with my system's atoi() function. On the rather
old and slow machine that happens to sit in front of me, one
million executions of your macro ran 0.093 seconds faster than
one million executions of atoi(). If it took you, say, five
minutes to code the macro, you'll break even as soon as you've
converted 3.2 billion numbers -- and after that, it's all gain.

But that comparison isn't really fair: after all, your
macro also reports the length of the converted digit string,
which is more than atoi() does. The comparison really should
be against strtol(), so I tried that one as well. A million
executions of your macro clocked in at a whopping 0.414 seconds
faster than a million strtol() calls, so the break-even point
is really more like 0.7 billion conversions. Macroz rulez!

</irony>

Of course, timings on your machine will differ from those
on mine. Still, I strongly doubt that the speed of atoi() will
make any noticeable difference in the performance of what sounds
like an I/O-bound program. Were you able to measure an
improvement?
 
A

Allan Bruce

Eric Sosman said:
Allan said:
does anyone know how to implement this function efficiently?


Here is my fast implementation of atoi. Please bear in mind that this
has
no error checking, and since I have total control of the inputs to this
program - it works for me. If you need error checking then you would
need
to add it. Also, this is a #define for optimal speed, it might not give
best filesize but my program was reading thousands, if not millions of
numbers from ascii so this was the best way (I also have an equivalent
strtod).
[code snipped]

<irony>

Good move. I just ran a quick and dirty timing comparison
of your macro with my system's atoi() function. On the rather
old and slow machine that happens to sit in front of me, one
million executions of your macro ran 0.093 seconds faster than
one million executions of atoi(). If it took you, say, five
minutes to code the macro, you'll break even as soon as you've
converted 3.2 billion numbers -- and after that, it's all gain.

But that comparison isn't really fair: after all, your
macro also reports the length of the converted digit string,
which is more than atoi() does. The comparison really should
be against strtol(), so I tried that one as well. A million
executions of your macro clocked in at a whopping 0.414 seconds
faster than a million strtol() calls, so the break-even point
is really more like 0.7 billion conversions. Macroz rulez!

</irony>

Of course, timings on your machine will differ from those
on mine. Still, I strongly doubt that the speed of atoi() will
make any noticeable difference in the performance of what sounds
like an I/O-bound program. Were you able to measure an
improvement?

To be honest I wrote this macro as I was heavily using atoi and strtod. At
the time, I believed that the atoi was being slow so thats why I chose it,
but there wasnt much of a speed gain. However when I use me strtod macro,
it is much much faster. Here is the code and you can have a speed test,
would be interesting to see if the results are similar to what I achieved:

/* Faster version of strtod(). no error checking. */
#define STRTOD(X, result, leng) \
do{ \
boolean neg = FALSE; \
int fraction= 0; \
int after = 0; \
char *lptr = X; \
result = 0.0f; \
leng = 0; \
\
if (*lptr == '-') \
{ \
lptr++; leng++; neg = TRUE; \
} \
while(1) \
{ \
if ((*lptr >= '0') && (*lptr <='9')) \
{ \
result *= 10.0f; \
result += (float)*lptr - '0'; \
lptr++; leng++; \
} \
else if ((*lptr == 'e') || (*lptr == 'E')) \
{ \
result = (float)strtod(X, &lptr); \
leng = lptr - X; \
break; \
} \
else \
break; \
} \
if (*lptr == '.')/* fraction part follows*/ \
{ \
lptr++; leng++; /*skip fraction part */ \
while(1) \
{ \
if ((*lptr >= '0') && (*lptr <='9')) \
{ \
after *= 10; \
after += *lptr -'0'; \
fraction++; lptr++; leng++; \
} \
else if ((*lptr == 'e') || (*lptr == 'E')) \
{ \
result = (float)strtod(X, &lptr); \
leng = lptr - X; \
break; \
} \
else \
{ \
result += (float)((double)after)/pow(10, fraction);\
break; \
} \
} \
} \
if (neg) result = -result; \
}while(0)


p.s. I am using a boolean type here which I believe might be only useful on
my compiler but it could be replaced by another datatype I'm sure.
Allan
 
T

Thomas Matthews

puzzlecracker said:
does anyone know how to implement this function efficiently?

If you are going to post in and please put them both
in the header of your post. That way we can
see the replies from both groups.

The same applies to other newsgroups that you
have posted to.

An efficient implementation of atoi can be written
in assembly. But since that is off-topic, I won't
discuss it here.

Why do you need an efficient version?

--
Thomas Matthews

C++ newsgroup welcome message:
http://www.slack.net/~shiva/welcome.txt
C++ Faq: http://www.parashift.com/c++-faq-lite
C Faq: http://www.eskimo.com/~scs/c-faq/top.html
alt.comp.lang.learn.c-c++ faq:
http://www.comeaucomputing.com/learn/faq/
Other sites:
http://www.josuttis.com -- C++ STL Library book
http://www.sgi.com/tech/stl -- Standard Template Library
 
E

Eric Sosman

Allan said:
[...]
To be honest I wrote this macro as I was heavily using atoi and strtod. At
the time, I believed that the atoi was being slow so thats why I chose it,
but there wasnt much of a speed gain. However when I use me strtod macro,
it is much much faster. Here is the code and you can have a speed test,
would be interesting to see if the results are similar to what I achieved:
[code snipped; see up-thread]

This may well be faster than strtod(), at least on some
machines. You'd need to measure to find out. The speed can
probably be improved -- that call to pow() can probably be
done away with, and several micro-optimizations suggest
themselves. Again, you'd need to measure to see whether any
such changes actually represent improvements.

One thing worth noting is that the macro is a little
sloppy about accuracy -- more than "a little" sloppy with
inputs containing a lot of fraction digits! This may be good
enough for the application at hand, but I wouldn't recommend
this macro for use as a general-purpose strtod() replacement.

(Interesting that it falls back on strtod() in some cases;
given the way pow() is being used one would think those cases
would be easy to handle in line.)

Stepping back a bit, I'd suggest that anyone interested in
re-implementing Standard library functions (either "as specified"
or in streamlined reduced-capability forms) should study P.J.
Plauger's "The Standard C Library" for an exposition of the many
non-obvious concerns that can arise.
 
C

CBFalconer

Eric said:
Allan said:
[...]
To be honest I wrote this macro as I was heavily using atoi and
strtod. At the time, I believed that the atoi was being slow so
thats why I chose it, but there wasnt much of a speed gain.
However when I use me strtod macro, it is much much faster. Here
is the code and you can have a speed test, would be interesting
to see if the results are similar to what I achieved:
[code snipped; see up-thread]

This may well be faster than strtod(), at least on some
machines. You'd need to measure to find out. The speed can
probably be improved -- that call to pow() can probably be
done away with, and several micro-optimizations suggest
themselves. Again, you'd need to measure to see whether any
such changes actually represent improvements.

One thing worth noting is that the macro is a little
sloppy about accuracy -- more than "a little" sloppy with
inputs containing a lot of fraction digits! This may be good
enough for the application at hand, but I wouldn't recommend
this macro for use as a general-purpose strtod() replacement.

(Interesting that it falls back on strtod() in some cases;
given the way pow() is being used one would think those cases
would be easy to handle in line.)

Stepping back a bit, I'd suggest that anyone interested in
re-implementing Standard library functions (either "as specified"
or in streamlined reduced-capability forms) should study P.J.
Plauger's "The Standard C Library" for an exposition of the many
non-obvious concerns that can arise.

Rather than reimplement standard library stuff, I have some
routines here designed for numeric input from text streams. They
are intended to avoid any fuss with buffers, be capable of skipping
to numeric input, possibly over blank lines, and to universally
report overflows. It is always possible to discover what character
terminated the numeric field because it is always pushed back.
This also means there is no need to wonder whether a line remainder
needs flushing - it always does.
 
E

Eric Sosman

CBFalconer said:
Rather than reimplement standard library stuff, I have some
routines here designed for numeric input from text streams. They
are intended to avoid any fuss with buffers, be capable of skipping
to numeric input, possibly over blank lines, and to universally
report overflows. It is always possible to discover what character
terminated the numeric field because it is always pushed back.
This also means there is no need to wonder whether a line remainder
needs flushing - it always does.

I was trying to raise a somewhat more general issue,
namely, that it is sometimes -- only sometimes -- worth
while to implement a function that does X% of the job of
some Standard library function but runs Y% faster on the
machine at hand. People confronting a decision to do or
not do such a thing would likely benefit from perusing
Plauger.

ISTR that one of Jon Bentley's "Programming Pearls"
columns dealt with just this issue. He'd tracked a
performance bottleneck down to a sqrt(x*x + y*y) call.
He replaced it with a hypotenuse(x, y) function that
exploited its knowledge of the x and y values to come
up with the answer faster than was possible with sqrt(),
whose single argument provides less information. Further,
he drew on his knowledge of the program to find that he
didn't need to handle NaNs, infinities, or outrageous
argument values; this allowed him to take shortcuts not
available to the general-purpose sqrt() implementation.

(Some systems provide a non-Standard hypot() function
to perform this calculation with better speed and accuracy
than the classical Pythagorean formulation can yield.)

The fact that a Standard library function accomplishes
some task does not imply that it's always the best way to
do so. We keep telling ourselves not to re-invent the
wheel -- and yet, don't Firestone and Michelin employ
high-paid specialists to do exactly that? Improvements
are usually possible, and occasionally worth seeking.
 
E

Eric Sosman

Eric said:
>
(Some systems provide a non-Standard hypot() function
to perform this calculation with better speed and accuracy
than the classical Pythagorean formulation can yield.)

Assailed by sudden doubt, I checked the Standard: lo!
C99 added the hypot() function, which was absent from C90.
The adjective "non-Standard" is thus version-dependent.
 
A

Allan Bruce

Eric Sosman said:
Allan said:
[...]
To be honest I wrote this macro as I was heavily using atoi and strtod.
At
the time, I believed that the atoi was being slow so thats why I chose
it,
but there wasnt much of a speed gain. However when I use me strtod
macro,
it is much much faster. Here is the code and you can have a speed test,
would be interesting to see if the results are similar to what I
achieved:
[code snipped; see up-thread]

This may well be faster than strtod(), at least on some
machines. You'd need to measure to find out. The speed can
probably be improved -- that call to pow() can probably be
done away with, and several micro-optimizations suggest
themselves. Again, you'd need to measure to see whether any
such changes actually represent improvements.

One thing worth noting is that the macro is a little
sloppy about accuracy -- more than "a little" sloppy with
inputs containing a lot of fraction digits! This may be good
enough for the application at hand, but I wouldn't recommend
this macro for use as a general-purpose strtod() replacement.

(Interesting that it falls back on strtod() in some cases;
given the way pow() is being used one would think those cases
would be easy to handle in line.)

Stepping back a bit, I'd suggest that anyone interested in
re-implementing Standard library functions (either "as specified"
or in streamlined reduced-capability forms) should study P.J.
Plauger's "The Standard C Library" for an exposition of the many
non-obvious concerns that can arise.

You are correct about the pow() function - I actually have changed that in a
more recent version of my program. I achieve a massive speed increase using
this. I tested 7 million numbers and with the library strtod it takes
approximately 10.71 secs wheras with my macro it takes only 0.241 seconds to
read through the same numbers, a speed increase of nearly 45 times!

I am interested to know about your accuracy, I tested the following numbers:
"389742.24321","844.4654684","0.3478734","987987987.87","34324.32432","0.0000002323234","98782.9278"
and they all come out fine with my STROTOD, can you elaborate? and perhaps
provide an example? (I must also add that my newer version uses a long for
my fractional part. I have copied the updated implementation below.

Thanks,
Allan

#define STRTOD(X, result, leng) \
{ \
bool neg = FALSE; \
double fraction = 1; \
long after = 0; \
char *lptr = X; \
result = 0.0f; \
leng = 0; \
\
if (*lptr == '-') \
{ \
lptr++; leng++; neg = TRUE; \
} \
while(1) \
{ \
if ((*lptr >= '0') && (*lptr <='9')) \
{ \
result *= 10; \
result += *lptr - '0'; \
lptr++; leng++; \
} \
else if ((*lptr == 'e') || (*lptr == 'E')) \
{ \
result = (float)strtod(X, &lptr); \
leng = lptr - X; \
break; \
} \
else \
break; \
} \
if (*lptr == '.')/* fraction part follows*/ \
{ \
lptr++; leng++; /*skip fraction part */ \
while(1) \
{ \
if ((*lptr >= '0') && (*lptr <='9')) \
{ \
after *= 10; \
after += *lptr -'0'; \
fraction *= 10; lptr++; leng++; \
} \
else if ((*lptr == 'e') || (*lptr == 'E')) \
{ \
result = (float)strtod(X, &lptr); \
leng = lptr - X; \
break; \
} \
else \
{ \
result += (float)(((double)after)/fraction);\
break; \
} \
} \
} \
if (neg) result = -result; \
}
 
C

Chris Croughton

Assailed by sudden doubt, I checked the Standard: lo!
C99 added the hypot() function, which was absent from C90.
The adjective "non-Standard" is thus version-dependent.

And thus introduced another incompatibility between the versions (there
was no way to predict that a hypot() function might be added to the
standard header math.h). The math functions don't even have their own
namespace like the str* and mem* ones do.

I'm coming to the conclusion that the only way to be safe is to include
at least one uppercase letter in every function and variable name,
otherwise it could conflict with some future 'standard' function or
keyword (perfectly valid C89 programs using a variable called 'restrict'
for instance)...

Chris C
 
C

CBFalconer

Chris said:
And thus introduced another incompatibility between the versions
(there was no way to predict that a hypot() function might be
added to the standard header math.h). The math functions don't
even have their own namespace like the str* and mem* ones do.

I'm coming to the conclusion that the only way to be safe is to
include at least one uppercase letter in every function and
variable name, otherwise it could conflict with some future
'standard' function or keyword (perfectly valid C89 programs
using a variable called 'restrict' for instance)...

Keywords (reserved words) are one thing, function names another.
Apart from a few cases it is usually possible to replace a system
function with your own by simply linking it before the library
search. This is definitely not standard compliant, but is
practicable in most systems known to me. Areas such as the math
library could easily be forced to adhere to this capability in a
future standards revision. There could simply be a list of header
names, whose functions are replaceable. Or the individual
functions could have another characteristic, "replaceable". Some
functions would need to be grouped, so that either none or all of
the group must be replaced.

Crossed to comp.std.c, because I think the above paragraph has
merit :)
 
C

Chris Croughton

Keywords (reserved words) are one thing, function names another.
Apart from a few cases it is usually possible to replace a system
function with your own by simply linking it before the library
search.

Only if it happens to have the same prototype. For instance, for speed
the application could well have declared float hypot(float, float), so
when they include math.h it wouldn't even compile. Or it could be a
variable:

double hypot = sqrt(x*x + y*y);

Or even a typedef...

Perfectly valid C89, fails on C90.
This is definitely not standard compliant, but is
practicable in most systems known to me. Areas such as the math
library could easily be forced to adhere to this capability in a
future standards revision. There could simply be a list of header
names, whose functions are replaceable. Or the individual
functions could have another characteristic, "replaceable". Some
functions would need to be grouped, so that either none or all of
the group must be replaced.

But whichever way it is done it will still break perfectly good code.
The code for hypot(), for example, in the application's original it
might well need to include math.h for implementing the function.

(And hypot() wasn't even an essential function, it was an optimisation
at best, trivially implemented using standard library functions...)

Chris C
 
R

Randy Howard

You are correct about the pow() function - I actually have changed that in a
more recent version of my program. I achieve a massive speed increase using
this. I tested 7 million numbers and with the library strtod it takes
approximately 10.71 secs wheras with my macro it takes only 0.241 seconds to
read through the same numbers, a speed increase of nearly 45 times!

If you think that is nice, try working on replacing memcpy() sometime.
ESPECIALLY on MSVC. The library version is pitifully slow, particularly on
large copies.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top