# Prime number algorithm in C

Discussion in 'C Programming' started by Cmorriskuerten, Nov 17, 2003.

1. ### CmorriskuertenGuest

HI,

is this is this solution to test if a number is a prime number or not:

/*
* Is n a prime number?
* Return TRUE (1): n is a prime number
* Return FALSE (0): n is a *not* a prime number
*/
int is_prime_number(int n)
{
long c;

if (n < 2) return FALSE;

for (c = 2; c < n; c++)
{
if ((n % c) == 0) return FALSE;
}
return TRUE;
}

Tia,

Chris

Cmorriskuerten, Nov 17, 2003

2. ### Sheldon SimmsGuest

On Mon, 17 Nov 2003 16:28:10 +0000, Cmorriskuerten wrote:

> HI,
>
> is this is this solution to test if a number is a prime number or not:
>
> /*
> * Is n a prime number?
> * Return TRUE (1): n is a prime number
> * Return FALSE (0): n is a *not* a prime number
> */
> int is_prime_number(int n)
> {
> long c;
>
> if (n < 2) return FALSE;
>
> for (c = 2; c < n; c++)
> {
> if ((n % c) == 0) return FALSE;
> }
> return TRUE;
> }

It would be better to ask questions like this in comp.programming
next time.

However, it does appear that the function would work. It would be
quite slow, though.

Sheldon Simms, Nov 17, 2003

3. ### Christopher Benson-ManicaGuest

osmium <> spoke thus:

> You could improve it a lot by avoiding the test for even numbers except 2 by
> changing the increment. Also, once you have reached the square root of n,
> any further testing is doomed to failure so you might as well stop there.
> But if you insert the square root test in the obvious way you are depending
> on the compiler to be smart enough to optimize it away. So it *could*
> actually get slower. :-(

He could stop at n/2 and still cut the run time in half...

--
Christopher Benson-Manica | I *should* know what I'm talking about - if I
ataru(at)cyberspace.org | don't, I need to know. Flames welcome.

Christopher Benson-Manica, Nov 17, 2003
4. ### James HuGuest

On 2003-11-17, Christopher Benson-Manica <> wrote:
> osmium <> spoke thus:
>
>> Also, once you have reached the square root of n, any further testing
>> is doomed to failure so you might as well stop there. But if you
>> insert the square root test in the obvious way you are depending on
>> the compiler to be smart enough to optimize it away.

>
> He could stop at n/2 and still cut the run time in half...

In C classic:

#define leq_sqrt(v, n) (v <= n/v)

In C99:

inline _Bool leq_sqrt(unsigned v, unsigned n) { return v <= n/v; }

<ot>
Why stopping at sqrt is enough:

Property:
a, b, n non-negative integers and a*b == n and a <= b
then
a <= sqrt(n) and b >= sqrt(n)

Proof:
Case 1: a == sqrt(n)
if a == sqrt(n) then b == sqrt(n)

Case 2: a != sqrt(n)
if a < sqrt(n) then
a*b < sqrt(n)*b, but since a*b == n, then
n < sqrt(n)*b, implies
sqrt(n) < b, in other words
b > sqrt(n)

Assume a > sqrt(n). Then using similar logic to above,
we arrive at b < sqrt(n). This implies that b < a.
This is a contradiction of a <= b.

There are no other cases to consider.

Thus, we have shown a*b == n and a <= b implies
a <= sqrt(n) and b >= sqrt(n)
for non-negative integers a, b and n.

QED.
</ot>

-- James

James Hu, Nov 17, 2003
5. ### Dan PopGuest

In <> (Cmorriskuerten) writes:

>is this is this solution to test if a number is a prime number or not:
>
>/*
>* Is n a prime number?
>* Return TRUE (1): n is a prime number
>* Return FALSE (0): n is a *not* a prime number
>*/
>int is_prime_number(int n)
>{
> long c;
>
> if (n < 2) return FALSE;
>
> for (c = 2; c < n; c++)
> {
> if ((n % c) == 0) return FALSE;
> }
> return TRUE;
>}

Last time I checked, 2 was a prime number, but your function claims
otherwise.

Apart from that, both the "algorithm" and the implementation are
incredibly naive. You should have certainly spent more time designing
the algorithm, rather than simply using a stupid, brute force approach.

Implementation-wise: if n is int, what's the point in making c long?
Do you really want your function to run as slow as possible?

On the algorithm side: after testing divisibility by 2, there is no
point in checking divisibility by any other even number. And there is
no point in checking any number up to n - 1 as a possible divisor, you
can safely stop at sqrt(n).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email:

Dan Pop, Nov 17, 2003
6. ### Dan PopGuest

In <bpb3ek\$6c2\$> I wrote:

>In <> (Cmorriskuerten) writes:
>
>>is this is this solution to test if a number is a prime number or not:
>>
>>/*
>>* Is n a prime number?
>>* Return TRUE (1): n is a prime number
>>* Return FALSE (0): n is a *not* a prime number
>>*/
>>int is_prime_number(int n)
>>{
>> long c;
>>
>> if (n < 2) return FALSE;
>>
>> for (c = 2; c < n; c++)
>> {
>> if ((n % c) == 0) return FALSE;
>> }
>> return TRUE;
>>}

>
>Last time I checked, 2 was a prime number, but your function claims
>otherwise.

Please ignore this comment. The case when n == 2 is properly handled,
because the for loop is skipped.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email:

Dan Pop, Nov 17, 2003
7. ### osmiumGuest

Cmorriskuerten writes:

> is this is this solution to test if a number is a prime number or not:
>
> /*
> * Is n a prime number?
> * Return TRUE (1): n is a prime number
> * Return FALSE (0): n is a *not* a prime number
> */
> int is_prime_number(int n)
> {
> long c;
>
> if (n < 2) return FALSE;
>
> for (c = 2; c < n; c++)
> {
> if ((n % c) == 0) return FALSE;
> }
> return TRUE;
> }

The logic looks OK to me, but the acid test is to try it. I mention logic
because I don't know how TRUE and FALSE obtained their values in this
snippet. I am not fond of the mix of int and long. I don't like the name.
We already *know* it's a number. How about is_prime? Why 'c'? Candidate?

You could improve it a lot by avoiding the test for even numbers except 2 by
changing the increment. Also, once you have reached the square root of n,
any further testing is doomed to failure so you might as well stop there.
But if you insert the square root test in the obvious way you are depending
on the compiler to be smart enough to optimize it away. So it *could*
actually get slower. :-(

osmium, Nov 17, 2003
8. ### Lorenzo J. LucchiniGuest

(Dan Pop) wrote in message news:<bpb3ek\$6c2\$>...
> In <> (Cmorriskuerten) writes:
>
>> [prime number algorithm]

>
> [snip]
>
> On the algorithm side: after testing divisibility by 2, there is no
> point in checking divisibility by any other even number.

And after testing divisibility by 3, there is no point in checking
divisibility by any other multiple of 3.
Sure, it's easier to put your sentence about 2 in an algorithm than it
is to put mine about 3 in one; but personally, I would decide to
either live with an algorithm that tests for every positive integer up
to sqrt(n) or - after I realize I don't need to test for multiples of
numbers I've already tested for - devise another algorithm that takes
this into account.
Don't you think simply incrementing the counter by 2 instead of by 1
is in a way a micro-optimization?

> [snip]

by LjL

Lorenzo J. Lucchini, Nov 18, 2003
9. ### Dan PopGuest

In <> (Lorenzo J. Lucchini) writes:

> (Dan Pop) wrote in message news:<bpb3ek\$6c2\$>...
>> In <> (Cmorriskuerten) writes:
>>
>>> [prime number algorithm]

>>
>> [snip]
>>
>> On the algorithm side: after testing divisibility by 2, there is no
>> point in checking divisibility by any other even number.

>
>Don't you think simply incrementing the counter by 2 instead of by 1
>is in a way a micro-optimization?

Nope, I don't: it accelerates the *algorithm*, not the implementation,
by a factor of 2. If this is a micro-optimization in your book, I would
be really interested to know what counts as a real optimisation to you.

Micro-optimisations don't touch the algorithm, they merely accelerate
its implementation by taking advantage of certain properties of the
implementation and/or the underlying hardware. Replacing a multiplication
by a power of 2 with a left shift is a micro-optimisation if the
latter is faster and if the compiler is not smart enough to do the
replacement itself.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email:

Dan Pop, Nov 19, 2003
10. ### Lorenzo J. LucchiniGuest

(Dan Pop) wrote in message news:<bpfi2g\$8up\$>...
> In <> (Lorenzo J. Lucchini) writes:
>
> > (Dan Pop) wrote in message news:<bpb3ek\$6c2\$>...
> >> In <> (Cmorriskuerten) writes:
> >>
> >>> [prime number algorithm]
> >>
> >> [snip]
> >>
> >> On the algorithm side: after testing divisibility by 2, there is no
> >> point in checking divisibility by any other even number.

> >
> >Don't you think simply incrementing the counter by 2 instead of by 1
> >is in a way a micro-optimization?

>
> Nope, I don't: it accelerates the *algorithm*, not the implementation,
> by a factor of 2. If this is a micro-optimization in your book, I would
> be really interested to know what counts as a real optimisation to you.

I stand corrected. However, I must say that personally I would still
make a choice between testing for every number and only testing for
those numbers that are not multiples of a number I've already tested
for.
Testing for every number says "I want to make the algorithm as
straight-forward as possible".
Testing for non-multiples of tested numbers says "I want to make the
algorithm good"
Testing for every number except for multiples of two says "At the
beginning I thought I would need to test for every number; then I
realized I did not need to test for multiples of two; who knows, maybe
one day I'll even realize I don't need to test for multiples of *any*
number I've already tested for".

At least that's what I'd tend to deduce after reading the three
different algorithms.

> [snip]

by LjL

Lorenzo J. Lucchini, Nov 19, 2003
11. ### Dan PopGuest

In <> (Lorenzo J. Lucchini) writes:

> (Dan Pop) wrote in message news:<bpfi2g\$8up\$>...
>> In <> (Lorenzo J. Lucchini) writes:
>>
>> > (Dan Pop) wrote in message news:<bpb3ek\$6c2\$>...
>> >> In <> (Cmorriskuerten) writes:
>> >>
>> >>> [prime number algorithm]
>> >>
>> >> [snip]
>> >>
>> >> On the algorithm side: after testing divisibility by 2, there is no
>> >> point in checking divisibility by any other even number.
>> >
>> >Don't you think simply incrementing the counter by 2 instead of by 1
>> >is in a way a micro-optimization?

>>
>> Nope, I don't: it accelerates the *algorithm*, not the implementation,
>> by a factor of 2. If this is a micro-optimization in your book, I would
>> be really interested to know what counts as a real optimisation to you.

>
>I stand corrected. However, I must say that personally I would still
>make a choice between testing for every number and only testing for
>those numbers that are not multiples of a number I've already tested
>for.
>Testing for every number says "I want to make the algorithm as
>straight-forward as possible".
>Testing for non-multiples of tested numbers says "I want to make the
>algorithm good"
>Testing for every number except for multiples of two says "At the
>beginning I thought I would need to test for every number; then I
>realized I did not need to test for multiples of two; who knows, maybe
>one day I'll even realize I don't need to test for multiples of *any*
>number I've already tested for".

Welcome to the real world, where compromises are a fact of life.
You won't get very far without understanding this and acting accordingly.

The best choice is the one that is reasonably fast and reasonably easy to
implement. Your ideal algorithm may prove harder to implement and slower.

Your algorithm could be used as an alternative to Eratosthenes' sieve, if
the purpose was to find out *all* the prime numbers smaller than N, but
it would be overkill to test a single number for primeness.

Eratosthenes' sieve is going to be faster on dumb processors, where % is
typically a very slow operation, because it can be implemented using
additions only. It's also a very cache unfriendly algorithm, but dumb
processors seldom use cache ;-)

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email:

Dan Pop, Nov 20, 2003
12. ### James HuGuest

On 2003-11-20, Dan Pop <> wrote:
> In <> (Lorenzo J. Lucchini) writes:
>> Testing for every number says "I want to make the algorithm as
>> straight-forward as possible".
>> Testing for non-multiples of tested numbers says "I want to make the
>> algorithm good"
>> Testing for every number except for multiples of two says "At the
>> beginning I thought I would need to test for every number; then I
>> realized I did not need to test for multiples of two; who knows,
>> maybe one day I'll even realize I don't need to test for multiples of
>> *any* number I've already tested for".

>
> Your algorithm could be used as an alternative to Eratosthenes' sieve, if
> the purpose was to find out *all* the prime numbers smaller than N, but
> it would be overkill to test a single number for primeness.

The definition of the function may be to test a single number for
primeness, but it may be called many times for many numbers. It may
still be worth it to make the computation.

A shot at an implementation is in my signature. Not extensively tested.

-- James
--
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

#define ISQRT_G_1(N) (N>>((sizeof(N)*CHAR_BIT)/2))
#define ISQRT_G_2(N) ((ISQRT_G_1(N)+N/ISQRT_G_1(N))/2)
#define ISQRT_G_3(N) ((ISQRT_G_2(N)+N/ISQRT_G_2(N))/2)
#define ISQRT_G_4(N) ((ISQRT_G_3(N)+N/ISQRT_G_3(N))/2)
#define ISQRT_G_5(N) ((ISQRT_G_4(N)+N/ISQRT_G_4(N))/2)
#define ISQRT_G_6(N) ((ISQRT_G_5(N)+N/ISQRT_G_5(N))/2)
#define ISQRT(N) ISQRT_G_6(N)

static unsigned long
isqrt(unsigned long N)
{
unsigned long r = N/2;

r = (r + N/r)/2;
r = (r + N/r)/2;
r = (r + N/r)/2;
r = (r + N/r)/2;
r = (r + N/r)/2;

return r;
}

static int prime_table_initialized;
static unsigned char prime_table[(ISQRT(ULONG_MAX)+(CHAR_BIT/2))/CHAR_BIT];
static unsigned prime_table_max = sizeof(prime_table)*CHAR_BIT;
#define MAKE_PRIME(N) (void)(prime_table[N/CHAR_BIT] |= (1U<<(N%CHAR_BIT)))
#define MAKE_UNPRIME(N) (void)(prime_table[N/CHAR_BIT] &= ~(1U<<(N%CHAR_BIT)))
#define IS_PRIME(N) ((prime_table[N/CHAR_BIT] & (1U<<(N%CHAR_BIT))) != 0)

static void
isprime_sieve(void)
{
unsigned i;
unsigned j;

for (i = 0; i < prime_table_max; i++) {
MAKE_PRIME(i);
}
MAKE_UNPRIME(0);
MAKE_UNPRIME(1);
for (i = 2; i < prime_table_max; i++) {
if (IS_PRIME(i)) {
for (j = i+i; j < prime_table_max; j += i) {
MAKE_UNPRIME(j);
}
}
}
prime_table_initialized = 1;
}

static int
isprime(unsigned long n, unsigned long *f)
{
unsigned i;
unsigned long factor;
if (f == 0) f = &factor;
*f = 1;
if (prime_table_initialized == 0) isprime_sieve();
if (n < 4) return IS_PRIME(n);
for (i = 2; i < prime_table_max; i++) {
if (! IS_PRIME(i)) continue;
if (n % i == 0) return (*f = i), 0;
if (n/i <= i) return 1;
}
return 1;
}

int
main(int argc, char *argv[])
{
unsigned long n;
unsigned long m;
unsigned long f;
int i;

m = 0;
for (i = 1; i < argc; i++) {
n = strtoul(argv, 0, 0);
if (n > m) m = n;
}
prime_table_max = (isqrt(m)<prime_table_max) ? isqrt(m)+1 : prime_table_max;
for (i = 1; i < argc; i++) {
n = strtoul(argv, 0, 0);
if (isprime(n,&f)) printf("%lu is prime\n", n);
else printf("%lu is not prime (f=%lu)\n", n, f);
}

return 0;
}

James Hu, Nov 20, 2003
13. ### Dan PopGuest

In <> James Hu <> writes:

>On 2003-11-20, Dan Pop <> wrote:
>> In <> (Lorenzo J. Lucchini) writes:
>>> Testing for every number says "I want to make the algorithm as
>>> straight-forward as possible".
>>> Testing for non-multiples of tested numbers says "I want to make the
>>> algorithm good"
>>> Testing for every number except for multiples of two says "At the
>>> beginning I thought I would need to test for every number; then I
>>> realized I did not need to test for multiples of two; who knows,
>>> maybe one day I'll even realize I don't need to test for multiples of
>>> *any* number I've already tested for".

>>
>> Your algorithm could be used as an alternative to Eratosthenes' sieve, if
>> the purpose was to find out *all* the prime numbers smaller than N, but
>> it would be overkill to test a single number for primeness.

>
>The definition of the function may be to test a single number for
>primeness, but it may be called many times for many numbers. It may
>still be worth it to make the computation.

Then, it's much cleaner to use two functions: one that computes all
primes less than N, and another that uses the results of the first to
test whether a number less than N is prime or not. At its first
invocation, the latter calls the former, then it performs a bsearch on
the array containing the results. The other calls simply perform the
bsearch.

Things get more complicated if N is not known when the test function is
called for the first time. And the approach is suboptimal if the
function is called once for testing a number above 1 million and all the
other times for numbers less than 100.

So, it's hard to say what's best without knowing the full story.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email:

Dan Pop, Nov 20, 2003
14. ### Eric SosmanGuest

James Hu wrote:
> [...]
> #define ISQRT_G_1(N) (N>>((sizeof(N)*CHAR_BIT)/2))
> #define ISQRT_G_2(N) ((ISQRT_G_1(N)+N/ISQRT_G_1(N))/2)
> #define ISQRT_G_3(N) ((ISQRT_G_2(N)+N/ISQRT_G_2(N))/2)
> #define ISQRT_G_4(N) ((ISQRT_G_3(N)+N/ISQRT_G_3(N))/2)
> #define ISQRT_G_5(N) ((ISQRT_G_4(N)+N/ISQRT_G_4(N))/2)
> #define ISQRT_G_6(N) ((ISQRT_G_5(N)+N/ISQRT_G_5(N))/2)
> #define ISQRT(N) ISQRT_G_6(N)

This isn't going to work very well for N less than
the square root of its type's maximum value: the initial
guess will be zero, and after that ...

There was a thread about three years ago regarding
implementations of integer square root functions without
resorting to a floating-point conversion. The fastest
used a 256-place table and an `if' ladder to produce an
initial guess close enough that only two Newton-Raphson
iterations were needed to refine it, and many smaller
values needed only one iteration. I was able to speed it
up just a hair more, and will provide it to anyone who's
interested. As coded it works only for 32-bit `unsigned
int', but the technique could be extended to wider types
with a fairly small expenditure of skull sweat.

--

Eric Sosman, Nov 20, 2003
15. ### Debashish ChakravartyGuest

(Dan Pop) wrote in message news:<bpitmn\$d20\$>...

> Then, it's much cleaner to use two functions: one that computes all
> primes less than N, and another that uses the results of the first to
> test whether a number less than N is prime or not. At its first
> invocation, the latter calls the former, then it performs a bsearch on
> the array containing the results. The other calls simply perform the
> bsearch.

While building a table of primes it should be sufficient to test
whether the current candidate is divisible by one of the primes found
so far.

Here's my attempt at the problem ( I don't think it would be much
useful for testing an individual number for primeness)

#include <stdio.h>

#define MAXPRIMES 100

struct
{
int nprimes;
int primes[MAXPRIMES];
} prime_list={1,{2}};

int
main(void)
{
int i;
int num=10;/*buid a table of first "num" primes*/
for(i=3;prime_list.nprimes<num;i+=2)
{
int j;
int isprime=1;
for(j=0;j<prime_list.nprimes;j++)/* test if i is divisible by primes
found so far*/
{
if ( i%prime_list.primes[j] == 0)
{
isprime=0;
break;
}
}
if(isprime)/*new prime found push it on the list*/
{
prime_list.primes[prime_list.nprimes++]=i;
}
}
/*print out the table of first num primes*/
for(i=0;i<prime_list.nprimes;i++)
{
printf("%d\n",prime_list.primes);
}
return 0;
}

Debashish Chakravarty, Nov 21, 2003
16. ### James HuGuest

On 2003-11-20, Eric Sosman <> wrote:
> James Hu wrote:
>> [...]
>> #define ISQRT_G_1(N) (N>>((sizeof(N)*CHAR_BIT)/2))
>> #define ISQRT_G_2(N) ((ISQRT_G_1(N)+N/ISQRT_G_1(N))/2)
>> #define ISQRT_G_3(N) ((ISQRT_G_2(N)+N/ISQRT_G_2(N))/2)
>> #define ISQRT_G_4(N) ((ISQRT_G_3(N)+N/ISQRT_G_3(N))/2)
>> #define ISQRT_G_5(N) ((ISQRT_G_4(N)+N/ISQRT_G_4(N))/2)
>> #define ISQRT_G_6(N) ((ISQRT_G_5(N)+N/ISQRT_G_5(N))/2)
>> #define ISQRT(N) ISQRT_G_6(N)

>
> This isn't going to work very well for N less than
> the square root of its type's maximum value: the initial
> guess will be zero, and after that ...

I wrote that macro specifically to find the square root of ULONG_MAX.

> There was a thread about three years ago regarding
> implementations of integer square root functions without
> resorting to a floating-point conversion. ...

My macro is certainly faster for computing the square root of ULONG_MAX
than any function since it is computed at compile time, rather than
runtime.

-- James

James Hu, Nov 21, 2003
17. ### AnupamGuest

Eric Sosman <> wrote in message news:<>...
> James Hu wrote:
> > [...]
> > #define ISQRT_G_1(N) (N>>((sizeof(N)*CHAR_BIT)/2))
> > #define ISQRT_G_2(N) ((ISQRT_G_1(N)+N/ISQRT_G_1(N))/2)
> > #define ISQRT_G_3(N) ((ISQRT_G_2(N)+N/ISQRT_G_2(N))/2)
> > #define ISQRT_G_4(N) ((ISQRT_G_3(N)+N/ISQRT_G_3(N))/2)
> > #define ISQRT_G_5(N) ((ISQRT_G_4(N)+N/ISQRT_G_4(N))/2)
> > #define ISQRT_G_6(N) ((ISQRT_G_5(N)+N/ISQRT_G_5(N))/2)
> > #define ISQRT(N) ISQRT_G_6(N)

>
> This isn't going to work very well for N less than
> the square root of its type's maximum value: the initial
> guess will be zero, and after that ...
>
> There was a thread about three years ago regarding
> implementations of integer square root functions without
> resorting to a floating-point conversion. The fastest
> used a 256-place table and an `if' ladder to produce an
> initial guess close enough that only two Newton-Raphson
> iterations were needed to refine it, and many smaller
> values needed only one iteration. I was able to speed it
> up just a hair more, and will provide it to anyone who's
> interested. As coded it works only for 32-bit `unsigned
> int', but the technique could be extended to wider types
> with a fairly small expenditure of skull sweat.

Hi,
I would be very much interested in the article. Could you
provide a link to that thread? And if you can, then the
speeded-up code too.
Regards and thanks,
Anupam

Anupam, Nov 21, 2003
18. ### James HuGuest

On 2003-11-21, Anupam <> wrote:
> Hi,
> I would be very much interested in the article. Could you
> provide a link to that thread? And if you can, then the
> speeded-up code too.
> Regards and thanks,
> Anupam

I don't have a pointer to the thread, but this web page was an

http://www.azillionmonkeys.com/qed/sqroot.html

-- James

James Hu, Nov 21, 2003
19. ### Eric SosmanGuest

James Hu wrote:
>
> On 2003-11-20, Eric Sosman <> wrote:
> > James Hu wrote:
> >> [...]
> >> #define ISQRT_G_1(N) (N>>((sizeof(N)*CHAR_BIT)/2))
> >> #define ISQRT_G_2(N) ((ISQRT_G_1(N)+N/ISQRT_G_1(N))/2)
> >> #define ISQRT_G_3(N) ((ISQRT_G_2(N)+N/ISQRT_G_2(N))/2)
> >> #define ISQRT_G_4(N) ((ISQRT_G_3(N)+N/ISQRT_G_3(N))/2)
> >> #define ISQRT_G_5(N) ((ISQRT_G_4(N)+N/ISQRT_G_4(N))/2)
> >> #define ISQRT_G_6(N) ((ISQRT_G_5(N)+N/ISQRT_G_5(N))/2)
> >> #define ISQRT(N) ISQRT_G_6(N)

> >
> > This isn't going to work very well for N less than
> > the square root of its type's maximum value: the initial
> > guess will be zero, and after that ...

>
> I wrote that macro specifically to find the square root of ULONG_MAX.

Aha. I hadn't realized the limited intent; sorry.

--

Eric Sosman, Nov 21, 2003
20. ### Eric SosmanGuest

Anupam wrote:
>
> Eric Sosman <> wrote
> >
> > There was a thread about three years ago regarding
> > implementations of integer square root functions without
> > resorting to a floating-point conversion. [...]

>
> I would be very much interested in the article. Could you
> provide a link to that thread?

Anupam: There's a wonderful service on the Internet,
known as Google. Among other things, Google maintains
searchable archives of Usenet newsgroups, including our
own comp.lang.c. Now, what do you suppose you find if
you go to http://groups.google.com and search for "integer
square root" in the comp.lang.c archives?

> And if you can, then the
> speeded-up code too.

It's just a few simple modifications to the code in
the first article of the cited thread. I also threw in
a whole lot of comments to describe what was going on;
it was essentially a "demonstration piece." Here it is,
except that I've blotted some E-mail addresses out of
the source citations so as not to increase anyone's
exposure to spam:

/* HISTORY --
* 04-Jul-2002: Renamed the header file.
* 29-Jun-2000: Added a lot of comments.
* 19-May-2000: Original (see DESCRIPTION).
*/

/* DESCRIPTION --
* This function is a slight improvement on code posted to comp.lang.c by
* Lawrence Kirby (), which seems to have been based on code
* by Dann Corbit (), which he in turn traces back to
* an 80386 assembly routine by one Arne Steinarson. Arne got the method from
* a fragment of temple wall retrieved from sunken Atlantis by Coq Jasteau;
* the stony text refers to the method as a gift from the "Ancient Astronauts"
* and hints of dire consequences to impious Atlanteans who affronted the
* Heavens by converting the method to binary from its original base seven.
*
* My principal improvement was to use a table of rounded rather than
* truncated first approximations; this reduced the number of cases requiring
* two Newton-Raphson iterations instead of just one. I also eliminated some
* needless additions of unity here and there, and got rid of a pessimization
* which tested for x >= 65535*65535. Finally, I rearranged the range tests
* along the lines suggested by Hermann Kremer (),
* although whether this is a good idea or not depends on what sort of input
* distribution is anticipated.
*/

#include <limits.h>
#include "imath.h"

#if UINT_MAX != 0xFFFFFFFF
#error "This code needs a 32-bit unsigned integer"
#endif

unsigned int
isqrt(unsigned int x)
{
/* This table holds the rounded square roots of the integers 0..255, each
* expressed as a fixed-point binary number with four bits to the left and
* four bits to the right of the binary point. For example, sqrt(2) is
* 1.414... decimal or 1.01101... binary, which is rounded to 1.0111 and
* represented in the table as 0x17 == 23.
*/
static unsigned char est[0x100] = {
0, 16, 23, 28, 32, 36, 39, 42, 45, 48, 51, 53, 55, 58, 60, 62, 64, 66,
68, 70, 72, 73, 75, 77, 78, 80, 82, 83, 85, 86, 88, 89, 91, 92, 93, 95,
96, 97, 99, 100, 101, 102, 104, 105, 106, 107, 109, 110, 111, 112, 113,
114, 115, 116, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128,
129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 139, 140, 141,
142, 143, 144, 145, 146, 147, 148, 148, 149, 150, 151, 152, 153, 153,
154, 155, 156, 157, 158, 158, 159, 160, 161, 162, 162, 163, 164, 165,
166, 166, 167, 168, 169, 169, 170, 171, 172, 172, 173, 174, 175, 175,
176, 177, 177, 178, 179, 180, 180, 181, 182, 182, 183, 184, 185, 185,
186, 187, 187, 188, 189, 189, 190, 191, 191, 192, 193, 193, 194, 195,
195, 196, 197, 197, 198, 199, 199, 200, 200, 201, 202, 202, 203, 204,
204, 205, 206, 206, 207, 207, 208, 209, 209, 210, 210, 211, 212, 212,
213, 213, 214, 215, 215, 216, 216, 217, 218, 218, 219, 219, 220, 221,
221, 222, 222, 223, 223, 224, 225, 225, 226, 226, 227, 227, 228, 229,
229, 230, 230, 231, 231, 232, 232, 233, 234, 234, 235, 235, 236, 236,
237, 237, 238, 238, 239, 239, 240, 241, 241, 242, 242, 243, 243, 244,
244, 245, 245, 246, 246, 247, 247, 248, 248, 249, 249, 250, 250, 251,
251, 252, 252, 253, 253, 254, 254, 255, 255 };
unsigned int r;

if (x >= 1U << 14) {
/* Some values in this range require at least one Newton-Raphson step
* to refine the initial estimate.
*/

if (x >= 1U << 30) {
r = est[x >> 24] << 8;
/* Some values in this range require two Newton-Raphson iterations
* instead of just one, so perform the first one here.
*/
r = (r + x / r) / 2;
}
else if (x >= 1U << 28)
r = est[x >> 22] << 7;
else if (x >= 1U << 26)
r = est[x >> 20] << 6;
else if (x >= 1U << 24)
r = est[x >> 18] << 5;
else if (x >= 1U << 22)
r = est[x >> 16] << 4;
else if (x >= 1U << 20)
r = est[x >> 14] << 3;
else if (x >= 1U << 18)
r = est[x >> 12] << 2;
else if (x >= 1U << 16)
r = est[x >> 10] << 1;
else
r = est[x >> 8] << 0;

/* Refine the estimate with one (or one more) Newton-Raphson step.
*/
r = (r + x / r) / 2;
}
else {
/* In this range, the initial estimate (after scaling and rounding)
* is very close and needs at most a small final tweak.
*/
if (x >= 1U << 12)
r = (est[x >> 6] + 1) >> 1;
else if (x >= 1U << 10)
r = (est[x >> 4] + 2) >> 2;
else if (x >= 1U << 8)
r = (est[x >> 2] + 4) >> 3;
else {
/* In this range, the initial estimate is exact and requires no
* tweaking at all.
*/
return est[x >> 0] >> 4;
}
}

/* Final tweak: The estimate (original or once- or twice-refined) is either
* correct as it stands or is one too large. (Don't worry about overflow
* in the multiplication; `r' is strictly less than 65536 here.)
*/
if (r * r > x)
--r;

return r;
}

--

Eric Sosman, Nov 21, 2003