# [Algorithm] Sum of Primes < 1000000

Discussion in 'Java' started by Luc The Perverse, Jan 5, 2007.

1. ### Luc The PerverseGuest

I was messing around on a site where you answer questions for points. (No
prizes or anything)

Anyway, they say our programs *should* execute in less than 1 minute if well
formed.

Mine didn't - it was less than 10 minutes though and I got the right answer.

I was supposed to sum all primes less than 1 million.

Is there an algorithm out there, which I will be able to understand, which
works significantly faster than this one?

//import java.lang.*;
import java.util.*;
import java.math.*;

public class prob10{
public static void main(String[] x){
if(x.length<1){
System.out.println("enter a number");
return;
}

int n = Integer.parseInt(x[0]);
int c = 0;
int[] pc = new int[n];
int[] p2 = new int[n];
if(n<=1){
System.out.println(2);
return;
}
c++;
pc[0] = 2;
p2[0] = 2;
int cn = 1;
long g = 2;
while(c<n){
cn+=2;
boolean Prime = true;
for(int i=1;i<c;i++){
p2 -= 2;
if(p2 == 0)
Prime = false;
if(p2 < 1)
p2 += pc;
}
if(Prime){
pc[c] = cn;
p2[c++] = cn;
if((c%2048) == 0)
System.out.println(cn);
if(cn<1000000)
g+=cn;
else{
System.out.println("Yay 1 million reached!");
break;
}

}
}
System.out.println("Grand Total " + g);
System.out.println(pc[n-1]);
}

}

I specifically tried to avoid ever dividing or modding (except in my status
reporter, but that only checks when a prime has been found)

I "count down" until the next multiple a prime is detected.

The algorithm works . . but it is n^2 complexity, and kinda sucks.

--
LTP

Luc The Perverse, Jan 5, 2007

2. ### SimonGuest

Luc The Perverse schrieb:
> I was messing around on a site where you answer questions for points. (No
> prizes or anything)
>
> Anyway, they say our programs *should* execute in less than 1 minute if well
> formed.
>
> Mine didn't - it was less than 10 minutes though and I got the right answer.
>
> I was supposed to sum all primes less than 1 million.
>
> Is there an algorithm out there, which I will be able to understand, which
> works significantly faster than this one?

You may want to have a look at the Sieve of Eratosthenes (and its successors)
(see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
to n in time n log n. On my computer my implementation takes 150 milliseconds
for n=1000000.

> I specifically tried to avoid ever dividing or modding (except in my status
> reporter, but that only checks when a prime has been found)

I doubt that such micro-optimisations are of much use here.

Cheers,
Simon

Simon, Jan 5, 2007

3. ### Patricia ShanahanGuest

Luc The Perverse wrote:
> I was messing around on a site where you answer questions for points. (No
> prizes or anything)
>
> Anyway, they say our programs *should* execute in less than 1 minute if well
> formed.
>
> Mine didn't - it was less than 10 minutes though and I got the right answer.
>
> I was supposed to sum all primes less than 1 million.
>
> Is there an algorithm out there, which I will be able to understand, which
> works significantly faster than this one?

....

The usual algorithm is the Sieve of Eratosthenes,
http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

The sieve is O(n*n), but with small constant factor.

Patricia

Patricia Shanahan, Jan 5, 2007
4. ### Patricia ShanahanGuest

Patricia Shanahan wrote:
> Luc The Perverse wrote:
>> I was messing around on a site where you answer questions for points.
>> (No prizes or anything)
>>
>> Anyway, they say our programs *should* execute in less than 1 minute
>> if well formed.
>>
>> Mine didn't - it was less than 10 minutes though and I got the right
>>
>> I was supposed to sum all primes less than 1 million.
>>
>> Is there an algorithm out there, which I will be able to understand,
>> which works significantly faster than this one?

> ...
>
> The usual algorithm is the Sieve of Eratosthenes,
> http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
>
> The sieve is O(n*n), but with small constant factor.

I was wrong about the complexity. I was not allowing for only doing the
inner loop for primes.

Patricia

Patricia Shanahan, Jan 5, 2007
5. ### Luc The PerverseGuest

"Patricia Shanahan" <> wrote in message
news:Cernh.6752\$...
> Patricia Shanahan wrote:
>> Luc The Perverse wrote:
>>> I was messing around on a site where you answer questions for points.
>>> (No prizes or anything)
>>>
>>> Anyway, they say our programs *should* execute in less than 1 minute if
>>> well formed.
>>>
>>> Mine didn't - it was less than 10 minutes though and I got the right
>>>
>>> I was supposed to sum all primes less than 1 million.
>>>
>>> Is there an algorithm out there, which I will be able to understand,
>>> which works significantly faster than this one?

>> ...
>>
>> The usual algorithm is the Sieve of Eratosthenes,
>> http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
>>
>> The sieve is O(n*n), but with small constant factor.

>
> I was wrong about the complexity. I was not allowing for only doing the
> inner loop for primes.

Looks like I had better check this out!

hehe cool

Thanks guys! (And I mean Simon too)

--
LTP

Luc The Perverse, Jan 5, 2007
6. ### Patricia ShanahanGuest

Simon wrote:
> Luc The Perverse schrieb:
>> I was messing around on a site where you answer questions for points. (No
>> prizes or anything)
>>
>> Anyway, they say our programs *should* execute in less than 1 minute if well
>> formed.
>>
>> Mine didn't - it was less than 10 minutes though and I got the right answer.
>>
>> I was supposed to sum all primes less than 1 million.
>>
>> Is there an algorithm out there, which I will be able to understand, which
>> works significantly faster than this one?

>
> You may want to have a look at the Sieve of Eratosthenes (and its successors)
> (see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
> to n in time n log n. On my computer my implementation takes 150 milliseconds
> for n=1000000.

As a matter of curiosity, did you use a boolean[] or BitSet?

I started with a boolean[], and mine was slower than yours, about 200
milliseconds. It dropped to under 100 milliseconds when I switched to
BitSet. I believe the improvement is due to more efficient caching.

Patricia

Patricia Shanahan, Jan 5, 2007
7. ### Daniel PittsGuest

Re: Sum of Primes < 1000000

Patricia Shanahan wrote:
> Simon wrote:
> > Luc The Perverse schrieb:
> >> I was messing around on a site where you answer questions for points. (No
> >> prizes or anything)
> >>
> >> Anyway, they say our programs *should* execute in less than 1 minute if well
> >> formed.
> >>
> >> Mine didn't - it was less than 10 minutes though and I got the right answer.
> >>
> >> I was supposed to sum all primes less than 1 million.
> >>
> >> Is there an algorithm out there, which I will be able to understand, which
> >> works significantly faster than this one?

> >
> > You may want to have a look at the Sieve of Eratosthenes (and its successors)
> > (see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
> > to n in time n log n. On my computer my implementation takes 150 milliseconds
> > for n=1000000.

>
> As a matter of curiosity, did you use a boolean[] or BitSet?
>
> I started with a boolean[], and mine was slower than yours, about 200
> milliseconds. It dropped to under 100 milliseconds when I switched to
> BitSet. I believe the improvement is due to more efficient caching.
>
> Patricia

I personally got 66ms from:

public class SumPrimes {
public static void main(String...args) {
final int bounds = Integer.parseInt(args[0]) + 1;
final java.util.BitSet composits = new
java.util.BitSet(bounds);
long sum = 1;
for (int prime = 2; prime < bounds; prime =
composits.nextClearBit(prime +1)) {
sum += prime;
for (int composit = prime + prime; composit < bounds;
composit += prime) {
composits.set(composit);
}
}
System.out.println(sum);
}
}

Daniel Pitts, Jan 5, 2007
8. ### MikeNeresonGuest

Re: Sum of Primes < 1000000

Your original alogorithm worked for me in 141ms. I made a couple
seemingly minor changes. Can you find them? Why are they so important?

http://pastebin.com/852736

OUTPUT:
Yay 1 million reached!
Grand Total 797162
0
only took 141 ms

MikeNereson, Jan 6, 2007
9. ### MikeNeresonGuest

Re: Sum of Primes < 1000000

Whoops, I didn't validate that output. My bad. The bitset solution is
great.

MikeNereson wrote:
> Your original alogorithm worked for me in 141ms. I made a couple
> seemingly minor changes. Can you find them? Why are they so important?
>
> http://pastebin.com/852736
>
> OUTPUT:
> Yay 1 million reached!
> Grand Total 797162
> 0
> only took 141 ms

MikeNereson, Jan 6, 2007
10. ### Chris UppalGuest

Re: Sum of Primes < 1000000

MikeNereson wrote:

> Yay 1 million reached!
> Grand Total 797162

Seems unlikely to be the correct answer since 999983 is just one one of the
primes in that range...

-- chris

Chris Uppal, Jan 6, 2007
11. ### AbhiGuest

Re: Sum of Primes < 1000000

Hi Luc,
I found the primes<1000000 and stored them in an array.also summed them
up.I am pasting my code.Pls hav a look..........it took 1782 msecs on a
p4 2.4GHz processor && 512 mb ram.
If this code is not ok then let me know

/*
* Created on Jan 7, 2007
*
* To change the template for this generated file go to
* Window&gt;Preferences&gt;Java&gt;Code Generation&gt;Code and
*/
package src;

//import java.lang.Math;
/**
* @author Jboss
*calculate sum of prime nos less than 1 million
*/
public final class Sum_Prime {

private static int currNo;
private static int currSqrt;
private static int[] primeArr;
private static long currSum = 0;
private static final int range = 1000000; // calc till 1000000;
private static long startTime;

public static void main(String[] args) {
//Math.sqrt()
int currArrSize = range / 20;//holds the curr size of the array
if (currArrSize == 0)
{
primeArr = new int[10];
currArrSize=10;
}
else
primeArr = new int[currArrSize];

startTime = System.currentTimeMillis();
int k = 0; //index of the array which stores the prime nos.

currNo = 2; //start looking from 2
System.out.print("\n arr size \t"+primeArr.length);
for (int i = 2; i <= range; i++) {
boolean isPrime = true;
currSqrt = (int) Math.sqrt(currNo);
for (int j = 2;
j <= currSqrt;
j++) //chk for 3 whose sqrt is 1.5....
{
if (currNo % j == 0) {

isPrime = false;
break;
}
}
if (isPrime) {
try {
primeArr[k] = currNo;
currSum = currSum + currNo;
k++;
} catch (ArrayIndexOutOfBoundsException arrex) {
currArrSize=currArrSize*2;
int[] tempArr = new int[currArrSize];
System.arraycopy(primeArr, 0, tempArr, 0,k);//copies the old arr
into new 1
primeArr=tempArr;
//System.out.print("\n primeArr size \t"+primeArr.length);
//System.out.print("\n tempArr size \t"+tempArr.length);
primeArr[k]=currNo;
currSum = currSum + currNo;
System.out.print("\n new arr size \t"+primeArr.length);
k++;
}

}
currNo++;
}
System.out.print("\n sum \t" + currSum);
System.out.print("\n curr val \t" + currNo);
System.out.print("\n nos found \t" + k);
System.out.print(
"\n Time taken to execute in millisecs \t"
+ (System.currentTimeMillis() - startTime));
System.exit(0);
}
}
tell me the name of the site. where u saw this..........

Luc The Perverse wrote:
> I was messing around on a site where you answer questions for points. (No
> prizes or anything)
>
> Anyway, they say our programs *should* execute in less than 1 minute if well
> formed.
>
> Mine didn't - it was less than 10 minutes though and I got the right answer.
>
> I was supposed to sum all primes less than 1 million.
>
> Is there an algorithm out there, which I will be able to understand, which
> works significantly faster than this one?
>
> //import java.lang.*;
> import java.util.*;
> import java.math.*;
>
> public class prob10{
> public static void main(String[] x){
> if(x.length<1){
> System.out.println("enter a number");
> return;
> }
>
> int n = Integer.parseInt(x[0]);
> int c = 0;
> int[] pc = new int[n];
> int[] p2 = new int[n];
> if(n<=1){
> System.out.println(2);
> return;
> }
> c++;
> pc[0] = 2;
> p2[0] = 2;
> int cn = 1;
> long g = 2;
> while(c<n){
> cn+=2;
> boolean Prime = true;
> for(int i=1;i<c;i++){
> p2 -= 2;
> if(p2 == 0)
> Prime = false;
> if(p2 < 1)
> p2 += pc;
> }
> if(Prime){
> pc[c] = cn;
> p2[c++] = cn;
> if((c%2048) == 0)
> System.out.println(cn);
> if(cn<1000000)
> g+=cn;
> else{
> System.out.println("Yay 1 million reached!");
> break;
> }
>
> }
> }
> System.out.println("Grand Total " + g);
> System.out.println(pc[n-1]);
> }
>
> }
>
> I specifically tried to avoid ever dividing or modding (except in my status
> reporter, but that only checks when a prime has been found)
>
> I "count down" until the next multiple a prime is detected.
>
> The algorithm works . . but it is n^2 complexity, and kinda sucks.
>
> --
> LTP
>
>

Abhi, Jan 8, 2007
12. ### SimonGuest

Patricia Shanahan schrieb:
> Simon wrote:
> As a matter of curiosity, did you use a boolean[] or BitSet?
>
> I started with a boolean[], and mine was slower than yours, about 200
> milliseconds. It dropped to under 100 milliseconds when I switched to
> BitSet. I believe the improvement is due to more efficient caching.

Well, that is strange. I have been using a boolean[], and using BitSet actually
slows it down on my machine. I am using a Laptop, so CPU speed may vary. Today
it is even faster than yesterday. The output was (code see below):

Time : 42ms
Sum : 37551402022

Apart from that, from your other post I saw that you have used a different
argument to obtain the n log n upper bound.

- Your argument: There are only log n primes below n and each iteration takes
time at most n.
- My argument: Consider the iteration in which we erase prime i. We erase n/i
composite numbers. In total, we need time

sum(i=1..n) n/i
= n * H(n)
= O(n * log n)

(Here, H(n) is the n-th harmonic number.) Can we combine both arguments to
obtain an upper bound below n * log n? In my sum, only log n terms are actually
necessary, as has been argued by Patricia. Unfortunately, the terms that are
actually present are the larger ones (those for small i). Still, some Googling
reveals that it is possible to get the time down to n * log log n.

Cheers,
Simon

This is my code:

public class Primes {

private boolean[] numbers;
private long sum = 0;

public Primes(int size) {
numbers = new boolean[size];
}

public void eraseComposite() {
int prime = 2;
while (prime < numbers.length) {
sum += prime;
// mark multiples of prime
for (int i = 2 * prime; i < numbers.length; i += prime) {
// since it would be larger than Integer.MAX_INTEGER)
numbers = true;
}
// move forward to next prime
prime++;
while (prime < numbers.length-1 && (numbers[prime])) {
prime++;
}
}
}

public static void main(String[] argv) {
long t = System.currentTimeMillis();
Primes p = new Primes(1000000);
p.eraseComposite();
System.out.println("Time : " + (System.currentTimeMillis() - t) + "ms");
System.out.println("Sum : " + p.sum);
}
}

Simon, Jan 8, 2007
13. ### Patricia ShanahanGuest

Simon wrote:
> Patricia Shanahan schrieb:
>> Simon wrote:
>> As a matter of curiosity, did you use a boolean[] or BitSet?
>>
>> I started with a boolean[], and mine was slower than yours, about 200
>> milliseconds. It dropped to under 100 milliseconds when I switched to
>> BitSet. I believe the improvement is due to more efficient caching.

>
> Well, that is strange. I have been using a boolean[], and using BitSet actually
> slows it down on my machine. I am using a Laptop, so CPU speed may vary. Today
> it is even faster than yesterday. The output was (code see below):

The most likely cause of the speed up I saw is more efficient caching,
because BitSet stores the data more densely. That is a very hardware
dependent effect, though less a matter of CPU speed than of cache
configuration.

> public void eraseComposite() {
> int prime = 2;
> while (prime < numbers.length) {
> sum += prime;
> // mark multiples of prime
> for (int i = 2 * prime; i < numbers.length; i += prime) {
> // (we could start with prime*prime here, but I didn't bother
> // since it would be larger than Integer.MAX_INTEGER)
> numbers = true;
> }
> // move forward to next prime
> prime++;
> while (prime < numbers.length-1 && (numbers[prime])) {
> prime++;
> }
> }
> }
>

There is one optimization you missed. When prime is greater than the
square root of numbers.length, there is no need to do the composite scan.

If a number less than 1,000,000 is composite, at least one prime factor
is less than 1000.

Patricia

Patricia Shanahan, Jan 8, 2007
14. ### SimonGuest

Patricia Shanahan schrieb:
> Simon wrote:
>> Patricia Shanahan schrieb:
>>> Simon wrote:
>>> As a matter of curiosity, did you use a boolean[] or BitSet?
>>>
>>> I started with a boolean[], and mine was slower than yours, about 200
>>> milliseconds. It dropped to under 100 milliseconds when I switched to
>>> BitSet. I believe the improvement is due to more efficient caching.

>>
>> Well, that is strange. I have been using a boolean[], and using BitSet
>> actually
>> slows it down on my machine. I am using a Laptop, so CPU speed may
>> vary. Today
>> it is even faster than yesterday. The output was (code see below):

>
> The most likely cause of the speed up I saw is more efficient caching,
> because BitSet stores the data more densely. That is a very hardware
> dependent effect, though less a matter of CPU speed than of cache
> configuration.
>
>> public void eraseComposite() {
>> int prime = 2;
>> while (prime < numbers.length) {
>> sum += prime;
>> // mark multiples of prime
>> for (int i = 2 * prime; i < numbers.length; i += prime) {
>> bother
>> // since it would be larger than Integer.MAX_INTEGER)
>> numbers = true;
>> }
>> // move forward to next prime
>> prime++;
>> while (prime < numbers.length-1 && (numbers[prime])) {
>> prime++;
>> }
>> }
>> }
>>

>
> There is one optimization you missed. When prime is greater than the
> square root of numbers.length, there is no need to do the composite scan.
>
> If a number less than 1,000,000 is composite, at least one prime factor
> is less than 1000.

This is what I tried to express in the above comment. It was more laziness than
the reason stated above though that prevented me from implementing it

Cheers,
Simon

Simon, Jan 8, 2007
15. ### Patricia ShanahanGuest

Simon wrote:
> Patricia Shanahan schrieb:
>> Simon wrote:
>>> Patricia Shanahan schrieb:
>>>> Simon wrote:
>>>> As a matter of curiosity, did you use a boolean[] or BitSet?
>>>>
>>>> I started with a boolean[], and mine was slower than yours, about 200
>>>> milliseconds. It dropped to under 100 milliseconds when I switched to
>>>> BitSet. I believe the improvement is due to more efficient caching.
>>> Well, that is strange. I have been using a boolean[], and using BitSet
>>> actually
>>> slows it down on my machine. I am using a Laptop, so CPU speed may
>>> vary. Today
>>> it is even faster than yesterday. The output was (code see below):

>> The most likely cause of the speed up I saw is more efficient caching,
>> because BitSet stores the data more densely. That is a very hardware
>> dependent effect, though less a matter of CPU speed than of cache
>> configuration.
>>
>>> public void eraseComposite() {
>>> int prime = 2;
>>> while (prime < numbers.length) {
>>> sum += prime;
>>> // mark multiples of prime
>>> for (int i = 2 * prime; i < numbers.length; i += prime) {
>>> bother
>>> // since it would be larger than Integer.MAX_INTEGER)
>>> numbers = true;
>>> }
>>> // move forward to next prime
>>> prime++;
>>> while (prime < numbers.length-1 && (numbers[prime])) {
>>> prime++;
>>> }
>>> }
>>> }
>>>

>> There is one optimization you missed. When prime is greater than the
>> square root of numbers.length, there is no need to do the composite scan.
>>
>> If a number less than 1,000,000 is composite, at least one prime factor
>> is less than 1000.

>
> This is what I tried to express in the above comment. It was more laziness than
> the reason stated above though that prevented me from implementing it
>
> Cheers,
> Simon

Ah, I see. You were thinking of checking by asking whether the square of
the latest prime is less than numbers.length.

The way I did it, there is no possibility of overflow - I stored
Math.sqrt(numbers.length) as an integer. Inside the loop, I just
compared to the square root.

Patricia

Patricia Shanahan, Jan 8, 2007
16. ### John ErsatznomGuest

So to make a long story short, the algorithm is:

* Make array of booleans of length max_num - 1, whose zeroth element
corresponds to 2 and whose max_num - 2nd element corresponds to maxnum.
Clear it (all false). (Java clears it for you if you use Java.)
* Take square root of maxnum and store it somewhere.
* for i = 2 to that square root
* if (array[i - 2]) continue;
* for j = i*i to max_num step i
* set array[j - 2]
* end
* end

Note the optimization of starting the inner loop at i*i. All the
composites smaller than i*i have a prime factor smaller than i and so
are already marked at that point. Note also skipping the inner loop for
composite values of i.

In fact, using the square root of max_num is a less significant
optimization after the i*i optimization, as it just stops the outer loop
testing every remaining value for compositeness and then entering the
inner loop for the primes but without the inner loop actually doing
anything (as i*i >= max_num the loop terminates immediately).

The algorithm does sqrt(max_num) bit tests (using the sqrt(max_num) opt)
and sum{primes p up to sqrt(max_num}(max_num/p). The number of primes is
O(log sqrt(max_num)) from what I've seen of primes, which is O(log
max_num) since the log of the square root is simply half the log and the
half disappears into the constant factor. The average prime from this
set will be O(1) (integral of log x is log x - x, approximates sum of
primes; divide by log x for 1 - 1/log(x) is O(1)). So the number of bit
sets is O(max_num log max_num) -- O(max_num) per prime as it's max_num/p
with average p O(1), and O(log max_num) primes.

So the whole thing has complexity O(n log n) -- as good as quicksort,
and without the quadratic worst-case behavior.

Removing the squaring/square rooting optimizations doesn't increase the
complexity but does increase the running time by a factor of 2 or so.
Note that square roots are expensive, but the only square root occurs
outside any loops...i*i is just a multiplication, and multiplication of
integers is cheap on any modern architecture. The inner loop does one
imul, three iadds (one's actually the subtract in the loop test and one
computes the array index), one icmp (the rest of the loop test), and one
boolean assignment. I expect the compiler to optimize "set array[j - 2]"
to "*(foo + j) = true" or something equivalent -- i.e. to use a point
two before the actual start of the array as the base for the pointer
adds in this loop. Of course, using a BitSet instead of a boolean[] may
change this a lot. A sensible implementation plus jit compiler will
probably reduce it to a similar add and a bit mask operation, but I
expect an idiv will sneak in there to get the array index into the
underlying array, which may make BitSet worse-performing, and the
subtraction of 2 definitely has to be done separately. Likely *(foo + (j
- 2)/64) |= 2^(j - 2) under the hood with 2^(j - 2) sucked out of a
lookup table, for a grand total of two extra iadds (j - 2 and the lookup
into the 2^n table), one extra idiv, and two extra fetches (the lookup
into the 2^n table and retrieving the stored word to |= with something)
versus storing in a boolean[] which is probably the one add and a flat
store, without bothering to retrieve whatever was in the array cell
originally.

I'd be *very* surprised if BitSet actually comes out faster after you
run it a few hundred times to get it all JITted and then a few hundred
more to measure and average.

John Ersatznom, Jan 8, 2007
17. ### Richard F.L.R.SnashallGuest

John Ersatznom wrote:
> So to make a long story short, the algorithm is:
>
> * Make array of booleans of length max_num - 1, whose zeroth element
> corresponds to 2 and whose max_num - 2nd element corresponds to maxnum.
> Clear it (all false). (Java clears it for you if you use Java.)
> * Take square root of maxnum and store it somewhere.
> * for i = 2 to that square root
> * if (array[i - 2]) continue;
> * for j = i*i to max_num step i
> * set array[j - 2]
> * end
> * end
>
> Note the optimization of starting the inner loop at i*i. All the
> composites smaller than i*i have a prime factor smaller than i and so
> are already marked at that point. Note also skipping the inner loop for
> composite values of i.

What if you used your array to find "open" values at i and above
and considered only those multiples of i?

Richard F.L.R.Snashall, Jan 8, 2007
18. ### Patricia ShanahanGuest

John Ersatznom wrote:
....
> I'd be *very* surprised if BitSet actually comes out faster after you
> run it a few hundred times to get it all JITted and then a few hundred
> more to measure and average.

The potential gain from BitSet is more a matter of hardware than
anything JIT can affect.

Especially in the early stages, the sieve has a very cache-unfavorable
memory access pattern, relatively small stride access to a range of a
million elements.

If you use a boolean[], those million elements occupy a million bytes of
memory, because, at least in the implementations I've examined, each
boolean in an array gets its own byte.

BitSet uses a long[], and packs 64 bits into each element, so a million
elements only take 125,000 bytes of memory.

That may make a significant difference if at least one cache in the
system has a size in the range of tens to hundreds of megabytes.

Patricia

Patricia Shanahan, Jan 8, 2007
19. ### Jean-Francois BriereGuest

Re: Sum of Primes < 1000000

The sum should be 37550402023 and not 37551402022

// move forward to next prime
prime++;
while (prime < numbers.length && (numbers[prime])) {
prime++;
}

Regards

Jean-Francois Briere, Jan 8, 2007
20. ### SimonGuest

John Ersatznom wrote:
> The algorithm does sqrt(max_num) bit tests (using the sqrt(max_num) opt)
> and sum{primes p up to sqrt(max_num}(max_num/p). The number of primes is
> O(log sqrt(max_num)) from what I've seen of primes, which is O(log
> max_num) since the log of the square root is simply half the log and the
> half disappears into the constant factor. The average prime from this
> set will be O(1) (integral of log x is log x - x, approximates sum of
> primes; divide by log x for 1 - 1/log(x) is O(1)). So the number of bit
> sets is O(max_num log max_num) -- O(max_num) per prime as it's max_num/p
> with average p O(1), and O(log max_num) primes.
>
> So the whole thing has complexity O(n log n) -- as good as quicksort,
> and without the quadratic worst-case behavior.

I'm not sure what you are saying. We have already seen two arguments for an
upper bound of O(n*log n). Is that a third argument? In particular, I don't
understand what you mean by "The average prime from this set will be O(1)".
(Every set of (distinct) positive integers (prime or not) with at least k
elements has an average of at least k/4 since at least k/2 of its elements are
>= k/2.). Furthermore, the integral of log x is not lox x - x, but rather x *

log x - x. Then, you seem to have this number p in the denominator (in
max_num/p) and use p=O(1) to obtain an upper bound on the term. This, however,
is impossible. You would actually need it to be p=Omega(1) (which is obviously
true in contrast to the O(1)). Also, you cannot just take the average and then
multiply with that as if every element would equal the average.

Or are you trying to give a lower bound? This should also be impossible:
http://wwwhomes.uni-bielefeld.de/achim/prime_sieve.html
claims that the truth is O(n * log log n) which I tend to believe as I already
posted (There are only log n terms in the harmonic series).

Cheers,
Simon

Simon, Jan 9, 2007