[Algorithm] Sum of Primes < 1000000

Discussion in 'Java' started by Luc The Perverse, Jan 5, 2007.

  1. I was messing around on a site where you answer questions for points. (No
    prizes or anything)

    Anyway, they say our programs *should* execute in less than 1 minute if well
    formed.

    Mine didn't - it was less than 10 minutes though and I got the right answer.

    I was supposed to sum all primes less than 1 million.

    Is there an algorithm out there, which I will be able to understand, which
    works significantly faster than this one?

    //import java.lang.*;
    import java.util.*;
    import java.math.*;

    public class prob10{
    public static void main(String[] x){
    if(x.length<1){
    System.out.println("enter a number");
    return;
    }

    int n = Integer.parseInt(x[0]);
    int c = 0;
    int[] pc = new int[n];
    int[] p2 = new int[n];
    if(n<=1){
    System.out.println(2);
    return;
    }
    c++;
    pc[0] = 2;
    p2[0] = 2;
    int cn = 1;
    long g = 2;
    while(c<n){
    cn+=2;
    boolean Prime = true;
    for(int i=1;i<c;i++){
    p2 -= 2;
    if(p2 == 0)
    Prime = false;
    if(p2 < 1)
    p2 += pc;
    }
    if(Prime){
    pc[c] = cn;
    p2[c++] = cn;
    if((c%2048) == 0)
    System.out.println(cn);
    if(cn<1000000)
    g+=cn;
    else{
    System.out.println("Yay 1 million reached!");
    break;
    }

    }
    }
    System.out.println("Grand Total " + g);
    System.out.println(pc[n-1]);
    }

    }

    I specifically tried to avoid ever dividing or modding (except in my status
    reporter, but that only checks when a prime has been found)

    I "count down" until the next multiple a prime is detected.

    The algorithm works . . but it is n^2 complexity, and kinda sucks.

    --
    LTP

    :)
     
    Luc The Perverse, Jan 5, 2007
    #1
    1. Advertising

  2. Luc The Perverse

    Simon Guest

    Luc The Perverse schrieb:
    > I was messing around on a site where you answer questions for points. (No
    > prizes or anything)
    >
    > Anyway, they say our programs *should* execute in less than 1 minute if well
    > formed.
    >
    > Mine didn't - it was less than 10 minutes though and I got the right answer.
    >
    > I was supposed to sum all primes less than 1 million.
    >
    > Is there an algorithm out there, which I will be able to understand, which
    > works significantly faster than this one?


    You may want to have a look at the Sieve of Eratosthenes (and its successors)
    (see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
    to n in time n log n. On my computer my implementation takes 150 milliseconds
    for n=1000000.

    > I specifically tried to avoid ever dividing or modding (except in my status
    > reporter, but that only checks when a prime has been found)


    I doubt that such micro-optimisations are of much use here.

    Cheers,
    Simon
     
    Simon, Jan 5, 2007
    #2
    1. Advertising

  3. Luc The Perverse wrote:
    > I was messing around on a site where you answer questions for points. (No
    > prizes or anything)
    >
    > Anyway, they say our programs *should* execute in less than 1 minute if well
    > formed.
    >
    > Mine didn't - it was less than 10 minutes though and I got the right answer.
    >
    > I was supposed to sum all primes less than 1 million.
    >
    > Is there an algorithm out there, which I will be able to understand, which
    > works significantly faster than this one?

    ....

    The usual algorithm is the Sieve of Eratosthenes,
    http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes

    The sieve is O(n*n), but with small constant factor.

    Patricia
     
    Patricia Shanahan, Jan 5, 2007
    #3
  4. Patricia Shanahan wrote:
    > Luc The Perverse wrote:
    >> I was messing around on a site where you answer questions for points.
    >> (No prizes or anything)
    >>
    >> Anyway, they say our programs *should* execute in less than 1 minute
    >> if well formed.
    >>
    >> Mine didn't - it was less than 10 minutes though and I got the right
    >> answer.
    >>
    >> I was supposed to sum all primes less than 1 million.
    >>
    >> Is there an algorithm out there, which I will be able to understand,
    >> which works significantly faster than this one?

    > ...
    >
    > The usual algorithm is the Sieve of Eratosthenes,
    > http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
    >
    > The sieve is O(n*n), but with small constant factor.


    I was wrong about the complexity. I was not allowing for only doing the
    inner loop for primes.

    Patricia
     
    Patricia Shanahan, Jan 5, 2007
    #4
  5. "Patricia Shanahan" <> wrote in message
    news:Cernh.6752$...
    > Patricia Shanahan wrote:
    >> Luc The Perverse wrote:
    >>> I was messing around on a site where you answer questions for points.
    >>> (No prizes or anything)
    >>>
    >>> Anyway, they say our programs *should* execute in less than 1 minute if
    >>> well formed.
    >>>
    >>> Mine didn't - it was less than 10 minutes though and I got the right
    >>> answer.
    >>>
    >>> I was supposed to sum all primes less than 1 million.
    >>>
    >>> Is there an algorithm out there, which I will be able to understand,
    >>> which works significantly faster than this one?

    >> ...
    >>
    >> The usual algorithm is the Sieve of Eratosthenes,
    >> http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes
    >>
    >> The sieve is O(n*n), but with small constant factor.

    >
    > I was wrong about the complexity. I was not allowing for only doing the
    > inner loop for primes.


    Looks like I had better check this out!

    hehe cool

    Thanks guys! (And I mean Simon too)

    --
    LTP

    :)
     
    Luc The Perverse, Jan 5, 2007
    #5
  6. Simon wrote:
    > Luc The Perverse schrieb:
    >> I was messing around on a site where you answer questions for points. (No
    >> prizes or anything)
    >>
    >> Anyway, they say our programs *should* execute in less than 1 minute if well
    >> formed.
    >>
    >> Mine didn't - it was less than 10 minutes though and I got the right answer.
    >>
    >> I was supposed to sum all primes less than 1 million.
    >>
    >> Is there an algorithm out there, which I will be able to understand, which
    >> works significantly faster than this one?

    >
    > You may want to have a look at the Sieve of Eratosthenes (and its successors)
    > (see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
    > to n in time n log n. On my computer my implementation takes 150 milliseconds
    > for n=1000000.


    As a matter of curiosity, did you use a boolean[] or BitSet?

    I started with a boolean[], and mine was slower than yours, about 200
    milliseconds. It dropped to under 100 milliseconds when I switched to
    BitSet. I believe the improvement is due to more efficient caching.

    Patricia
     
    Patricia Shanahan, Jan 5, 2007
    #6
  7. Luc The Perverse

    Daniel Pitts Guest

    Re: Sum of Primes < 1000000

    Patricia Shanahan wrote:
    > Simon wrote:
    > > Luc The Perverse schrieb:
    > >> I was messing around on a site where you answer questions for points. (No
    > >> prizes or anything)
    > >>
    > >> Anyway, they say our programs *should* execute in less than 1 minute if well
    > >> formed.
    > >>
    > >> Mine didn't - it was less than 10 minutes though and I got the right answer.
    > >>
    > >> I was supposed to sum all primes less than 1 million.
    > >>
    > >> Is there an algorithm out there, which I will be able to understand, which
    > >> works significantly faster than this one?

    > >
    > > You may want to have a look at the Sieve of Eratosthenes (and its successors)
    > > (see http://en.wikipedia.org/wiki/Sieve_of_Eratosthenes). It finds all primes up
    > > to n in time n log n. On my computer my implementation takes 150 milliseconds
    > > for n=1000000.

    >
    > As a matter of curiosity, did you use a boolean[] or BitSet?
    >
    > I started with a boolean[], and mine was slower than yours, about 200
    > milliseconds. It dropped to under 100 milliseconds when I switched to
    > BitSet. I believe the improvement is due to more efficient caching.
    >
    > Patricia


    I personally got 66ms from:

    public class SumPrimes {
    public static void main(String...args) {
    final int bounds = Integer.parseInt(args[0]) + 1;
    final java.util.BitSet composits = new
    java.util.BitSet(bounds);
    long sum = 1;
    for (int prime = 2; prime < bounds; prime =
    composits.nextClearBit(prime +1)) {
    sum += prime;
    for (int composit = prime + prime; composit < bounds;
    composit += prime) {
    composits.set(composit);
    }
    }
    System.out.println(sum);
    }
    }
     
    Daniel Pitts, Jan 5, 2007
    #7
  8. Luc The Perverse

    MikeNereson Guest

    Re: Sum of Primes < 1000000

    Your original alogorithm worked for me in 141ms. I made a couple
    seemingly minor changes. Can you find them? Why are they so important?

    http://pastebin.com/852736

    OUTPUT:
    Yay 1 million reached!
    Grand Total 797162
    0
    only took 141 ms
     
    MikeNereson, Jan 6, 2007
    #8
  9. Luc The Perverse

    MikeNereson Guest

    Re: Sum of Primes < 1000000

    Whoops, I didn't validate that output. My bad. The bitset solution is
    great.


    MikeNereson wrote:
    > Your original alogorithm worked for me in 141ms. I made a couple
    > seemingly minor changes. Can you find them? Why are they so important?
    >
    > http://pastebin.com/852736
    >
    > OUTPUT:
    > Yay 1 million reached!
    > Grand Total 797162
    > 0
    > only took 141 ms
     
    MikeNereson, Jan 6, 2007
    #9
  10. Luc The Perverse

    Chris Uppal Guest

    Re: Sum of Primes < 1000000

    MikeNereson wrote:

    > Yay 1 million reached!
    > Grand Total 797162


    Seems unlikely to be the correct answer since 999983 is just one one of the
    primes in that range...

    -- chris
     
    Chris Uppal, Jan 6, 2007
    #10
  11. Luc The Perverse

    Abhi Guest

    Re: Sum of Primes < 1000000

    Hi Luc,
    I found the primes<1000000 and stored them in an array.also summed them
    up.I am pasting my code.Pls hav a look..........it took 1782 msecs on a
    p4 2.4GHz processor && 512 mb ram.
    If this code is not ok then let me know:)

    /*
    * Created on Jan 7, 2007
    *
    * To change the template for this generated file go to
    * Window&gt;Preferences&gt;Java&gt;Code Generation&gt;Code and
    Comments
    */
    package src;

    //import java.lang.Math;
    /**
    * @author Jboss
    *calculate sum of prime nos less than 1 million
    */
    public final class Sum_Prime {

    private static int currNo;
    private static int currSqrt;
    private static int[] primeArr;
    private static long currSum = 0;
    private static final int range = 1000000; // calc till 1000000;
    private static long startTime;

    public static void main(String[] args) {
    //Math.sqrt()
    int currArrSize = range / 20;//holds the curr size of the array
    if (currArrSize == 0)
    {
    primeArr = new int[10];
    currArrSize=10;
    }
    else
    primeArr = new int[currArrSize];

    startTime = System.currentTimeMillis();
    int k = 0; //index of the array which stores the prime nos.

    currNo = 2; //start looking from 2
    System.out.print("\n arr size \t"+primeArr.length);
    for (int i = 2; i <= range; i++) {
    boolean isPrime = true;
    currSqrt = (int) Math.sqrt(currNo);
    for (int j = 2;
    j <= currSqrt;
    j++) //chk for 3 whose sqrt is 1.5....
    {
    if (currNo % j == 0) {

    isPrime = false;
    break;
    }
    }
    if (isPrime) {
    try {
    primeArr[k] = currNo;
    currSum = currSum + currNo;
    k++;
    } catch (ArrayIndexOutOfBoundsException arrex) {
    currArrSize=currArrSize*2;
    int[] tempArr = new int[currArrSize];
    System.arraycopy(primeArr, 0, tempArr, 0,k);//copies the old arr
    into new 1
    primeArr=tempArr;
    //System.out.print("\n primeArr size \t"+primeArr.length);
    //System.out.print("\n tempArr size \t"+tempArr.length);
    primeArr[k]=currNo;
    currSum = currSum + currNo;
    System.out.print("\n new arr size \t"+primeArr.length);
    k++;
    }

    }
    currNo++;
    }
    System.out.print("\n sum \t" + currSum);
    System.out.print("\n curr val \t" + currNo);
    System.out.print("\n nos found \t" + k);
    System.out.print(
    "\n Time taken to execute in millisecs \t"
    + (System.currentTimeMillis() - startTime));
    System.exit(0);
    }
    }
    tell me the name of the site. where u saw this..........

    Luc The Perverse wrote:
    > I was messing around on a site where you answer questions for points. (No
    > prizes or anything)
    >
    > Anyway, they say our programs *should* execute in less than 1 minute if well
    > formed.
    >
    > Mine didn't - it was less than 10 minutes though and I got the right answer.
    >
    > I was supposed to sum all primes less than 1 million.
    >
    > Is there an algorithm out there, which I will be able to understand, which
    > works significantly faster than this one?
    >
    > //import java.lang.*;
    > import java.util.*;
    > import java.math.*;
    >
    > public class prob10{
    > public static void main(String[] x){
    > if(x.length<1){
    > System.out.println("enter a number");
    > return;
    > }
    >
    > int n = Integer.parseInt(x[0]);
    > int c = 0;
    > int[] pc = new int[n];
    > int[] p2 = new int[n];
    > if(n<=1){
    > System.out.println(2);
    > return;
    > }
    > c++;
    > pc[0] = 2;
    > p2[0] = 2;
    > int cn = 1;
    > long g = 2;
    > while(c<n){
    > cn+=2;
    > boolean Prime = true;
    > for(int i=1;i<c;i++){
    > p2 -= 2;
    > if(p2 == 0)
    > Prime = false;
    > if(p2 < 1)
    > p2 += pc;
    > }
    > if(Prime){
    > pc[c] = cn;
    > p2[c++] = cn;
    > if((c%2048) == 0)
    > System.out.println(cn);
    > if(cn<1000000)
    > g+=cn;
    > else{
    > System.out.println("Yay 1 million reached!");
    > break;
    > }
    >
    > }
    > }
    > System.out.println("Grand Total " + g);
    > System.out.println(pc[n-1]);
    > }
    >
    > }
    >
    > I specifically tried to avoid ever dividing or modding (except in my status
    > reporter, but that only checks when a prime has been found)
    >
    > I "count down" until the next multiple a prime is detected.
    >
    > The algorithm works . . but it is n^2 complexity, and kinda sucks.
    >
    > --
    > LTP
    >
    > :)
     
    Abhi, Jan 8, 2007
    #11
  12. Luc The Perverse

    Simon Guest

    Patricia Shanahan schrieb:
    > Simon wrote:
    > As a matter of curiosity, did you use a boolean[] or BitSet?
    >
    > I started with a boolean[], and mine was slower than yours, about 200
    > milliseconds. It dropped to under 100 milliseconds when I switched to
    > BitSet. I believe the improvement is due to more efficient caching.


    Well, that is strange. I have been using a boolean[], and using BitSet actually
    slows it down on my machine. I am using a Laptop, so CPU speed may vary. Today
    it is even faster than yesterday. The output was (code see below):

    Time : 42ms
    Sum : 37551402022

    Apart from that, from your other post I saw that you have used a different
    argument to obtain the n log n upper bound.

    - Your argument: There are only log n primes below n and each iteration takes
    time at most n.
    - My argument: Consider the iteration in which we erase prime i. We erase n/i
    composite numbers. In total, we need time

    sum(i=1..n) n/i
    = n * H(n)
    = O(n * log n)

    (Here, H(n) is the n-th harmonic number.) Can we combine both arguments to
    obtain an upper bound below n * log n? In my sum, only log n terms are actually
    necessary, as has been argued by Patricia. Unfortunately, the terms that are
    actually present are the larger ones (those for small i). Still, some Googling
    reveals that it is possible to get the time down to n * log log n.

    Cheers,
    Simon



    This is my code:

    public class Primes {

    private boolean[] numbers;
    private long sum = 0;

    public Primes(int size) {
    numbers = new boolean[size];
    }

    public void eraseComposite() {
    int prime = 2;
    while (prime < numbers.length) {
    sum += prime;
    // mark multiples of prime
    for (int i = 2 * prime; i < numbers.length; i += prime) {
    // (we could start with prime*prime here, but I didn't bother
    // since it would be larger than Integer.MAX_INTEGER)
    numbers = true;
    }
    // move forward to next prime
    prime++;
    while (prime < numbers.length-1 && (numbers[prime])) {
    prime++;
    }
    }
    }

    public static void main(String[] argv) {
    long t = System.currentTimeMillis();
    Primes p = new Primes(1000000);
    p.eraseComposite();
    System.out.println("Time : " + (System.currentTimeMillis() - t) + "ms");
    System.out.println("Sum : " + p.sum);
    }
    }
     
    Simon, Jan 8, 2007
    #12
  13. Simon wrote:
    > Patricia Shanahan schrieb:
    >> Simon wrote:
    >> As a matter of curiosity, did you use a boolean[] or BitSet?
    >>
    >> I started with a boolean[], and mine was slower than yours, about 200
    >> milliseconds. It dropped to under 100 milliseconds when I switched to
    >> BitSet. I believe the improvement is due to more efficient caching.

    >
    > Well, that is strange. I have been using a boolean[], and using BitSet actually
    > slows it down on my machine. I am using a Laptop, so CPU speed may vary. Today
    > it is even faster than yesterday. The output was (code see below):


    The most likely cause of the speed up I saw is more efficient caching,
    because BitSet stores the data more densely. That is a very hardware
    dependent effect, though less a matter of CPU speed than of cache
    configuration.

    > public void eraseComposite() {
    > int prime = 2;
    > while (prime < numbers.length) {
    > sum += prime;
    > // mark multiples of prime
    > for (int i = 2 * prime; i < numbers.length; i += prime) {
    > // (we could start with prime*prime here, but I didn't bother
    > // since it would be larger than Integer.MAX_INTEGER)
    > numbers = true;
    > }
    > // move forward to next prime
    > prime++;
    > while (prime < numbers.length-1 && (numbers[prime])) {
    > prime++;
    > }
    > }
    > }
    >


    There is one optimization you missed. When prime is greater than the
    square root of numbers.length, there is no need to do the composite scan.

    If a number less than 1,000,000 is composite, at least one prime factor
    is less than 1000.

    Patricia
     
    Patricia Shanahan, Jan 8, 2007
    #13
  14. Luc The Perverse

    Simon Guest

    Patricia Shanahan schrieb:
    > Simon wrote:
    >> Patricia Shanahan schrieb:
    >>> Simon wrote:
    >>> As a matter of curiosity, did you use a boolean[] or BitSet?
    >>>
    >>> I started with a boolean[], and mine was slower than yours, about 200
    >>> milliseconds. It dropped to under 100 milliseconds when I switched to
    >>> BitSet. I believe the improvement is due to more efficient caching.

    >>
    >> Well, that is strange. I have been using a boolean[], and using BitSet
    >> actually
    >> slows it down on my machine. I am using a Laptop, so CPU speed may
    >> vary. Today
    >> it is even faster than yesterday. The output was (code see below):

    >
    > The most likely cause of the speed up I saw is more efficient caching,
    > because BitSet stores the data more densely. That is a very hardware
    > dependent effect, though less a matter of CPU speed than of cache
    > configuration.
    >
    >> public void eraseComposite() {
    >> int prime = 2;
    >> while (prime < numbers.length) {
    >> sum += prime;
    >> // mark multiples of prime
    >> for (int i = 2 * prime; i < numbers.length; i += prime) {
    >> // (we could start with prime*prime here, but I didn't
    >> bother
    >> // since it would be larger than Integer.MAX_INTEGER)
    >> numbers = true;
    >> }
    >> // move forward to next prime
    >> prime++;
    >> while (prime < numbers.length-1 && (numbers[prime])) {
    >> prime++;
    >> }
    >> }
    >> }
    >>

    >
    > There is one optimization you missed. When prime is greater than the
    > square root of numbers.length, there is no need to do the composite scan.
    >
    > If a number less than 1,000,000 is composite, at least one prime factor
    > is less than 1000.


    This is what I tried to express in the above comment. It was more laziness than
    the reason stated above though that prevented me from implementing it :)

    Cheers,
    Simon
     
    Simon, Jan 8, 2007
    #14
  15. Simon wrote:
    > Patricia Shanahan schrieb:
    >> Simon wrote:
    >>> Patricia Shanahan schrieb:
    >>>> Simon wrote:
    >>>> As a matter of curiosity, did you use a boolean[] or BitSet?
    >>>>
    >>>> I started with a boolean[], and mine was slower than yours, about 200
    >>>> milliseconds. It dropped to under 100 milliseconds when I switched to
    >>>> BitSet. I believe the improvement is due to more efficient caching.
    >>> Well, that is strange. I have been using a boolean[], and using BitSet
    >>> actually
    >>> slows it down on my machine. I am using a Laptop, so CPU speed may
    >>> vary. Today
    >>> it is even faster than yesterday. The output was (code see below):

    >> The most likely cause of the speed up I saw is more efficient caching,
    >> because BitSet stores the data more densely. That is a very hardware
    >> dependent effect, though less a matter of CPU speed than of cache
    >> configuration.
    >>
    >>> public void eraseComposite() {
    >>> int prime = 2;
    >>> while (prime < numbers.length) {
    >>> sum += prime;
    >>> // mark multiples of prime
    >>> for (int i = 2 * prime; i < numbers.length; i += prime) {
    >>> // (we could start with prime*prime here, but I didn't
    >>> bother
    >>> // since it would be larger than Integer.MAX_INTEGER)
    >>> numbers = true;
    >>> }
    >>> // move forward to next prime
    >>> prime++;
    >>> while (prime < numbers.length-1 && (numbers[prime])) {
    >>> prime++;
    >>> }
    >>> }
    >>> }
    >>>

    >> There is one optimization you missed. When prime is greater than the
    >> square root of numbers.length, there is no need to do the composite scan.
    >>
    >> If a number less than 1,000,000 is composite, at least one prime factor
    >> is less than 1000.

    >
    > This is what I tried to express in the above comment. It was more laziness than
    > the reason stated above though that prevented me from implementing it :)
    >
    > Cheers,
    > Simon


    Ah, I see. You were thinking of checking by asking whether the square of
    the latest prime is less than numbers.length.

    The way I did it, there is no possibility of overflow - I stored
    Math.sqrt(numbers.length) as an integer. Inside the loop, I just
    compared to the square root.

    Patricia
     
    Patricia Shanahan, Jan 8, 2007
    #15
  16. So to make a long story short, the algorithm is:

    * Make array of booleans of length max_num - 1, whose zeroth element
    corresponds to 2 and whose max_num - 2nd element corresponds to maxnum.
    Clear it (all false). (Java clears it for you if you use Java.)
    * Take square root of maxnum and store it somewhere.
    * for i = 2 to that square root
    * if (array[i - 2]) continue;
    * for j = i*i to max_num step i
    * set array[j - 2]
    * end
    * end

    Note the optimization of starting the inner loop at i*i. All the
    composites smaller than i*i have a prime factor smaller than i and so
    are already marked at that point. Note also skipping the inner loop for
    composite values of i.

    In fact, using the square root of max_num is a less significant
    optimization after the i*i optimization, as it just stops the outer loop
    testing every remaining value for compositeness and then entering the
    inner loop for the primes but without the inner loop actually doing
    anything (as i*i >= max_num the loop terminates immediately).

    The algorithm does sqrt(max_num) bit tests (using the sqrt(max_num) opt)
    and sum{primes p up to sqrt(max_num}(max_num/p). The number of primes is
    O(log sqrt(max_num)) from what I've seen of primes, which is O(log
    max_num) since the log of the square root is simply half the log and the
    half disappears into the constant factor. The average prime from this
    set will be O(1) (integral of log x is log x - x, approximates sum of
    primes; divide by log x for 1 - 1/log(x) is O(1)). So the number of bit
    sets is O(max_num log max_num) -- O(max_num) per prime as it's max_num/p
    with average p O(1), and O(log max_num) primes.

    So the whole thing has complexity O(n log n) -- as good as quicksort,
    and without the quadratic worst-case behavior. :)

    Removing the squaring/square rooting optimizations doesn't increase the
    complexity but does increase the running time by a factor of 2 or so.
    Note that square roots are expensive, but the only square root occurs
    outside any loops...i*i is just a multiplication, and multiplication of
    integers is cheap on any modern architecture. The inner loop does one
    imul, three iadds (one's actually the subtract in the loop test and one
    computes the array index), one icmp (the rest of the loop test), and one
    boolean assignment. I expect the compiler to optimize "set array[j - 2]"
    to "*(foo + j) = true" or something equivalent -- i.e. to use a point
    two before the actual start of the array as the base for the pointer
    adds in this loop. Of course, using a BitSet instead of a boolean[] may
    change this a lot. A sensible implementation plus jit compiler will
    probably reduce it to a similar add and a bit mask operation, but I
    expect an idiv will sneak in there to get the array index into the
    underlying array, which may make BitSet worse-performing, and the
    subtraction of 2 definitely has to be done separately. Likely *(foo + (j
    - 2)/64) |= 2^(j - 2) under the hood with 2^(j - 2) sucked out of a
    lookup table, for a grand total of two extra iadds (j - 2 and the lookup
    into the 2^n table), one extra idiv, and two extra fetches (the lookup
    into the 2^n table and retrieving the stored word to |= with something)
    versus storing in a boolean[] which is probably the one add and a flat
    store, without bothering to retrieve whatever was in the array cell
    originally.

    I'd be *very* surprised if BitSet actually comes out faster after you
    run it a few hundred times to get it all JITted and then a few hundred
    more to measure and average.
     
    John Ersatznom, Jan 8, 2007
    #16
  17. John Ersatznom wrote:
    > So to make a long story short, the algorithm is:
    >
    > * Make array of booleans of length max_num - 1, whose zeroth element
    > corresponds to 2 and whose max_num - 2nd element corresponds to maxnum.
    > Clear it (all false). (Java clears it for you if you use Java.)
    > * Take square root of maxnum and store it somewhere.
    > * for i = 2 to that square root
    > * if (array[i - 2]) continue;
    > * for j = i*i to max_num step i
    > * set array[j - 2]
    > * end
    > * end
    >
    > Note the optimization of starting the inner loop at i*i. All the
    > composites smaller than i*i have a prime factor smaller than i and so
    > are already marked at that point. Note also skipping the inner loop for
    > composite values of i.


    What if you used your array to find "open" values at i and above
    and considered only those multiples of i?
     
    Richard F.L.R.Snashall, Jan 8, 2007
    #17
  18. John Ersatznom wrote:
    ....
    > I'd be *very* surprised if BitSet actually comes out faster after you
    > run it a few hundred times to get it all JITted and then a few hundred
    > more to measure and average.


    The potential gain from BitSet is more a matter of hardware than
    anything JIT can affect.

    Especially in the early stages, the sieve has a very cache-unfavorable
    memory access pattern, relatively small stride access to a range of a
    million elements.

    If you use a boolean[], those million elements occupy a million bytes of
    memory, because, at least in the implementations I've examined, each
    boolean in an array gets its own byte.

    BitSet uses a long[], and packs 64 bits into each element, so a million
    elements only take 125,000 bytes of memory.

    That may make a significant difference if at least one cache in the
    system has a size in the range of tens to hundreds of megabytes.

    Patricia
     
    Patricia Shanahan, Jan 8, 2007
    #18
  19. Re: Sum of Primes < 1000000

    The sum should be 37550402023 and not 37551402022
    (Your're adding 999999 by error).
    Your code should read:

    // move forward to next prime
    prime++;
    while (prime < numbers.length && (numbers[prime])) {
    prime++;
    }

    Regards
     
    Jean-Francois Briere, Jan 8, 2007
    #19
  20. Luc The Perverse

    Simon Guest

    John Ersatznom wrote:
    > The algorithm does sqrt(max_num) bit tests (using the sqrt(max_num) opt)
    > and sum{primes p up to sqrt(max_num}(max_num/p). The number of primes is
    > O(log sqrt(max_num)) from what I've seen of primes, which is O(log
    > max_num) since the log of the square root is simply half the log and the
    > half disappears into the constant factor. The average prime from this
    > set will be O(1) (integral of log x is log x - x, approximates sum of
    > primes; divide by log x for 1 - 1/log(x) is O(1)). So the number of bit
    > sets is O(max_num log max_num) -- O(max_num) per prime as it's max_num/p
    > with average p O(1), and O(log max_num) primes.
    >
    > So the whole thing has complexity O(n log n) -- as good as quicksort,
    > and without the quadratic worst-case behavior. :)


    I'm not sure what you are saying. We have already seen two arguments for an
    upper bound of O(n*log n). Is that a third argument? In particular, I don't
    understand what you mean by "The average prime from this set will be O(1)".
    (Every set of (distinct) positive integers (prime or not) with at least k
    elements has an average of at least k/4 since at least k/2 of its elements are
    >= k/2.). Furthermore, the integral of log x is not lox x - x, but rather x *

    log x - x. Then, you seem to have this number p in the denominator (in
    max_num/p) and use p=O(1) to obtain an upper bound on the term. This, however,
    is impossible. You would actually need it to be p=Omega(1) (which is obviously
    true in contrast to the O(1)). Also, you cannot just take the average and then
    multiply with that as if every element would equal the average.

    Or are you trying to give a lower bound? This should also be impossible:
    http://wwwhomes.uni-bielefeld.de/achim/prime_sieve.html
    claims that the truth is O(n * log log n) which I tend to believe as I already
    posted (There are only log n terms in the harmonic series).


    Cheers,
    Simon
     
    Simon, Jan 9, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. someone else

    simple algorithm for finding primes

    someone else, Dec 8, 2003, in forum: C Programming
    Replies:
    32
    Views:
    1,417
    Chris Torek
    Dec 12, 2003
  2. Math
    Replies:
    2
    Views:
    300
    Nick Craig-Wood
    Mar 28, 2006
  3. Math
    Replies:
    2
    Views:
    300
    Ben Finney
    Mar 29, 2006
  4. Tim Shephard

    Return value of 1000000 from recv.. why?

    Tim Shephard, Aug 30, 2010, in forum: C Programming
    Replies:
    3
    Views:
    271
    Seebs
    Aug 30, 2010
  5. bad_knee

    regex to convert 1000000 -> 1,000,000 ?

    bad_knee, Nov 14, 2003, in forum: Perl Misc
    Replies:
    39
    Views:
    334
    Tad McClellan
    Nov 19, 2003
Loading...

Share This Page