How much time does it need to sort 1 million random 64-bit/32-bit integers?

Discussion in 'VHDL' started by Weng Tianxiang, Jul 6, 2006.

  1. Hi,
    Can I find a website that lists the time for sorting 1 million
    64-bit/32-bit random data using the fastest software sorting algorithm
    in a PC?

    Thank you.

    Weng
     
    Weng Tianxiang, Jul 6, 2006
    #1
    1. Advertising

  2. Weng Tianxiang said:

    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?


    Apparently not.

    Note that, even if you could, such results would be highly
    hardware-dependent (although the /relative/ ordering of various sorting
    algorithms should be fairly consistent on any given platform). PCs vary
    wildly in performance characteristics. The first PC I used was an Atari ST,
    running at - what was it, 8 MHz? The first IBM-compatible I used was an
    8088 running at - er, I forget, but it was in the 8-12MHz range, I think.
    The first PC I /owned/ was an 80386-SX running at 20MHz. Whoa, that baby
    was *fast*. Currently, I'm using a PIII which is probably a brazillion
    times faster than the 386. The term "a PC" doesn't mean a lot when it comes
    to benchmarking.

    I suggest you just write up the algorithms yourself, or grab them off the
    Net, and try them on your own hardware.

    Beware of the results being corrupted by caching. Do not measure I/O times.
    Do each test a few hundred times.

    --
    Richard Heathfield
    "Usenet is a strange place" - dmr 29/7/1999
    http://www.cpax.org.uk
    email: rjh at above domain (but drop the www, obviously)
     
    Richard Heathfield, Jul 6, 2006
    #2
    1. Advertising

  3. Weng Tianxiang <> wrote:

    >Can I find a website that lists the time for sorting 1 million
    >64-bit/32-bit random data using the fastest software sorting algorithm


    Weng,

    Just about every "good" sorting algorithm has an execution time that
    scales roughly as N.log(N) with the number of items to be sorted.
    There are, of course, many tradeoffs about speed vs. space
    vs. hardware cost, but for any given algorithm that's not too
    far from right.

    A million data items will fit into main memory easily on a modern PC,
    so there's no need to worry about file I/O time.

    Donald Knuth's book "Sorting and Searching" is about as good as
    it gets on this topic.
    --
    Jonathan Bromley, Consultant

    DOULOS - Developing Design Know-how
    VHDL * Verilog * SystemC * e * Perl * Tcl/Tk * Project Services

    Doulos Ltd., 22 Market Place, Ringwood, BH24 1AW, UK

    http://www.MYCOMPANY.com

    The contents of this message may contain personal views which
    are not the views of Doulos Ltd., unless specifically stated.
     
    Jonathan Bromley, Jul 6, 2006
    #3
  4. Re: How much time does it need to sort 1 million random 64-bit/32-bitintegers?

    Weng Tianxiang schrieb:
    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?


    That depends on a lot of parameters, but I can give you one datapoint as
    an example:
    - Sorting 16777216 pair wise different 24-bit numbers takes no time at all.

    Kolja Sulimma
     
    Kolja Sulimma, Jul 6, 2006
    #4
  5. Hi,
    Actually I read the related article in a website that lists the timing
    run by the author, but I couldn't find it again.

    For example, he lists times in parallel with number of data, the
    frequency of his computer and all related parameters that may affect
    the performance, he also lists the performances using different
    algorithms to do a comparison and made some conclusions.

    Thank you.

    Weng
     
    Weng Tianxiang, Jul 6, 2006
    #5
  6. Weng Tianxiang

    John_H Guest

    "Weng Tianxiang" <> wrote in message
    news:...
    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?
    >
    > Thank you.
    >
    > Weng


    Since you posted on fpga and vhdl newsgroups in addition to programming, are
    you interested in an FPGA solution? Most of the information that one *can*
    find on the web deals with serial implementations. Parallel implementations
    for an FPGA such as the Bitonic sort can speed up the implementation
    significantly.
     
    John_H, Jul 6, 2006
    #6
  7. Weng Tianxiang

    JJ Guest

    Weng Tianxiang wrote:
    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?
    >
    > Thank you.
    >
    > Weng


    A post to c.p, c.a.f, c.l.v for 3 different sets of answers.

    wikipedia is probably a good place to start for software only solution.

    Largely depends on overall memory performance since very little work is
    done for the cost of memory access and also what you really want to
    sort for, whether it is already mostly sorted or always random ordered
    and if you are only counting frequencies rather than true sorting.

    Research for quicksort, mergesort plus radix sort and counter sorts
    for the larger cases and bubble or insertion sort for the tiny cases.
    If you have Knuths 3 vol set that will help, the material has been
    repeated many times in every CS text. Most give you the answer in
    terms of ideal machines using big O notation. Knuths reference books
    were written in the 60s when machines ran below 1mips with memory in
    same ballpark as cpu, so no caches.

    For the PC case
    For very small sorts say upto many thousands of items, the entire sort
    can be done from L1, L2 caches and is likely to be a good test of the
    cpu & cache system. Bigger sorts can use these smaller sorts in
    succession with some sort of merging. 1 random million isn't even
    remotely going to take longer than a blink since L2 caches are big
    enough to hold a good fraction of the 1M dataset.

    In any nix system you already have a basic quicksort on the cmd line
    but IIRC its not known to be stable for data that is already nearly
    sorted. If you code up quicksort, you can randomize the order to
    guarantee quicksort sticks to O N log N time otherwise it can degrade
    badly. Mergesort always takes bigger O N log N time since it does more
    work but it doesn't degrade with data patterns. A quicksort with pre
    randomization probably comes out even.

    Now if computers still had a form of memory that had the performance of
    SRAM but the cost and size of DRAM, the answer would be much easier to
    predict and radix sort would come out handily as that sort time follows
    the no of partitions performed on the word * no of items. Most radix
    sorts will use byte sorting so that would mean 4 passes of byte testing
    and writing items to 1 of 256 respective buckets which can get very
    large, esp if the data is mostly the same. If you have enough of this
    flat memory to hold the data on input and output, then you can do it in
    2 passes with 65536 buckets, this will break modern PC caches though.

    Since most PCs don't have SRAM like memory they simulate that with
    cache hierarchy the answer gets more complicated to predict. Fully
    random uncached accesses into PC main memory are surprisingly slow,
    towards several 100ns. Thats called a Memory wall.

    If you sort for N random ordered items and sweep N from very small to
    large to enormous, the quicksort and others follow O N log N time but
    with abrupt steps in O that increase as the hardware works ever harder
    to simulate flat memory access times. IIRC O can vary over over a
    factor of 10. O isn't anywhere near constant except on ideal machines
    with something like RLDRAM or infinitely big caches. I believe that
    mergesort will have a much more constant O.



    For the FPGA case.
    Each compare and swap could be done in an atomic operation needing 2
    reads and possibly 2 conditional writes in merge sort. WIth a dual port
    memory that might take 1 or 2 clocks in a pipeline fashion for sorts
    that fit into Blockram. Now if the data is spread over many sort
    engines, the parallelism may make up for the much lower clock.

    I might favor radix 3 sort since you could effectively use BlockRams as
    temporary buckets. Basically stream all the DRAM values through to 2048
    distinct buckets depending in turn on upper, middle, lower 11b fields.
    As each bucket fills, save it back to DRAM and reset to empty. Same can
    be done slower with radix 4 with 256 buckets or even faster with radix
    2 with 65536 buckets on a really big FPGA. The sort time will approach
    6M (or resp 8M, 4M) memory cycles at the fastest possible rate for just
    the keys. Some complexity comes in managing the buckets in the output
    stream into seemingly contiguous buckets.

    If you are just doing counter sort of values limited to x values,
    things get so much simpler, just funnel a memory read stream through x
    counters. Does x have to be 2^32?


    John Jakson
    transputer guy
     
    JJ, Jul 6, 2006
    #7
  8. Weng Tianxiang

    Joe Wright Guest

    Re: How much time does it need to sort 1 million random 64-bit/32-bitintegers?

    Weng Tianxiang wrote:
    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?
    >
    > Thank you.
    >
    > Weng
    >

    About a half second on my machine. Is that what you really wanted to know?


    --
    Joe Wright
    "Everything should be made as simple as possible, but not simpler."
    --- Albert Einstein ---
     
    Joe Wright, Jul 6, 2006
    #8
  9. Hi,
    What I really want to know is that the following formula for best
    sorting timing is correct (it is copied from Knuth's "Sorting and
    Searching") and it is not too far away from the PC computer reality:
    14.5* N * (lg N).

    There are about 20 algorithms, and I reall don't know what formula
    should be selected as a representative for best software algorithm
    running for 1 million random data.

    14.5* N * (lg N) is likely the best one.

    Thank you.

    Weng
     
    Weng Tianxiang, Jul 6, 2006
    #9
  10. Weng Tianxiang

    Oliver Wong Guest

    "Weng Tianxiang" <> wrote in message
    news:...
    > Hi,
    > Can I find a website that lists the time for sorting 1 million
    > 64-bit/32-bit random data using the fastest software sorting algorithm
    > in a PC?


    Here's a ballpark figure:

    Let n = 1000000

    n*log2(n) / 3Ghz = 6.64385619 milliseconds

    - Oliver
     
    Oliver Wong, Jul 6, 2006
    #10
  11. Weng Tianxiang

    Dann Corbit Guest

    "Weng Tianxiang" <> wrote in message
    news:...
    > Hi,
    > What I really want to know is that the following formula for best
    > sorting timing is correct (it is copied from Knuth's "Sorting and
    > Searching") and it is not too far away from the PC computer reality:
    > 14.5* N * (lg N).
    >
    > There are about 20 algorithms, and I reall don't know what formula
    > should be selected as a representative for best software algorithm
    > running for 1 million random data.
    >
    > 14.5* N * (lg N) is likely the best one.


    That will not be optimal.

    Integers would probably be a good candidate for Knuth's 'merge insertion'
    algorithm.

    This article "Implementing HEAPSORT with (n logn - 0.9n) and QUICKSORT with
    (n logn + 0.2n) comparisons"

    may be of some interest to you. He also supplies the source code and driver
    program for the article. It would be fairly simple to test on your
    platform, I imagine.

    In addition, this algorithm (especially for integer data type) may be of
    special interest for your particular case:
    http://www.nada.kth.se/~snilsson/public/code/nloglogn.c

    Of course, it is a distribution based sort (as a comparison sort operating
    that quickly is provably impossible).

    > Thank you.
    >
    > Weng
    >
     
    Dann Corbit, Jul 6, 2006
    #11
  12. Hi Wong,
    n*log2(n) is the number of comparison operations that must be done with
    a comparison sorting algorithm.

    It must have some overhead to do a loop and to swap two compared data
    into place.

    That is the reason why the formula has a 14.5 coefficient.

    Weng

    Oliver Wong wrote:
    > "Weng Tianxiang" <> wrote in message
    > news:...
    > > Hi,
    > > Can I find a website that lists the time for sorting 1 million
    > > 64-bit/32-bit random data using the fastest software sorting algorithm
    > > in a PC?

    >
    > Here's a ballpark figure:
    >
    > Let n = 1000000
    >
    > n*log2(n) / 3Ghz = 6.64385619 milliseconds
    >
    > - Oliver
     
    Weng Tianxiang, Jul 7, 2006
    #12
  13. Weng Tianxiang

    Logan Shaw Guest

    Re: How much time does it need to sort 1 million random 64-bit/32-bitintegers?

    Weng Tianxiang wrote:
    > What I really want to know is that the following formula for best
    > sorting timing is correct (it is copied from Knuth's "Sorting and
    > Searching") and it is not too far away from the PC computer reality:
    > 14.5* N * (lg N).


    No, the formula is not correct. The formula does not have any units;
    therefore, the constant factor is meaningless and the formula cannot
    be correct.

    Also, it's not clear whether lg() is a natural logarithm (the usual
    symbol for that is ln()) or a base-2 logarithm or some other base.
    So that adds in a second unknown constant factor.

    Consider, for a moment, if you sort 1 million numbers, each 32 bits
    in size. Should I let N = 1 million? If so, then when I evaluate
    that formula, I get some number like 200324903, or about 200 million.
    But what does this number mean? Is it clock cycles? Microseconds?
    Memory fetches? Microjoules of electrical energy used to power
    the computer while it performs this computation?

    - Logan
     
    Logan Shaw, Jul 7, 2006
    #13
  14. HI,

    Weng Tianxiang schrieb:
    > What I really want to know is that the following formula for best
    > sorting timing is correct (it is copied from Knuth's "Sorting and
    > Searching") and it is not too far away from the PC computer reality:
    > 14.5* N * (lg N).
    >
    > There are about 20 algorithms, and I reall don't know what formula
    > should be selected as a representative for best software algorithm
    > running for 1 million random data.
    >
    > 14.5* N * (lg N) is likely the best one.


    Thats totaly wrong for 1 Mio 32-Bit integers. In a HW-group you should
    be aware of space time tradeoffs.
    A simple fpga with enough memory(4G) should need o(N) to sort that
    number just by storing in the adress of its value. Maybe you need to be
    be aware if you fold two numbers with the same value, this would
    increase your memory to a max of 120Gbit (30 bit counter value results
    in a memory 32x30) and should increase your runtime from o(N) to o(N)+1
    for pipelined lookup.

    Knuth is applicable for real great numbers of N, but nowadays we should
    consider 1 mio numbers as small.

    bye Thomasd
     
    Thomas Stanka, Jul 7, 2006
    #14
  15. Re: How much time does it need to sort 1 million random 64-bit/32-bitintegers?

    Weng Tianxiang schrieb:
    > Hi Wong,
    > n*log2(n) is the number of comparison operations that must be done with
    > a comparison sorting algorithm.


    Forget about comaprisons on PC hardware. You are interested in the
    number of load/stores because these are a lot slower than comparisons.
    A typical comparison sort will have two loads and two store per comparison.

    Radix Sort with 11-bit digits requires three O(N) passes for a total of
    about 12M loads and 6M stores.
    If your cache is big enough you could also do only two passes using
    16-bit digits.

    Kolja Sulimma
     
    Kolja Sulimma, Jul 7, 2006
    #15
  16. Weng Tianxiang

    Guest

    Weng Tianxiang wrote:
    > Hi,
    > What I really want to know is that the following formula for best
    > sorting timing is correct (it is copied from Knuth's "Sorting and
    > Searching") and it is not too far away from the PC computer reality:
    > 14.5* N * (lg N).
    >
    > There are about 20 algorithms, and I reall don't know what formula
    > should be selected as a representative for best software algorithm
    > running for 1 million random data.
    >
    > 14.5* N * (lg N) is likely the best one.



    If you're referring to the table I think you are, you're completely
    misinterpreting it. Knuth compares a number of different internal sort
    algorithms, and gives average and worst case timing formulas based on
    MIX assembler implementations of the algorithms.

    So you can at best, get estimates of the average and maximum runtimes
    for a MIX implementation. And then you need to define "best." The
    table shows a number of algorithms that have better time complexity
    than (14.5n*lg n), although they may impose requirements on the input
    data or require extra space. Do you define best as average or maximum
    run times, stable or not, using minimum memory or not...?

    As others pointed out, there are plenty of sorting algorithms which are
    not bound by the (n*lg n) canard, it's just that they tend not to be as
    useful as the old standby's.

    FWIW, for 32 or 64 bit integers, some extra memory (a bit more than
    double what you need to hold the data) and a high-order radix sort will
    trivially lead to O(N). Is that "best" in some scenario?
     
    , Jul 7, 2006
    #16
  17. Hi Kolja,
    I am talking about a real world measurements, not a theoretical one:
    1. Input data are in random;
    2. Input data are 32-bit or 64-bit;
    3. Sorting must be stable;

    Radix sort cannot be applied to my case: how can you sort by 11-bit for
    a data set of 32-bits? You may imagine a set of data with same all
    lowest 11 bits, but different upper 21 bits?

    Now I have clearer idea on what I really want:
    As indicated by someone above, 14.5*N*(lg N) is for MIX referenced by
    Knuth and is not useful in getting real PC world computation.

    So now I want to know the coefficient in the following formula in real
    PC world to replace Knuth's calculation formula:
    Coefficiant * N * (lg N) for a real PC environment.

    PC running frequency and cache size are not my concern, because when a
    coefficient is given, a PC frequency and cache size may be appended to
    the result and everyone knows their relations.

    If a table of coefficient can be available based on frequency and cache
    size, it is convenient for everyone to get a taste on how fast a
    sorting algorithm is.

    Thank you.

    Weng
     
    Weng Tianxiang, Jul 7, 2006
    #17
  18. Logan Shaw schrieb:
    > No, the formula is not correct. The formula does not have any units;
    > therefore, the constant factor is meaningless and the formula cannot
    > be correct.


    Maybe you should google a bit after landau symbols and complexity
    theorie.
    In general complexity therorie constants are ignored, but for more
    detailed (or practical oriented) inspections they are still used.

    bye Thomas
     
    Thomas Stanka, Jul 7, 2006
    #18
  19. Weng Tianxiang

    Oliver Wong Guest

    "Thomas Stanka" <> wrote in message
    news:...
    >
    > Logan Shaw schrieb:
    >> No, the formula is not correct. The formula does not have any units;
    >> therefore, the constant factor is meaningless and the formula cannot
    >> be correct.

    >
    > Maybe you should google a bit after landau symbols and complexity
    > theorie.
    > In general complexity therorie constants are ignored, but for more
    > detailed (or practical oriented) inspections they are still used.


    I think the point Logan is making is that the original question was "How
    long will this algorithm take to run?" and someone else said (essentially)
    "200 million." To which Logan responds "200 million what? Milliseconds?
    Clock cycles? Something else?"

    Therefore, "200 million" is an "incorrect" answer in that it does not
    actually answer the question asked.

    - Oliver
     
    Oliver Wong, Jul 7, 2006
    #19
  20. Re: How much time does it need to sort 1 million random 64-bit/32-bitintegers?

    Weng Tianxiang schrieb:
    > Hi Kolja,
    > I am talking about a real world measurements, not a theoretical one:
    > 1. Input data are in random;
    > 2. Input data are 32-bit or 64-bit;
    > 3. Sorting must be stable;
    >
    > Radix sort cannot be applied to my case: how can you sort by 11-bit for
    > a data set of 32-bits? You may imagine a set of data with same all
    > lowest 11 bits, but different upper 21 bits?


    Maybe this would be the time to read up how radix sort works?
    Hint: Try google.

    Radix sort is stable. It can applied to all sorting problems, it just
    might not be the fastest one. E.g. unlinke other methods it will be half
    the speed for 64-bit than for 32-bit.
    Hint: In my post I spoke about three passes with 11-bits or two passes
    with 16-bit. How does this relate to the 32-bit of the values?


    > PC running frequency and cache size are not my concern, because when a
    > coefficient is given, a PC frequency and cache size may be appended to
    > the result and everyone knows their relations.

    Definitely not.
    Cache access time is access pattern dependant.

    Kolja Sulimma
     
    Kolja Sulimma, Jul 7, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Elmar Baumann
    Replies:
    0
    Views:
    611
    Elmar Baumann
    Feb 2, 2004
  2. globalrev
    Replies:
    4
    Views:
    771
    Gabriel Genellina
    Apr 20, 2008
  3. how to store 1 million integers?

    , May 9, 2009, in forum: C Programming
    Replies:
    18
    Views:
    1,032
    Chris M. Thomasson
    May 12, 2009
  4. Replies:
    9
    Views:
    119
    Bill H
    May 20, 2008
  5. Replies:
    2
    Views:
    103
    Tad J McClellan
    Oct 18, 2008
Loading...

Share This Page