About parallel sorting...


A

aminer

Hello,

Look down the the following link , it's about parallel partition:

http://www.redgenes.com/Lecture-Sorting.pdf


I have tried to simulate this parallel partition method ,
but i don't think it will scale cause we have to do a merging,
which essentially is an array-copy operation but this array-copy
operations will be expensive compared to an integer compare
operation that you find inside the partition fuinction, and it will still
be expensive compared to a string compare operation that you find
inside the partition function. So since it's not scaling i have abondoned
the idea to implement this parallel partition method in my parallel
quicksort.

I have also just read the following paper about Parallel Merging:

http://www.economyinformatics.ase.ro/content/EN4/alecu.pdf

And i have implemented this algorithm just to see what is its performance.
and i have noticed that the serial algorithm is 8 times slower than the
merge
function that you find in the serial mergesort algorithm.So 8 times slower,
it's too slow.

So the only way to do a somewhat better parallel sorting algorithm,
it's to use the following algorithm;

http://www.drdobbs.com/parallel/parallel-merge/229204454?queryText=parallel+sort


The idea is simple:

Let's assume we want to merge sorted arrays X and Y. Select X[m]
median element in X. Elements in X[ .. m-1] are less than or equal to
X[m]. Using binary search find index k of the first element in Y greater
than X[m]. Thus Y[ .. k-1] are less than or equal to X[m] as well.
Elements in X[m+1..] are greater than or equal to X[m] and Y[k .. ]
are greater. So merge(X, Y) can be defined as
concat(merge(X[ .. m-1], Y[ .. k-1]), X[m], merge(X[m+1.. ], Y[k .. ]))
now we can recursively in parallel do merge(X[ .. m-1], Y[ .. k-1]) and
merge(X[m+1 .. ], Y[k .. ]) and then concat results.



Thank you,
Amine Moulay Ramdane.
 
Ad

Advertisements

A

aminer

Hello,


There is another parallel method for parallel partition, here it's:

http://www.cs.sunysb.edu/~rezaul/Spring-2012/CSE613/CSE613-lecture-9.pdf

but as you will notice it's still too expensive, causeyou have to create
3 arrays and copy in them:

3. array B[ 0: n ? 1 ], lt[ 0: n ? 1 ], gt[ 0: n ? 1 ]

You can use SIMD instructions on the parallel-prefix-sum function

8. lt [ 0: n ? 1 ] ¬ Par-Prefix-Sum ( lt[ 0: n ? 1 ], + )
:
But the algorithm is still expensive i think on a quad or eight cores or
even
16 cores, you have to have much more than 16 cores to be able to benefit
from this method i think.




Thank you,
Amine Moulay Ramdane..
 
Ad

Advertisements

A

aminer

Hello,

Read this:

"Parallel hybrid merge algorithm was developed that outperformed
sequential simple merge as well as STL merge by 0.9-5.8 times overall
and by over 5 times for larger arrays"

http://www.drdobbs.com/parallel/parallel-merge/229204454?pgno=3


This paper as you have noticed has fogot to tell that this method is
dependant on the distribution of the data

Read for exemple this:

"Select X[m] median element in X. Elements in X[ .. m-1] are less than
or equal to X[m]. Using binary search find index k of the first element in
Y greater than X[m]."

So if "median element:" of X is not near or equal to the "median
element" of Y so this method will have bad parallel performance
and it will not scale.



Merci,
Amine Moulay Ramdamne
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top