Recursion

K

kokolo

I found an implementation without sub-arrays:
http://www.thescripts.com/forum/thread49795.html
changedit so it works and tried it for 99999 elements.
It did it in13 seconds while my code does it in 14.
How quick it can get at all?

kokolo
I won't spam on this subject any more. I ckecked out a little and realized
that the code from the link above is
just a re-written C-code from
http://linux.wku.edu/~lamonml/algor/sort/quick.html

It doesn't have any Perl flavor and is much harder to read.
Yes, it's faster but only 1 sec for 100000 elements (13 to 14). I like this
one much more for the flavor (push, pop) and readibility:
_____________________________________________________-
print "Enter the number of elements to sort: \n";
chomp($size=<STDIN>);
foreach (1..$size) {push @array,int(rand(1000))}

$start=time();
@array= qs(@array);

#print "The sorted array:\n @array \n";
print "$size elementsQuickSorted in ,",time()-$start," seconds.\n";

sub qs {
my @array=@_;
my $pivot=$#array;
my @left;
my @right;

for ($i=0;$i<$pivot;$i++){

if ($red[$i]<=$red[$pivot]){ unshift @left,$red[$i]}
else { push @right,$red[$i]}
}

if ($#left>0){@left=qs(@left)}
if ($#right>0){@right=qs(@right)}

push @left,($red[$pivot],@right);
return @left;
}
____________________________________________________________________________
___________
I appreciate all advices, those about referencing in particular but I'd
really appreciate if somebody
can help me find how quick Perl can get with quicksort.
C does 100000 elements in less than a second. I'm aware Perl has to be
slower and that there are builtins,etc. but I'm not asking for general
advices.
C-like code does it in 13. I believe this is far from Perl limit.

kokolo
 
X

xhoster

That said, your program does seem to be N**2, rather than NlogN, and I
see no reason that it should be. Even with all the badness built in, it
should still be NlogN, or maybe N(logN)**2, not N**2. I don't quite get
it.

OK, I solved the problem. Your sort has a bad worst case performance (on
an array that is already sorted, reverse sorted, all ties, etc.) because it
falls into a N**2 step of just picking off one element per recursion.
Because of the way you generate data, you have only 1000 distinct values
and therefore you have a huge number of ties. Once all those ties come
together into a subarray, sorting that subarray has bad performance. That
happens over and over again.

I took out the "int" from your data generation, and bang, things got much
better.

Xho
 
A

anno4000

kokolo said:
I found an implementation without sub-arrays:
http://www.thescripts.com/forum/thread49795.html
changedit so it works and tried it for 99999 elements.
It did it in13 seconds while my code does it in 14.
How quick it can get at all?

kokolo
I won't spam on this subject any more. I ckecked out a little and realized
that the code from the link above is
just a re-written C-code from
http://linux.wku.edu/~lamonml/algor/sort/quick.html

It doesn't have any Perl flavor and is much harder to read.
Yes, it's faster but only 1 sec for 100000 elements (13 to 14). I like this
one much more for the flavor (push, pop) and readibility:
_____________________________________________________-
print "Enter the number of elements to sort: \n";
chomp($size=<STDIN>);
foreach (1..$size) {push @array,int(rand(1000))}

$start=time();
@array= qs(@array);

#print "The sorted array:\n @array \n";
print "$size elementsQuickSorted in ,",time()-$start," seconds.\n";

sub qs {
my @array=@_;
my $pivot=$#array;
my @left;
my @right;

for ($i=0;$i<$pivot;$i++){

if ($red[$i]<=$red[$pivot]){ unshift @left,$red[$i]}
else { push @right,$red[$i]}
}

if ($#left>0){@left=qs(@left)}
if ($#right>0){@right=qs(@right)}

push @left,($red[$pivot],@right);
return @left;
}

Oh puleeze! You posted this code without ever checking it. It isn't
strict-safe (@red and $i are undeclared), and it doesn't work (@array
and @red should be the same arrays). The way it is it returns an
array of undef's instead of a sorted list. It is annoying to have
to spend time correcting trivial coding mistakes, time *you* should
have spent before posting.

The code still makes copies (@left, @right) of partial lists and
glues them together after sorting. The hallmark of standard quicksort
implementations is that they sort *in place*, moving elements around
in the original list.
____________________________________________________________________________
___________
I appreciate all advices, those about referencing in particular but I'd
really appreciate if somebody
can help me find how quick Perl can get with quicksort.
C does 100000 elements in less than a second. I'm aware Perl has to be
slower and that there are builtins,etc. but I'm not asking for general
advices.
C-like code does it in 13.

There's nothing particularly C-like in the implementation.
I believe this is far from Perl limit.

Okay, here is an implementation. It is not highly optimized. In
particular it uses an auxiliary sub partition() to do the pivot
operation. That is because I found that step difficult to get
right, so I isolated it.

sub qs {
my @a = @_;
qs_recurs( \ @a, 0, $#_);
@a;
}

sub qs_recurs {
my ( $a, $from, $to) = @_;
return unless $from < $to;
my $mid = partition( $a, $a->[ $from], $from, $to);
@$a[ $from, $mid] = @$a[ $mid, $from]; # move pivot to middle
qs_recurs( $a, $from, $mid - 1); # sort lower part
qs_recurs( $a, $mid + 1, $to); # sort upper part
}

sub partition {
my ( $a, $pivot, $from, $to) = @_;
while ( $from <= $to ) {
if ( $a->[ $from] > $pivot ) {
@$a[ $from, $to] = @$a[ $to, $from];
-- $to;
} else {
++ $from;
}
}
return $to;
}

Anno
 
K

kokolo

Oh puleeze! You posted this code without ever checking it. It isn't
strict-safe (@red and $i are undeclared), and it doesn't work (@array
and @red should be the same arrays). The way it is it returns an
array of undef's instead of a sorted list. It is annoying to have
to spend time correcting trivial coding mistakes, time *you* should
have spent before posting.

I hate myself for quick-posting. Ihad several versions for testing purposes
and copied a bad one. I'm really sorry,
I'm wasting other people's time that way.
The code still makes copies (@left, @right) of partial lists and
glues them together after sorting. The hallmark of standard quicksort
implementations is that they sort *in place*, moving elements around
in the original list.
Okay, here is an implementation. It is not highly optimized. In
particular it uses an auxiliary sub partition() to do the pivot
operation. That is because I found that step difficult to get
right, so I isolated it.

sub qs {
my @a = @_;
qs_recurs( \ @a, 0, $#_);
@a;
}

sub qs_recurs {
my ( $a, $from, $to) = @_;
return unless $from < $to;
my $mid = partition( $a, $a->[ $from], $from, $to);
@$a[ $from, $mid] = @$a[ $mid, $from]; # move pivot to middle
qs_recurs( $a, $from, $mid - 1); # sort lower part
qs_recurs( $a, $mid + 1, $to); # sort upper part
}

sub partition {
my ( $a, $pivot, $from, $to) = @_;
while ( $from <= $to ) {
if ( $a->[ $from] > $pivot ) {
@$a[ $from, $to] = @$a[ $to, $from];
-- $to;
} else {
++ $from;
}
}
return $to;
}

Anno

Very, very nice. The sub partition() is the nicest part and is the crucial
improvement.
I was struggling with that part, I didn't want a messy code and I chose to
go with sub-arrays to keep it simple.
Your solution is both elegant and efficient. 100000 in 3 sec.
Thanks a lot for a very educational solution.

kokolo
 
K

kokolo

OK, I solved the problem. Your sort has a bad worst case performance (on
an array that is already sorted, reverse sorted, all ties, etc.) because it
falls into a N**2 step of just picking off one element per recursion.
Because of the way you generate data, you have only 1000 distinct values
and therefore you have a huge number of ties. Once all those ties come
together into a subarray, sorting that subarray has bad performance. That
happens over and over again.

I took out the "int" from your data generation, and bang, things got much
better.

Xho

Now, that was remarkable.
I appreciate a lot the fact you actually tested the code and gave it some
thought and found what was the problem.
Now it looks like N log N and is pretty fast.

thx again
kokolo
 
A

anno4000

[snipped, revised version below]
Very, very nice. The sub partition() is the nicest part and is the crucial
improvement.

It is also the part that has a serious design flaw. Two, really.
It takes an arbitrary number for the pivot, but quicksort only
works when the pivot is a known element of the unsorted array.
Also, the code in qs_recurs() relies on the fact that the pivot
element won't change its position at the beginning of the array
during partitioning. That's difficult to grant in general and
also nails the choice of the pivot to being the first element.
I was struggling with that part, I didn't want a messy code and I chose to
go with sub-arrays to keep it simple.
Your solution is both elegant and efficient. 100000 in 3 sec.
Thanks a lot for a very educational solution.

Here is a revised version. The pivot *index* can now be chosen
arbitrarily in the call to partition(), at the cost of an additional
exchange. All moving-about of elements happens in partition(),
qs_recurs() only organizes the calls by choosing the pivot index and
the sub-array limits.

sub qs {
my @a = @_;
qs_recurs( \ @a, 0, $#_);
@a;
}

sub qs_recurs {
my ( $a, $from, $to) = @_;
return unless $from < $to;
my $i_pivot = ($from + $to)/2; # anything between $from and $to
my $mid = partition( $a, $from, $i_pivot, $to);
qs_recurs( $a, $from, $mid - 1);
qs_recurs( $a, $mid + 1, $to);
}

sub partition {
my ( $a, $from, $i_pivot, $to) = @_;
# put pivot in first position (it will remain there)
@$a[ $from, $i_pivot] = @$a[ $i_pivot, $from];
$i_pivot = $from; # save for later
my $pivot = $a->[ $i_pivot];
while ( $from <= $to ) {
if ( $a->[ $from] > $pivot ) {
@$a[ $from, $to] = @$a[ $to, $from];
-- $to;
} else {
++ $from;
}
}
# move pivot element in the middle
@$a[ $i_pivot, $to] = @$a[ $to, $i_pivot];
return $to;
}

As mentioned, this isn't particularly optimized, and the change doesn't
make it faster. Obvious micro-optimizations are

-- avoid unnecessary recursive calls:

qs_recurs( $a, $from, $mid - 1) if $from < $mid - 1; # etc

-- avoid unnecessary exchange operations

@$a[ $from, $to] = @$a[ $to, $from] if $from != $to; # etc

__ inline partition() into qs_recurs()

None of these will make a dramatic difference, and each should be
tried and benchmarked individually. The second one in particular
could be counter-productive.

Anno
 
H

Henry Law

Bernard said:
The usefulness of this group has become questionable at best.

Not my experience at all. I have several times recently posted here
with some Perl problem, and always had prompt, accurate and wise advice.
From time to time people have put themselves out to help me, e.g.
installing modules in order to run my problem code and so on.
Most questions asked here are rejected immediately because they don't conform to *every*
*single* *fucking* *point* in the guidelines, even if they are perfectly clear. The only ones left
are so simple, they can be answered with a perldoc reference.

Do you understand the truth - and therefore the irony - of what you've
just written? If a poster spends the time required to follow the
guidelines (Googling, pruning the problem program down to its essence,
providing good test data, checking code levels and so forth) time after
time the solution becomes clear.

In other words the whole point of the guidelines is not so much that
they've been followed: it's that following them helps clarify the
problem and makes it easier for someone - often the poster - to find the
solution.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,084
Latest member
HansGeorgi

Latest Threads

Top