How to improve this sort?

A

At_sea_with_C

Hello all,

I have written an ascending sort routine for floats. This seems to do
the job, but for elements over 10,000, it gets awfully slow. A lot of
useless comparisions with previously sorted elements are made. I was
thinking of using a function, that kept an index, to selectively iterate
over unsorted elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?
Thanks to all.

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define MAX_ALLOC 1000000UL

void sortf_asc(float *a, unsigned long elem) {
int sort_occured = 0;
unsigned long start_offset = 0, current_pos, c1;
float tmp;
while(start_offset < elem - 2) {
for(c1 = 1 + (current_pos = start_offset); c1 < elem; c1++) {
if(a[current_pos] > a[c1]) {
tmp = a[c1];
a[c1] = a[current_pos];
a[current_pos] = tmp;
current_pos = c1;
sort_occured = 1;
}
}
if(sort_occured) sort_occured = 0;
else start_offset++;
}
return;
}

int main(int argc, char **argv) {
char *usage = "Usage:\n\nPROGRAM ELEMENTS\n\nELEMENTS - Number of "
"elements to sort, (max. = %lu)\n",
*fopenfail = "Failed to open file.\n",
*tail = NULL;
unsigned long elements = 0, c1;
float *farray = NULL;
FILE *outfp = NULL;

if(argc < 2){ fprintf(stderr, usage, MAX_ALLOC); return EXIT_FAILURE; }
else {
errno = 0;
elements = strtoul(argv[1], &tail, 0);
if(errno == EINVAL || errno == ERANGE) {
fprintf(stderr, usage, MAX_ALLOC);
return EXIT_FAILURE;
}
else if(elements < 2 || elements > MAX_ALLOC) {
fprintf(stderr, usage, MAX_ALLOC);
return EXIT_FAILURE;
}
}

farray = malloc(elements * sizeof *farray);
if(!farray) {
fprintf(stderr, "Memory allocation failed.\n");
return EXIT_FAILURE;
}
else {
srand((unsigned int)time(NULL));
for(c1 = 0; c1 < elements; c1++) farray[c1] = (float)rand();
}

outfp = fopen("unsorted", "w");
if(!outfp) {
fprintf(stderr, fopenfail);
free(farray);
return EXIT_FAILURE;
}
else {
for(c1 = 0; c1 < elements; c1++)
fprintf(outfp, "[%lu]: %f\n", c1, farray[c1]);
fclose(outfp);
outfp = NULL;
}

sortf_asc(farray, elements);

outfp = fopen("sorted", "w");
if(!outfp) {
fprintf(stderr, fopenfail);
free(farray);
return EXIT_FAILURE;
}
else {
for(c1 = 0; c1 < elements; c1++)
fprintf(outfp, "[%lu]: %f\n", c1, farray[c1]);
fclose(outfp);
}

return 0;
}
 
R

Richard Heathfield

At_sea_with_C said:
Hello all,

I have written an ascending sort routine for floats. This seems to do
the job, but for elements over 10,000, it gets awfully slow. A lot of
useless comparisions with previously sorted elements are made. I was
thinking of using a function, that kept an index, to selectively iterate
over unsorted elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?

You seem to be using Bubble Sort, which is one of the slowest sorts around.
If you are not allowed to call qsort, I suggest taking a look at page 62 of
"The C Programming Language", 2nd edition, by Kernighan and Ritchie, which
implements D L Shell's sorting algorithm. Change int v[] to float v[] -
that should be the only necessary change.
 
A

At_sea_with_C

Richard said:
At_sea_with_C said:
Hello all,

I have written an ascending sort routine for floats. This seems to do
the job, but for elements over 10,000, it gets awfully slow. A lot of
useless comparisions with previously sorted elements are made. I was
thinking of using a function, that kept an index, to selectively iterate
over unsorted elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?

You seem to be using Bubble Sort, which is one of the slowest sorts around.
If you are not allowed to call qsort, I suggest taking a look at page 62 of
"The C Programming Language", 2nd edition, by Kernighan and Ritchie, which
implements D L Shell's sorting algorithm. Change int v[] to float v[] -
that should be the only necessary change.

I worked out the logic on paper. We were just told to sort an array of N
integers or floats. I chose floats for no particular reason. It's just
an introductory class, and we haven't covered sorting yet, which is why
I had to think for a while for this one. I will see if the book you
mention is available in our library.

Just one question. Is sorting integers different from sorting floats? In
other words, can I just substitute int for float everywhere in my code
and have it work okay? I can't think why not, but I am just starting C,
so your advice would be helpful.

Thanks for the reply.
 
F

Flash Gordon

At_sea_with_C wrote, On 30/01/07 12:16:
Richard said:
At_sea_with_C said:
Hello all,

I have written an ascending sort routine for floats. This seems to do
the job, but for elements over 10,000, it gets awfully slow. A lot of
useless comparisions with previously sorted elements are made. I was
thinking of using a function, that kept an index, to selectively iterate
over unsorted elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?

You seem to be using Bubble Sort, which is one of the slowest sorts
around. If you are not allowed to call qsort, I suggest taking a look
at page 62 of "The C Programming Language", 2nd edition, by Kernighan
and Ritchie, which implements D L Shell's sorting algorithm. Change
int v[] to float v[] - that should be the only necessary change.

I worked out the logic on paper. We were just told to sort an array of N
integers or floats. I chose floats for no particular reason. It's just
an introductory class, and we haven't covered sorting yet, which is why
I had to think for a while for this one. I will see if the book you
mention is available in our library.

If you are at all serious about C I would recommend buying a copy. I've
been programming in C for over 10 years (and other languages for even
longer) and I still find it a useful reference.
Just one question. Is sorting integers different from sorting floats? In
other words, can I just substitute int for float everywhere in my code
and have it work okay? I can't think why not, but I am just starting C,
so your advice would be helpful.

As long as you don't test for equality (which you do not need to do for
sorting) yes. However, in general I would recommend using double rather
than float except where there is a specific reason for using float. In
most situations when you use a float it will be converted to a double so
using floats can lead to a lot of extra conversions for no real benefit
in most situations.
 
R

Richard Heathfield

At_sea_with_C said:
Richard said:
At_sea_with_C said:
Hello all,

I have written an ascending sort routine for floats. This seems to do
the job, but for elements over 10,000, it gets awfully slow. A lot of
useless comparisions with previously sorted elements are made. I was
thinking of using a function, that kept an index, to selectively iterate
over unsorted elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?

You seem to be using Bubble Sort, which is one of the slowest sorts
around. If you are not allowed to call qsort, I suggest taking a look at
page 62 of "The C Programming Language", 2nd edition, by Kernighan and
Ritchie, which implements D L Shell's sorting algorithm. Change int v[]
to float v[] - that should be the only necessary change.

I worked out the logic on paper.

....and (re)invented Bubble Sort. Lots of us do that. :) But, as you
discovered, it gets appallingly slow appallingly fast.
We were just told to sort an array of N integers or floats. I chose
floats for no particular reason. It's just
an introductory class, and we haven't covered sorting yet, which is why
I had to think for a while for this one. I will see if the book you
mention is available in our library.

That's a good plan. Or see, for example:
http://www.nist.gov/dads/HTML/shellsort.html
Just one question. Is sorting integers different from sorting floats? In
other words, can I just substitute int for float everywhere in my code
and have it work okay? I can't think why not, but I am just starting C,
so your advice would be helpful.

Yes, you can do that.
 
D

David T. Ashley

At_sea_with_C said:
Hello all,

I have written an ascending sort routine for floats. This seems to do the
job, but for elements over 10,000, it gets awfully slow. A lot of useless
comparisions with previously sorted elements are made. I was thinking of
using a function, that kept an index, to selectively iterate over unsorted
elements. But I am unable to think of a way to do it.

Any ideas to improve the routine and get a better grade?

I don't know if you've covered big-O notation yet. You may find this
interesting.

http://en.wikipedia.org/wiki/Big_O_notation

The bubble sort is O(N^2), which means that, for example, a sort involving
100,000 records may take 100 times as long as a sort involving 10,000
records. This is as bad as sort algorithms get. Phrased differently, it
does not "scale up" well.

Here is a comparison of sort algorithms:

http://en.wikipedia.org/wiki/Sorting_algorithm

Quicksort is considered the best general-purpose sorting algorithm. Unless
the data is ordered very antagonistically, it performs well. As others have
pointed out, it is supported directly by the library.

It would be, of course, instructional for you to implement quicksort
yourself rather than use the library function. This would tend to impress
the instructor if it wasn't assigned.

Now, as far as sorting doubles ...

Floating-point numbers are sortable in the sense that there is a strict
ordering of them -- the computer hardware is always able to determine
unambiguously a<b, a==b, or a>b.

However, the practical difficulty in dealing with floating-point numbers is
that they only approximate real numbers, and so often there are small
residual errors which lead to programming errors by humans.

For example, consider this:

BEGIN

double six=6.0, three=3.0, two=2.0, one=1.0, result1, result2;

result1 = one/two - one/three;
result2 = one/six;

if (result1 == result2) ... /* Danger, Will Robinson! Danger! */

END

1/2 - 1/3 = 1/6, right? Well, not always with computer arithmetic. There
is some inexactness to deal with.

You may find this URL useful:

http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm

As far as comparing two floats with a unique ordering -- that is guaranteed.
It is just all the steps leading up to that which cause ambiguity ...
 
B

Ben Pfaff

David T. Ashley said:
Quicksort is considered the best general-purpose sorting algorithm. Unless
the data is ordered very antagonistically, it performs well. As others have
pointed out, it is supported directly by the library.

If by "the library" you mean the standard C library qsort()
function, quicksort is merely a common implementation for
qsort(), not a specified or required algorithm. The GNU libc
implementation, for example, uses merge sort when enough memory
is available.
 
R

Richard Tobin

Ben Pfaff said:
If by "the library" you mean the standard C library qsort()
function, quicksort is merely a common implementation for
qsort(), not a specified or required algorithm.

It does also explain the "q" in the name.

-- Richard
 
L

Lane Straatman

David T. Ashley said:
I don't know if you've covered big-O notation yet. You may find this
interesting.

http://en.wikipedia.org/wiki/Big_O_notation

The bubble sort is O(N^2), which means that, for example, a sort involving
100,000 records may take 100 times as long as a sort involving 10,000
records. This is as bad as sort algorithms get. Phrased differently, it
does not "scale up" well.

Here is a comparison of sort algorithms:

http://en.wikipedia.org/wiki/Sorting_algorithm

Quicksort is considered the best general-purpose sorting algorithm.
Unless the data is ordered very antagonistically, it performs well. As
others have pointed out, it is supported directly by the library.
Is there a way where a sort algorithm can ask itself "is this going well?"
and switch gears? For example, could it take a cursory look at the data and
if it doesn't like what it sees, just permute the data randomly? LS
 
M

matevzb

Is there a way where a sort algorithm can ask itself "is this going well?"
and switch gears? For example, could it take a cursory look at the data and
if it doesn't like what it sees, just permute the data randomly?
Depending on your definition of "take a cursory look at the data and
if it doesn't like what it sees", bogosort could considered. It does
the second part exceptionally well =]
 
S

Stephen Howe

Is there a way where a sort algorithm can ask itself "is this going well?"
and switch gears? For example, could it take a cursory look at the data and
if it doesn't like what it sees, just permute the data randomly? LS

Yes. That is what David Musser's Introsort does.
Primarily it has been used for implenting C++'s sort() but there is no
reason why it cannot be used for C or even qsort() can be used for it. It
uses quicksort, but if it discovers a portion of the array which is
partitioning badly it switches to heapsort for that portion of the array.
The net result is that this hybrid sort is always guranteed O(N * log N) in
the worse case and gives quicksort like performance in the average case.

See
http://en.wikipedia.org/wiki/Introsort
for details.

David Musser also talked about Introselect which is a hybrid version of
quickselect - the algorithm used to find the kth element in an unsorted
array and place it into the kth position. quickselect can also go "bad" and
so Introselect is used to guarantee O(N) behaviour.

Stephen Howe
 
D

David T. Ashley

Ben Pfaff said:
If by "the library" you mean the standard C library qsort()
function, quicksort is merely a common implementation for
qsort(), not a specified or required algorithm. The GNU libc
implementation, for example, uses merge sort when enough memory
is available.

Wow. I learn something new every day. I just assumed that qsort was
mnemonic for "quick-sort".

Interestingly enough, *nix and Microsoft documentation seem to differ.

On my *nix box, "man qsort" gets me information that doesn't commit to a
specific algorithm.

However, on Microsoft's website, searching by "qsort" gets me:

<BEGIN>
The qsort function implements a quick-sort algorithm to sort an array of num
elements, each of width bytes.
<END>

which seems to be more committal towards a specific algorithm.

They need to rename the sort function. My suggestions:

faster_than_youre_likely_to_do_it_yourself_sort()

we_might_use_quicksort_and_we_might_not_sort()

dont_ask_whats_in_the_black_box_sort()
 
I

Ian Collins

I

Ian Collins

Ben said:
Eww. Static variables.

Not so bad, they are only used to reduce the number of parameters passed
to the lower level sort routine.

Anyway, what's wrong with using static variables?
 
B

Ben Pfaff

Ian Collins said:
Not so bad, they are only used to reduce the number of parameters passed
to the lower level sort routine.

Anyway, what's wrong with using static variables?

They make code non-reentrant, unsafe in the presence of threads,
and generally brittle. Witness strtok, for example.
 
R

Richard Tobin

Ian Collins said:
Anyway, what's wrong with using static variables?

Isn't qsort() required to be re-entrant? That is, can't the comparison
function call qsort()?

And in many practical environments, qsort() would have to be
thread-safe.

-- Richard
 
C

CBFalconer

Ian said:
Not so bad, they are only used to reduce the number of parameters
passed to the lower level sort routine.

Anyway, what's wrong with using static variables?

Try re-entering that routine. System routines should be
re-entrant. C doesn't guarantee that.

--
<http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>

"A man who is right every time is not likely to do very much."
-- Francis Crick, co-discover of DNA
"There is nothing more amazing than stupidity in action."
-- Thomas Matthews
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top