Bubble Sort in C

Seebs · Sep 15, 2009

Nevertheless, an art world aficionado
would be able to distinguish them easily (even if he couldn't
actually be sure that it was by Picasso): "THIS one is the work of a
master; but THAT one shouldn't even be in this gallery", he might
say. And the reason is simple. Picasso didn't start doing cubism (and
other abstract art forms, for all I know) until he had first mastered
what I would call "proper" painting. He learned how to paint
portraits and landscapes and still life and stuff. He learned the
"rules" of art, and learned them well. THEN he was in a position to
know how and when to break them.

Yes!

Good point.

Okay, now for the moral (which is by now, I hope, obvious). Make it a
rule NEVER to play games with #include. Use it like the books say -
at the top, headers only. One day (and it might be months or it might
be decades from now - that's up to you), when you have mastered C to
the point where high quality C code flows from your brain into your
source without hesitation or flaw, you may find yourself in a
situation a little like that of Peter Seebach, and you may find
yourself thinking "perhaps it's time to break that rule; let's see if
I can't think of an alternative first, though". And, if there is no
sensible alternative, it may then be time to go ahead and break that
rule.

The code in question violated at least a dozen of the hard and fast
rules of programming.

It's a library which is inserted in front of the standard system C library
(using "LD_PRELOAD" for those of you using systems where that makes sense)
which intercepts and modifies essentially all filesystem operations, using
network connections to a server to modify their behavior. The ultimate effect
is that a program running under it *thinks* it has root privileges, and can
do things like changing file permissions, and see the results it expects,
without actually having root privileges. (People familiar with the Debian
build system might think this sounds like fakeroot; it's similar in purpose
but quite different in implementation.)

So I have a couple dozen functions which happen to have the exact same
prototypes *and names* as standard library functions. (You see the madness
already, yes?) And they have preambles and epilogues which are needed
to perform Dire Magics. For instance, handling the fact that open(2) has
two incompatible prototypes; I handle this, very poorly, by relying on
the knowledge that *in fact* you can treat it as a "..." on the systems
we care about.

And then I have things like:

static int
wrap_open(const char *path, int flags, ...) {
int rc = -1;
mode_t mode;

va_list ap;
va_start(ap, flags);
mode = va_arg(ap, mode_t);
va_end(ap);

#include "guts/open.c"

return rc;
}

And the code generates a file looking like:

/*
* int
* wrap_open(const char *path, int flags, ...mode_t mode) {
* int rc = -1;
*/

/* return rc;
* }
*/

Once this is generated, I write the internals for a function with the
given prototype (note the "..." as a warning that there was magic
involved).

Now, the thing is, the code AROUND guts/open.c is subject to change
without notice. If I did some kind of clever thing to rewrite the
source file and try to preserve my code, it would almost certainly
fail eventually, creating nightmarish problems. Worse, there are
real live sound technical reasons to make sure that every one of
these functions is actually in a single file all together, but I don't
want the source for them all in one file necessarily -- in particular,
depending on the system you're building for, some of these functions
might not exist in the machine-generated file. (Also, having them
all in one file, and a few other things in that file, I can declare
a whole mess of things static - very important, because I don't want
to pollute the namespace of programs using my code except on purpose.)

Solution: Use #include of source files which contain only the lines
of code which would go inside a function body, and which are put in
between other lumps of code which are also in that function body, but
which aren't in a user-edited file.

Is this, in general, good style? Not in the least. Is it an effective
way to keep the substantive implementation in clearly-defined source files,
while preserving the Dire Magic of the all-in-one-file.

For another example, look at the SQLite build, where the primary form
of the build (and the only one that is "supported" in most contexts)
is that they have a script carefully read through all their source
and assemble a Giant File Of Source, because with gcc, having all the
source in a single file improves performance somewhere around 10% on
common targets, and reduces the size of the resulting library noticably.

In short: It ain't portable, and it ain't pretty, but there are reasons
to do it for production systems in some cases.

-s

Mark · Sep 15, 2009

Richard said:
Because he's using it to get the offset to the last character.

Ahh, I'm always in rush. Thanks, Richard, the answer is obvious.

A better question would be: "why don't you check that strlen(c) is
non-zero before subtracting 1 from it?", to which the answer will
probably be along the lines of "oops".

That's a good point. Perhaps assert(c != NULL) may help ?

Nick Keighley · Sep 15, 2009

I'm not sure I agree with this

You are saying that the exercise I did was purely academic (not
practical).

writing almost *any* code when you are learnign is "practical".
I don't need a farenheit to centigrade converter program (I have
a table next to my heating controls) but it's still a useful
excercise in K&R. I don't understand why "academic" is used
as a synonym for "useless".

Remember I am not student anymore

well I am. I'm still learnign things and I don't plan to stop.
I *graduated* yonks ago.

I am not thrashing you. I have academic exercises which theoretical to
the point of totally unrelated to the real world programming. So, if
you think my ways of learning algorithms are not practical, you must
have something better to offer.

well you have chosen a rather poor algorithm to play with.
I'm not sure reinventing wheels like this is going to teach you much.
Get a decent book on algorithms. Sedgewick (or if brave, Knuth)
they'll explain the various trade-offs the algorithms exhibit.

If other regular posters say that Bubble-Sort is inherently bad idea
(except of the 2 cases you mention in an earlier reply) then I will
not use in the code I write for my employer. BTW, next algorithm in
Ullman's book is insertion sort

Bubble sort is bad. For small numbers of elements Insertion Sort beats
it
and for large numbers of elements Quick Sort beats it. Insertion Sort
is simpler to code.

"the bubble sort seems to have nothing to recommend it, except a
catchy name and the fact that it leads to some interesting
theoretical
problems" D.E.Knuth

Nick Keighley · Sep 15, 2009

Since a lot of real-world work involves working with code other people
wrote, it's not usually enough to say "I wouldn't do that". You'll end up
working on code where someone else did.

I'm still waiting to see code with #include inside a function body.
I'm not saying I wouldn't do it in circumstances similar to yours
but it's pretty rare

Charlton Wilbur · Sep 15, 2009

a> If other regular posters say that Bubble-Sort is inherently bad
a> idea (except of the 2 cases you mention in an earlier reply) then
a> I will not use in the code I write for my employer. BTW, next
a> algorithm in Ullman's book is insertion sort

Bubble sort is an excellent first sort to teach because even a beginner
at the study of algorithms can see how it works. Also, because it's
O(n^2), it offers a point of comparison against mergesort, for instance,
which is O(n ln n).

But I'm not going to say that it's bad. You've just spent part of a
thread explaining to us that you're an educated, knowledgeable
professional, and your employer is paying you for that expertise -- do
*you* think bubble sort is a poor choice? Why or why not?

Charlton

Keith Thompson · Sep 15, 2009

Richard Heathfield said:
Only if this is a one-off function for a program known never to
process empty strings when working correctly. If it's a general
purpose routine (i.e. if the function could ever legitimately receive
an empty string), it's better to check and report an error to the
caller:

What error? Sorting an empty string should quietly do nothing.

void sort_bubble(char *c) /* academic exercise only - do not
use in production environment */
{
if(c != NULL && *c != '\0')
{ [snip]
}
}

See?

Seebs · Sep 15, 2009

I'm still waiting to see code with #include inside a function body.

I posted my example.

I'm not saying I wouldn't do it in circumstances similar to yours
but it's pretty rare

Considering that I have one example I can think of in twenty years or so
of using C, I'd agree that "pretty rare" is a safe statement.

But given that there is such an example, it becomes interesting. Say there's
a standard header that I need in only one of the twenty or thirty functions
I'm implementing that way. What *does* happen if I #include a standard header
in that file, which turns out to be #included inside a function definition?

(Answer: Honestly, I'm not totally sure; I'd have to look it up. I believe
off the top of my head that #defines persist for the remainder of the
translation units, but that declarations don't. The killer, though, is
that I am pretty sure standard headers are allowed to define inline
functions, and that would produce nested functions, which would kill me.
So I think the answer is that in practice it would likely be a fatal error.)

-s

Seebs · Sep 15, 2009

It violates a "shall" that does not occur in a constraints section; as
such the behavior is undefined.

That would fit my expectations. There's no requirement that it be made
to work, and there's likely circumstances where it wouldn't without a great
deal of effort, so I'd expect it would not work.

-s

jameskuyper · Sep 15, 2009

Seebs wrote:
....

But given that there is such an example, it becomes interesting. Say there's
a standard header that I need in only one of the twenty or thirty functions
I'm implementing that way. What *does* happen if I #include a standard header
in that file, which turns out to be #included inside a function definition?

(Answer: Honestly, I'm not totally sure; I'd have to look it up. I believe
off the top of my head that #defines persist for the remainder of the
translation units, but that declarations don't.

Correct, in general, but for standard headers you've got a problem:

"If used, a header shall be included outside of any external
declaration or definition, ..." (7.1.2p4), which means in particular
that it can't occur inside a function definition.

So I think the answer is that in practice it would likely be a fatal error.)

It violates a "shall" that does not occur in a constraints section; as
such the behavior is undefined.

Squeamizh · Sep 15, 2009

UNLESS
1. The list is very tiny (e.g. 20 items or less)
AND
2. The list is very nearly sorted already (e.g. 1 or 2 items in the
full list out of sequence)
THEN
Bubble sort is not only the wrong answer, it is tragically the wrong
answer.

Most of the time, for sorting a small list of items, insertion list is
a good choice.
Insertion sort is O(N*N) just like bubblesort, but the constant of
proportionality is much smaller.

For the rare condition list stated above, bubblesort is a good choice.

I don't see how bubble sort is faster than insertion sort under the
conditions you stated. I don't think bubble sort is faster than
insertion sort under any circumstances.

user923005 · Sep 15, 2009

I don't see how bubble sort is faster than insertion sort under the
conditions you stated. I don't think bubble sort is faster than
insertion sort under any circumstances.

Experiments have shown that with a list that averages 5 items in
length that is almost always nearly sorted, bubble sort worked out
faster than insertion sort and binary insertion sort in this routine:

#include "chess.h"
#include "data.h"
/* last modified 11/24/08 */
/*

*******************************************************************************

*
*
* Quiesce() is the recursive routine used to implement the alpha/
beta *
* negamax search (similar to minimax but simpler to code.) Quiesce
() is *
* called whenever there is no "depth" remaining so that only
capture moves *
* are searched
deeper. *

*
*

*******************************************************************************
*/
int Quiesce(TREE * RESTRICT tree, int alpha, int beta, int wtm, int
ply)
{
register int o_alpha, value;
register int *next_move;
register int *goodmv, *movep, moves = 0, *sortv, temp;

/*
************************************************************
* *
* initialize. *
* *
************************************************************
*/
if (ply >= MAXPLY - 1)
return (beta);
tree->nodes_searched++;
#if defined(NODES)
temp_search_nodes--;
if (temp_search_nodes <= 0) {
time_abort++;
abort_search = 1;
return (0);
}
#endif
if (tree->thread_id == 0)
next_time_check--;
tree->last[ply] = tree->last[ply - 1];
o_alpha = alpha;
/*
************************************************************
* *
* now call Evaluate() to produce the "stand-pat" score *
* that will be returned if no capture is acceptable. *
* if this score is > alpha, then we also have to save *
* the "path" to this node as it is the PV that leads *
* to this score. *
* *
************************************************************
*/
value = Evaluate(tree, ply, wtm, alpha, beta);
#if defined(TRACE)
if (trace_level >= 99)
Trace(tree, ply, value, wtm, alpha, beta, "quiesce", EVALUATION);
#endif
if (value > alpha) {
if (value >= beta)
return (value);
alpha = value;
tree->pv[ply].pathl = ply - 1;
tree->pv[ply].pathh = 0;
tree->pv[ply].pathd = iteration_depth;
}
/*
************************************************************
* *
* generate captures and sort them based on (a) the value *
* of the captured piece - the value of the capturing *
* piece if this is > 0; or, (b) the value returned by *
* Swap(). if the capture leaves the opponent with no *
* minor pieces, then we search that capture always since *
* the endgame might be won or lost with no pieces left. *
* *
* once we confirm that the capture is not losing any *
* material, we sort these non-losing captures into *
* MVV/LVA order which appears to be a slightly faster *
* move ordering idea. *
* *
************************************************************
*/
tree->last[ply] = GenerateCaptures(tree, ply, wtm, tree->last[ply -
1]);
goodmv = tree->last[ply - 1];
sortv = tree->sort_value;
for (movep = tree->last[ply - 1]; movep < tree->last[ply]; movep++)
{
if (Captured(*movep) == king)
return (beta);
if (pc_values[Piece(*movep)] < pc_values[Captured(*movep)] ||
((wtm) ? TotalPieces(black, occupied) : TotalPieces(white,
occupied)) - p_vals[Captured(*movep)] == 0) {
*goodmv++ = *movep;
*sortv++ = 128 * pc_values[Captured(*movep)] - pc_values[Piece
(*movep)];
moves++;
} else {
temp = Swap(tree, From(*movep), To(*movep), wtm);
if (temp >= 0) {
*goodmv++ = *movep;
*sortv++ = 128 * pc_values[Captured(*movep)] - pc_values[Piece
(*movep)];
moves++;
}
}
}
if (!moves) {
if (alpha != o_alpha) {
memcpy(&tree->pv[ply - 1].path[ply], &tree->pv[ply].path[ply],
(tree->pv[ply].pathl - ply + 1) * sizeof(int));
memcpy(&tree->pv[ply - 1].pathh, &tree->pv[ply].pathh, 3);
tree->pv[ply - 1].path[ply - 1] = tree->curmv[ply - 1];
}
return (value);
}
/*
************************************************************
* *
* don't disdain the lowly bubble sort here. the list of *
* captures is always short, and experiments with other *
* algorithms are always slightly slower. this is very *
* cache-friendly and runs quickly. *
* *
************************************************************
*/
if (moves > 1) {
register int done;
register int *end = tree->last[ply - 1] + moves - 1;

do {
done = 1;
sortv = tree->sort_value;
for (movep = tree->last[ply - 1]; movep < end; movep++, sortv++)
if (*sortv < *(sortv + 1)) {
temp = *sortv;
*sortv = *(sortv + 1);
*(sortv + 1) = temp;
temp = *movep;
*movep = *(movep + 1);
*(movep + 1) = temp;
done = 0;
}
} while (!done);
}
next_move = tree->last[ply - 1];
/*
************************************************************
* *
* now iterate through the move list and search the *
* resulting positions. *
* *
************************************************************
*/
while (moves--) {
tree->curmv[ply] = *(next_move++);
#if defined(TRACE)
if (ply <= trace_level)
Trace(tree, ply, 0, wtm, alpha, beta, "quiesce", CAPTURE_MOVES);
#endif
MakeMove(tree, ply, tree->curmv[ply], wtm);
value = -Quiesce(tree, -beta, -alpha, Flip(wtm), ply + 1);
UnmakeMove(tree, ply, tree->curmv[ply], wtm);
if (value > alpha) {
if (value >= beta)
return (value);
alpha = value;
}
if (tree->stop)
return (0);
}
/*
************************************************************
* *
* all moves have been searched. return the search *
* result that was found. if the result is not the *
* original alpha score, then we need to return the PV *
* that is associated with this score. *
* *
************************************************************
*/
if (alpha != o_alpha) {
memcpy(&tree->pv[ply - 1].path[ply], &tree->pv[ply].path[ply],
(tree->pv[ply].pathl - ply + 1) * sizeof(int));
memcpy(&tree->pv[ply - 1].pathh, &tree->pv[ply].pathh, 3);
tree->pv[ply - 1].path[ply - 1] = tree->curmv[ply - 1];
}
return (alpha);
}

Bart · Sep 15, 2009

I'm still waiting to see code with #include inside a function body.
I'm not saying I wouldn't do it in circumstances similar to yours
but it's pretty rare

int function(...) {
SomeStructType table[] = {
#include "machine_generated_table.c"
};
<do something with the table>

}

In an actual example, the table (one of 6 generated by a utility),
included one entry per function definition in some other modules
(several hundred entries).

Phil Carmody · Sep 16, 2009

Seebs said:
Bubble-sort is basically awful.
Absolutely.

It's certainly obvious,

Whawhawhat? Give someone a dozen cards, but no surface upon which to
lay them out, and ask them to order them. They will never ever use a
bubble sort. They'll either use a selection sort or an insertion
short. Bubble-sort is highly counter-intuitive.

Bubble-sort is an abomination; it doesn't even have paedagogical
merits as there are simpler and quicker O(n^2) sorts out there.

Phil

Phil Carmody · Sep 16, 2009

Seebs said:
Yes.

Because bubble-sort is bad enough that a good bubble-sort is still useless.

Oddly:

I have produced a hunk of code which does precisely that.

There's a hunk of machine-generated code, and it has a whole lot of
things of the form

int
foo(foo's arguments) {
// preamble happens here
#include guts/foo.c
// epilogue happens here
}

Thinking about it, according to the theoretical model, by the time
it's got to compiling the function, there isn't any #include inside
it, that's already been replaced and preprocessed!

(Likewise, you'd not have any comments in your function.)

There are sound technical reasons for this.

Absolutely. I've done way worse. Like that, but foo.c makes
use of macros that it doesn't define in order to modify what
it does. Then again, I wanted modular reduction techniques
which would word on both balanced representations and unsigned
representations, when the system was configured for all of the
different IEEE rounding modes, and only wanted to write the
functions once, yet compiled 8 different ways.

Phil

Richard Tobin · Sep 16, 2009

Phil Carmody said:
Whawhawhat? Give someone a dozen cards, but no surface upon which to
lay them out, and ask them to order them. They will never ever use a
bubble sort.

I suppose there are practical physical uses for bubble sort, taking
advantage of its property that only adjacent elements are compared.

"If you're taller than the person on your right, swap places with
them."

-- Richard

Nick Keighley · Sep 16, 2009

I suppose there are practical physical uses for bubble sort, taking
advantage of its property that only adjacent elements are compared.

"If you're taller than the person on your right, swap places with
them."

probably quite efficient too. The swops will take place in parrallel

James Kuyper · Sep 16, 2009

Richard said:
Give a programming newbie a computer, however, and tell them to write
a sort routine, and they may well end up with bubble sort as their
first attempt.

I wouldn't expect that to be the common result unless the newbie had
already heard of bubble-sort.

The earliest program I can remember writing implemented an algorithm
which makes even bubble sort look good by comparison. It was written in
Fortran I, and I certainly don't have the text of that program any more,
but here's the same algorithm implemented in C99:

void sort_int(int array[], int n)
{
// I am quite certain my Fortran I version
// had nothing like the following statement:
if(array==NULL || n<2)
return;

for(int start=0; start < n - 1; start++)
{
int min = array[start];
int location = start;

// Find smallest element in rest of list
for(int next = start+1; next < n; next++)
if(array[next] < min)
{
min = array[next];
location = next;
}

if(location != start)
{ // Swap elements
int temp = array[start];
array[start] = array[location];
array[location] = temp;
}
}
}

That is an algorithm I could imagine a newbie inventing, but oddly
enough, that's not what actually happened - that's the algorithm our
teacher instructed us to implement.

More than three decades after the fact, I can't be sure, but I think
that if I'd been asked to choose my own sorting algorithm, I might have
come up with something like an insertion sort.

Keith Thompson · Sep 16, 2009

Nick Keighley said:
probably quite efficient too. The swops will take place in parrallel

A naive parallel implementation will have problems: the person you're
trying to swap places with might simultaneously be swapping places
with someone else.

I suppose you could do something like this:

Loop
Everyone in an even-numbered position and taller than the
person to your right, swap
Everyone in an odd-numbered position and taller than the
person to your right, swap
Stop when nobody has swapped

user923005 · Sep 16, 2009

Phil Carmody wrote: said:
Phil Carmody wrote: said:

Bubble-sort [...] doesn't even have paedagogical
merits as there are simpler and quicker O(n^2) sorts out there.

Click to expand...

I said I didn't have time to address this, but something failed to
crop up after all.

As I said before, although (as you rightly point out) BubbleSort is
most certainly not how normal people intuitively sort cards, it is
quite likely to be the first *programmatical* sort they invent for
themselves.

When I first became a programmer, I was asked to sort some data that
was read from tape. Being super-green around the gills, I had no idea
that there was such a thing as a library sort routine so I wrote my
own sort. The wheel that I re-invented was selection sort. So that
was what was most natural to me.

Seebs · Sep 16, 2009

I wouldn't expect that to be the common result unless the newbie had
already heard of bubble-sort.

I don't know about common, but it's essentially what I came up with when
I first tried to figure out how you'd do it.

-s

Selection-Sort in C	35	Sep 29, 2009
insertion sort in C	14	Sep 15, 2009
selection-sort in C	22	Aug 16, 2011
pointer to pointer	7	Dec 21, 2010
bubble sort in structure	5	Mar 18, 2011
Binary Search in C	7	Dec 27, 2010
Problems in creating libraries	0	Nov 3, 2024
Unable to read input from keyboard, in below C code, for a BST.	0	Jul 20, 2025

Bubble Sort in C

Seebs

Mark

Nick Keighley

Nick Keighley

Charlton Wilbur

Keith Thompson

Seebs

Seebs

jameskuyper

Squeamizh

user923005

Bart

Phil Carmody

Phil Carmody

Richard Tobin

Nick Keighley

James Kuyper

Keith Thompson

user923005

Seebs

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads