# selection-sort in C

A

#### arnuld

Just wrote selection-sort in C. Is it the right way to do this in C ?

(definition of selection-sort taken from exercise 2.2-2 of Introduction
to Algorithms, 2e)

/* Exercise 2.2-2, page 29, Selection Sort */

#include <stdio.h>

enum { SIZE_ARR = 6 };

void selectionSort(char s[], int len);
int findSmallest(char s[], const int idx_cur, const int len);

int main(void)
{
char arr[SIZE_ARR+1] = {'a', 'r', 'n', 'u', 'l', 'd'};

printf("%s\n", arr);
selectionSort(arr, SIZE_ARR);
printf("%s\n", arr);

return 0;
}

void selectionSort(char s[], int len)
{
int i, idx;
char tmp;
for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);
/* printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;
s = s[idx];
s[idx] = tmp;
}
}

int findSmallest(char s[], const int idx_cur, const int len)
{
int j, ret_index;
char smallest = s[idx_cur];
ret_index = idx_cur;

for(j = idx_cur+1; j != len; ++j)
{
/* printf("idx_cur = %d, j = %d\n", idx_cur, j); */
if(smallest > s[j])
{
smallest = s[j];
ret_index = j;
}
}

return ret_index;
}

===================== OUTPUT =======================
[[email protected] Intro-to-Algos]\$ gcc -ansi -pedantic -Wall -Wextra 2.2-2.c
[[email protected] Intro-to-Algos]\$ ./a.out
arnuld
[[email protected] Intro-to-Algos]\$

-- arnuld
www.LispMachine.Wordpress.com

P

#### Phil Carmody

arnuld said:
Just wrote selection-sort in C. Is it the right way to do this in C ?

(definition of selection-sort taken from exercise 2.2-2 of Introduction
to Algorithms, 2e)

/* Exercise 2.2-2, page 29, Selection Sort */

#include <stdio.h>

enum { SIZE_ARR = 6 };

void selectionSort(char s[], int len);
int findSmallest(char s[], const int idx_cur, const int len);

The consts on those parameters don't do much, but aren't actually
wrong.
int main(void)
{
char arr[SIZE_ARR+1] = {'a', 'r', 'n', 'u', 'l', 'd'};

printf("%s\n", arr);
selectionSort(arr, SIZE_ARR);
printf("%s\n", arr);

return 0;
}

void selectionSort(char s[], int len)
{
int i, idx;
char tmp;

Schools of thought vary, but some prefer variables that are only
going to be used in a small scope to be declared within that scope.
for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);

so that could be
int idx = findSmallest(s, i, len);
/* printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;

and that could be
char tmp = s;
s = s[idx];
s[idx] = tmp;

Note that the swap is only necessary if(i != idx).
}
}

int findSmallest(char s[], const int idx_cur, const int len)
{
int j, ret_index;
char smallest = s[idx_cur];
ret_index = idx_cur;

for(j = idx_cur+1; j != len; ++j)
{
/* printf("idx_cur = %d, j = %d\n", idx_cur, j); */
if(smallest > s[j])

Again, it's a matter of taste, but some prefer the object
you're testing to be first in the comparison. And here you're
testing s[j]. So:

if(s[j] < smallest)
{
smallest = s[j];
ret_index = j;
}
}

return ret_index;
}

The sort looks fine. You might want to write a simple test
harness to test arbitrary inputs.

Phil

A

#### arnuld

void selectionSort(char s[], int len); int findSmallest(char s[], const
int idx_cur, const int len);
The consts on those parameters don't do much, but aren't actually wrong.

Just to show that function is not supposed to change its arguments.

and that could be
char tmp = s;

I like to keep things in as small scope as possible but then here it will
initialize the variable on each looping. I thought assignment is less
expensive than initialization.

s = s[idx];
s[idx] = tmp;

Note that the swap is only necessary if(i != idx).

humm.. did not think about it

for(j = idx_cur+1; j != len; ++j)
{
/* printf("idx_cur = %d, j = %d\n", idx_cur, j); */
if(smallest > s[j])
Again, it's a matter of taste, but some prefer the object you're testing
to be first in the comparison. And here you're testing s[j]. So:

I will add this advice to my own style (unless I am comparing with some
enum constant)

-- arnuld
www.LispMachine.Wordpress.com

B

#### Ben Bacarisse

arnuld said:
void selectionSort(char s[], int len); int findSmallest(char s[], const
int idx_cur, const int len);
The consts on those parameters don't do much, but aren't actually wrong.

Just to show that function is not supposed to change its arguments.

The terminology is wrong here. No C function can change its
arguments -- arguments are the things you pass to a function. Your
declarations tell the reader that the function does not change its
*parameters*.
and that could be
char tmp = s;

I like to keep things in as small scope as possible but then here it will
initialize the variable on each looping. I thought assignment is less
expensive than initialization.

Why do you think that? I am genuinely interested. I never thought
about it one way or the other so I wonder where the idea comes from.

With automatic variables of aggregate types, any initialisation implies
full initialisation of whole object. I.e.

char str[100] = "";

is more expensive than

char str[100];
str[0] = 0;

but that's because the two don't do the same thing. It's not assignment
is any cheaper.

<snip>

G

#### Gene

Just wrote selection-sort in C. Is it the right way to do this in C ?

(definition of selection-sort taken from exercise 2.2-2 of Introduction
to Algorithms, 2e)

/* Exercise 2.2-2, page 29, Selection Sort */

#include <stdio.h>

enum { SIZE_ARR = 6 };

void selectionSort(char s[], int len);
int findSmallest(char s[], const int idx_cur, const int len);

int main(void)
{
char arr[SIZE_ARR+1] = {'a', 'r', 'n', 'u', 'l', 'd'};

printf("%s\n", arr);
selectionSort(arr, SIZE_ARR);
printf("%s\n", arr);

return 0;

}

void selectionSort(char s[], int len)
{
int i, idx;
char tmp;
for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);
/*      printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;
s = s[idx];
s[idx] = tmp;
}

}

int findSmallest(char s[], const int idx_cur, const int len)
{
int j, ret_index;
char smallest = s[idx_cur];
ret_index = idx_cur;

for(j = idx_cur+1; j != len; ++j)
{
/*      printf("idx_cur = %d, j = %d\n", idx_cur, j); */
if(smallest > s[j])
{
smallest = s[j];
ret_index = j;
}
}

return ret_index;

}

,
The C style looks fine, but the most general answer to your question
must include the fact that if you are implementing an O(n^2) sort,
insertion sort will do no worse than the others, and it is
significantly better on many data.

A

#### arnuld

The terminology is wrong here. No C function can change its arguments
-- arguments are the things you pass to a function. Your declarations
tell the reader that the function does not change its *parameters*.

Okay, today my confusion of parameter vs argument is cleared.

Why do you think that? I am genuinely interested. I never thought
about it one way or the other so I wonder where the idea comes from.

genuinely, I don't know but see down here.

With automatic variables of aggregate types, any initialisation implies
full initialisation of whole object. I.e.

char str[100] = "";

is more expensive than

char str[100];
str[0] = 0;

but that's because the two don't do the same thing. It's not assignment
is any cheaper.

Holy cow, on different occasions, I was told by few experienced
programmers that 1st version is much, much more expensive than 2nd. And I
think this is from where my initialization vs assignment idea came from.

M

#### Malcolm McLean

Just wrote selection-sort in C. Is it the right way to do this in C ?

Write a sort like this

void selection_sort(void *data, size_t N, size_t elsize, int (*comp)
(const void *e1, const void *e2));

Now you have several advantages. You can sort any data as long as it
fits in an array and you can write a comparator for it. Also, the
paramaters match qsort. So when testing your function, you can toggle
between selection_sort and qsort()

A

#### Anand Hariharan

On Tue, 16 Aug 2011 13:22:59 +0100, Ben Bacarisse wrote: (...)
With automatic variables of aggregate types, any initialisation implies
full initialisation of whole object.  I.e.
char str[100] = "";
is more expensive than
char str[100];
str[0] = 0;
but that's because the two don't do the same thing.  It's not assignment
is any cheaper.

Holy cow, on different occasions, I was told by few experienced
programmers that 1st version is much, much more expensive than 2nd. And I
think this is from where my initialization vs assignment idea came from.

Like Ben indicated, the two statements do different things, so you
cannot compare them and make conclusions regarding which is "cheaper".

In the sibling language C++, the distinction between assignment and
initialisation is much more important than it is in C, and assignment
is typically more expensive than initialisation.

Regardless of any performance issues, giving variables a value at
definition (i.e., initialisation) is a good habit to develop. Which
is why Phil's advice of declaring your variables in the tightest scope
possible also makes sense because you will define them only just
before you need them (and thereby give them a 'good' starting value).

- Anand

T

#### Thomas Boell

arnuld said:
[...]

I like to keep things in as small scope as possible but then here it will
initialize the variable on each looping. I thought assignment is less
expensive than initialization.

Why do you think that? I am genuinely interested. I never thought
about it one way or the other so I wonder where the idea comes from.

If the compiler would do the allocation (i. e. reserving space on the
stack) exactly where it's requested, it would be more expensive to
declare the variable inside the loop. But I doubt that any optimizing
compiler would do that in this case; the allocation would be moved
outside the loop, or possibly the variable is allocated in a free
register, so allocating the variable inside the loop isn't any more
expensive than outside.

K

#### karnath

arnuld said:
[...]

I like to keep things in as small scope as possible but then here it will
initialize the variable on each looping. I thought assignment is less
expensive than initialization.

Why do you think that? I am genuinely interested. I never thought
about it one way or the other so I wonder where the idea comes from.

If the compiler would do the allocation (i. e. reserving space on the
stack) exactly where it's requested, it would be more expensive to
declare the variable inside the loop. But I doubt that any optimizing
compiler would do that in this case; the allocation would be moved
outside the loop, or possibly the variable is allocated in a free
register, so allocating the variable inside the loop isn't any more
expensive than outside.

If we are using compiler without optimization it would be right to
put initialization before scope in topic case, won't it?

K

#### Keith Thompson

Ben Bacarisse said:
arnuld said:
On Tue, 16 Aug 2011 14:24:45 +0300, Phil Carmody wrote: [...]
The consts on those parameters don't do much, but aren't actually wrong.

Just to show that function is not supposed to change its arguments.

The terminology is wrong here. No C function can change its
arguments -- arguments are the things you pass to a function. Your
declarations tell the reader that the function does not change its
*parameters*.

(I think the attribution levels got messed up a bit.)

Here's a program that illustrates the distiction:

#include <stdio.h>

void func_with_const(const int param)
{
printf("In func_with_const, param = %d\n", param);
/* param = 100; */ /* would be illegal */
*(int*)&param = 200; /* undefined behavior! */
printf(" Now param = %d\n", param);
}

void func_without_const(int param)
{
printf("In func_with_const, param = %d\n", param);
param = 300; /* perfectly legal */
printf(" Now param = %d\n", param);
}

int main(void)
{
int argument = 10;
printf("In main, argument = %d\n", argument);
func_with_const(argument);
func_without_const(argument);
printf("Back in main, argument is still %d\n", argument);
return 0;
}

In func_with_const(), the "const" on the parameter declaration is a
promise not to modify the value of the parameter, which is a local
variable to the function. It's relevant only within the body of the
function itself, and is of no interest to the caller. In this case, the
fucntion uses a pointer cast to violate that promise, but even that
doesn't affect the argument.

func_without_const() doesn't promise not to modify its parameter, but
again, the modification has no effect on the caller.

You could have a *declaration* like
void func1(int param);
and a *definition* like
void func1(const int param) { ... param = ...; ... }
so the promise is made only where it's relevant, but it's questionable
whether that's worth the effort and potential confusion.

Note that "const" *can* meaningfully appear in a pointer parameter
declaration:

void func2(const int *param);

Here the "const" applies to the object (if any) that the parameter
points to, not to the parameter itself. A caller can pass func2() a
pointer to some data with the assurance that the pointed-to data won't
be modified.

You could make the parameter itself read-only like this:

void func3(int *const param);

but, like the "const param" declaration, this isn't meaningful for the
caller.

W

#### Willem

karnath wrote:
)> If the compiler would do the allocation (i. e. reserving space on the
)> stack) exactly where it's requested, it would be more expensive to
)> declare the variable inside the loop. But I doubt that any optimizing
)> compiler would do that in this case; the allocation would be moved
)> outside the loop, or possibly the variable is allocated in a free
)> register, so allocating the variable inside the loop isn't any more
)> expensive than outside.
)>
)
)
) If we are using compiler without optimization it would be right to
) put initialization before scope in topic case, won't it?

No.

If you are using compiler without optimization you obviously don't care
about performance, so it would not be right. You should write code in
the most clear way, and keeping declarations close to usage is clearer
according to most opinions.

SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

B

#### Ben Bacarisse

Thomas Boell said:
arnuld said:
[...]

I like to keep things in as small scope as possible but then here it will
initialize the variable on each looping. I thought assignment is less
expensive than initialization.

Why do you think that? I am genuinely interested. I never thought
about it one way or the other so I wonder where the idea comes from.

If the compiler would do the allocation (i. e. reserving space on the
stack) exactly where it's requested, it would be more expensive to
declare the variable inside the loop. But I doubt that any optimizing
compiler would do that in this case; the allocation would be moved
outside the loop, or possibly the variable is allocated in a free
register, so allocating the variable inside the loop isn't any more
expensive than outside.

That's relevant if you include allocation with initialisation. I
assumed that arnuld meant exactly what he said. To me that means he
thought that

int x;
x = 1;

is less expensive than

int x = 1;

The allocation occurs in both cases.

B

#### BGB

arnuld said:
Just wrote selection-sort in C. Is it the right way to do this in C ?

(definition of selection-sort taken from exercise 2.2-2 of Introduction
to Algorithms, 2e)

/* Exercise 2.2-2, page 29, Selection Sort */

#include<stdio.h>

enum { SIZE_ARR = 6 };

void selectionSort(char s[], int len);
int findSmallest(char s[], const int idx_cur, const int len);

The consts on those parameters don't do much, but aren't actually
wrong.

yep.

int main(void)
{
char arr[SIZE_ARR+1] = {'a', 'r', 'n', 'u', 'l', 'd'};

printf("%s\n", arr);
selectionSort(arr, SIZE_ARR);
printf("%s\n", arr);

return 0;
}

void selectionSort(char s[], int len)
{
int i, idx;
char tmp;

Schools of thought vary, but some prefer variables that are only
going to be used in a small scope to be declared within that scope.
for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);

so that could be
int idx = findSmallest(s, i, len);

however, this will not work on compilers which don't support C99, where
declarations have to be at the start of a block and can only be
initialized to compile-time constants.

IMO, it is better to stick with C89/C90 conventions unless there is a
good reason to do otherwise, since then one can build, say, on Windows
while using a certain Microsoft compiler (much like how one avoids MSVC
or GCC specific extensions except when necessary, and typically guarded
with #ifdef's).

/* printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;

and that could be
char tmp = s;

same issue...

s = s[idx];
s[idx] = tmp;

Note that the swap is only necessary if(i != idx).

yep.

}
}

int findSmallest(char s[], const int idx_cur, const int len)
{
int j, ret_index;
char smallest = s[idx_cur];
ret_index = idx_cur;

for(j = idx_cur+1; j != len; ++j)
{
/* printf("idx_cur = %d, j = %d\n", idx_cur, j); */
if(smallest> s[j])

Again, it's a matter of taste, but some prefer the object
you're testing to be first in the comparison. And here you're
testing s[j]. So:

if(s[j]< smallest)
{
smallest = s[j];
ret_index = j;
}
}

return ret_index;
}

The sort looks fine. You might want to write a simple test
harness to test arbitrary inputs.

yep.

typically, if I were writing it I would have done it all inline, for
example:
for(i=0; i<len; i++)
for(j=i+1; j<len; j++)
if(str[j]<str)
{ tmp=str; str=str[j]; str[j]=tmp; }

or maybe this:
stre=str+len;
for(s=str; s<stre; s++) for(t=s+1; t<stre; t++)
if(*t<*s) { tmp=*s; *s=*t; *t=tmp; }

but, again, this is also a matter of personal preference...

granted, this would usually be because if I am resorting to selection
sort, it means:
I want a fairly simple / easy-to-type sort;
I don't really care whether or not it performs all that well.

or such...

I

#### Ian Collins

arnuld said:
void selectionSort(char s[], int len)
{
int i, idx;
char tmp;

Schools of thought vary, but some prefer variables that are only
going to be used in a small scope to be declared within that scope.
for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);

so that could be
int idx = findSmallest(s, i, len);

however, this will not work on compilers which don't support C99, where
declarations have to be at the start of a block and can only be
initialized to compile-time constants.

Not so, the above is perfectly legal old C. You are confusing file
scoped and function scoped variables, where the same rules apply to both
old C and C99.
/* printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;

and that could be
char tmp = s;

same issue...

Same non-issue!

B

#### BGB

void selectionSort(char s[], int len)
{
int i, idx;
char tmp;

Schools of thought vary, but some prefer variables that are only
going to be used in a small scope to be declared within that scope.

for(i = 0; i != len; ++i)
{
idx = findSmallest(s, i, len);

so that could be
int idx = findSmallest(s, i, len);

however, this will not work on compilers which don't support C99, where
declarations have to be at the start of a block and can only be
initialized to compile-time constants.

Not so, the above is perfectly legal old C. You are confusing file
scoped and function scoped variables, where the same rules apply to both
old C and C99.

interesting, I had thought all initializers had to be constant in C89,
oh well...

/* printf("i = %d, Smallest Index = %d, s[%d] = %c\n", i, idx,
idx, s[idx]); */
tmp = s;

and that could be
char tmp = s;

same issue...

Same non-issue!

A

#### Alan Curry

interesting, I had thought all initializers had to be constant in C89,
oh well...

The subtle difference that you're half-remembering is that C89 required
aggregate initializers to be constant (resolvable at link-time).

int get_an_int(void);

void foo(void)
{
int item0 = get_an_int(), item1 = get_an_int(); /* valid C89 and C99 */
int item[] = { get_an_int(), get_an_int() }; /* valid C99 only */
}

A

#### arnuld

That loop is too long.

To sort an array of (N) elements,
you only have to sort the first (N - 1) elements.

I wonder how it happens but I will use this

For an array of (N) elements
you could decrease the number of swaps to less than (N), and you could
also lighten the load on the findSmallest() function.

void selectionSort(char s[], int len) {
int i, idx;
char tmp;

for(i = 0; i != len - 1; ++i) {
idx = findSmallest(s, i + 1, len);
if (s > s[idx]) {
tmp = s;
s = s[idx];
s[idx] = tmp;
}
}
}

This was good. i don't understand why these techniques don't come to my
mind:

(1) to sort N elements you need to sort only N-1 elements
(2) i and i != idx replaced by i+1 and s > s[idx]

Is there some special kind of practice/study needed to think like that ?
(I have just started Cormen et al. )

A

#### Angel

This was good. i don't understand why these techniques don't come to my
mind:

(1) to sort N elements you need to sort only N-1 elements
(2) i and i != idx replaced by i+1 and s > s[idx]

Is there some special kind of practice/study needed to think like that ?
(I have just started Cormen et al. )

Knowing the basics of algebra and formal Boolean logic helps a great deal,
IMHO. But other than that, it's just experience and practice. Much like
learning a human language, actually. You can learn the basics in school
but the only way to speak it fluently is to use it for real.

Or, I suppose, you could be "Apsie" to whom this sort of thinking comes
naturally. At least, that's how things are for me. ;-)

P

#### Phil Carmody

arnuld said:
I wonder how it happens but I will use this

It's perhaps better to say that in order to sort N elements
you only need to place N-1 of them in the correct positions.
There's nowhere else for the N-th element to go apart from
its correct position. However, you are actually still sorting
all N items.

Phil