# strange problem of sorting

Discussion in 'C++' started by abracadabra, Jul 6, 2007.

I am reading an old book - Programming Pearls 2nd edition recently. It
says, "Even though the general C++ program uses 50 times the memory
and CPU time of the specialized C program, it requires just half the
code and is much easier to extend to other problems." in a sorting
problem of the very first chapter.

I modified the codes in the answer part, ran it and found it is almost
the contrary case. The qsortints.c is the C code that uses the qsort
function defined in stdlib.h. The sortints.cpp uses STL sort. The
bitsort.c uses the method of bitmap in the book. To my surprise, the
bitmap method is slowest! The qsort method slower, the sort fastest.
It is totally different from what is said in the book!

I repeated the experiment using Visual Studio 2005 on a Windows XP
machine, and mingw32(gcc 4.2.0), the result is the same:
sort>qsort>bitmap.

Here goes the code.

/* qsortints.c -- Sort input set of integers using qsort */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define iMax 2000000
#define iSize 1000000

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a + ( 32767*myrand() + myrand()) % (b + 1 - a);
}

int intcomp(int *x, int *y)
{
return *x - *y;
}

int a[iMax];

int main()
{
int i, p, t;
clock_t t_begin, t_end;

srand((unsigned) time(NULL));

for (i = 0; i < iSize+2000; i++)
a = i;

t_begin = clock();
for (i = 0; i < iSize; i++)
{
p = randint(i, iSize-1);
t = a[p]; a[p] = a; a = t;
}
qsort(a, iSize, sizeof(int), intcomp);
t_end = clock();
printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);

/*system("pause");

for (i = 0; i < iSize; i++)
printf("%d ", a);*/

return 0;
}
/* Output is 1.270000 on my SuSE 10 with gcc -O3 */
/* End of qsortints.c */

/* sortints.cpp -- Sort input set of integers using STL set */
#include <iostream>
#include <ctime>
#include <algorithm>

using namespace std;

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a + ( 32767*myrand() + myrand()) % (b + 1 - a);
}

const int iMax = 2000000;
const int iSize = 1000000;

int a[iMax];

int main()
{
int i, p, t;
clock_t t_begin, t_end;
srand((unsigned) time(NULL));

for (i = 0; i < iSize+2000; i++)
a = i;

t_begin = clock();
for (i = 0; i < iSize; i++)
{
p = randint(i, iSize-1);
t = a[p]; a[p] = a; a = t;
}
sort(a, a+iSize-1);
t_end = clock();
cout << (t_end - t_begin)/1.0/CLOCKS_PER_SEC << endl;

/*system("pause");

for (i = 0; i < iSize; i++)
cout << a << " ";*/
}
/* Output is 0.38 on SuSE */
/* End of sortints.cpp */

/* bitsort.c Sort input set of integers using bitmap */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define BITSPERWORD 32
#define SHIFT 5
#define N 10000000
int a[1 + N/BITSPERWORD];

void set(int i)
{
}

void clr(int i)
{
}

int test(int i)
{
return a[i>>SHIFT] & (1<<(i & MASK));
}

int myrand()
{
return rand()%32767;
}

int randint(int a, int b)
{
return a+(32767*myrand()+myrand())%(b+1-a);
}

int main()
{
int i;
clock_t t_begin, t_end;

srand((unsigned)time(NULL));

for (i = 0; i < N; i++)
clr(i);

t_begin = clock();
for (i = 0; i< N-2000; i++)
set(randint(i, N-1));
t_end = clock();
printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);

/*for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);*/

return 0;
}
/* Output is 1.45000 on SuSE */
/* End of bitsort.c */

Any suggestions?

2. ### osmiumGuest

>I am reading an old book - Programming Pearls 2nd edition recently. It
> says, "Even though the general C++ program uses 50 times the memory
> and CPU time of the specialized C program, it requires just half the
> code and is much easier to extend to other problems." in a sorting
> problem of the very first chapter.
>
> I modified the codes in the answer part, ran it and found it is almost
> the contrary case. The qsortints.c is the C code that uses the qsort
> function defined in stdlib.h. The sortints.cpp uses STL sort. The
> bitsort.c uses the method of bitmap in the book. To my surprise, the
> bitmap method is slowest! The qsort method slower, the sort fastest.
> It is totally different from what is said in the book!
>
> I repeated the experiment using Visual Studio 2005 on a Windows XP
> machine, and mingw32(gcc 4.2.0), the result is the same:
> sort>qsort>bitmap.
>
> Here goes the code.
>
> /* qsortints.c -- Sort input set of integers using qsort */
> #include <stdio.h>

massive snippage

> printf("%d\n", i);*/
>
> return 0;
> }
> /* Output is 1.45000 on SuSE */
> /* End of bitsort.c */
>
> Any suggestions?

My first question would be, how long did the three methods take in MKS
units? (clocks/sec in not an MKS unit). I think you may be trying to
extract too much information from a small sample. ISTM the author's point
was simply that STL sort was easier to use than qsort. Given a programmer
with sufficient background, I guess that is marginally true.

osmium, Jul 6, 2007

3. ### =?ISO-8859-1?Q?Erik_Wikstr=F6m?=Guest

> I am reading an old book - Programming Pearls 2nd edition recently. It
> says, "Even though the general C++ program uses 50 times the memory
> and CPU time of the specialized C program, it requires just half the
> code and is much easier to extend to other problems." in a sorting
> problem of the very first chapter.
>
> I modified the codes in the answer part, ran it and found it is almost
> the contrary case. The qsortints.c is the C code that uses the qsort
> function defined in stdlib.h. The sortints.cpp uses STL sort. The
> bitsort.c uses the method of bitmap in the book. To my surprise, the
> bitmap method is slowest! The qsort method slower, the sort fastest.
> It is totally different from what is said in the book!
>
> I repeated the experiment using Visual Studio 2005 on a Windows XP
> machine, and mingw32(gcc 4.2.0), the result is the same:
> sort>qsort>bitmap.
>
> Here goes the code.

[snip]

> Any suggestions?

The book is outdated?

It's really no surprise that std::sort is faster than qsort, it's much
easier to optimise and (to my understanding) often uses introspective
sort, which has slightly better worst case performance than quicksort
(though there's probably nothing stopping someone to use the same
algorithm in qsort).

--
Erik Wikström

=?ISO-8859-1?Q?Erik_Wikstr=F6m?=, Jul 6, 2007
4. ### Marcus KwokGuest

Erik WikstrÃ¶m <> wrote:
> It's really no surprise that std::sort is faster than qsort, it's much
> easier to optimise

Yes, probably due to it being much easier to inline the comparison
function in std::sort() than in qsort().

--
Marcus Kwok
Replace 'invalid' with 'net' to reply

Marcus Kwok, Jul 6, 2007

On Jul 6, 11:23 pm, Erik Wikström <> wrote:
> On 2007-07-06 16:22, abracadabra wrote:
>
>
>
> > I am reading an old book - Programming Pearls 2nd edition recently. It
> > says, "Even though the general C++ program uses 50 times the memory
> > and CPU time of the specialized C program, it requires just half the
> > code and is much easier to extend to other problems." in a sorting
> > problem of the very first chapter.

>
> > I modified the codes in the answer part, ran it and found it is almost
> > the contrary case. The qsortints.c is the C code that uses the qsort
> > function defined in stdlib.h. The sortints.cpp uses STL sort. The
> > bitsort.c uses the method of bitmap in the book. To my surprise, the
> > bitmap method is slowest! The qsort method slower, the sort fastest.
> > It is totally different from what is said in the book!

>
> > I repeated the experiment using Visual Studio 2005 on a Windows XP
> > machine, and mingw32(gcc 4.2.0), the result is the same:
> > sort>qsort>bitmap.

>
> > Here goes the code.

>
> [snip]
>
> > Any suggestions?

>
> The book is outdated?
>
> It's really no surprise that std::sort is faster than qsort, it's much
> easier to optimise and (to my understanding) often uses introspective
> sort, which has slightly better worst case performance than quicksort
> (though there's probably nothing stopping someone to use the same
> algorithm in qsort).
>
> --
> Erik Wikström

Ah, yes. I typed a gcc -pg, and gprof. I saw lots of introsort
something in the output. But I don't know why the bitmap method is the
slowest. The book referred it as the fastest.

6. ### Greg HerlihyGuest

On Jul 6, 7:22 am, abracadabra <> wrote:
> I am reading an old book - Programming Pearls 2nd edition recently. It
> says, "Even though the general C++ program uses 50 times the memory
> and CPU time of the specialized C program, it requires just half the
> code and is much easier to extend to other problems." in a sorting
> problem of the very first chapter.
>
> I modified the codes in the answer part, ran it and found it is almost
> the contrary case. The qsortints.c is the C code that uses the qsort
> function defined in stdlib.h. The sortints.cpp uses STL sort. The
> bitsort.c uses the method of bitmap in the book. To my surprise, the
> bitmap method is slowest! The qsort method slower, the sort fastest.
> It is totally different from what is said in the book!
>
> I repeated the experiment using Visual Studio 2005 on a Windows XP
> machine, and mingw32(gcc 4.2.0), the result is the same:
> sort>qsort>bitmap.
>
> Here goes the code.
>
> /* sortints.cpp -- Sort input set of integers using STL set */
> #include <iostream>
> #include <ctime>
> #include <algorithm>
>
> using namespace std;

....
> int a[iMax];
>
> int main()
> {
> int i, p, t;
> clock_t t_begin, t_end;
> srand((unsigned) time(NULL));
>
> for (i = 0; i < iSize+2000; i++)
> a = i;
>
> t_begin = clock();
> for (i = 0; i < iSize; i++)
> {
> p = randint(i, iSize-1);
> t = a[p]; a[p] = a; a = t;
> }
> sort(a, a+iSize-1);
> t_end = clock();

....
> /* End of sortints.cpp */
>
> /* bitsort.c Sort input set of integers using bitmap */

....
> int main()
> {
> int i;
> clock_t t_begin, t_end;
>
> srand((unsigned)time(NULL));
>
> for (i = 0; i < N; i++)
> clr(i);
>
> t_begin = clock();
> for (i = 0; i< N-2000; i++)
> set(randint(i, N-1));
> t_end = clock();
> printf("%f\n", (t_end - t_begin)/1.0/CLOCKS_PER_SEC);
>
> /*for (i = 0; i < N; i++)
> if (test(i))
> printf("%d\n", i);*/
>
> return 0;}
>
> /* Output is 1.45000 on SuSE */
> /* End of bitsort.c */
>
> Any suggestions?

These programs are including the time taken to generate and fill in
the array of ints - and not just the actual time spent to sort the
array; whereas a more accurate measurement of sorting performance
would exclude this kind of preparation time. Note that in the case of
the bit sort, it will be necessary to allocate an array for the
generated ints, since the current version both generates the int
values and "sorts" them at the same time.

I found that once the preparation time was excluded from the sorting
timings, that the bit sort program was about twice as fast as the STL
program in sorting the int values. Note that this outcome is not at
all surprising - since the bit sort is actually just flipping bits in
a bit vector and is not "sorting" the int values - at least not in the
way that most people usually think of sorting..

Greg

Greg Herlihy, Jul 7, 2007

Because in the bitmap method, sorting and setting the integers are the
same step, I had to add the randint function into the time counting
for three method. I guess it might be the bit set operation that takes
a long time.

Because in the bitmap method, sorting and setting the integers are the
same step, I had to add the randint function into the time counting
for three method. I guess it might be the bit set operation that takes
a long time.