logical correctness?

S

Simon Biber

I would appreciate some critique of the code below. I have attempted to
avoid the undefined behaviour that would happen if pointers went off the
beginning of the arrays. The boolean logic is rather complex, though,
and I'm hoping it can be simplified.

I have written a "merge" function, which takes two sorted arrays. The
former of which must have enough extra space at the end to accomodate
the elements of the latter. Keeping array a sorted, it merges the
contents of array b into array a, in O(n) time.

The idea came from kitts' post "how can i do this in o(n)", and the
algorithm from Richard Heathfield's reply. I have not posted it in that
thread because this is a new question that I am asking myself, not a
reply intended for kitts.

#include <stdio.h>
#include <string.h>
#include <assert.h>
#define N(arr) (sizeof (arr) / sizeof *(arr))


void merge(void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compar)(const void *, const void *))
{
char *a = dest;
char *b = src;
char *ap = a + (dest_n - 1) * size;
char *bp = b + ( src_n - 1) * size;
char *cp = a + (dest_n + src_n - 1) * size;

while(cp)
{
if((ap && !bp) || ((ap || !bp) && compar(ap, bp) == 1))
{
printf("moving a[%d] to a[%d]\n", (ap-a)/size, (cp-a)/size);
memcpy(cp, ap, size);
if(ap > a) ap -= size; else ap = NULL;
}
else
{
printf("moving b[%d] to a[%d]\n", (bp-b)/size, (cp-a)/size);
memcpy(cp, bp, size);
if(bp > b) bp -= size; else bp = NULL;
}
if(cp > a) cp -= size; else cp = NULL;
}
}

int compare_int(const void *va, const void *vb)
{
const int *a = va;
const int *b = vb;
if(!a) { printf("Whoops, A was null!\n"); return 0; }
if(!b) { printf("Whoops, B was null!\n"); return 0; }
return *a < *b ? -1 : *a > *b;
}

int main(void)
{
int a[11] = {1, 3, 5, 7};
int b[7] = {2, 4, 6, 8, 10, 12, 14};
size_t i;

merge(a, 4, b, 7, sizeof(int), compare_int);

for(i = 0; i < N(a); i++) printf("%d ", a);
putchar('\n');
return 0;
}
 
T

Tim Rentsch

Simon Biber said:
I would appreciate some critique of the code below. I have attempted to
avoid the undefined behaviour that would happen if pointers went off the
beginning of the arrays. The boolean logic is rather complex, though,
and I'm hoping it can be simplified.

I have written a "merge" function, which takes two sorted arrays. The
former of which must have enough extra space at the end to accomodate
the elements of the latter. Keeping array a sorted, it merges the
contents of array b into array a, in O(n) time.
[snip]

void merge(void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compar)(const void *, const void *))
{
char *a = dest;
char *b = src;
char *ap = a + (dest_n - 1) * size;
char *bp = b + ( src_n - 1) * size;
char *cp = a + (dest_n + src_n - 1) * size;

while(cp)
{
if((ap && !bp) || ((ap || !bp) && compar(ap, bp) == 1))
{
printf("moving a[%d] to a[%d]\n", (ap-a)/size, (cp-a)/size);
memcpy(cp, ap, size);
if(ap > a) ap -= size; else ap = NULL;
}
else
{
printf("moving b[%d] to a[%d]\n", (bp-b)/size, (cp-a)/size);
memcpy(cp, bp, size);
if(bp > b) bp -= size; else bp = NULL;
}
if(cp > a) cp -= size; else cp = NULL;
}
}

The logic here may be ok. The way the loop tests are
done, with setting the pointers to NULL, seems complicated
to me. How about this writing instead? (Please ignore
formatting/layout differences, I'm not making a style
comment, just using a style that's more like what I'm
used to.)

void
merge(
void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compare)(const void *, const void *)
){
char *a = dest;
char *b = src;
char *ap = a + (dest_n ) * size;
char *bp = b + ( src_n) * size;
char *cp = a + (dest_n + src_n) * size;
char *p;

while( ap > a && bp > b ){
if( compare( ap - size, bp - size ) == 1 ) p = ap -= size;
else p = bp -= size;
memcpy( cp -= size, p, size );
}

while( bp > b ){
memcpy( cp -= size, bp -= size, size );
}
}

Disclaimer: I haven't compiled the above, just typed it in.

In the code above, it's easy (for me at least) to reason my way
through what the code is doing. The invariants seem simple and
easy to state. It's clear that each loop iteration has something
to copy, that exactly one element is copied in each iteration,
and after both loops are done that bp == b. So I have confidence
both in the code and in my understanding of it.

So that's my suggestion, for what it's worth.
 
A

August Karlstrom

Tim said:
The logic here may be ok. The way the loop tests are
done, with setting the pointers to NULL, seems complicated
to me. How about this writing instead? (Please ignore
formatting/layout differences, I'm not making a style
comment, just using a style that's more like what I'm
used to.)

Your code layout is very unique, that's for sure.
void
merge(
void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compare)(const void *, const void *)
){
char *a = dest;
char *b = src;
char *ap = a + (dest_n ) * size;
char *bp = b + ( src_n) * size;
char *cp = a + (dest_n + src_n) * size;
char *p;

while( ap > a && bp > b ){
if( compare( ap - size, bp - size ) == 1 ) p = ap -= size;
else p = bp -= size;
memcpy( cp -= size, p, size );
}

while( bp > b ){
memcpy( cp -= size, bp -= size, size );
}
}


August
 
P

pete

Tim said:
Simon Biber said:
I would appreciate some critique of the code below. I have attempted to
avoid the undefined behaviour that would happen if pointers went off the
beginning of the arrays. The boolean logic is rather complex, though,
and I'm hoping it can be simplified.

I have written a "merge" function, which takes two sorted arrays. The
former of which must have enough extra space at the end to accomodate
the elements of the latter. Keeping array a sorted, it merges the
contents of array b into array a, in O(n) time.
[snip]

void merge(void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compar)(const void *, const void *))
{
char *a = dest;
char *b = src;
char *ap = a + (dest_n - 1) * size;
char *bp = b + ( src_n - 1) * size;
char *cp = a + (dest_n + src_n - 1) * size;

while(cp)
{
if((ap && !bp) || ((ap || !bp) && compar(ap, bp) == 1))
{
printf("moving a[%d] to a[%d]\n", (ap-a)/size, (cp-a)/size);
memcpy(cp, ap, size);
if(ap > a) ap -= size; else ap = NULL;
}
else
{
printf("moving b[%d] to a[%d]\n", (bp-b)/size, (cp-a)/size);
memcpy(cp, bp, size);
if(bp > b) bp -= size; else bp = NULL;
}
if(cp > a) cp -= size; else cp = NULL;
}
}

The logic here may be ok. The way the loop tests are
done, with setting the pointers to NULL, seems complicated
to me. How about this writing instead? (Please ignore
formatting/layout differences, I'm not making a style
comment, just using a style that's more like what I'm
used to.)

void
merge(
void *dest,
size_t dest_n,
void *src,
size_t src_n,
size_t size,
int (*compare)(const void *, const void *)
){
char *a = dest;
char *b = src;
char *ap = a + (dest_n ) * size;
char *bp = b + ( src_n) * size;
char *cp = a + (dest_n + src_n) * size;
char *p;

while( ap > a && bp > b ){
if( compare( ap - size, bp - size ) == 1 )

I think it would be more better to have
( compare( ap - size, bp - size ) > 0 )
in case you want to use a qsort/bsearch style compar function.
 
T

Tim Rentsch

pete said:
Tim Rentsch wrote: [snip]
while( ap > a && bp > b ){
if( compare( ap - size, bp - size ) == 1 )

I think it would be more better to have
( compare( ap - size, bp - size ) > 0 )
in case you want to use a qsort/bsearch style compar function.

Yeah... The OP had this equality comparison to 1 and I just
copied that, but comparing greater than 0 is usually better
style in cases like this one.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,434
Messages
2,571,690
Members
48,796
Latest member
Greg L.

Latest Threads

Top