Any undefined behavior???

A

Alan Schroeder

The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.


#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)
{
int index;
float *fptr;
float minValue;

index = -1;
fptr = bsearch(value, data, size, width, compare_window);
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {
index--;
}
if (index) index++; /* We may have gone one too far */
}
return index;
}

/*
* NOTE:
* **ONLY** use this function to sort an array of floats; it won't work
* as expected if searching for a arbitrary value.
*/
int
compare_exact(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;

if (delta < 0.0) return -1 ;
if (delta > 0.0) return 1;
return 0;
}

/*
* This function can be use with 'bsearch' to find a float value within
* a given +/- window
*/
int
compare_window(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;
if (fabs(delta) <= window) return 0;

if (delta < 0.0) return -1;
if (delta > 0.0) return 1;
/* NOTE: We should **NEVER** get here */
return 0;
}


Thanks,
ALS
 
T

Tim Prince

Alan said:
The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.

It doesn't compile with any of 3 versions of gcc on my PCC. I suppose
that qualifies as no undefined behavior.
 
E

Eric Sosman

Alan said:
The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.

I see only one probable blunder, but there are a few other
peculiarities, too. See below.
#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/

"When the comments and the code disagree, both are
probably wrong." Maybe not in this case; I think it's
likely that the comment is just garbled and that the code
works as intended. Still, it's worth examining.

(My objection is that if window is non-negative, as
suggested by the way compare_window() is written, then
the condition value+window<=data[index] implies the condition
data[index]>=value-window. The search actually looks for a
value satisfying a pair of conditions that aren't redundant,
so there's something wrong here.)
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)
{
int index;
float *fptr;
float minValue;

index = -1;
fptr = bsearch(value, data, size, width, compare_window);
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {
index--;
}

This looks weird. I can imagine three circumstances:

- width is always sizeof(float), and all the elements of
the data array are searched equally. If this is the case,
width should not be a function argument; what happens if
some caller supplies width=3?

- width is a multiple of sizeof(float), and bsearch looks
only at the first of each group of N=width/sizeof(float)
elements. (Maybe data really contains N-place vectors
strung end-to-end.) If so, then you should be stepping
index by N, not by 1. Also, it would be better to accept
the width argument as N rather than as N*sizeof(float),
and make the adjustment in the bsearch call -- again,
think about the caller providing width=42.

- width is N*sizeof(float) as above, but the intermediate
un-bsearched values are "bunched" within window of the
bsearched values. That's a peculiar enough arrangement
that it deserves a big fat comment.
> if (index) index++; /* We may have gone one too far */

Here's what I think is the blunder. If you bsearch for a
value that is exactly window greater than one of the array
elements, bsearch may find that array element. ("May," not
"will," because it could find a nearby element if they're
spaced less than window apart.) Then the while loop will not
iterate, and index will be as it was when calculated from the
bsearch result -- and then if index>0 you'll increment it.
This will violate the "lowest index" part of the function's
contract, and may also violate the range conditions (if the
original data[index]==value-window, then data[++index] might
equal value+1000*window.) If bsearch finds the topmost array
element, index is computed as size-1 and perhaps incremented
to size, which smells very much like an off-the-end error.
(If the "active" part of the data array is supposed to be
followed by one or more "inactive" values, that'a another
special situation deserving of a comment.)
}
return index;
}

/*
* NOTE:
* **ONLY** use this function to sort an array of floats; it won't work
* as expected if searching for a arbitrary value.
*/
int
compare_exact(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;

Possible overflow here if the array contains large positive
and large negative values. Possible loss of significance, which
I think is probably harmless in this setting (but which I haven't
studied thoroughly).
if (delta < 0.0) return -1 ;
if (delta > 0.0) return 1;
return 0;

Consider avoiding all those worries about overflow and
significance loss by sticking to the comparison operators and
getting rid of the subtraction. You could write simply

if (*f1 < *f2) return -1;
if (*f1 > *f2) return +1;
return 0;

or you might opt for a one-liner only a C addict could love

return (*f1 > *f2) - (*f1 < *f2);
}

/*
* This function can be use with 'bsearch' to find a float value within
* a given +/- window
*/
int
compare_window(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;

Possible overflow, as above.
if (fabs(delta) <= window) return 0;

if (delta < 0.0) return -1;
if (delta > 0.0) return 1;
/* NOTE: We should **NEVER** get here */

If you're so certain, call abort(). ;-)

Better, whenever the arrangement of your tests produces
a branch that cannot be taken, it's usually a sign that there
is something amiss with the branching logic. In this case,
you've observed that one of the tests above must fire, which
means there's no reason to make the third test: the fact that
the first two didn't fire makes the result of the third a
foregone conclusion, right?

Consider rearranging the tests, perhaps along these lines:

if (delta < -window) return -1;
if (delta > +window) return +1;
return 0;

or the C addict's methadone

return (*f1 >= *f2-window) - (*f1 <= *f2+window);
return 0;
}

If I were writing this, I think I'd just forget about
bsearch and write my own binary search in open code. Yes,
they'd both be "binary search" -- but bsearch is a binary
search specialized for strict equality, which isn't what
you want. You're not guilty of using a hammer to drive
screws, but you may be using a tack-hammer when you ought
to use a ball-peen hammer.
 
P

pete

Eric said:
Alan said:
The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.

I see only one probable blunder, but there are a few other
peculiarities, too. See below.
#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/

"When the comments and the code disagree, both are
probably wrong." Maybe not in this case; I think it's
likely that the comment is just garbled and that the code
works as intended. Still, it's worth examining.

(My objection is that if window is non-negative, as
suggested by the way compare_window() is written, then
the condition value+window<=data[index] implies the condition
data[index]>=value-window. The search actually looks for a
value satisfying a pair of conditions that aren't redundant,
so there's something wrong here.)
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)
{
int index;
float *fptr;
float minValue;

index = -1;
fptr = bsearch(value, data, size, width, compare_window);
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {
index--;
}

This looks weird. I can imagine three circumstances:

- width is always sizeof(float), and all the elements of
the data array are searched equally. If this is the case,
width should not be a function argument; what happens if
some caller supplies width=3?

- width is a multiple of sizeof(float), and bsearch looks
only at the first of each group of N=width/sizeof(float)
elements. (Maybe data really contains N-place vectors
strung end-to-end.) If so, then you should be stepping
index by N, not by 1. Also, it would be better to accept
the width argument as N rather than as N*sizeof(float),
and make the adjustment in the bsearch call -- again,
think about the caller providing width=42.

- width is N*sizeof(float) as above, but the intermediate
un-bsearched values are "bunched" within window of the
bsearched values. That's a peculiar enough arrangement
that it deserves a big fat comment.
if (index) index++; /* We may have gone one too far */

Here's what I think is the blunder. If you bsearch for a
value that is exactly window greater than one of the array
elements, bsearch may find that array element. ("May," not
"will," because it could find a nearby element if they're
spaced less than window apart.) Then the while loop will not
iterate, and index will be as it was when calculated from the
bsearch result -- and then if index>0 you'll increment it.
This will violate the "lowest index" part of the function's
contract, and may also violate the range conditions (if the
original data[index]==value-window, then data[++index] might
equal value+1000*window.) If bsearch finds the topmost array
element, index is computed as size-1 and perhaps incremented
to size, which smells very much like an off-the-end error.
(If the "active" part of the data array is supposed to be
followed by one or more "inactive" values, that'a another
special situation deserving of a comment.)
}
return index;
}

/*
* NOTE:
* **ONLY** use this function to sort an array of floats; it won't work
* as expected if searching for a arbitrary value.
*/
int
compare_exact(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;

Possible overflow here if the array contains large positive
and large negative values. Possible loss of significance, which
I think is probably harmless in this setting (but which I haven't
studied thoroughly).
if (delta < 0.0) return -1 ;
if (delta > 0.0) return 1;
return 0;

Consider avoiding all those worries about overflow and
significance loss by sticking to the comparison operators and
getting rid of the subtraction. You could write simply

if (*f1 < *f2) return -1;
if (*f1 > *f2) return +1;
return 0;

or you might opt for a one-liner only a C addict could love

return (*f1 > *f2) - (*f1 < *f2);

or my favorite:

return *f2 > *f1 ? -1 : *f2 != *f1;
 
B

Barry Schwarz

The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.


#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)
{
int index;
float *fptr;
float minValue;

index = -1;
fptr = bsearch(value, data, size, width, compare_window);

compare_window is not declared at this point.
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {

If multiple entries have the value minValue, you stop too soon (or the
comment about >= is incorrect).
index--;
}
if (index) index++; /* We may have gone one too far */
}
return index;
}
snip


Remove del for email
 
E

ena8t8si

Alan said:
The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.


#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)

If what you want is: given window >= 0, find least index so that
*value-window <= data[index] && data[index] <= *value+window
(with entries in data being in non-decreasing order)

{
double vm = *value - window;
double vp = *value + window;
int a, b, c;
assert( window>=0 );
if (width<=0) return -1;

for ( a = -1, c = width-1; a+1 < c; ) {
b = (a+c)/2;
if (data < vm) a = b;
else c = b;
}
assert( a == -1 || data[a] < vm);
assert( c == width-1 || vm <= data[c] );
if (vm <= data[c] && data[c] <= vp) return c;
return -1;
}


If what you want is: given window >= 0, find least index so that
*value-window <= data[index] && *value+window <= data[index]
(with entries in data being in non-decreasing order)

{
double vp = *value + window;
int a, b, c;
assert( window>=0 );
if (width<=0) return -1;

for ( a = -1, c = width-1; a+1 < c; ) {
b = (a+c)/2;
if (data < vp) a = b;
else c = b;
}
assert( a == -1 || data[a] < vp);
assert( c == width-1 || vp <= data[c] );
assert( *value - window <= vp );
if (vp <= data[c]) return c;
return -1;
}
 
B

Ben C

The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior but
would like another set of eyes to check.


#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which contains
* a value such that data[index]>=value-window, or -1 if a value can not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/
int
find_first_index(const float *value, const float *data, size_t size, size_t
width)
{


index = -1;
fptr = bsearch(value, data, size, width, compare_window);

I know you didn't ask for style comments, only about undefined
behaviour, but I think this would be clearer like this:

int
find_first_index(float value, const float *data, size_t n)
{
...
fptr = bsearch(&value, data, n, sizeof *data, compare_window);
}

We're always searching a float array here, so having the "width"
parameter just seems to make life unnecessarily difficult for the
caller. It's not telling us anything we don't know.

The const float * parameter also looks at first glance like an array
(why else would a const pointer to a builtin type be in an argument
list?). bsearch takes a pointer because it's generic, but you can just
take the address at the point of calling it.
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {
index--;
}
if (index) index++; /* We may have gone one too far */
}
return index;

I think here you could use ptrdiff_t for index instead of int. After all
you're using size_t for the array lengths. But there's nothing undefined
about using int (as far as I know).
int
compare_exact(const void *e1, const void *e2)
{
const float *f1 = e1;
const float *f2 = e2;
float delta;

delta = *f1 - *f2;

if (delta < 0.0) return -1 ;
if (delta > 0.0) return 1;

This is not undefined, but you might want to put 0.0f instead of 0.0 if
you anticipate porting to a machine with fast floats but software double
precision routines.
 
A

Alan Schroeder

Eric Sosman said:
Alan said:
The following code produces the expected results on a PC using gcc, but I
need to port it (or least something similar) to a different
platform/compiler. I don't think I've introduced any undefined behavior
but would like another set of eyes to check.

I see only one probable blunder, but there are a few other
peculiarities, too. See below.
#include <math.h>
#include <stdlib.h>

extern float window;

/*
* NOTE:
* This function will return the lowest index into 'data' which
contains
* a value such that data[index]>=value-window, or -1 if a value can
not
* be found such that data[index]>=value-window and
value+window<=data[index].
*/

"When the comments and the code disagree, both are
probably wrong." Maybe not in this case; I think it's
likely that the comment is just garbled and that the code
works as intended. Still, it's worth examining.

(My objection is that if window is non-negative, as
suggested by the way compare_window() is written, then
the condition value+window<=data[index] implies the condition
data[index]>=value-window. The search actually looks for a
value satisfying a pair of conditions that aren't redundant,
so there's something wrong here.)

It's probaly the comment. I didn't think I had stated the intention
correctly. I'll get the comment right.
int
find_first_index(const float *value, const float *data, size_t size,
size_t width)
{
int index;
float *fptr;
float minValue;

index = -1;
fptr = bsearch(value, data, size, width, compare_window);
if (fptr != NULL) {
minValue = *value - window;
index = (int)(fptr-data);
while (index > 0 && data[index] > minValue) {
index--;
}

This looks weird. I can imagine three circumstances:

- width is always sizeof(float), and all the elements of
the data array are searched equally. If this is the case,
width should not be a function argument; what happens if
some caller supplies width=3?

- width is a multiple of sizeof(float), and bsearch looks
only at the first of each group of N=width/sizeof(float)
elements. (Maybe data really contains N-place vectors
strung end-to-end.) If so, then you should be stepping
index by N, not by 1. Also, it would be better to accept
the width argument as N rather than as N*sizeof(float),
and make the adjustment in the bsearch call -- again,
think about the caller providing width=42.

- width is N*sizeof(float) as above, but the intermediate
un-bsearched values are "bunched" within window of the
bsearched values. That's a peculiar enough arrangement
that it deserves a big fat comment.

I simplified the code to just use an array of floats. The actual data will
be different structs with the first element of each struct a float. You're
probably right and I'll comment the code better.
if (index) index++; /* We may have gone one too far */

Here's what I think is the blunder. If you bsearch for a
value that is exactly window greater than one of the array
elements, bsearch may find that array element. ("May," not
"will," because it could find a nearby element if they're
spaced less than window apart.) Then the while loop will not
iterate, and index will be as it was when calculated from the
bsearch result -- and then if index>0 you'll increment it.
This will violate the "lowest index" part of the function's
contract, and may also violate the range conditions (if the
original data[index]==value-window, then data[++index] might
equal value+1000*window.) If bsearch finds the topmost array
element, index is computed as size-1 and perhaps incremented
to size, which smells very much like an off-the-end error.
(If the "active" part of the data array is supposed to be
followed by one or more "inactive" values, that'a another
special situation deserving of a comment.)

That's another area I had problems with, if data[index]==value-window,
should I return index or index-1. I guess I'll have to check and get a
little better requirement definition. My guess is it's better to return a
range too large and eliminate the extra data later than to miss a possible
match.
Possible overflow here if the array contains large positive
and large negative values. Possible loss of significance, which
I think is probably harmless in this setting (but which I haven't
studied thoroughly).


Consider avoiding all those worries about overflow and
significance loss by sticking to the comparison operators and
getting rid of the subtraction. You could write simply

if (*f1 < *f2) return -1;
if (*f1 > *f2) return +1;
return 0;

or you might opt for a one-liner only a C addict could love

return (*f1 > *f2) - (*f1 < *f2);


Possible overflow, as above.


If you're so certain, call abort(). ;-)

Better, whenever the arrangement of your tests produces
a branch that cannot be taken, it's usually a sign that there
is something amiss with the branching logic. In this case,
you've observed that one of the tests above must fire, which
means there's no reason to make the third test: the fact that
the first two didn't fire makes the result of the third a
foregone conclusion, right?

Consider rearranging the tests, perhaps along these lines:

if (delta < -window) return -1;
if (delta > +window) return +1;
return 0;

or the C addict's methadone

return (*f1 >= *f2-window) - (*f1 <= *f2+window);


If I were writing this, I think I'd just forget about
bsearch and write my own binary search in open code. Yes,
they'd both be "binary search" -- but bsearch is a binary
search specialized for strict equality, which isn't what
you want. You're not guilty of using a hammer to drive
screws, but you may be using a tack-hammer when you ought
to use a ball-peen hammer.

I'd thought about rolling my own binary serarch function, but I've seen
enough in-house versions mess up the search, I decided to go with the
library function. I'll have to think about coding the search routine after
I fix the other problems.

Thanks again for your help,
ALS
 
E

ena8t8si

Alan said:
I'd thought about rolling my own binary serarch function, but I've seen
enough in-house versions mess up the search, I decided to go with the
library function. ...

Then this is an opportunity not only to write a shorter and
simpler function but also raise the level of understanding
for your coworkers. The original code has problems not only
because of the little glitches but because it's overly
complicated. Also, as Eric Sosman pointed out, if bsearch()
deals with values that are "equal", which one it finds is
unspecified.

/* Method to find first index in array of structured
* floating values so that vm <= array[index] <= vp,
* where vm = *value - window
* and vp = *value + window
*
* Establish invariant:
*
* 0 <= a && a <= c && c <= n
* a == 0 || float_data[a-1] < vm
* c == n || vm <= float_data[c-1]
*
* where float_data == *(float*)(fp + i*size)
*
* Then do the while loop that does the binary search, whereupon
*
* n < 1 || a == c-1
*
* The above assertion, plus the invariant, plus
* the array being sorted, means float_data[a] is
* the only possible candidate.
*/

int find_first_index( const float *value,
const float *data,
size_t size,
size_t n )
{
unsigned a, b, c;
const double vm = *value - window;
const double vp = *value + window;
const unsigned char *const fp = (const unsigned char*)data;

a = 0, c = n;
assert( a <= c && c <= n );
assert( a == 0 || *(float*)(fp + (a-1)*size) < vm );
assert( c == n || *(float*)(fp + (c-1)*size) >= vm );

while (a+1 < c) {
b = a + (c-a)/2;
assert( a < b && b < c );

if (*(float*)(fp + (b-1)*size) < vm) a = b;
else c = b;

assert( a <= c && c <= n );
assert( a == 0 || *(float*)(fp + (a-1)*size) < vm );
assert( c == n || *(float*)(fp + (c-1)*size) >= vm );
}

assert( n < 1 || a == c-1 );

if (n < 1) return -1;
if (*(float*)(fp + a*size) < vm) return -1;
if (*(float*)(fp + a*size) > vp) return -1;

return a;
}
 
P

pete

Ben said:
This is not undefined,
but you might want to put 0.0f instead of 0.0 if
you anticipate porting to a machine with fast floats
but software double precision routines.

Comparing zero to delta is dopey.
There's no point in optimizing that comparison.

*f1 should be compared to *f2 directly.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top