using a float as the index in an STL map?

J

JDT

Hi,

It seems that using floats as the first tupelo for an STL map (shown
below) can cause problems but I don't remember what the problems were.
Is it common practice to use a structure like below? I would appreciate
if you can share your experience using such a structure. Thanks.

std::map<float, int> m;

JD
 
V

Victor Bazarov

JDT said:
It seems that using floats as the first tupelo for an STL map (shown
below) can cause problems but I don't remember what the problems were.

The only thing I can think of is that if you expect two different
results of a calculation to lead to the same number (and ultimately
to the same stored in the map value), you may be in for a surprise.
Due to rounding errors in the FPU the mathematically equivalent
calculations can lead to different numbers internally.

Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
(a + 2.f/3) and (b - 1.f/3) _can_ give you different results.
Is it common practice to use a structure like below? I would
appreciate if you can share your experience using such a structure.
Thanks.
std::map<float, int> m;

I _never_ saw a 'map' where 'float' would be the Key type. I cannot
claim to have seen all code in the world, and even a significant part
of it, so I cannot attest to "commonality" of the practice.

V
 
J

Juha Nieminen

Victor said:
Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
(a + 2.f/3) and (b - 1.f/3) _can_ give you different results.

Concrete example:

double d1 = 1.0;
double d2 = 0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1;
std::cout << d1 << " " << d2 << " " << (d1 == d2) << std::endl;

That prints (at least in a regular PC):

1 1 0

Printing the doubles with more decimal places would show the
difference. It's very small, but it's there, thus (d1 == d2) is
false.
 
J

JDT

Victor said:
The only thing I can think of is that if you expect two different
results of a calculation to lead to the same number (and ultimately
to the same stored in the map value), you may be in for a surprise.
Due to rounding errors in the FPU the mathematically equivalent
calculations can lead to different numbers internally.

Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
(a + 2.f/3) and (b - 1.f/3) _can_ give you different results.




I _never_ saw a 'map' where 'float' would be the Key type. I cannot
claim to have seen all code in the world, and even a significant part
of it, so I cannot attest to "commonality" of the practice.

V

Hi Victor:

In the following scenario, I think using float as the key to a map makes
sense. For example, the following table needs to be sorted in the
ascending order of the 1st tuple (i.e. Values). So I can simply insert
the pairs into a map and then get a list of sorted pairs automatically.
Do people often have similar needs or are there other better ways to
accomplish this purpose? Any further help is much appreciated.

Values # of items
3.5 5
4.7 9
9.3 7
......

JD
 
V

Victor Bazarov

JDT said:
Hi Victor:

In the following scenario, I think using float as the key to a map
makes sense. For example, the following table needs to be sorted in
the ascending order of the 1st tuple (i.e. Values). So I can simply
insert the pairs into a map and then get a list of sorted pairs
automatically. Do people often have similar needs or are there other
better ways to accomplish this purpose? Any further help is much
appreciated.
Values # of items
3.5 5
4.7 9
9.3 7
.....

That's fine. Another way is to have a vector<pair<float,int> >, stuff
it with all the pairs you encounter, then sort them based on the pair's
'.first' member. You'd have better space economy that way. Of course,
in that case if you need to extract the correct order at any moment,
you'd have to sort right there (whereas 'std::map' always keeps them
sorted).

V
 
Z

zeppethefake

That's fine. Another way is to have a vector<pair<float,int> >, stuff
it with all the pairs you encounter, then sort them based on the pair's
'.first' member. You'd have better space economy that way. Of course,
in that case if you need to extract the correct order at any moment,
you'd have to sort right there (whereas 'std::map' always keeps them
sorted).


Uhm, I don't think it would be the best idea, because every time you
have to add a number you have to perform a search that is linear,
while the access to the map is logarithmic.

I think a good alternative would be to manually define the comparison
operator "less" considering that the equality

(a == b)

on a map is given by

!less(a,b) && !less(b,a)

it should be sufficient to write less() as something like


less(a, b){
if (a - b > threshold)
return false;
else
return true;
}

with threshold a proper small number. Sry for the loose syntax :)

you should then use map<double, unsigned, less>.

Regards,

Zeppe
 
J

James Kanze

Uhm, I don't think it would be the best idea, because every time you
have to add a number you have to perform a search that is linear,
while the access to the map is logarithmic.

Why would the search have to be linear? I've used lower_bounds
to keep a vector sorted on more than a few occasions.

But the real question concerns the actual use. I got the vague
impression that the ordering was only relevant once the
collection had been constructed. If that's the case, the best
solution is probably to just add the values to a vector, and
sort once at the end, when the vector is full. Otherwise, it
depends.
I think a good alternative would be to manually define the comparison
operator "less" considering that the equality
on a map is given by
!less(a,b) && !less(b,a)
it should be sufficient to write less() as something like
less(a, b){
if (a - b > threshold)
return false;
else
return true;
}
with threshold a proper small number.

That's a nice way to get undefined behavior. Since such a
function doesn't define an ordering relationship (which must be
transitive), it doesn't meet the requirements for std::map or
std::set.
 
R

Rae Westwood

Hi,

It seems that using floats as the first tupelo for an STL map (shown
below) can cause problems but I don't remember what the problems were.
Is it common practice to use a structure like below? I would appreciate
if you can share your experience using such a structure. Thanks.

std::map<float, int> m;

JD

I'd be interested in authoritative and intelligent input on this as well.
My understanding is that for a type to be a key for a map, it only need be
(sortable)..iow: it overloads the relational operators so that two values
of the same type can be compared.

As to how you generate your keys...well, that could pose a problem since
ieee floating point numbers are approximations of actual floating point
values.

float n=2.0f

is it stored internally as 2.0 or 1.9999999997?

A way around this would be to (box) the float key into its own class and
make your relational operators be inexact comparitors.

btw: I HAVE actually found uses for maps that have floating point keys. I
use them when doing histograms of numerical data.
 
K

kostas

In the following scenario, I think using float as the key to a map makes
sense. For example, the following table needs to be sorted in the
ascending order of the 1st tuple (i.e. Values). So I can simply insert
the pairs into a map and then get a list of sorted pairs automatically.
Do people often have similar needs or are there other better ways to
accomplish this purpose? Any further help is much appreciated.

Values # of items
3.5 5
4.7 9
9.3 7
.....

A safer approach that would solve some(but not all) of the problems
already discussed would be to insert a new key only if it's not close
enough to already existing ones, that is to the upper_bound() and its
previous iterator. Of course inserting with different order may result
in different keys(no with the clean numbers of your example). Do the
same when searching.
In other words don't use the convenient brackets [] use upper_bound()
instead.
 
R

red floyd

James said:
That's a nice way to get undefined behavior. Since such a
function doesn't define an ordering relationship (which must be
transitive), it doesn't meet the requirements for std::map or
std::set.

On the other hand, assuming a reasonable threshold (EPSILON), what's
wrong with:

less(float a, float b)
{
if (fabs(a - b) > EPSILON) // difference big enough to be
return a < b; // significant
else
return false; // elements are "equal"
}
 
K

kostas

On the other hand, assuming a reasonable threshold (EPSILON), what's
wrong with:

less(float a, float b)
{
if (fabs(a - b) > EPSILON) // difference big enough to be
return a < b; // significant
else
return false; // elements are "equal"

}

Say you have already the keys 1.,2. and EPSILON =1.
Which key is this solution guaranteed to return when searching whith
1.1?
 
J

James Kanze

On the other hand, assuming a reasonable threshold (EPSILON), what's
wrong with:
less(float a, float b)
{
if (fabs(a - b) > EPSILON) // difference big enough to be
return a < b; // significant
else
return false; // elements are "equal"
}

What have you changed? You still haven't defined a strict
ordering relationship, so you have undefined behavior.
 
J

James Kanze

A safer approach

Excuse me, but without knowing the source of his values, how can
you say that it is a safer approach.
that would solve some(but not all) of the problems
already discussed would be to insert a new key only if it's not close
enough to already existing ones, that is to the upper_bound() and its
previous iterator. Of course inserting with different order may result
in different keys(no with the clean numbers of your example). Do the
same when searching.

I can't think off hand of a case where that would be
appropriate, but there probably exists one. A more likely
solution would be to round the keys before insertion. In a lot
of cases (most, I suspect, if one knows what one is doing),
there isn't a problem.
 
J

James Kanze

I'd be interested in authoritative and intelligent input on this as well.
My understanding is that for a type to be a key for a map, it only need be
(sortable)..iow: it overloads the relational operators so that two values
of the same type can be compared.

The term used by the standard is that there must be a comparison
function which "induces a strict weak ordering on the values".
The standard then specifies what it means by a strict weak
ordering:

The term strict refers to the requirement of an irreflexive
relation (!comp (x, x) for all x), and the term weak to
requirements that are not as strong as those for a total
ordering, but stronger than those for a partial ordering. If
we define equiv(a, b) as !comp (a, b) && !comp (b, a), then
the requirements are that comp and equiv both be transitive
relations:

-- comp (a, b) && comp (b, c) implies comp (a, c)

-- equiv(a, b) && equiv(b, c) implies equiv(a, c)
[Note: Under these conditions, it can be shown that

. equiv is an equivalence relation

. comp induces a well-defined relation on the
equivalence classes determined by equiv

. The induced relation is a strict total ordering.
-- end note ]
As to how you generate your keys...well, that could pose a problem since
ieee floating point numbers are approximations of actual floating point
values.

Not really. Floating point numbers are exact
representations of floating point numbers. In many
applications, floating point numbers are used to model real
numbers; the approximation there is not very exact (and many
of the rules of real arithmetic do not hold).
float n=2.0f
is it stored internally as 2.0 or 1.9999999997?

Neither. It's stored internally as 0x40000000. Required by
the IEEE standard. As it happens, this value represents the
real number 2.0 exactly---you picked a very bad example:).

The problem is, of course, that floating point can only
represent a very small subset of the real numbers, and a
non-contiguous subset at that. (int can only represent a
very small subset of the integers, but it is a contiguous
subset.) And that while precisely defined by IEEE, floating
point arithmetic doesn't follow some of the expected
rules---addition is not associative, for example---and
often doesn't give the same results as the same operation
in real arithmetic. And of course, the fact that we enter
floating point constants in the form of decimal numbers, and
that the values we "see" often aren't present in the set of
values representable in floating point.
A way around this would be to (box) the float key into its own class and
make your relational operators be inexact comparitors.

See above. I suspect that most naïve inexact comparitors
would fail to define a strick weak ordering.
btw: I HAVE actually found uses for maps that have floating point keys. I
use them when doing histograms of numerical data.

Just guessing, but in such cases, you would define
equivalences classes over ranges of floating point values,
no? Something along the lines of:

struct FPCmp
{
bool operator()( double a, double b ) const
{
return floor( 100.0 * a ) < floor( 100.0 * b ) ;
}
} ;

(In such cases, I'd probably use a canonic representation of
each equivalence class as the key, i.e. floor(100.0 *
value), in the above example.)
 
K

kostas

I can't think off hand of a case where that would be
appropriate, but there probably exists one. A more likely
solution would be to round the keys before insertion. In a lot
of cases (most, I suspect, if one knows what one is doing),
there isn't a problem.

Round to what you forgot to say. I think that whatever rounding scheme
you choose, there is the possibility that slight different values will
round to different keys. Say you have chosen to round to units and
your input is 1.4 and 1.6(not to say 1.499... , 1.50...1). I suspect
that the first will round to 1. and the second to 2. You can say that
my proposal was a round to already existing values.

I had also taken my precautions.
 
Z

zeppe

James said:
That's a nice way to get undefined behavior. Since such a
function doesn't define an ordering relationship (which must be
transitive), it doesn't meet the requirements for std::map or
std::set.

oh gosh, my bad! :)

Regards,

Zeppe
 
J

James Kanze

Round to what you forgot to say. I think that whatever rounding scheme
you choose, there is the possibility that slight different values will
round to different keys.

Certainly. That's why you have rules for rounding.
Say you have chosen to round to units and
your input is 1.4 and 1.6(not to say 1.499... , 1.50...1). I suspect
that the first will round to 1. and the second to 2. You can say that
my proposal was a round to already existing values.

It's not rounding in the classical sense, although as you say,
in some ways, it works like it. The reason I say that I can't
think of a case off hand for it is precisely that the choice of
the central value is more or less random. I'm also not too sure
if it will work; there's certainly no way to write a Compare
operator which reflects it.
 
K

kostas

It's not rounding in the classical sense, although as you say,
in some ways, it works like it. The reason I say that I can't
think of a case off hand for it is precisely that the choice of
the central value is more or less random.

Probably there are cases that it's not so important that the map will
end up having the same keys regardless of the insertion order (which
is the main advantage of an arbitrary rounding method as far as i can
see) but that these keys will be closer to the given ones(actually
will be some of them). To continue my simple example either 1.49... or
1.50...1 may be better that 1. or 2. Then, you would probably think ,
it may next come 1.1 which i would be forced to round to some of the
above values while rounding to unit is better. It may come, it may
not. If you already know, use your rounding scheme. Furthermore there
are some interesting cases that this will not happen. Consider for
example an algorithm that produces some values quite far apart one
from the other(in theory) but with some fluctuations due to numerical
calculations(in practice) that you would like to filter out. Why you
should introduce in this case some extra arbitrary rounding
error(rounding guess, if you prefer) ?
 
G

Greg Herlihy

What have you changed? You still haven't defined a strict
ordering relationship, so you have undefined behavior.

There is no undefined behavior. The less() function does produce a
total ordering:

For any a, b, and c such that less(a, b) returns true and less(b, c)
returns true, then it is the case that less( a, c) also returns true.

Greg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,050
Latest member
AngelS122

Latest Threads

Top