using a float as the index in an STL map?

Discussion in 'C++' started by JDT, Apr 20, 2007.

1. JDTGuest

Hi,

It seems that using floats as the first tupelo for an STL map (shown
below) can cause problems but I don't remember what the problems were.
Is it common practice to use a structure like below? I would appreciate
if you can share your experience using such a structure. Thanks.

std::map<float, int> m;

JD

JDT, Apr 20, 2007

2. Victor BazarovGuest

JDT wrote:
> It seems that using floats as the first tupelo for an STL map (shown
> below) can cause problems but I don't remember what the problems were.

The only thing I can think of is that if you expect two different
results of a calculation to lead to the same number (and ultimately
to the same stored in the map value), you may be in for a surprise.
Due to rounding errors in the FPU the mathematically equivalent
calculations can lead to different numbers internally.

Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
(a + 2.f/3) and (b - 1.f/3) _can_ give you different results.

> Is it common practice to use a structure like below? I would
> appreciate if you can share your experience using such a structure.
> Thanks.
> std::map<float, int> m;

I _never_ saw a 'map' where 'float' would be the Key type. I cannot
claim to have seen all code in the world, and even a significant part
of it, so I cannot attest to "commonality" of the practice.

V
--

Victor Bazarov, Apr 20, 2007

3. Juha NieminenGuest

Victor Bazarov wrote:
> Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
> (a + 2.f/3) and (b - 1.f/3) _can_ give you different results.

Concrete example:

double d1 = 1.0;
double d2 = 0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1+0.1;
std::cout << d1 << " " << d2 << " " << (d1 == d2) << std::endl;

That prints (at least in a regular PC):

1 1 0

Printing the doubles with more decimal places would show the
difference. It's very small, but it's there, thus (d1 == d2) is
false.

Juha Nieminen, Apr 20, 2007
4. JDTGuest

Victor Bazarov wrote:

> JDT wrote:
>
>>It seems that using floats as the first tupelo for an STL map (shown
>>below) can cause problems but I don't remember what the problems were.

>
>
> The only thing I can think of is that if you expect two different
> results of a calculation to lead to the same number (and ultimately
> to the same stored in the map value), you may be in for a surprise.
> Due to rounding errors in the FPU the mathematically equivalent
> calculations can lead to different numbers internally.
>
> Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
> (a + 2.f/3) and (b - 1.f/3) _can_ give you different results.
>
>
>>Is it common practice to use a structure like below? I would
>>appreciate if you can share your experience using such a structure.
>>Thanks.
>>std::map<float, int> m;

>
>
> I _never_ saw a 'map' where 'float' would be the Key type. I cannot
> claim to have seen all code in the world, and even a significant part
> of it, so I cannot attest to "commonality" of the practice.
>
> V

Hi Victor:

In the following scenario, I think using float as the key to a map makes
sense. For example, the following table needs to be sorted in the
ascending order of the 1st tuple (i.e. Values). So I can simply insert
the pairs into a map and then get a list of sorted pairs automatically.
Do people often have similar needs or are there other better ways to
accomplish this purpose? Any further help is much appreciated.

Values # of items
3.5 5
4.7 9
9.3 7
......

JD

JDT, Apr 20, 2007
5. Victor BazarovGuest

JDT wrote:
> Victor Bazarov wrote:
>
>> JDT wrote:
>>
>>> It seems that using floats as the first tupelo for an STL map (shown
>>> below) can cause problems but I don't remember what the problems
>>> were.

>>
>>
>> The only thing I can think of is that if you expect two different
>> results of a calculation to lead to the same number (and ultimately
>> to the same stored in the map value), you may be in for a surprise.
>> Due to rounding errors in the FPU the mathematically equivalent
>> calculations can lead to different numbers internally.
>>
>> Example, given that 'a' is 1.f and 'b' is 2.f, the expressions
>> (a + 2.f/3) and (b - 1.f/3) _can_ give you different results.
>>
>>
>>> Is it common practice to use a structure like below? I would
>>> appreciate if you can share your experience using such a structure.
>>> Thanks.
>>> std::map<float, int> m;

>>
>>
>> I _never_ saw a 'map' where 'float' would be the Key type. I cannot
>> claim to have seen all code in the world, and even a significant part
>> of it, so I cannot attest to "commonality" of the practice.
>>
>> V

>
> Hi Victor:
>
> In the following scenario, I think using float as the key to a map
> makes sense. For example, the following table needs to be sorted in
> the ascending order of the 1st tuple (i.e. Values). So I can simply
> insert the pairs into a map and then get a list of sorted pairs
> automatically. Do people often have similar needs or are there other
> better ways to accomplish this purpose? Any further help is much
> appreciated.
> Values # of items
> 3.5 5
> 4.7 9
> 9.3 7
> .....

That's fine. Another way is to have a vector<pair<float,int> >, stuff
it with all the pairs you encounter, then sort them based on the pair's
'.first' member. You'd have better space economy that way. Of course,
in that case if you need to extract the correct order at any moment,
you'd have to sort right there (whereas 'std::map' always keeps them
sorted).

V
--

Victor Bazarov, Apr 20, 2007
6. Guest

On 20 Apr, 22:33, "Victor Bazarov" <> wrote:

> That's fine. Another way is to have a vector<pair<float,int> >, stuff
> it with all the pairs you encounter, then sort them based on the pair's
> '.first' member. You'd have better space economy that way. Of course,
> in that case if you need to extract the correct order at any moment,
> you'd have to sort right there (whereas 'std::map' always keeps them
> sorted).

Uhm, I don't think it would be the best idea, because every time you
have to add a number you have to perform a search that is linear,

I think a good alternative would be to manually define the comparison
operator "less" considering that the equality

(a == b)

on a map is given by

!less(a,b) && !less(b,a)

it should be sufficient to write less() as something like

less(a, b){
if (a - b > threshold)
return false;
else
return true;
}

with threshold a proper small number. Sry for the loose syntax

you should then use map<double, unsigned, less>.

Regards,

Zeppe

, Apr 20, 2007
7. James KanzeGuest

On Apr 21, 12:36 am, wrote:
> On 20 Apr, 22:33, "Victor Bazarov" <> wrote:

> > That's fine. Another way is to have a vector<pair<float,int> >, stuff
> > it with all the pairs you encounter, then sort them based on the pair's
> > '.first' member. You'd have better space economy that way. Of course,
> > in that case if you need to extract the correct order at any moment,
> > you'd have to sort right there (whereas 'std::map' always keeps them
> > sorted).

> Uhm, I don't think it would be the best idea, because every time you
> have to add a number you have to perform a search that is linear,

Why would the search have to be linear? I've used lower_bounds
to keep a vector sorted on more than a few occasions.

But the real question concerns the actual use. I got the vague
impression that the ordering was only relevant once the
collection had been constructed. If that's the case, the best
solution is probably to just add the values to a vector, and
sort once at the end, when the vector is full. Otherwise, it
depends.

> I think a good alternative would be to manually define the comparison
> operator "less" considering that the equality

> (a == b)

> on a map is given by

> !less(a,b) && !less(b,a)

> it should be sufficient to write less() as something like

> less(a, b){
> if (a - b > threshold)
> return false;
> else
> return true;
> }

> with threshold a proper small number.

That's a nice way to get undefined behavior. Since such a
function doesn't define an ordering relationship (which must be
transitive), it doesn't meet the requirements for std::map or
std::set.

--
James Kanze (Gabi Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze, Apr 20, 2007
8. Rae WestwoodGuest

On Fri, 20 Apr 2007 19:52:01 +0000, JDT wrote:

> Hi,
>
> It seems that using floats as the first tupelo for an STL map (shown
> below) can cause problems but I don't remember what the problems were.
> Is it common practice to use a structure like below? I would appreciate
> if you can share your experience using such a structure. Thanks.
>
> std::map<float, int> m;
>
> JD

I'd be interested in authoritative and intelligent input on this as well.
My understanding is that for a type to be a key for a map, it only need be
(sortable)..iow: it overloads the relational operators so that two values
of the same type can be compared.

As to how you generate your keys...well, that could pose a problem since
ieee floating point numbers are approximations of actual floating point
values.

float n=2.0f

is it stored internally as 2.0 or 1.9999999997?

A way around this would be to (box) the float key into its own class and
make your relational operators be inexact comparitors.

btw: I HAVE actually found uses for maps that have floating point keys. I
use them when doing histograms of numerical data.

Rae Westwood, Apr 21, 2007
9. kostasGuest

On Apr 21, 12:12 am, JDT <> wrote:
> In the following scenario, I think using float as the key to a map makes
> sense. For example, the following table needs to be sorted in the
> ascending order of the 1st tuple (i.e. Values). So I can simply insert
> the pairs into a map and then get a list of sorted pairs automatically.
> Do people often have similar needs or are there other better ways to
> accomplish this purpose? Any further help is much appreciated.
>
> Values # of items
> 3.5 5
> 4.7 9
> 9.3 7
> .....

A safer approach that would solve some(but not all) of the problems
already discussed would be to insert a new key only if it's not close
enough to already existing ones, that is to the upper_bound() and its
previous iterator. Of course inserting with different order may result
in different keys(no with the clean numbers of your example). Do the
same when searching.
In other words don't use the convenient brackets [] use upper_bound()

kostas, Apr 21, 2007
10. red floydGuest

James Kanze wrote:
> On Apr 21, 12:36 am, wrote:
>
>> I think a good alternative would be to manually define the comparison
>> operator "less" considering that the equality

>
>> (a == b)

>
>> on a map is given by

>
>> !less(a,b) && !less(b,a)

>
>> it should be sufficient to write less() as something like

>
>> less(a, b){
>> if (a - b > threshold)
>> return false;
>> else
>> return true;
>> }

>
>> with threshold a proper small number.

>
> That's a nice way to get undefined behavior. Since such a
> function doesn't define an ordering relationship (which must be
> transitive), it doesn't meet the requirements for std::map or
> std::set.
>

On the other hand, assuming a reasonable threshold (EPSILON), what's
wrong with:

less(float a, float b)
{
if (fabs(a - b) > EPSILON) // difference big enough to be
return a < b; // significant
else
return false; // elements are "equal"
}

red floyd, Apr 21, 2007
11. kostasGuest

On Apr 21, 4:07 am, red floyd <> wrote:
> On the other hand, assuming a reasonable threshold (EPSILON), what's
> wrong with:
>
> less(float a, float b)
> {
> if (fabs(a - b) > EPSILON) // difference big enough to be
> return a < b; // significant
> else
> return false; // elements are "equal"
>
> }

Say you have already the keys 1.,2. and EPSILON =1.
Which key is this solution guaranteed to return when searching whith
1.1?

kostas, Apr 21, 2007
12. James KanzeGuest

On Apr 21, 3:07 am, red floyd <> wrote:
> James Kanze wrote:
> > On Apr 21, 12:36 am, wrote:

> >> I think a good alternative would be to manually define the comparison
> >> operator "less" considering that the equality

> >> (a == b)

> >> on a map is given by

> >> !less(a,b) && !less(b,a)

> >> it should be sufficient to write less() as something like

> >> less(a, b){
> >> if (a - b > threshold)
> >> return false;
> >> else
> >> return true;
> >> }

> >> with threshold a proper small number.

> > That's a nice way to get undefined behavior. Since such a
> > function doesn't define an ordering relationship (which must be
> > transitive), it doesn't meet the requirements for std::map or
> > std::set.

> On the other hand, assuming a reasonable threshold (EPSILON), what's
> wrong with:

> less(float a, float b)
> {
> if (fabs(a - b) > EPSILON) // difference big enough to be
> return a < b; // significant
> else
> return false; // elements are "equal"
> }

What have you changed? You still haven't defined a strict
ordering relationship, so you have undefined behavior.

--
James Kanze (Gabi Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze, Apr 21, 2007
13. James KanzeGuest

On Apr 21, 1:43 am, kostas <> wrote:
> On Apr 21, 12:12 am, JDT <> wrote:

> > In the following scenario, I think using float as the key to a map makes
> > sense. For example, the following table needs to be sorted in the
> > ascending order of the 1st tuple (i.e. Values). So I can simply insert
> > the pairs into a map and then get a list of sorted pairs automatically.
> > Do people often have similar needs or are there other better ways to
> > accomplish this purpose? Any further help is much appreciated.

> > Values # of items
> > 3.5 5
> > 4.7 9
> > 9.3 7
> > .....

> A safer approach

Excuse me, but without knowing the source of his values, how can
you say that it is a safer approach.

> that would solve some(but not all) of the problems
> already discussed would be to insert a new key only if it's not close
> enough to already existing ones, that is to the upper_bound() and its
> previous iterator. Of course inserting with different order may result
> in different keys(no with the clean numbers of your example). Do the
> same when searching.

I can't think off hand of a case where that would be
appropriate, but there probably exists one. A more likely
solution would be to round the keys before insertion. In a lot
of cases (most, I suspect, if one knows what one is doing),
there isn't a problem.

--
James Kanze (Gabi Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze, Apr 21, 2007
14. James KanzeGuest

On Apr 21, 1:31 am, Rae Westwood <>
wrote:
> On Fri, 20 Apr 2007 19:52:01 +0000, JDT wrote:

> > It seems that using floats as the first tupelo for an STL map (shown
> > below) can cause problems but I don't remember what the problems were.
> > Is it common practice to use a structure like below? I would appreciate
> > if you can share your experience using such a structure. Thanks.

> > std::map<float, int> m;

> I'd be interested in authoritative and intelligent input on this as well.
> My understanding is that for a type to be a key for a map, it only need be
> (sortable)..iow: it overloads the relational operators so that two values
> of the same type can be compared.

The term used by the standard is that there must be a comparison
function which "induces a strict weak ordering on the values".
The standard then specifies what it means by a strict weak
ordering:

The term strict refers to the requirement of an irreflexive
relation (!comp (x, x) for all x), and the term weak to
requirements that are not as strong as those for a total
ordering, but stronger than those for a partial ordering. If
we define equiv(a, b) as !comp (a, b) && !comp (b, a), then
the requirements are that comp and equiv both be transitive
relations:

-- comp (a, b) && comp (b, c) implies comp (a, c)

-- equiv(a, b) && equiv(b, c) implies equiv(a, c)
[Note: Under these conditions, it can be shown that

. equiv is an equivalence relation

. comp induces a well-defined relation on the
equivalence classes determined by equiv

. The induced relation is a strict total ordering.
-- end note ]

> As to how you generate your keys...well, that could pose a problem since
> ieee floating point numbers are approximations of actual floating point
> values.

Not really. Floating point numbers are exact
representations of floating point numbers. In many
applications, floating point numbers are used to model real
numbers; the approximation there is not very exact (and many
of the rules of real arithmetic do not hold).

> float n=2.0f

> is it stored internally as 2.0 or 1.9999999997?

Neither. It's stored internally as 0x40000000. Required by
the IEEE standard. As it happens, this value represents the
real number 2.0 exactly---you picked a very bad example.

The problem is, of course, that floating point can only
represent a very small subset of the real numbers, and a
non-contiguous subset at that. (int can only represent a
very small subset of the integers, but it is a contiguous
subset.) And that while precisely defined by IEEE, floating
point arithmetic doesn't follow some of the expected
rules---addition is not associative, for example---and
often doesn't give the same results as the same operation
in real arithmetic. And of course, the fact that we enter
floating point constants in the form of decimal numbers, and
that the values we "see" often aren't present in the set of
values representable in floating point.

> A way around this would be to (box) the float key into its own class and
> make your relational operators be inexact comparitors.

See above. I suspect that most naïve inexact comparitors
would fail to define a strick weak ordering.

> btw: I HAVE actually found uses for maps that have floating point keys. I
> use them when doing histograms of numerical data.

Just guessing, but in such cases, you would define
equivalences classes over ranges of floating point values,
no? Something along the lines of:

struct FPCmp
{
bool operator()( double a, double b ) const
{
return floor( 100.0 * a ) < floor( 100.0 * b ) ;
}
} ;

(In such cases, I'd probably use a canonic representation of
each equivalence class as the key, i.e. floor(100.0 *
value), in the above example.)

--
James Kanze (Gabi Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze, Apr 21, 2007
15. kostasGuest

On Apr 21, 1:02 pm, James Kanze <> wrote:
> I can't think off hand of a case where that would be
> appropriate, but there probably exists one. A more likely
> solution would be to round the keys before insertion. In a lot
> of cases (most, I suspect, if one knows what one is doing),
> there isn't a problem.

Round to what you forgot to say. I think that whatever rounding scheme
you choose, there is the possibility that slight different values will
round to different keys. Say you have chosen to round to units and
your input is 1.4 and 1.6(not to say 1.499... , 1.50...1). I suspect
that the first will round to 1. and the second to 2. You can say that
my proposal was a round to already existing values.

I had also taken my precautions.

kostas, Apr 21, 2007
16. zeppeGuest

James Kanze wrote:

> That's a nice way to get undefined behavior. Since such a
> function doesn't define an ordering relationship (which must be
> transitive), it doesn't meet the requirements for std::map or
> std::set.

Regards,

Zeppe

zeppe, Apr 21, 2007
17. James KanzeGuest

On Apr 21, 2:56 pm, kostas <> wrote:
> On Apr 21, 1:02 pm, James Kanze <> wrote:

> > I can't think off hand of a case where that would be
> > appropriate, but there probably exists one. A more likely
> > solution would be to round the keys before insertion. In a lot
> > of cases (most, I suspect, if one knows what one is doing),
> > there isn't a problem.

> Round to what you forgot to say. I think that whatever rounding scheme
> you choose, there is the possibility that slight different values will
> round to different keys.

Certainly. That's why you have rules for rounding.

> Say you have chosen to round to units and
> your input is 1.4 and 1.6(not to say 1.499... , 1.50...1). I suspect
> that the first will round to 1. and the second to 2. You can say that
> my proposal was a round to already existing values.

It's not rounding in the classical sense, although as you say,
in some ways, it works like it. The reason I say that I can't
think of a case off hand for it is precisely that the choice of
the central value is more or less random. I'm also not too sure
if it will work; there's certainly no way to write a Compare
operator which reflects it.

--
James Kanze (Gabi Software) email:
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

James Kanze, Apr 22, 2007
18. rep_movsdGuest

Maybe just use a class which represents fractions and use that instead
of floats

rep_movsd, Apr 22, 2007
19. kostasGuest

On Apr 22, 12:02 pm, James Kanze <> wrote:
> It's not rounding in the classical sense, although as you say,
> in some ways, it works like it. The reason I say that I can't
> think of a case off hand for it is precisely that the choice of
> the central value is more or less random.

Probably there are cases that it's not so important that the map will
end up having the same keys regardless of the insertion order (which
is the main advantage of an arbitrary rounding method as far as i can
see) but that these keys will be closer to the given ones(actually
will be some of them). To continue my simple example either 1.49... or
1.50...1 may be better that 1. or 2. Then, you would probably think ,
it may next come 1.1 which i would be forced to round to some of the
above values while rounding to unit is better. It may come, it may
are some interesting cases that this will not happen. Consider for
example an algorithm that produces some values quite far apart one
from the other(in theory) but with some fluctuations due to numerical
calculations(in practice) that you would like to filter out. Why you
should introduce in this case some extra arbitrary rounding
error(rounding guess, if you prefer) ?

kostas, Apr 22, 2007
20. Greg HerlihyGuest

On Apr 21, 2:51 am, James Kanze <> wrote:
> On Apr 21, 3:07 am, red floyd <> wrote:
>
>
>
> > James Kanze wrote:
> > > On Apr 21, 12:36 am, wrote:
> > >> I think a good alternative would be to manually define the comparison
> > >> operator "less" considering that the equality
> > >> (a == b)
> > >> on a map is given by
> > >> !less(a,b) && !less(b,a)
> > >> it should be sufficient to write less() as something like
> > >> less(a, b){
> > >> if (a - b > threshold)
> > >> return false;
> > >> else
> > >> return true;
> > >> }
> > >> with threshold a proper small number.
> > > That's a nice way to get undefined behavior. Since such a
> > > function doesn't define an ordering relationship (which must be
> > > transitive), it doesn't meet the requirements for std::map or
> > > std::set.

> > On the other hand, assuming a reasonable threshold (EPSILON), what's
> > wrong with:
> > less(float a, float b)
> > {
> > if (fabs(a - b) > EPSILON) // difference big enough to be
> > return a < b; // significant
> > else
> > return false; // elements are "equal"
> > }

>
> What have you changed? You still haven't defined a strict
> ordering relationship, so you have undefined behavior.

There is no undefined behavior. The less() function does produce a
total ordering:

For any a, b, and c such that less(a, b) returns true and less(b, c)
returns true, then it is the case that less( a, c) also returns true.

Greg

Greg Herlihy, Apr 23, 2007