Efficient searching through stl map?

H

Henrik Goldman

I have a std::map which consist of time_t as first and an associated class
as second:

map<time_t, myclass>

I would like to do calculations (min and max values) for a set of 'second'
for a specific time range.

Given the fact that 'first' can hold any time I would like to do the
calculation based on values which falls within a range of e.g. yesterday
(all 24 hours). So I'd like to efficiently find the first value of time_t
which meets this requirement.

Until now I've scanned forward by an iterator but if you have alot of values
this linear search becomes quite slow. This has become a problem to me. The
problem boils down to that I don't know the exact 'first' value that I'm
looking for but know within what timerange it should fall.

Are there any stl algorithms which can let me define my own compare
algorithm so I can do this time range checking in order to find the first
valid element? Naturally since time_t is just an integer the algorithm
should be faster then linear search.

Thanks in advance.

-- Henrik
 
D

Daniel T.

"Henrik Goldman said:
I have a std::map which consist of time_t as first and an associated class
as second:

map<time_t, myclass>

I would like to do calculations (min and max values) for a set of 'second'
for a specific time range.

Given the fact that 'first' can hold any time I would like to do the
calculation based on values which falls within a range of e.g. yesterday
(all 24 hours). So I'd like to efficiently find the first value of time_t
which meets this requirement.

Until now I've scanned forward by an iterator but if you have alot of values
this linear search becomes quite slow. This has become a problem to me. The
problem boils down to that I don't know the exact 'first' value that I'm
looking for but know within what timerange it should fall.

Are there any stl algorithms which can let me define my own compare
algorithm so I can do this time range checking in order to find the first
valid element? Naturally since time_t is just an integer the algorithm
should be faster then linear search.

I'm not quite sure you are asking for but I'm going to assume that
lower_bound, upper_bound, and equal_range won't work.

Are you trying to find two instances of myclass who's associated time_t
object are within a certain range?

Maybe if you put the data in a multimap<myclass, time_t>, then
equal_range will give you all the myclass objects and you can compare
their time_ts... If I'm not even close, then you are going to have to
show some code. Give an example that works but is too slow and someone,
i'm sure, will be able to produce an example that is faster and still
does the same thing.
 
H

Henrik Goldman

I'm not quite sure you are asking for but I'm going to assume that
lower_bound, upper_bound, and equal_range won't work.

Yes I'm sorry about beeing unclear. It's very easy to be quite unclear when
you do something quite technical and have been sitting with it for a while.

I think lower_bound might solve my problem though.

The basic idea is that I do statistics at specific intervals e.g. every 10
min. The gathered values are stored as 'second' for the specific time_t.
This way I can go through each record and see at what time it was gathered.

Here is the original (slow!) algorithm:

int CStatistics::GetMaxValueInPeriod(time_t tFrom, time_t tTo)

{

UseMap::iterator it;

int nMaxValue = 0;

for (it = m_pUseMap->begin();

it != m_pUseMap->end();

it++)

{

if (it->first < tFrom)

continue;

if (it->first > tTo)

break;

if (nMaxValue < it->second.m_nTotalFoo)

nMaxValue = it->second.m_m_nTotalFoo;

}

return nMaxValue;

}



Assume it contains values from since 2 years ago which is the worst case
scenario for me.

Then in order to find the first values which fits in my range I scan forward
by omitting all values which were earlier then the time I want.

This takes a lot of time even for a few values. Can lower_bound help me?
Are you trying to find two instances of myclass who's associated time_t
object are within a certain range?

No. I'm trying to find all instances of myclass which falls within a
specific timerange in order to perform caluculations on them. The algorithm
above is just one of many.
Maybe if you put the data in a multimap<myclass, time_t>, then
equal_range will give you all the myclass objects and you can compare
their time_ts... If I'm not even close, then you are going to have to
show some code. Give an example that works but is too slow and someone,
i'm sure, will be able to produce an example that is faster and still
does the same thing.

Perhaps multimap would work as well. I havn't spent enough time on multimap
structure to be able to tell. Theres always a complexity vs. speed limit.
I'd rather have easy to read code naturally so if possible I'll stick to
map.

Thanks in advance.
-- Henrik
 
D

Daniel T.

I'm not quite sure you are asking for but I'm going to assume that
lower_bound, upper_bound, and equal_range won't work.

Yes I'm sorry about beeing unclear. It's very easy to be quite unclear when
you do something quite technical and have been sitting with it for a while.

I think lower_bound might solve my problem though.

The basic idea is that I do statistics at specific intervals e.g. every 10
min. The gathered values are stored as 'second' for the specific time_t.
This way I can go through each record and see at what time it was gathered.

Here is the original (slow!) algorithm:

int CStatistics::GetMaxValueInPeriod(time_t tFrom, time_t tTo)

{

UseMap::iterator it;

int nMaxValue = 0;

for (it = m_pUseMap->begin();

it != m_pUseMap->end();

it++)

{

if (it->first < tFrom)

continue;

if (it->first > tTo)

break;

if (nMaxValue < it->second.m_nTotalFoo)

nMaxValue = it->second.m_m_nTotalFoo;

}

return nMaxValue;

}



Assume it contains values from since 2 years ago which is the worst case
scenario for me.

Then in order to find the first values which fits in my range I scan forward
by omitting all values which were earlier then the time I want.

This takes a lot of time even for a few values. Can lower_bound help me?[/QUOTE]

bool comp( const UseMap::value_type& lhs, const UseMap::value_type& rhs )
{
return lhs.second.m_m_nTotalFoo < rhs.second.m_m_nTotalFoo;
}

int CStatistics::GetMaxValueInPeriod( time_t tFrom, time_t tTo )
{
assert( tFrom <= tTo );
UseMap::iterator begin = m_pUseMap->lower_bound( tFrom );
UseMap::iterator end = m_pUseMap->upper_bound( tTo );
if ( begin == end )
throw runtime_error( "no such period" );
return max_element( begin, end, &comp )->second.m_m_nTotalFoo;
}

I think you will find something like the above much faster. Of course
I'm guessing as to some of the types because you didn't give compilable
code.

Depending on how many elements are in the map, and how big the range is:

int CStatistics::GetMaxValueInPeriod( time_t tFrom, time_t tTo )
{
assert( tFrom <= tTo );
UseMap::iterator begin = m_pUseMap->lower_bound( tFrom );
UseMap::iterator end = begin;
while ( end->first < tTo )
++end;
if ( begin == end )
throw runtime_error( "no such period" );
return max_element( begin, end, &comp )->second.m_m_nTotalFoo;
}

Might be faster, you'll have to measure the two and see which is better
for you.

Finally there is:

int CStatistics::GetMaxValueInPeriod( time_t tFrom, time_t tTo )
{
assert( tFrom <= tTo );
UseMap::iterator begin = m_pUseMap->lower_bound( tFrom );
if ( begin->first < tTo )
throw runtime_error( "no such period" );
int result = begin->second.m_m_nTotalFoo;
++begin;
while ( begin->first < tTo ) {
if ( begin->second.m_m_nTotalFoo > result )
result = begin->second.m_m_nTotalFoo;
++begin;
}
return result;
}

Which is a little harder to read, but may be a hair faster than the
second example.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top