Clean ways to count remove_if() removals?

Jorgen Grahn · Jun 23, 2009

I'm looking at std::remove_if(Iter, Iter, Pred) and
std::list::remove_if(Pred). I'd like to use it to remove elements from
a std::list *and* find out how many elements were removed.

Is there a clean way?

At first I wrote a predicate class which counted how many times its
operator() returned true -- but then I realized that predicates are
passed by value, so if I did

MyPredicate p;
my_list.remove_if(p);
cout << "removed " << p.size << " elements\n";

'p' was copied, and never used again.
I can pass a pointer or reference to a counter:

unsigned size = 0;
MyPredicate p(&size);
...

but that doesn't seem very elegant. Have I missed something? Is there
a standard idiom for dealing with this?

/Jorgen

Victor Bazarov · Jun 23, 2009

Jorgen said:
I'm looking at std::remove_if(Iter, Iter, Pred) and
std::list::remove_if(Pred). I'd like to use it to remove elements from
a std::list *and* find out how many elements were removed.

Is there a clean way?

At first I wrote a predicate class which counted how many times its
operator() returned true -- but then I realized that predicates are
passed by value, so if I did

MyPredicate p;
my_list.remove_if(p);
cout << "removed " << p.size << " elements\n";

'p' was copied, and never used again.
I can pass a pointer or reference to a counter:

unsigned size = 0;
MyPredicate p(&size);
...

but that doesn't seem very elegant. Have I missed something? Is there
a standard idiom for dealing with this?

How about

auto szBefore = my_list.size();
remove_if ...
auto szAfter = my_list.size();
cout << "removed " << szBefore - szAfter << " elements\n";

V

Marcel Müller · Jun 23, 2009

Victor said:
How about

auto szBefore = my_list.size();
remove_if ...
auto szAfter = my_list.size();

If list is what it looks like, .size() is O(n). This might be not intended.

Marcel

Marcel Müller · Jun 23, 2009

Jorgen said:
I can pass a pointer or reference to a counter:

unsigned size = 0;
MyPredicate p(&size);
...

but that doesn't seem very elegant. Have I missed something? Is there
a standard idiom for dealing with this?

I would say the way you showed above. This is for instance exaclty the
way that C# effectively uses when local variables are captured by lambda
expressions.

Marcel

Victor Bazarov · Jun 23, 2009

Marcel said:
If list is what it looks like, .size() is O(n). This might be not intended.

Counting how many times the predicate is called is also O(n). And the
list *could* cache the size somewhere (if it's smart, that is).

BTW, 'remove_if' doesn't change the size. That's my fault. 'erase'
does. So, to be semantically correct, one needs to call 'std::distance'
for the two iterators:

auto itNewEnd = remove_if...
auto removed = std::distance(itNewEnd, my_list.end());

IIRC, 'remove_if' only places the elements to be removed to the end of
the container.

V

Jorgen Grahn · Jun 24, 2009

That's right -- I was about to point that out in my first posting. It
feels stupid to walk through the list element-by-element three times.

Still, that is what I do as a workaround. And in this particular case
it doesn't really matter much for speed -- my lists have a mean size
of 2 and a max size of 11 ...

Counting how many times the predicate is called is also O(n). And the
list *could* cache the size somewhere (if it's smart, that is).

BTW, 'remove_if' doesn't change the size. That's my fault. 'erase'
does. So, to be semantically correct, one needs to call 'std::distance'
for the two iterators:

auto itNewEnd = remove_if...
auto removed = std::distance(itNewEnd, my_list.end());

IIRC, 'remove_if' only places the elements to be removed to the end of
the container.

You are right about std::remove_if, but there is also a
list::remove_if which *does* erase the elements.

/Jorgen

James Kanze · Jun 24, 2009

Counting how many times the predicate is called is also O(n).
And the list *could* cache the size somewhere (if it's smart,
that is).

BTW, 'remove_if' doesn't change the size.

std::list<>::remove_if does (and typically, should be a lot
faster than std::remove_if followed by erase on a list).

Jorgen Grahn · Jun 24, 2009

I would say the way you showed above. This is for instance exaclty the
way that C# effectively uses when local variables are captured by lambda
expressions.

That still doesn't make it more attractive to me ... or does it?
My problem with the code below is that I have an object 'p' with a
worthless state.

But it looks a bit nicer formulated like this:

unsigned size = 0;
baz = std::remove_if(a, b, MyPredicate(&size));

I guess I could live with that.

/Jorgen

SG · Jun 24, 2009

I'm looking at std::remove_if(Iter, Iter, Pred) and
std::list::remove_if(Pred). I'd like to use it to remove
elements from a std::list *and* find out how many elements were
removed.

Is there a clean way?

At first I wrote a predicate class which counted how many times its
operator() returned true -- but then I realized that predicates are
passed by value, so if I did

MyPredicate p;
my_list.remove_if(p);
cout << "removed " << p.size << " elements\n";

'p' was copied, and never used again.
I can pass a pointer or reference to a counter:

unsigned size = 0;
MyPredicate p(&size);
...

but that doesn't seem very elegant. Have I missed something? Is there
a standard idiom for dealing with this?

If remove_if returned a copy of the predicate you could have written

p = list.remove_if(p);

That's how it works fo std::for_each. Since list::remove_if returns
void I think you don't have much of a choice. In C++0x you will be
able to write

list.remove_if(ref(p));

reference_wrapper<T> has an overloaded function call operator which
perfectly forwards parameters. I don't know whether Boost.Ref has any
magic in it to do something similar.

Cheers!
SG

Michael Doubez · Jun 24, 2009

That's right -- I was about to point that out in my first posting. It
feels stupid to walk through the list element-by-element three times.

Still, that is what I do as a workaround. And in this particular case
it doesn't really matter much for speed -- my lists have a mean size
of 2 and a max size of 11 ...

You could also use std::count_if() with your predicate to count the
elements that will be erased. That would make it only 2 walkthrough.

You are right about std::remove_if, but there is also a
list::remove_if which *does* erase the elements.

You could also roll your own loop (in a function):

template<class T>
size_t list_erase_if(std::list<T>& my_list, MyPredicate p)
{
typedef std::list<T>::iterator iterator;
size_t counter = 0;

const iterator begin = my_list.begin();
const iterator end = my_list.end();
for( iterator it = std::find_if(begin,end,p) ;
it != end ;
it = std::find(it,end,p)
)
{
it = my_list.erase( it );
++counter;
}

return counter;
}

Michael Doubez · Jun 24, 2009

If remove_if returned a copy of the predicate you could have written

p = list.remove_if(p);

Which is not a big help since AFAIK there is no guarantee that the
same predicate instance is called on all elements or even that it is
called only once.

In addition to that, there is not much sense in returning a predicate,
it is not like it is supposed to hold a value.

That's how it works fo std::for_each.

But for_each() takes a functor, not a predicate and provides the
guarantee that it returns the function object after it has been
applied to each element (and local state are allowed).

Since list::remove_if returns
void I think you don't have much of a choice. In C++0x you will be
able to write

list.remove_if(ref(p));

reference_wrapper<T> has an overloaded function call operator which
perfectly forwards parameters. I don't know whether Boost.Ref has any
magic in it to do something similar.

Which doesn't help either.

Maxim Yegorushkin · Jun 24, 2009

James said:
The standard requires list<>::size() to have constant time,
which means that it must be cached.

Well, here is stlport4 std::list<>::size() (bundled with Sun Studio 12):

size_type size() const {
size_type __result = distance(begin(), end());
return __result;
}

SG · Jun 24, 2009

Which is not a big help since AFAIK there is no guarantee that the
same predicate instance is called on all elements or even that it is
called only once.

In addition to that, there is not much sense in returning a predicate,
it is not like it is supposed to hold a value.
Right.

But for_each() takes a functor, not a predicate and provides the
guarantee that it returns the function object after it has been
applied to each element (and local state are allowed).

Yeah, I know. If you check out the requirements of the concept
Predicate you'll note only two restrictions: A predicate returns
something that is implicitly convertible to bool and it is not allowed
to modify its arguments. You still can have "state" (i.e. pointer).

I just forgot about lambdas:

int count = 0;
list.remove_if([&count](my_type const& x)->bool{...});

Which doesn't help either.

Please elaborate. I don't have the C++03 standard documents available
but the C++0x draft explicitly mentions

Complexity: Exactly distance(begin(), end()) applications of the
corresponding predicate.

which seems enough a guarantee to me to make the ref(p) or lambda
version work. It doesn't even matter if the function call operator is
applied on the /same/ predicate object.

Cheers!
SG

Michael Doubez · Jun 24, 2009

I just forgot about lambdas:

int count = 0;
list.remove_if([&count](my_type const& x)->bool{...});

Which doesn't help either.

Click to expand...

Please elaborate. I don't have the C++03 standard documents available
but the C++0x draft explicitly mentions

Complexity: Exactly distance(begin(), end()) applications of the
corresponding predicate.

which seems enough a guarantee to me to make the ref(p) or lambda
version work. It doesn't even matter if the function call operator is
applied on the /same/ predicate object.

So it seems.

And C++03 has the same guarantee 23.2.2.4/18:
Complexity: Exactly size() applications of the corresponding predicate

joseph cook · Jun 24, 2009

The standard requires list<>::size() to have constant time,
which means that it must be cached.

So it does. The documentation for SGI STL states that it can be O(N),
and warns that one should always use "empty()" instead of "size()
==0". I wonder why that is...

Joe

Michael Doubez · Jun 24, 2009

The standard says it "should have constant complexity". That's a
recommendation, not a requirement. One issue here is the requirement
that list::splice have constant complexity, which can't be done if size
is also constant complexity.

The size can be memorized, invalidated upon splice and recomputed at
first call of size().

That would make it amortized constant time at worse (at the cost of a
comparison in size() which is small enough).

Jerry Coffin · Jun 24, 2009

[ ... ]

The standard requires list<>::size() to have constant time,
which means that it must be cached.

Where is that requirement? The only thing I know of is the infamouse
"note A" attached to Table 65, which only says it: "should have
constant complexity."

Jerry Coffin · Jun 24, 2009

[ ... ]

The standard says it "should have constant complexity". That's a
recommendation, not a requirement. One issue here is the requirement
that list::splice have constant complexity, which can't be done if size
is also constant complexity.

I was looking at that, but after looking carefully, I believe size()
can be made constant complexity while meeting all the complexity
requirements for splice().

There are three overloads of splice:
void splice(iterator position, list<T, Allocator> &x);
Inserts the entirety of x at before position. The cached size can be
updated in constant time:
cached_size += x.size();
x.cached_size = 0;

void splice(iterator position, list<T, Allocator>& x, iterator i);

This removes x and inserts it as this[position]. The sizes can be
updated in consant time:
++cached_size;
--x.cached_size;

void splice(iterator position, list<T, Allocator> &x, iterator first,
iterator last);

This is the (somewhat) tricky one: the requirement is:
Constant time if &x == this; otherwise, linear time

If &x==this, we're just moving elements around in the same container
-- so the container's size doesn't change. We can just modify the
pointers in constant time, and never have to touch the cached size at
all.

If we're moving elements from one container to another, linear time
is allowed. This allows walking the elements in the list from first
to last to count the elements.

If there was no requirement for updating a cached count, all the
overloads of splice could always have constant complexity.

As an aside (and perhaps I should post this in comp.std.c++ instead,
but since you're apparently reading this thread...) a quick look at
N2857 shows that it still uses the same wording, mentioning "time".
Might it be better to change that from "time" to "complexity"?

Alf P. Steinbach · Jun 24, 2009

* joseph cook:

So it does.

It doesn't.

There are two possible in practice list implementations: one where size is O(1),
and one where splicing is O(1).

The standard allows both.

The documentation for SGI STL states that it can be O(N),
and warns that one should always use "empty()" instead of "size()
==0". I wonder why that is...

In order to make splicing O(1).

Cheers & hth.,

- Alf

Hans Bos · Jun 24, 2009

Alf P. Steinbach said:
* joseph cook:

It doesn't.

There are two possible in practice list implementations: one where size is
O(1), and one where splicing is O(1).

So when using the library I cannot rely on the fact that either operation
may be O(1).
I have to assume that both size() and splice() are O(N).

It would be better if the standard specifies which one is O(1).

Greetings,
Hans.

how to implement a remove_if predicate with iterators vector	0	May 28, 2004
Pruning a std::map efficiently?	14	Apr 1, 2009
Removing elements from std::vector.	6	Nov 15, 2004
Modified answer to Accelerated C++ Exercise 11-6 stops linking?	5	Jan 5, 2010
A different kind of adaptor iterator	4	Dec 5, 2003
First try at a std::vector from 1 to n	17	May 3, 2005
For Peer Review	1	Apr 2, 2010
New flyweight template class	0	Dec 10, 2007

Clean ways to count remove_if() removals?

Jorgen Grahn

Victor Bazarov

Marcel Müller

Marcel Müller

Victor Bazarov

Jorgen Grahn

James Kanze

Jorgen Grahn

SG

Michael Doubez

Michael Doubez

Maxim Yegorushkin

SG

Michael Doubez

joseph cook

Michael Doubez

Jerry Coffin

Jerry Coffin

Alf P. Steinbach

Hans Bos

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads