several newbie questions about stl

G

goodbyeera

1. Does std::vector has any chance to shrink its capacity?
My understanding of the member functions relating to capacity:
reserve(): will only reallocate when the passed in argument is larger
than current capacity
insert(): will only reallocate when size is about to exceed current
capacity
erase(): will only invalidate iterators pointing to elements after the
erased one, so it cannot reallocate, quotes from standard:
resize(): simply call insert() or erase(), so won't reallocate

So, if we don't take operator=() and swap() into account, it seems
that during the lifetime of a vector, its capacity won't shrink
forever?

But the standard declares:
A vector is a kind of sequence that supports random access iterators.
In addition, it supports (amortized) constant time insert and erase
operations at the end; insert and erase in the middle take linear
time.

If its capacity won't shrink, then erase at the end operation should
be pure constant time, not just amortized constant time.
Could anyone tell me where my understanding goes wrong?

2. Why std::partition() requires its arguments to be bi-directional
iterator? Why isn't forward iterator enough?
partition(l, u, p)
m = l;
for i = [l, u]
if (p(*i))
swap(m++, i);
return m;

3. The standard says std::binary_search() only requires forward
iterator, and claims its complexity as below:
Complexity: At most log(last - first) + 2 comparisons.
So it seems that it only counts comparison as complexity, but ignores
the cost of advance operation when the iterator is just forward
iterator, but not random access iterator.
Based on the same assumption, why not std::reverse()/std::reverse_copy
() just requires forward iterator instead of bi-directional iterator,
after all, advance operation doesn't count? See below what standard
says about their complexity, respectively.
Complexity: Exactly (last - first)/2 swaps.
Complexity: Exactly last - first assignments.

4. Why the first two arguments of std::find_first_of() requires
forward iterator, but not just input iterator?

5. Difference between std::istream_iterator and
std::istreambuf_iterator?
template <class T, class charT = char, class traits =
char_traits<charT>, class Distance = ptrdiff_t>
class istream_iterator: public iterator<input_iterator_tag, T,
Distance, const T*, const T&>
{...};

template<class charT, class traits = char_traits<charT>>
class istreambuf_iterator: public iterator<input_iterator_tag, charT,
typename traits::eek:ff_type, charT*, charT&>
{...};
Why the former has const modifier for the pointer and reference type,
while the latter doesn't?

6. Difference of argument type of std::generate() and
std::random_shuffle()?
Why the functor argument of the former is passed by value, while the
equivalent of latter is passed by reference? What's the rationale for
such difference?
My understanding is that pass-by-value functor cannot retain its usage
information (e.g. internal state change in the RNG), thus calling
std::generate() again with the same functor will generate the same
sequence of numbers.
Pass-by-reference functor doesn't have this problem, but on the other
hand, it prevents user from passing a temporary object.
It seems that it will be better to use a pass-by-value functor
argument and return the used functor as the function's return value,
just as what std::for_each() does.

7. The prototype of bsearch() is:
void *bsearch(const void *key, const void *base, size_t nmemb, size_t
size, int (*compar)(const void *, const void *));
The base argument is of type (void*) and the return value is of type
(void*). This seems to be of C-like style, which is for the sake of
ease of usage of the return value, such as the strchr() in C:
char *strchr(const char *s, int c);
But in C++, most of such functions have been changed to a pair of
overloaded functions:
const char* strchr(const char* s, int c);
char* strchr(char* s, int c);
So, why the same change is not done to bsearch()?

8. As for std::basic_ostream and std::basic_istream, why the << and >>
operation of bool, short, int, long are all defined as member
function, but only the equivalent of char is defined as non-member
function?
 
A

Alf P. Steinbach

* (e-mail address removed):
[snipped explanations of the questions]
1. Does std::vector has any chance to shrink its capacity?
2. Why std::partition() requires its arguments to be bi-directional
3. The standard says std::binary_search() only requires forward
4. Why the first two arguments of std::find_first_of() requires
forward iterator, but not just input iterator?
5. Difference between std::istream_iterator and
std::istreambuf_iterator?
6. Difference of argument type of std::generate() and
std::random_shuffle()?
7. The prototype of bsearch() is:
void *bsearch(const void *key, const void *base, size_t nmemb, size_t
size, int (*compar)(const void *, const void *));
The base argument is of type (void*) and the return value is of type
(void*). This seems to be of C-like style, which is for the sake of
ease of usage of the return value, such as the strchr() in C:
char *strchr(const char *s, int c);
But in C++, most of such functions have been changed to a pair of
overloaded functions:
const char* strchr(const char* s, int c);
char* strchr(char* s, int c);
So, why the same change is not done to bsearch()?
8. As for std::basic_ostream and std::basic_istream, why the << and >>
operation of bool, short, int, long are all defined as member
function, but only the equivalent of char is defined as non-member
function?

Regarding 1 I think chances are that you're correct, that the time complexity of
erase is needlessly kept open-ended as "amortized" instead of flat-out linear.

Regarding the other questions, except 5, it seems the answer is mostly
"historical accident".

However, those are *good* questions that might help to improve the standard
library definition, if not for first version of C++0x then perhaps in a
corrigendum or technical report, and so I suggest you post this [comp.std.c++],
which as of a few months ago is up and running again, after being down since
early January 2008. Alternatively post it to [comp.lang.c++.moderated], where
far more Really Knowledgable Folks are present. About the only ones more or less
frequently reading and posting here that I know are qualified to answer your
questions -- without having to engage in the same long-winded research that
you and I would have to -- are Andrew Koenig, James Kanze and Pete Becker, and
you're not guaranteed that either of them will read your article or take the
considerable time to respond, or respond to all questions.


Cheers & hth.,

- Alf
 
G

goodbyeera

Thank you, Alf. I'm new to C++ and I got these questions when I'm
reading The C++ Programming Language by Bjarne Stroustrup recently. I
don't know whether these questions are good or silly, because I've
never used STL in any real code yet and these questions are simply my
immediate thoughts during the book reading. I've followed your
suggestion and posted the same article on the c++.moderated and std.c+
+ group. Hope someone could clear up my confusion there.

Thanks,
goodbyeera
 
J

James Kanze

1. Does std::vector has any chance to shrink its capacity?
My understanding of the member functions relating to capacity:
reserve(): will only reallocate when the passed in argument is larger
than current capacity
insert(): will only reallocate when size is about to exceed current
capacity
erase(): will only invalidate iterators pointing to elements after the
erased one, so it cannot reallocate, quotes from standard:
resize(): simply call insert() or erase(), so won't reallocate
So, if we don't take operator=() and swap() into account, it seems
that during the lifetime of a vector, its capacity won't shrink
forever?
Correct.

But the standard declares:
A vector is a kind of sequence that supports random access
iterators. In addition, it supports (amortized) constant time
insert and erase operations at the end; insert and erase in
the middle take linear time.
If its capacity won't shrink, then erase at the end operation
should be pure constant time, not just amortized constant
time. Could anyone tell me where my understanding goes wrong?

Nowhere, really. It's just that the standard uses the
expression amortized constant time a lot elsewhere, so the
amortized ends up attached to constant even in places where it's
not necessary.
2. Why std::partition() requires its arguments to be
bi-directional iterator? Why isn't forward iterator enough?
partition(l, u, p)
m = l;
for i = [l, u]
if (p(*i))
swap(m++, i);
return m;

Because the Hoare's original algorithm (1961) requires
bi-directional iterators. (It may also result in a few less
swaps.)
3. The standard says std::binary_search() only requires
forward iterator, and claims its complexity as below:
Complexity: At most log(last - first) + 2 comparisons. So it
seems that it only counts comparison as complexity, but
ignores the cost of advance operation when the iterator is
just forward iterator, but not random access iterator. Based
on the same assumption, why not
std::reverse()/std::reverse_copy () just requires forward
iterator instead of bi-directional iterator, after all,
advance operation doesn't count? See below what standard says
about their complexity, respectively.
Complexity: Exactly (last - first)/2 swaps.
Complexity: Exactly last - first assignments.

History, probably. There are a lot of seemingly random choices
in the STL.
4. Why the first two arguments of std::find_first_of()
requires forward iterator, but not just input iterator?

Good question. (Typically, it's not very useful over an input
iterator, but there's no reason to forbid it.)
5. Difference between std::istream_iterator and
std::istreambuf_iterator?
template <class T, class charT = char, class traits =
char_traits<charT>, class Distance = ptrdiff_t>
class istream_iterator: public iterator<input_iterator_tag, T,
Distance, const T*, const T&>
{...};

The difference between reading through a streambuf, and reading
through an istream. In particular, with the streambuf_iterator:

-- you can only read characters (char, wchar_t),
-- there are no format flags (note that in istream, skipws is
on by default, so an istream_iterator<char> will not copy
any white space),
-- and the error flags aren't set when you use an
istreambuf_iterator.

In practice, the stream iterators aren't that useful, and the
input stream iterators even less so.
template<class charT, class traits = char_traits<charT>>
class istreambuf_iterator: public iterator<input_iterator_tag, charT,
typename traits::eek:ff_type, charT*, charT&>
{...};
Why the former has const modifier for the pointer and
reference type, while the latter doesn't?

Good question. You'd probably have to ask in comp.std.c++.
6. Difference of argument type of std::generate() and
std::random_shuffle()?
Why the functor argument of the former is passed by value,
while the equivalent of latter is passed by reference? What's
the rationale for such difference?

In general, in the standard library, functional objects are
passed by value. (Why, I don't know, but I suspect that it has
to do with const: if passed by const reference, they have to be
immutable, but if passed by non const reference, you can't pass
a temporary.) In the case of a random number generator, of
course, the object needs state, so pass by value doesn't work
unless an additional level of indirection is introduced.
My understanding is that pass-by-value functor cannot retain
its usage information (e.g. internal state change in the RNG),
thus calling std::generate() again with the same functor will
generate the same sequence of numbers. Pass-by-reference
functor doesn't have this problem, but on the other hand, it
prevents user from passing a temporary object. It seems that
it will be better to use a pass-by-value functor argument and
return the used functor as the function's return value, just
as what std::for_each() does.

Many solutions are possible. The standard pretty much assumes
that functional objects don't have mutable state, since how
often and when they are copied isn't specified; in practice, if
you need mutable state, you need an additional level of
indirection (formally, even for for_each, although in practice,
probably not). Random generators obviously need mutable state,
and for a lot of them, significant mutable state (making copying
expensive). So different rules apply.
7. The prototype of bsearch() is:
void *bsearch(const void *key, const void *base, size_t nmemb, size_t
size, int (*compar)(const void *, const void *));
The base argument is of type (void*) and the return value is of type
(void*). This seems to be of C-like style, which is for the sake of
ease of usage of the return value, such as the strchr() in C:
char *strchr(const char *s, int c);
But in C++, most of such functions have been changed to a pair of
overloaded functions:
const char* strchr(const char* s, int c);
char* strchr(char* s, int c);
So, why the same change is not done to bsearch()?

Probably oversight.
8. As for std::basic_ostream and std::basic_istream, why the
<< and >> operation of bool, short, int, long are all defined
as member function, but only the equivalent of char is defined
as non-member function?

To confuse people.

Seriously, they were all members in the classical iostream. And
changing this broke a lot of code.
 
Z

Zeppe

James said:
Good question. (Typically, it's not very useful over an input
iterator, but there's no reason to forbid it.)

There actually is. An input iterator can read elements only once. If you
make a copy of an input iterator, the copy and the original iterators
will read different values when incremented. There is no way to
implement find_first_of with input_iterators.
Probably oversight.

or because it was useless. You cannot assign a void* anyway, you need to
make a cast, and therefore there is not a big difference to me between a
void* and a const void*, as a return type (as input it allows you to
give const void* if used in the code).
To confuse people.

Seriously, they were all members in the classical iostream. And
changing this broke a lot of code.

It looks like they tried to minimise the number of member functions for
basic_ostream. Since the output of characters has several overloading,
it was probably easier, and more efficient at compilation time, to
provide a simple template put and implement all the others using it.

Regards,

Zeppe
 
J

Juha Nieminen

James said:
Nowhere, really. It's just that the standard uses the
expression amortized constant time a lot elsewhere, so the
amortized ends up attached to constant even in places where it's
not necessary.

Does the standard guarantee that the capacity of a std::vector will
never decrease (other than when swapping or assigning from another vector)?

Or, in other words, does the standard allow or forbid, for example,
pop_back() to decrement the capacity of the vector?
 
G

goodbyeera

There actually is. An input iterator can read elements only once. If you
make a copy of an input iterator, the copy and the original iterators
will read different values when incremented. There is no way to
implement find_first_of with input_iterators.

I am referring only to the first two parameters of std::find_first_of
(), but not the the other two parameters. For the 3rd and 4th
parameters, it has to be forward iterators. But for the first two
parameters, it looks like they are just the same as the parameters of
std::find(). If std::find() requires just input iterator, why
std::find_first_of() requires anything different?
 
Z

Zeppe

Juha said:
Does the standard guarantee that the capacity of a std::vector will
never decrease (other than when swapping or assigning from another vector)?

that's right, there is no way to decrease the capacity of a vector. This
guarantees you that if you do not exceed the vector capacity, the vector
is not reallocated and iterators and references to the content are not
invalidated.

The exception to this rule is swap(), since you can swap the content of
a vector with that of another (possibly smaller) vector, of course
invalidating references and iterators.

Best wishes,

Zeppe
 
Z

Zeppe

I am referring only to the first two parameters of std::find_first_of
(), but not the the other two parameters. For the 3rd and 4th
parameters, it has to be forward iterators. But for the first two
parameters, it looks like they are just the same as the parameters of
std::find(). If std::find() requires just input iterator, why
std::find_first_of() requires anything different?

Sorry, I was thinking at search rather than find_first_of. You're right
for the first two iterators, and I'll tell you more: in the c++ 0x draft
the function signature is changed to accept input iterators
(http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2798.pdf page 926)

Best wishes,

Zeppe
 
J

James Kanze

There actually is. An input iterator can read elements only
once. If you make a copy of an input iterator, the copy and
the original iterators will read different values when
incremented. There is no way to implement find_first_of with
input_iterators.

template< typename Iter1, typename Iter2 >
Iter1
find_first_of(
Iter1 begin1,
Iter1 end1,
Iter2 begin2,
Iter2 end2 )
{
while ( begin1 != end1
&& find( begin2, end2, *begin1 ) == end2 )
++ begin1 ;
}
return begin1 ;
}

Where's the problem is Iter1 is an input iterator?
or because it was useless. You cannot assign a void* anyway,
you need to make a cast, and therefore there is not a big
difference to me between a void* and a const void*, as a
return type (as input it allows you to give const void* if
used in the code).

You need a static_cast, but not a const_cast.
It looks like they tried to minimise the number of member
functions for basic_ostream. Since the output of characters
has several overloading, it was probably easier, and more
efficient at compilation time, to provide a simple template
put and implement all the others using it.

They didn't minimize very much. There's no real reason why <<
int, for example, has to be a member.

There are several advantages in making all of the overloaded
operators members. To do so means, however, that users must be
able to add members---in today's C++, this could be done by
making operator<< a member function template, and allowing users
to specialize it on their types, but this option wasn't
available when the classical iostream was designed.

If you decide to make as few of the overloads members as
possible, then none of them would be members; all of them are
easily defined in terms of the rest of the public interface.
 
J

James Kanze

Does the standard guarantee that the capacity of a std::vector
will never decrease (other than when swapping or assigning
from another vector)?
Or, in other words, does the standard allow or forbid, for
example, pop_back() to decrement the capacity of the vector?

Indirectly, yes, since it guarantees that pop_back will not
invalidate iterators. And that it has constant complexity.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,042
Latest member
icassiem

Latest Threads

Top