Alternatives to the C++ Standard Library?

D

Dietmar Kuehl

Michael said:
The question then becomes how to get the lengths of the sequences, which
are needed to initialize a table for dynamic programming.

Do you need the length of both sequences up front? If so, the best
bet would be asking for two multi-pass sequences, i.e. forward cursors
for both sequence. To determine the actual length, you would always
use

distance(b, e);

where 'b' and 'e' are the beginning and the end of the corresponding
sequence. This would count all elements in within the range if the
user of the algorithm does not pass random access cursors. For random
access cursors a simple substraction would be automatically used. That
is, although you only ask for multi-pass iterators, the algorithm
would take advantage of optimizations available for the actual
parameters.
I suppose one answer is to make them parameters also.

I would not go down this route! The restriction to have a multi-pass
iterator is not that bad unless the sequences are really tremendous
(i.e. they won't fit into memory together with the other necessary
data). Of course, you might provide an interface to the algorithm
which takes the length as argument but which only requires input
iterators. This algorithm would then be used to implement one which
determines the length on its own.
 
D

Dietmar Kuehl

Steven said:
Dietmar Kuehl wrote:
There are places where the wording is overly "artful". A very good
example is with the use of the term "header".

I don't think "header" is can be replaced with anything which better
describes it to the target audience of the C++ standard (people who
know C++ and just want to find out details of the actual specification).
The term "header" for sets of declarations is sufficiently old and
established that everybody fluent in C++ (or C) knows what it is about.
The standard does not live in isolation nor is it intended to do so.
Actually, there are known problems with the standard which are more
harmful. For example, some things are not really specified because it
is assumed that the reader of the standard magically knows what the
actual intend was.
You aren't seeing what I'm seeing. And I mean in terms of my sent mail,
and my inbox.

You mean, you get requests from committee members pleading for job? I'm
convinced that none of the committee members has intended to secure his
jobs by deliberately making the standard the way it is.
I really wasn't thinking in terms of education. I was thinking in terms
of formal development of ideas. Mathematical rigor.

Ah, you mean the standard committee should have rather gone into some
suitable ivory tower instead of practically working on a document. Of
course, it would have been desirable to be absolutely precise, have
no contradictions, have definitions for everything, etc. The committee
tried and actually still tries to do so. Practical considerations are,
however, still important. For example, the time we can work on the
standard is actually rather limited since everybody actually does
something for a living. I'm not aware of anybody who has the liberty
to work full-time on the standard although some organizations devote
substantial amounts of time to the process already. ... and the
community wants results faster than they can be delivered, too.

I was, to some extent speaking metaphorically. More in the sense of ivory
tower than economics.

Well, the C++ standard was not and never will be produced in an ivory
tower. It is driven by practical and economical considerations and
this is not going to change.
Not sure what you mean here. The reason I mentioned the Java-style
iterator is because I believe in reality it is just using the three
operations you had enumerated, and I was suggesting that it could
therefore be optimized in much the same way.

I don't think that Java enumerators can be substantially optimized.
For one, it is hard enough to determine that the object's type cannot
change in the given context and even if this can be determined, the
three operations (advance, access, and check) are too interwinded to
be separable at all. In addition, Java enumerators lack the interface
to take advantage of additional operations.
From my readding of TC++PL, getting the best performance is often
obtained by using the member function for a particular container
template.

You got it wrong. The generic algorithms are not applicable in all cases
but in some of these cases the internal structure of a certain container
still allows an implementation with good performance. In cases like this,
the container provides a corresponding operation as a member. For
example, the generic algorithm for binary search (aka 'lower_bound()',
'upper_bound()', and 'equal_range()') are only applicable to random
access iterators. However, 'set's and 'map's can provide efficient
searches [for the key type only, though] although they only support
bidirectional iteration. Thus, there are specialized versions as member
functions.

I thought that's what I said.

Maybe you should understand my wording, then review your wording, and
then realize that they actually state rather different things with
mine reflecting what Nico almost certainly wanted to state. Essentially,
I'm saying that the generic (non-member) algorithms have the best
performance for all sequences to which they are applicable (*). The
container classes may provide specialized (member) algorithms in cases
where the generic algorithm is not applicable but an algorithm with
good (or for the given data structure best) performance can be
implemented due to the knowledge of the data structure. Essentially,
your statement implies that there is a choice between algorithms used
(generic or specific ones). This is not the case: containers only have
specialized algorithms if the generic ones cannot be applied.

(*) Actually, this statement is too strong in some cases: it is sometimes
possible to create a faster algorithm based on special properties of
the used data structures. This is, however, essentially only due to
incomplete sets of concepts and corresponding incomplete implementation
of algorithms. Conceptually, the generic algorithms should be as good
as any special algorithm - and in general even better because having to
maintain only one (or a small few) generic implementation of an
algorithm is more effective than implementing the same algorithms for
each data structure and should thus provide the freedome to aggressively
optimize the generic implementation.
I was thinking of a strategy implemented at a higher level. That is, the
compiler actually swapping out the algorithm to get the best performance.

.... and I explained that generic algorithms actually indeed determine
the best algorithm (at compile time) but do so based on the sequence
concepts, entirely ignoring any possibly involved container. Of course,
I'm trying to bring home the idea that containers are entirely irrelevant
in STL since at least half a dozens articles in this thread (and many
more before this thread...).
I don't see why the compiler is obligated to perform all these operation
if the same behavior can be obtained by eliminating on of them.

Actually, the compiler does not really do it at all. It is the
implementer of the generic algorithm who does. The compiler is then
forced to choose the right algorithm by various rules of the C++
language, e.g. overload resolution or SFINAE.
I've done some interesting things with them, but unless I'm using them
every day, they are difficult to understand. They take me more time to
understand and modify than do the slightly more verbose approach they
replace.

["they" and "them" referring to Java's anonymous classes]

I agree with this. However, the envisioned lambda expressions work
only similar semantically, not syntactically, anyway. In some sense,
you can view the envisioned lambda as nothing different than a block.
The only news is that blocks can also be passed to functions in which
case they become nothing else than a functor. To allow arguments to
these functors, you can use the "special" variable "_1", "_2", ... I
don't consider this overly complicated. Of course, it assumes that
the user has at least a vague idea about what a "functor" is. However,
this is a pretty central concept and without knowing it a user would
not be very effective in using STL algorithms anyway.
There's also the problem of not seeing everything relevant to the
construct
where it is located. I would have to think about the topic a bit to
provide a concrete example, but I am confident in what I asserted.

I'm also confident in what I have asserted.
The proposed foreach construct has the ability to take bounds as arguments
so I could (probably) make this work. This is just something I have handy
that shows the kind of complexity I am thinking about. I wouldn't even
know where to begin doing it with a lambda expression:

The funny thing is that it wouldn't look that much different at all
- despite the fact that I wasn't able to locate any use of a "for_each"
construct... To make use of lambda expressions in this ordinary C++
code, you would essentially replace the "for" statements by function
templates (which can be provided by a library) which essentially call
each block with a corresponding index. The index would not be called
"i", "j", ... but "_1", "_2", ..., e.g.

for_each_index(_radii, {
osg::Vec2 unv(_radii[_1 + 1] - _radii[_1]);
unv.normalize();

for_each_index(_radii, _1, _ring, {
osg::Vec3 v0(_radii[_1][0], _radii[_1][1] * _ring[_2][0],
_radii[_1][1] * _ring[_2][1]);
// ... (I think it is clear how to modify the remainder to
// achieve the same effect of the loop)
});
});


Of course, this uses two different overloads of the 'for_each_index()'
function:
- one taking a unary functor as argument.
- one taking a binary functor as argument (the first argument is
fixed by the function templates second argument, the second is
the varying index)
But I don't see how this distinction is any different for Java than it is
for C++. My point in saying that you can call toArry on a Java container
was that you can impose a random access mappin over the collection, and
use
random access algorithms to operate on it. Mind you with Java you will
mostlikely be sorting pointers.

Imposing any random access structure (one where all elements can be
accessed in O(1)) to a container where you can insert in O(1) at
arbitrary positions involves at least an O(n) operation. Although it
reduces the constant when only using pointers it does not change
the asymptotic complexity. Thus, doing such a conversion necessarily
imposes a substantial overhead, especially if the actual operation
can be done with a complexity smaller than O(n), like searching in
a sorted sequence which can be done in O(ln n). I assume you know
enough about complexity theory to make sense of these funny O(f(n))
notations (if not you should read up on them; this is a pretty
important topic if you want to do serious software development).
That has advantages and disadvantages.
One advantage is that it's possible to impose multiple mappings at the
same time on the same collection.

Although it is possible with the current STL concepts (if you are
trying hard enough) this is something which is trivially done using
cursors and property maps as the basic concepts - one of the reasons
why I'm trying to move the committee to refine the current STL
concepts to use cursors and property maps as the basic sequence
abstraction instead of iterators.
I was saying that you could convert the linked list to an Array in order
to
get random access. But that was assuming we were talking about C++ random
access algorithms.

At least I was talking about generic algorithms which do some funny
things to provide the best performance achievable for a given sequence.
Converting between different kinds of sequence abstraction like
converting between an item based (e.g. forward or bidirectional access
is provided by linked lists) and a random access based (i.e. you can
move between arbitrary elements in O(1) time) is none of these funny
things, however. The key of generic algorithms is to provide efficient
implementations based on what you pass to them. If you pass them a
list iterator (i.e. you get forward or bidirectional access) they use
just this and to the best possible on the basis of this restriction.
If you pass them a vector iterator (i.e. you get random access) they
may be able to use a more efficient algorithm due to the enhanced
capabilities.
I would have to read up on exactly what determines the backing data
structure of the various Java collections. They may actually hold
everything with an Array of pointers, and merely manipulate pointers to
provide the different interfaces. If you add an element to a linked list
you could put it at the end of the array and put it's pointer somewhere
else in the linked list. Yes you will pay an indirection cost. And Java
may also leave holes on removal of elements, and pack it when it feels
like it. I don't recall, if I ever knew.

You cannot get both: insertion at an arbitrary position and random
access. Well, you can get insertion at an arbitrary position and
random access if you don't need the order for the random access to
retain the original order. Of course, loosing the original order
effectively kills your chances to use a more efficient algorithms
taking advantage of random access. That is, simply adding things to
the end of an array does not help you when you want to pass the
sequence to an algorithm.

I think rather than reading up on Java stuff you should at least
consider trying to understand what I'm saying. I'm convinced that
this provides more interesting insights into implementation of
algorithms than finding out about details of Java's internal data
structures...
I don't know how their so-called generics work, but supposedly they
address
some of that. I'm sure the designers of Java understood the tradeoff,
they just opted for sloppy type checking to get flexible behavior.

I don't think that the Java designers understood anything about generic
programming at all, otherwise they would not have done what they did.
I understand the restrictions they felt themselves caught in but I have
seen a much better approach to generic on a JVM than what Java now has!
They opted for something which was done in C++ something like 15 years
ago and which became what are templates now. Essentially, the Java
designers refused to learn the lessons learned in the C++ context (of
course, they also refused from other experiences before, too).
The one feature of
Java Arrays that makes them superior to C++ built-ins is the .size() (or
is that .length()?) method.

It is trivial to create a class which looks like a built-in array but
adds a 'size()' member. The standard library does not do it because it
is deemed sufficient to provide 'std::vector': the two extra pointers
per object are not considered to be too expensive and I doubt that
there are indeed many applications where they are harmful. In the few
cases where they are, it is sufficiently easy to create a suitable
class [template].
You can do it with sizeof, but that gets really
weird when you start passing them as parameters.

Of course, I have rare occasion to pass arrays in the first place. I
generally pass iterator pairs to suitable function templates. Maybe you
should try to understand what STL is about and you will realize that
there are few situations where you want to restrict your functions to
accept arrays anyway.
Other than the fact that the guys as SuSE change my keyboard mapping once
a week, I can do a reasonable job of typing in Old Norse, so the umlauts
come through, no problem.

Of course, I assume that others benefit from this thread, too (?). Thus,
See my code example above.
Ditto.

When I was able to divide the STL conceptually into two groups of things;
collections, and algorithms, I was able to think about it in finite terms.
It no longer seemed open-ended, That's what I mean. That's what I
consider the first step in really understanding something.

Except that neither the set of collections nor the set of algorithms
is finite to the STL! Part of understanding STL is that although you
have an infinite set of collections, you only need to implement each
algorithm a finite (actually small) number of times. That is, if you
want an algorithm which is applicable to all containers, you only
need to provide one implementation thereof (possibly, you need a few
more implementations if you want to take advantage of specialstructures
for providing better performance but this is only about optimization).
On the other hand, if want to create a container which can be used with
all algorithms, you only need to provide a few entities abstracting
your container (e.g. iterators). That is, both the set of containers
and the set of algorithms are open. The only thing which is indeed
conceptually finite in STL is the set of concepts (and even this is
only finite per domain addressed by the algorithms and data structures).

BTW, STL conceptually consists of at least four different things:
- concepts
- algorithms
- iterators and functors
- containers
(listed in decreasing order of importance).
What I said is completely accurate regarding the iterators that are
associated with containers. I have no idea what you are talking about.

It wouldn't have been necessary to state the last sentence since this is
indeed obvious. In several messages I tried to explain that iterators are
a concept which is entirely independent of containers. You don't need any
container but you can still have a sequence which is accessed by
iterators. You are right that there are iterators associated with
containers but this only means that containers have associated iterators
and that they may indeed sometimes depend on them. On the other hand, you
can create iterators for containers which are ignorant of iterators, i.e.
you can apply STL algorithms to containers which were created in complete
unawareness of STL in the first place! All you need to do is to create
suitable iterators.

.... and maybe you will eventually understand what I'm talking off and
realize that your statement was not as accurate as you apparently think
it was.
Every STL container other than bitset has iterator typedefs. They have an
iterator, a const_iterator, a reverse_iterator, and a
const_reverse_iterator. The sequence types have a difference type for
storing the difference between iterator positions.

You aren't telling me about STL containers, are you? If you do, you might
want to read the second paragraph in the acknowledgments of Nico's book...
(although since then I have not only implemented the locales and IOStreams
library but most of the standard C++ library).

[relatively irrelevant although correct stuff about particular iterators
remove]
But it's important to bear in mind that iterators are a pure abstraction,
so anything that acts like an iterator is an iterator.

The above sentence is part of the key to STL. I just have the strong
impression you have not yet digested its implications. Also, it is
a major oversimplification when it comes to concept hierarchies.
Perhaps I'm wasting my time reading Josuttis's The C++ Standard Library?

Perhaps. It would not be due to Nico's presentation, though.
I only have 8 more pages left in Chapter 13. But perhaps I'm wasting my
time. Can you suggest a *_good_* book. Guess I'll skip chapter 14. I'm
sure the quality of presentation is not up to your high standards.

This is an interesting statement, especially taking the last sentence
of the first paragraph of Chapter 14 into account... I'm undecided on
whether you are ridiculing me or you didn't know what is says. In any
case, you are right with your assumption that I would present the
contents differently today (but I still think it is appropriate).
No. And that's not really what I'm saying. I don't believe I would be
able to write intuitive code using the feature.

Can you point out where

for_each(vec, { std::cout << _1; });

is more complex than

for_each(int i: vec) { std::cout << i; }

? These two statements are indeed different but I seriously don't see
where they are substantially different to justify that the first
code is not intuitive...

There is clearly a lot of tricky stuff to learn here:
http://www.boost.org/doc/html/lambda/le_in_details.html

I'm well aware of Boost::Lambda and how it is implemented, including
the more complicated scenarios. Still, this library does not cover
everything, nor can it.
 
M

Michael Olea

Dietmar said:
... you would always use

distance(b, e);

[which does a simple subtraction when it can, otherwise counts]

Perfect. Done.

(I was wondering how "distance" could specialize on iterator type, but
looking at a g++ version I see it simply dispatches to an overloaded
function taking a 3rd argument of type xxx_iterator_tag.)

Now I have a generic edit_distance algorithm that cleanly separates the
notions of movement and access (a real issue in sequence analysis), and is
as efficient as hand crafted versions would be. Gotta love generic code!

The interface, though, is a little cumbersome - 6 parameters to specify two
sequences. A higher level abstraction - "Sequence" - might be called for:

http://boost-consulting.com/projects/mtl4/libs/sequence/doc/html/

(I know you are aware of this link, I am assuming there is some non-zero
chance others may be reading this thread.) Then maybe the interface would
look like:

template<class ReadableForwardSeq1, class ReadableForwardSeq2>
unsigned int edit_distance(ReadableForwardSeq1 s1, ReadableForwardSeq2 s2);

I don't know if that would simplify client code much, but there is another
motive: there will be a functor version of edit_distance, one of a number
of metric functors to be passed to a handful of generic cluster analysis
algorithms. The idea is that these algorithms do not need to know anything
at all about the objects being clustered (vectors, tuples, strings,
graphs...) other than some notion of proximity, which is provided by a
metric (or alternatively a similarity) functor. And in general the call:
my_metric(x, y) seems more natural than my_metric(b1, e1, b2, e2, rmp1,
rpm2).

Just thinking out loud here. Anyway, I am going out of town tomorrow morning
for a meeting - and it is *very nice* to be walking into that meeting with
a working version of edit_distance based on the cursor/property_map idea,
where edit_distance is serving as a model for many other functions like it.

- Michael
 
S

Steven T. Hatton

Dietmar said:
I don't think "header" is can be replaced with anything which better
describes it to the target audience of the C++ standard (people who
know C++ and just want to find out details of the actual specification).
The term "header" for sets of declarations is sufficiently old and
established that everybody fluent in C++ (or C) knows what it is about.

In C++ "header" does not mean the same thing that it means on page 82 of
K&R. I try do distinguish between 'header' meaning any header file, and
Standard Header, meaning a member of a particular set of entities specified
in the C++ Standard. I treat "Standard Header" as a proper noun. I'm sure
the English professors can throw their Latin words at me for it, but I
don't care.
The standard does not live in isolation nor is it intended to do so.
Actually, there are known problems with the standard which are more
harmful. For example, some things are not really specified because it
is assumed that the reader of the standard magically knows what the
actual intend was.

One thing I found lacking, though it seems to clearly have been intended, is
a specification of what it means for one program element to appear after
another. There is, AFAIK, no stipulation that a file be parsed in
assending address order. That is, source text should be read top to
bottom, left to right. That may seem to be a foregone conclusion. Perhaps
it's best left as such, rather than trying to specify what top to bottom,
left to right means in terms of a translation unit. Doing otherwise could
result in volumes of philosophical essays.
I'm
convinced that none of the committee members has intended to secure his
jobs by deliberately making the standard the way it is.

It's not the big guys with the trillion dollar market capitalization.
Ah, you mean the standard committee should have rather gone into some
suitable ivory tower instead of practically working on a document. Of
course, it would have been desirable to be absolutely precise, have
no contradictions, have definitions for everything, etc. The committee
tried and actually still tries to do so.

There is a big difference between saying 'the Committee should have done
such and such', and saying 'I was surprised to find such and such'. I did
not say it was wrong that the Standard is constructed as it is. I was just
surprised to find that it is so constructed. Yes I made the comment in the
context of discussin the understandability of the Standard, but I only made
it as an observation about the structure offered for consideration. It was
not a criticism.

Computers offer us a powerful collection of tools, and the ability to build
new tools for analysing the structure of documents such as the C++
Standard. Personally, I find the prospect attractive as an intellectual
exercise, and believe it could have practical benefit.
Practical considerations are,
however, still important. For example, the time we can work on the
standard is actually rather limited since everybody actually does
something for a living. I'm not aware of anybody who has the liberty
to work full-time on the standard although some organizations devote
substantial amounts of time to the process already. ... and the
community wants results faster than they can be delivered, too.

Understood. A careful analysis of the structure of the Standard is probably
something better done in the offseason. I've suggested the text of the
document be converted to XML in oder to facilitate such a study. Working
with it in PDF is beyond painful for any purpose other than simply reading
it or printing it. I suspect copying from the document when it's open in
Acrobat is easier under Windos than under X, but that would force me to
work in an environment lacking all the other tools I rely on.
Well, the C++ standard was not and never will be produced in an ivory
tower. It is driven by practical and economical considerations and
this is not going to change.

Actually, C and C++ are the products of a different culture than the one
that exists today. Bell Labs used to have the all the reverence bestowed
on a major university. Likewise for XEROX PARC. But I don't care to play
language lawyer in defending my choice of words. Let them fall as they
will.
I don't think that Java enumerators can be substantially optimized.
For one, it is hard enough to determine that the object's type cannot
change in the given context and even if this can be determined, the
three operations (advance, access, and check) are too interwinded to
be separable at all. In addition, Java enumerators lack the interface
to take advantage of additional operations.

What can be done in Java as implemented on a JVM, what can be done with the
syntax of the Java-style iterators. I'm not sure of the distinction
between iterators and enumerators, and don't have time to look it up, but
these are two distinct interfaces in Java. My point was regarding the
relationship between syntax of the programming language and the ability of
the compiler to optimize it. It's easy to lose sight of the fact that the
source language does not necessarily dictate what the compiler can
optimize. As regards flexibility of syntax at the source level, I agree
that Java-style iterators are not as flexible. I often found myself
casting things to Array (a rather arcane exercise in Java) to get a more
comfortable interface than that provided by Java containers.
Maybe you should understand my wording, then review your wording, and
then realize that they actually state rather different things with
mine reflecting what Nico almost certainly wanted to state. Essentially,
I'm saying that the generic (non-member) algorithms have the best
performance for all sequences to which they are applicable (*). The
container classes may provide specialized (member) algorithms in cases
where the generic algorithm is not applicable but an algorithm with
good (or for the given data structure best) performance can be
implemented due to the knowledge of the data structure. Essentially,
your statement implies that there is a choice between algorithms used
(generic or specific ones). This is not the case: containers only have
specialized algorithms if the generic ones cannot be applied.

There are some cases where he states that std::list provides member
functions which are superior to the namespace local (he says "global", but
I try to avoid using foul language in fromal discussions) algorithms. See
for example §9.8.1. You are thinking of the cases where the namespace
local algorithms are not applicable to associative containers. Bear in mind
that the highlighter ink is still wet on my copy of TC++SL.

Agreed.
Actually, the compiler does not really do it at all. It is the
implementer of the generic algorithm who does. The compiler is then
forced to choose the right algorithm by various rules of the C++
language, e.g. overload resolution or SFINAE.

I'd have to really get down into the weeds to reply intelligently, and I
don't believe that would be practical for a person with my limited
understanding of these matters. As for SFINAE, I thought that was the
silliest "look at the neat tricks we can do with templates" example
possible, until I saw the paper on overloading operator.().
I've done some interesting things with them, but unless I'm using them
every day, they are difficult to understand. They take me more time to
understand and modify than do the slightly more verbose approach they
replace.

["they" and "them" referring to Java's anonymous classes]

I agree with this. However, the envisioned lambda expressions work
only similar semantically, not syntactically, anyway. In some sense,
you can view the envisioned lambda as nothing different than a block.
The only news is that blocks can also be passed to functions in which
case they become nothing else than a functor. To allow arguments to
these functors, you can use the "special" variable "_1", "_2", ... I
don't consider this overly complicated. Of course, it assumes that
the user has at least a vague idea about what a "functor" is. However,
this is a pretty central concept and without knowing it a user would
not be very effective in using STL algorithms anyway.

Perhaps I'm too closed minded on the issue, but when I started reading about
delayed evaluation, etc., I had the impression that lambda can get pretty
tricky.
There's also the problem of not seeing everything relevant to the
construct
where it is located. I would have to think about the topic a bit to
provide a concrete example, but I am confident in what I asserted.

I'm also confident in what I have asserted.
The proposed foreach construct has the ability to take bounds as
arguments
so I could (probably) make this work. This is just something I have
handy
that shows the kind of complexity I am thinking about. I wouldn't even
know where to begin doing it with a lambda expression:

The funny thing is that it wouldn't look that much different at all
- despite the fact that I wasn't able to locate any use of a "for_each"
construct... To make use of lambda expressions in this ordinary C++
code, you would essentially replace the "for" statements by function
templates (which can be provided by a library) which essentially call
each block with a corresponding index. The index would not be called
"i", "j", ... but "_1", "_2", ..., e.g.

for_each_index(_radii, {
osg::Vec2 unv(_radii[_1 + 1] - _radii[_1]);
unv.normalize();

for_each_index(_radii, _1, _ring, {
osg::Vec3 v0(_radii[_1][0], _radii[_1][1] * _ring[_2][0],
_radii[_1][1] * _ring[_2][1]);
// ... (I think it is clear how to modify the remainder to
// achieve the same effect of the loop)
});
});


Of course, this uses two different overloads of the 'for_each_index()'
function:
- one taking a unary functor as argument.
- one taking a binary functor as argument (the first argument is
fixed by the function templates second argument, the second is
the varying index)

Perhaps I should study it more carefully. Thanks.
Imposing any random access structure (one where all elements can be
accessed in O(1)) to a container where you can insert in O(1) at
arbitrary positions involves at least an O(n) operation. Although it
reduces the constant when only using pointers it does not change
the asymptotic complexity. Thus, doing such a conversion necessarily
imposes a substantial overhead, especially if the actual operation
can be done with a complexity smaller than O(n), like searching in
a sorted sequence which can be done in O(ln n). I assume you know
enough about complexity theory to make sense of these funny O(f(n))
notations (if not you should read up on them; this is a pretty
important topic if you want to do serious software development).

I'm not fully convinced of this. If the elements actually are stored in an
array with the possibility of holes, providing random access *may* not be
as expensive as you are suggesting.
You cannot get both: insertion at an arbitrary position and random
access. Well, you can get insertion at an arbitrary position and
random access if you don't need the order for the random access to
retain the original order. Of course, loosing the original order
effectively kills your chances to use a more efficient algorithms
taking advantage of random access. That is, simply adding things to
the end of an array does not help you when you want to pass the
sequence to an algorithm.

I'm hesitant to accept absolute statements about whether it can "help".
There may be a way to leverage the order of the linked list mapping in
order to optimize the production of a packed, ordered array.

I don't think that the Java designers understood anything about generic
programming at all, otherwise they would not have done what they did.
I was talking about the original tradeoff. Basically Gosling took the Emacs
Lisp idea that a list is a list is a list, and applied it to an OOPL that
looks like C++ and acts like Smalltalk.
I understand the restrictions they felt themselves caught in but I have
seen a much better approach to generic on a JVM than what Java now has!
They opted for something which was done in C++ something like 15 years
ago and which became what are templates now. Essentially, the Java
designers refused to learn the lessons learned in the C++ context (of
course, they also refused from other experiences before, too).

I can't comment on what they now have. About two years ago I suggested they
leave things alone and simply focus on effectively using what they had.
The one feature of
Java Arrays that makes them superior to C++ built-ins is the .size() (or
is that .length()?) method.

It is trivial to create a class which looks like a built-in array but
adds a 'size()' member. The standard library does not do it because it
is deemed sufficient to provide 'std::vector': the two extra pointers
per object are not considered to be too expensive and I doubt that
there are indeed many applications where they are harmful. In the few
cases where they are, it is sufficiently easy to create a suitable
class [template].

The syntax in C++ is not as elegant for creating multidimessional jaggad
arrays of std::vector as it is in Java for creating their smooth
counterpart.
Of course, I have rare occasion to pass arrays in the first place. I
generally pass iterator pairs to suitable function templates. Maybe you
should try to understand what STL is about and you will realize that
there are few situations where you want to restrict your functions to
accept arrays anyway.

It's more the case that the data I'm working with lends itself naturally to
being expressed as multidimensional arrays. The same can be done using
std::valarray and std::gslice, but it is not as intuitive for me.

Suggesting that there may be an STL approach to the same problem may be
useful. Suggesting that I am not trying to learn to use the STL is wrong.
It may well be that STL iterators will do very much the same thing that
using boost::array<> will do. When I pass a boost::array<>* I'm passing a
reference to the data, and information about its bounds. I typically want
to iterate over all elements of the array, and often want to do that for
each member array. This is easy to do if I can simply pass the whole data
structure by reference, and have the called function determine the bounds.
STL iterators may give me more flexibility in such a situation, but I'm not
sure I gain much from that. It would seem more productive to find examples
where the array approach is used, and show the advantage of using the STL
approach instead, than it would be to suggest that a person is not trying
to learn the STL approach.
Except that neither the set of collections nor the set of algorithms
is finite to the STL! Part of understanding STL is that although you
have an infinite set of collections, you only need to implement each
algorithm a finite (actually small) number of times. That is, if you
want an algorithm which is applicable to all containers, you only
need to provide one implementation thereof (possibly, you need a few
more implementations if you want to take advantage of specialstructures
for providing better performance but this is only about optimization).
On the other hand, if want to create a container which can be used with
all algorithms, you only need to provide a few entities abstracting
your container (e.g. iterators). That is, both the set of containers
and the set of algorithms are open. The only thing which is indeed
conceptually finite in STL is the set of concepts (and even this is
only finite per domain addressed by the algorithms and data structures).

BTW, STL conceptually consists of at least four different things:
- concepts

Does the Standard currently provide a formal definition (I'll tell you when
I've stopped laughing) of "concept" in the STL sense? I do believe the
term has a specific meaning in the context of the STL, and I believe it is
close to what I call a meta-type.
- algorithms
- iterators and functors
- containers
(listed in decreasing order of importance).

When I was talking about the STL being finite, I meant the actual number of
specifications in the C++ Standard. I agree your list is more complete
than mine. I suggest that iterators should be placed on a separate line
above functors. Helper functions (for lack of a better term) may also be
worth mentioning. That is mem_fun vs mem_fun_t. You may prefer to leave
the notion of traits implied by the 'T' in STL.
This is an interesting statement, especially taking the last sentence
of the first paragraph of Chapter 14 into account... I'm undecided on
whether you are ridiculing me or you didn't know what is says. In any
case, you are right with your assumption that I would present the
contents differently today (but I still think it is appropriate).

Well, since I've been getting a lot of negative feedback on the idea of
viewing the Standard Headers sans comments as an instructive exercise (and
you seem to be aware of that discussion), and you suggested I stop looking
at the the headers and, _instead_, read a book on the subject, your comment
didn't come across very well. I'm a slow reader. I have what some people
call learning disabilities. It's very hard work for me to get through a
book like that.

I will share this observation about my experience in reading the book. The
reason I was eight pages from the end of Chapter 13 is because it wasn't
"clicking". There's something that's difficult to explain which happens to
me when I fail to understand some key concept in such a development. It
happened when I was reading a book on the physics of elasticity. I had not
fully understood the difference between force and stress. There are some
fairly elementary trigonometric derivations following the discussion of
stress which I simply stared at without being able to follow the
derivations. Once I figured out the conceptual part of what distinguishes
stress from force, the derivations were very easy to follow. This may not
seem rational to most people. It doesn't seem rational to me.
Nonetheless, it's how my mind works.

I cannot claim to fully understand every step in detail regarding the
discussion 676-677, but after examining the various I/O Header files I've
extracted from the Standard, the discussion began to make sense.

Obviously the most important of these is the one containing:

namespace std {
template <class charT, class traits = char_traits<charT> >
class basic_streambuf {...};
}
Can you point out where

for_each(vec, { std::cout << _1; });

is more complex than

for_each(int i: vec) { std::cout << i; }

? These two statements are indeed different but I seriously don't see
where they are substantially different to justify that the first
code is not intuitive...

In the latter I can see where 'i' is coming from.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
ScottChare

Latest Threads

Top