don't understand the STL source code!

P

Peyman

Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/
 
K

Kai-Uwe Bux

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster!

Have you measured?
and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?

I bet that the list is implemented as a ring with one node as the past-end
node. That node is realized as a _List_node_base since it does not carry
data. All other nodes carry data. The reason to make this particular node
special is so that construction of a std::list<T> does not require T to be
default constructible. Similar considerations apply to the past-end nodes
for all tree-based structures.


Best

Kai-Uwe Bux
 
G

Gianni Mariani

Peyman wrote:
....
I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/

I have no idea what the author did so but I would do it to reduce the
number of specializations the compiler has to do - resulting in less
code. If you know that the objects are always going to have a
_List_node_base then the methods to insert/erase/splice/iterate along
the notes are the same no matter what derived class you have. If you
can use non-template functions to do most of the hard work, it means
that the compiler may be able to use the same subset methods for all
instantiations of the std::list template.

I however, don't worry about doing that, it's an optimization. If it
turns out that I have a severe case of code bloat on a particular
template class I might look at it but in the last 10 years I have never
had to worry about code bloat even when I had a code size budget. The
were a few times code bloat almost became an issue but the resolution
was to get rid of some of the bloated libraries we were linking in.
Other times, the marketers were on something they didn't share with me
(comments like "The install binary must be less than 100k" usually fall
on deaf ears unless someone can prove that it really means the
investment will be worth it.)

I know you didn't ask about optimization but since the answer to the
question quite possibly would send you on a wild goose chase I thought
it would be prudent to raise it now.

PrematureOptimization = sqrt( AllEvil )

Just write template code so that it works properly iff you come across a
code bloat issue, then fix the top code bloat issue, rinse, repeat until
you get to an acceptable size. In the least 3 years I have never needed
to worry about this.

The reason it makes sense to put that extra effort into the stl is that
it is used everywhere, from embedded systems to everything else so code
bloat more than likely is an issue.

G
 
E

Earl Purple

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/

A common technique when writing a template class is to separate
"dependent" from "non-dependent" code. Dependent code is that which
depends on the template parameter.

Much of a linked-list structure is code relating to the manipulation of
nodes and can be done the same way regardless of the underlying type
stored.

By having only one set of this code, it reduces the code size as well
as compile time. It is unlikely to have a major impact on performance.
 
M

mlimber

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

[..]

And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?

The specific reason is that all identifiers starting with an underscore
and a capital letter are reserved for the implementation, which
includes the STL. This rule helps prevent name clashes with the user's
names.

Cheers! --M
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top