don't understand the STL source code!

Peyman · Oct 31, 2006

Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/

Kai-Uwe Bux · Oct 31, 2006

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster!

Have you measured?

and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?

I bet that the list is implemented as a ring with one node as the past-end
node. That node is realized as a _List_node_base since it does not carry
data. All other nodes carry data. The reason to make this particular node
special is so that construction of a std::list<T> does not require T to be
default constructible. Similar considerations apply to the past-end nodes
for all tree-based structures.

Best

Kai-Uwe Bux

Gianni Mariani · Oct 31, 2006

Peyman wrote:
....

I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/

I have no idea what the author did so but I would do it to reduce the
number of specializations the compiler has to do - resulting in less
code. If you know that the objects are always going to have a
_List_node_base then the methods to insert/erase/splice/iterate along
the notes are the same no matter what derived class you have. If you
can use non-template functions to do most of the hard work, it means
that the compiler may be able to use the same subset methods for all
instantiations of the std::list template.

I however, don't worry about doing that, it's an optimization. If it
turns out that I have a severe case of code bloat on a particular
template class I might look at it but in the last 10 years I have never
had to worry about code bloat even when I had a code size budget. The
were a few times code bloat almost became an issue but the resolution
was to get rid of some of the bloated libraries we were linking in.
Other times, the marketers were on something they didn't share with me
(comments like "The install binary must be less than 100k" usually fall
on deaf ears unless someone can prove that it really means the
investment will be worth it.)

I know you didn't ask about optimization but since the answer to the
question quite possibly would send you on a wild goose chase I thought
it would be prudent to raise it now.

PrematureOptimization = sqrt( AllEvil )

Just write template code so that it works properly iff you come across a
code bloat issue, then fix the top code bloat issue, rinse, repeat until
you get to an acceptable size. In the least 3 years I have never needed
to worry about this.

The reason it makes sense to put that extra effort into the stl is that
it is used everywhere, from embedded systems to everything else so code
bloat more than likely is an issue.

G

Earl Purple · Oct 31, 2006

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

and then everywhere in the rest of the code, for example for the
iterator class, they only use the node_base, while for the list class
itself, they use node, not the node_base, and then they keep casting
these two structs to retrieve the data stored in node and so on! I
found this technique in almost all of the other data structures they've
implemented there, like the "RD_Tree" for "map". I can't understand why
they're dividing a node into two structs, while it could be only a
simple node containing both pointers and data, and then never need to
cast them, so of course faster! and easier! so I thought there must be
a good reason! Anyone knows why!?
And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?
BTW, I know that the implementation is old fashioned but I still can
learn from it! and for more info. here is the link to this
implementation:
http://www.sgi.com/tech/stl/

A common technique when writing a template class is to separate
"dependent" from "non-dependent" code. Dependent code is that which
depends on the template parameter.

Much of a linked-list structure is code relating to the manipulation of
nodes and can be done the same way regardless of the underlying type
stored.

By having only one set of this code, it reduces the code size as well
as compile time. It is unlikely to have a major impact on performance.

mlimber · Oct 31, 2006

Peyman said:
Hi, I was reading the source code of an implementation of STL by HP
company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
Which I found that for implementing a Linked List, they have used the
following two structures:

struct _List_node_base {
_List_node_base* _M_next;
_List_node_base* _M_prev;
};

template <class _Tp>
struct _List_node : public _List_node_base {
_Tp _M_data;
};

[..]

And one more thing, what's the reason for using this much
"underscore"!!! I know it can be only a style! but is there any
specific reason!?

The specific reason is that all identifiers starting with an underscore
and a capital letter are reserved for the implementation, which
includes the STL. This rule helps prevent name clashes with the user's
names.

Cheers! --M

The way to read STL source code	85	Feb 17, 2012
Sorting an STL map	1	Oct 19, 2013
I don't understand the following..	4	Aug 4, 2011
I need help in understanding these files on my phone, Could someone help me understand these files? Urgent help needed. Please help.	1	Jun 4, 2023
Traits in STL	6	Jul 17, 2011
Newbies don't learn C++	66	Jan 11, 2011
Alternative STL Structure?	10	Nov 25, 2010
Problem with a login script, SESSION user rights and put this together so it works with the other pages and MySQL. Code examples.	2	May 5, 2023

don't understand the STL source code!

Peyman

Kai-Uwe Bux

Gianni Mariani

Earl Purple

mlimber

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads