don't understand the STL source code!

Discussion in 'C++' started by Peyman, Oct 31, 2006.

  1. Peyman

    Peyman Guest

    Hi, I was reading the source code of an implementation of STL by HP
    company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
    Which I found that for implementing a Linked List, they have used the
    following two structures:

    struct _List_node_base {
    _List_node_base* _M_next;
    _List_node_base* _M_prev;
    };

    template <class _Tp>
    struct _List_node : public _List_node_base {
    _Tp _M_data;
    };

    and then everywhere in the rest of the code, for example for the
    iterator class, they only use the node_base, while for the list class
    itself, they use node, not the node_base, and then they keep casting
    these two structs to retrieve the data stored in node and so on! I
    found this technique in almost all of the other data structures they've
    implemented there, like the "RD_Tree" for "map". I can't understand why
    they're dividing a node into two structs, while it could be only a
    simple node containing both pointers and data, and then never need to
    cast them, so of course faster! and easier! so I thought there must be
    a good reason! Anyone knows why!?
    And one more thing, what's the reason for using this much
    "underscore"!!! I know it can be only a style! but is there any
    specific reason!?
    BTW, I know that the implementation is old fashioned but I still can
    learn from it! and for more info. here is the link to this
    implementation:
    http://www.sgi.com/tech/stl/
    Peyman, Oct 31, 2006
    #1
    1. Advertising

  2. Peyman

    Kai-Uwe Bux Guest

    Peyman wrote:

    > Hi, I was reading the source code of an implementation of STL by HP
    > company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
    > Which I found that for implementing a Linked List, they have used the
    > following two structures:
    >
    > struct _List_node_base {
    > _List_node_base* _M_next;
    > _List_node_base* _M_prev;
    > };
    >
    > template <class _Tp>
    > struct _List_node : public _List_node_base {
    > _Tp _M_data;
    > };
    >
    > and then everywhere in the rest of the code, for example for the
    > iterator class, they only use the node_base, while for the list class
    > itself, they use node, not the node_base, and then they keep casting
    > these two structs to retrieve the data stored in node and so on! I
    > found this technique in almost all of the other data structures they've
    > implemented there, like the "RD_Tree" for "map". I can't understand why
    > they're dividing a node into two structs, while it could be only a
    > simple node containing both pointers and data, and then never need to
    > cast them, so of course faster!


    Have you measured?

    > and easier! so I thought there must be
    > a good reason! Anyone knows why!?
    > And one more thing, what's the reason for using this much
    > "underscore"!!! I know it can be only a style! but is there any
    > specific reason!?


    I bet that the list is implemented as a ring with one node as the past-end
    node. That node is realized as a _List_node_base since it does not carry
    data. All other nodes carry data. The reason to make this particular node
    special is so that construction of a std::list<T> does not require T to be
    default constructible. Similar considerations apply to the past-end nodes
    for all tree-based structures.


    Best

    Kai-Uwe Bux
    Kai-Uwe Bux, Oct 31, 2006
    #2
    1. Advertising

  3. Peyman wrote:
    ....
    > I can't understand why
    > they're dividing a node into two structs, while it could be only a
    > simple node containing both pointers and data, and then never need to
    > cast them, so of course faster! and easier! so I thought there must be
    > a good reason! Anyone knows why!?
    > And one more thing, what's the reason for using this much
    > "underscore"!!! I know it can be only a style! but is there any
    > specific reason!?
    > BTW, I know that the implementation is old fashioned but I still can
    > learn from it! and for more info. here is the link to this
    > implementation:
    > http://www.sgi.com/tech/stl/
    >


    I have no idea what the author did so but I would do it to reduce the
    number of specializations the compiler has to do - resulting in less
    code. If you know that the objects are always going to have a
    _List_node_base then the methods to insert/erase/splice/iterate along
    the notes are the same no matter what derived class you have. If you
    can use non-template functions to do most of the hard work, it means
    that the compiler may be able to use the same subset methods for all
    instantiations of the std::list template.

    I however, don't worry about doing that, it's an optimization. If it
    turns out that I have a severe case of code bloat on a particular
    template class I might look at it but in the last 10 years I have never
    had to worry about code bloat even when I had a code size budget. The
    were a few times code bloat almost became an issue but the resolution
    was to get rid of some of the bloated libraries we were linking in.
    Other times, the marketers were on something they didn't share with me
    (comments like "The install binary must be less than 100k" usually fall
    on deaf ears unless someone can prove that it really means the
    investment will be worth it.)

    I know you didn't ask about optimization but since the answer to the
    question quite possibly would send you on a wild goose chase I thought
    it would be prudent to raise it now.

    PrematureOptimization = sqrt( AllEvil )

    Just write template code so that it works properly iff you come across a
    code bloat issue, then fix the top code bloat issue, rinse, repeat until
    you get to an acceptable size. In the least 3 years I have never needed
    to worry about this.

    The reason it makes sense to put that extra effort into the stl is that
    it is used everywhere, from embedded systems to everything else so code
    bloat more than likely is an issue.

    G
    Gianni Mariani, Oct 31, 2006
    #3
  4. Peyman

    Earl Purple Guest

    Peyman wrote:
    > Hi, I was reading the source code of an implementation of STL by HP
    > company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
    > Which I found that for implementing a Linked List, they have used the
    > following two structures:
    >
    > struct _List_node_base {
    > _List_node_base* _M_next;
    > _List_node_base* _M_prev;
    > };
    >
    > template <class _Tp>
    > struct _List_node : public _List_node_base {
    > _Tp _M_data;
    > };
    >
    > and then everywhere in the rest of the code, for example for the
    > iterator class, they only use the node_base, while for the list class
    > itself, they use node, not the node_base, and then they keep casting
    > these two structs to retrieve the data stored in node and so on! I
    > found this technique in almost all of the other data structures they've
    > implemented there, like the "RD_Tree" for "map". I can't understand why
    > they're dividing a node into two structs, while it could be only a
    > simple node containing both pointers and data, and then never need to
    > cast them, so of course faster! and easier! so I thought there must be
    > a good reason! Anyone knows why!?
    > And one more thing, what's the reason for using this much
    > "underscore"!!! I know it can be only a style! but is there any
    > specific reason!?
    > BTW, I know that the implementation is old fashioned but I still can
    > learn from it! and for more info. here is the link to this
    > implementation:
    > http://www.sgi.com/tech/stl/


    A common technique when writing a template class is to separate
    "dependent" from "non-dependent" code. Dependent code is that which
    depends on the template parameter.

    Much of a linked-list structure is code relating to the manipulation of
    nodes and can be done the same way regardless of the underlying type
    stored.

    By having only one set of this code, it reduces the code size as well
    as compile time. It is unlikely to have a major impact on performance.
    Earl Purple, Oct 31, 2006
    #4
  5. Peyman

    mlimber Guest

    Peyman wrote:
    > Hi, I was reading the source code of an implementation of STL by HP
    > company 1994, and Silicon Graphics Computer Systems, Inc. 1996, 1997.
    > Which I found that for implementing a Linked List, they have used the
    > following two structures:
    >
    > struct _List_node_base {
    > _List_node_base* _M_next;
    > _List_node_base* _M_prev;
    > };
    >
    > template <class _Tp>
    > struct _List_node : public _List_node_base {
    > _Tp _M_data;
    > };
    >
    > [..]
    >
    > And one more thing, what's the reason for using this much
    > "underscore"!!! I know it can be only a style! but is there any
    > specific reason!?


    The specific reason is that all identifiers starting with an underscore
    and a capital letter are reserved for the implementation, which
    includes the STL. This rule helps prevent name clashes with the user's
    names.

    Cheers! --M
    mlimber, Oct 31, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. JS
    Replies:
    8
    Views:
    421
    Sven Heyll
    Mar 17, 2005
  2. Simon
    Replies:
    9
    Views:
    308
    Default User
    Jul 18, 2006
  3. xianwei
    Replies:
    2
    Views:
    346
    xianwei
    Sep 20, 2007
  4. Matthew Wilson
    Replies:
    4
    Views:
    265
    Tim Roberts
    Jan 22, 2008
  5. None

    Don't understand these code.

    None, Apr 25, 2008, in forum: Javascript
    Replies:
    2
    Views:
    103
    Lasse Reichstein Nielsen
    Apr 27, 2008
Loading...

Share This Page