Engineering a list container. Part 1.

Discussion in 'C Programming' started by jacob navia, Dec 7, 2013.

  1. jacob navia

    jacob navia Guest

    Lists
    -----
    Single linked lists use a single pointer to the next element. The data
    for the element comes right behind that pointer to avoid the overhead
    that yet another pointer would represent.

    So, we use this structure:

    typedef struct _list_element {
    struct _list_element *Next;
    char Data[MINIMUM_ARRAY_INDEX]; // See below
    } list_element;

    The list header uses this structure to store the elements. As you can
    see, there is no space wasted in a pointer to the element stored. The
    element stored is placed just behind the Next pointer. The downside of
    this decision is that we can’t recycle this object to store other
    different objects of different size.

    The constant MINIMUM ARRAY INDEX is defined as 1 if we are compiling in
    C90 mode or as nothing if we are compiling in C99 mode. In C99 mode we
    have a flexible structure, that consists of a fixed and a variable part.
    The fixed part is the pointer to the next element. The variable part is
    the object we are storing in the list.

    Alignment
    ---------
    Some machines require that data be stored at particular addresses,
    always a multiple of two. For instance SPARC machines require that
    doubles be aligned at a multiple of 8. The structure for our list
    element above would provoke a crash when used to store doubles 4.
    In those machines the list element structure is defined as follows:

    typedef struct _ListElement {
    struct _ListElement *Next;
    #ifdef SPARC32
    void *alignment;
    #endif
    char Data[MINIMUM_ARRAY_INDEX];
    } ListElement;

    This assumes that sizeof(void *) is 4.
    In machines that handle unaligned data gracefully without crashing
    alignment re- quirements aren’t useless, since in most cases they
    provoke a performance loss.

    The list header
    ---------------

    struct _List {
    ListInterface *VTable;
    size_t count;
    unsigned Flags;
    unsigned timestamp;
    size_t ElementSize;
    list_element *Last;
    list_element *First;
    CompareFunction Compare;
    ErrorFunction RaiseError;
    ContainerHeap *Heap;
    ContainerAllocator *Allocator;
    };
    In the public containers.h header file we refer always to an abstract
    structure List. We define it here. This schema allows other
    implementation to use the same header with maybe radically different
    implementations of their data structure.

    The individual fields are as follows:

    1.
    1.1: Vtable. Pointer to a table of methods for this container
    1.2: count. Number of elements in the list
    1.3: Flags. Bit fields for flags about this list.
    1.4: ElementSize. Number of bytes the data occupies.

    2. timestamp. This field is incremented at each modification of the
    list, and allows the iterators to detect if the container changes during
    an iteration: they store the value of this field at the start of the
    iteration, and before each iteration they compare it with its current
    value. If there are any changes, they return NULL .

    3. Last. Stores a pointer to the last element of the list. This allows
    the addition of an element at the end of the list to be fast, avoiding a
    complete rescan of the list. This field is an optimization, all
    algorithms of a single linked list would work without this field.

    4. First. The start of the linked list.

    5. Compare. A comparison function for the type of elements stored in the
    list.

    6. RaiseError. A function that will be called when an error occurs. This
    field is necessary only if you want to keep the flexibility of having a
    different error function for each list that the client software builds.
    An alternative implementation would store a pointer to an error function
    in the interface.

    7. Allocator. A set of functions that allocates memory for this list. In
    an implementation that needs less flexibility and is more interested in
    saving space it could be replaced by the default allocator.

    The sample implementation has certainly a quite voluminous header
    because of a design decision to keep things very flexible. Other
    implementations could trim most of the fields, and an absolute minimal
    implementation would trim Last, Compare, RaiseError, Heap,
    and Allocator. If the implementation assumes that only one iterator per
    container is allowed, the timestamp field could be replace by a single
    bit (’changed’) in the Flags field.

    The list interface
    ------------------
    The list interface is a table of functions that holds all the methods
    of the container. For the list container we have:

    typedef struct tagListInterface {
    int (*Add)(List *L,const void *newval);
    int (*AddRange)(List *L, size_t n,const void *data);
    void *(*Advance)(ListElement **pListElement);
    int (*Append)(List *l1,List *l2);
    int (*Apply)(List *L,int(Applyfn)(void *,void *),void *arg);
    void *(*Back)(const List *l);
    int (*Clear)(List *L);
    int (*Contains)(const List *L,const void *element);
    List *(*Copy)(const List *L);
    int (*CopyElement)(const List *list,size_t idx,void *OutBuffer);
    List *(*Create)(size_t element_size);
    List *(*CreateWithAllocator)(size_t elementsize,
    const ContainerAllocator *mm);
    void *(*ElementData)(ListElement *le);
    int (*Equal)(const List *l1,const List *l2);
    int (*Erase)(List *L,const void *);
    int (*EraseAll)(List *l,const void *);
    int (*EraseAt)(List *L,size_t idx);
    int (*EraseRange)(List *L,size_t start,size_t end);
    int (*Finalize)(List *L);
    ListElement *(*FirstElement)(List *l);
    void *(*Front)(const List *l);
    const ContainerAllocator *(*GetAllocator)(const List *list);
    void *(*GetElement)(const List *L,size_t idx);
    size_t (*GetElementSize)(const List *l);
    unsigned (*GetFlags)(const List *L);
    ContainerHeap *(*GetHeap)(const List *l);
    List *(*GetRange)(const List *l,size_t start,size_t end);
    int (*IndexOf)(const List *L,const void *SearchedElement,
    void *ExtraArgs,size_t *result);
    List *(*Init)(List *aList,size_t element_size);
    int (*InitIterator)(List *L,void *buf);
    List *(*InitWithAllocator)(List *aList,size_t element_size,
    const ContainerAllocator *mm);
    List *(*InitializeWith)(size_t elementSize,size_t n,
    const void *data);
    int (*InsertAt)(List *L,size_t idx,const void *newval);
    int (*InsertIn)(List *l, size_t idx,List *newData);
    ListElement *(*LastElement)(List *l);
    List *(*Load)(FILE *stream, ReadFunction loadFn,void *arg);
    Iterator *(*NewIterator)(List *L);
    ListElement *(*NextElement)(ListElement *le);
    int (*PopFront)(List *L,void *result);
    int (*PushFront)(List *L,const void *str);
    int (*RemoveRange)(List *l,size_t start, size_t end);
    int (*ReplaceAt)(List *L,size_t idx,const void *newval);
    int (*Reverse)(List *l);
    int (*RotateLeft)(List *l, size_t n);
    int (*RotateRight)(List *l,size_t n);
    int (*Save)(const List *L,FILE *stream, SaveFunction saveFn,
    void *arg);
    int (*Select)(List *src,const Mask *m);
    List *(*SelectCopy)(const List *src,const Mask *m);
    CompareFunction (*SetCompareFunction)(List *l,CompareFunction fn);
    DestructorFunction (*SetDestructor)(List *v,DestructorFunction fn);
    int (*SetElementData)(List *l, ListElement *le,void *data);
    ErrorFunction (*SetErrorFunction)(List *L,ErrorFunction);
    unsigned (*SetFlags)(List *L,unsigned flags);
    size_t (*Size)(const List *L);
    size_t (*Sizeof)(const List *l);
    size_t (*SizeofIterator)(const List *);
    ListElement *(*Skip)(ListElement *l,size_t n);
    int (*Sort)(List *l);
    List *(*SplitAfter)(List *l, ListElement *pt);
    int (*UseHeap)(List *L, const ContainerAllocator *m);
    int (*deleteIterator)(Iterator *);
    } ListInterface;

    Note that ther is a single Vtable for all lists of the same type.
    All list just a pointer to a common method table.

    In a future post I will explain this functions in detail and propose
    code for them.

    Feel free to comment on this.

    jacob
    jacob navia, Dec 7, 2013
    #1
    1. Advertising

  2. On Saturday, December 7, 2013 11:42:28 PM UTC, jacob navia wrote:
    >
    > typedef struct _list_element {
    >
    > struct _list_element *Next;
    >
    > char Data[MINIMUM_ARRAY_INDEX]; // See below
    >
    > } list_element;
    >


    C handles this perfectly well as

    typedef struct node
    {
    struct node *next;
    int fielda;
    char fieldb[100];
    }

    etc, but it's by no means easy to make it generic. As you say, if fielda is
    a double, some systems will insert 4 bytes of padding between it and the next
    pointer. No problem, unless you want a generic system.
    >
    > The sample implementation has certainly a quite voluminous header
    > because of a design decision to keep things very flexible. Other
    > implementations could trim most of the fields, and an absolute minimal
    > implementation would trim Last, Compare, RaiseError, Heap,
    > and Allocator.
    > Feel free to comment on this.
    >

    OK, so keeping with our non-generic implementation, probably we'll be calling
    malloc() to create new nodes, we'll have head pointer somewhere, we'll
    frequently be iterating over the list with

    for(ptr = head; ptr != 0; ptr = ptr->next)

    How can we beat this for the generic? We can use a fast fixed block allocator.
    We can implement bug-free insert / deletes at any position, whilst with the
    simple non-generic implementation we can only insert at the head, and we've
    got to remember not to let anyone hang on to the head pointer because it
    becomes invalidated by an insert. We can automatically destroy the list.

    What happens if the list itself changes during an iteration? Is it important
    for the programmer to be able to toggle relatively easily between a list and
    an array? Are out of memory conditions essentially impossible, or do we need
    an elaborate system to handle them? With a flexible generic solution, you've
    got to provide answers to all of those, and it has to be the more complex
    answer. That's one snag. The generic can't be totally simple to use, because
    there aren't simple ways of addressing these issues.

    The other problem is the perennial one, dependency. You don't want to create
    a dependency on another file, much less a library/compiler, just because you're
    using a linked list to store your objects.
    Malcolm McLean, Dec 8, 2013
    #2
    1. Advertising

  3. jacob navia

    jacob navia Guest

    Le 08/12/2013 17:42, Malcolm McLean a écrit :
    > On Saturday, December 7, 2013 11:42:28 PM UTC, jacob navia wrote:
    >>
    >> typedef struct _list_element {
    >>
    >> struct _list_element *Next;
    >>
    >> char Data[MINIMUM_ARRAY_INDEX]; // See below
    >>
    >> } list_element;
    >>

    >
    > C handles this perfectly well as
    >
    > typedef struct node
    > {
    > struct node *next;
    > int fielda;
    > char fieldb[100];
    > }
    >
    > etc, but it's by no means easy to make it generic. As you say, if fielda is
    > a double, some systems will insert 4 bytes of padding between it and the next
    > pointer. No problem, unless you want a generic system.


    You have no problems with the generic system either. Just do

    typedef struct myData {
    int fielda;
    char fieldb[100];
    } Data;

    You pass sizeof(Data) to the generic creation functions. No overhead at all.


    >>
    >> The sample implementation has certainly a quite voluminous header
    >> because of a design decision to keep things very flexible. Other
    >> implementations could trim most of the fields, and an absolute minimal
    >> implementation would trim Last, Compare, RaiseError, Heap,
    >> and Allocator.
    >> Feel free to comment on this.
    >>

    > OK, so keeping with our non-generic implementation, probably we'll be calling
    > malloc() to create new nodes, we'll have head pointer somewhere, we'll
    > frequently be iterating over the list with
    >
    > for(ptr = head; ptr != 0; ptr = ptr->next)
    >
    > How can we beat this for the generic? We can use a fast fixed block allocator.


    Yes, a fast fixed block allocator is provided by the ccl. No need to use
    a slow malloc function. But of course the generic implementation
    needs to do the same loop and will have the same speed as the
    non-generic.

    > We can implement bug-free insert / deletes at any position, whilst with the
    > simple non-generic implementation we can only insert at the head, and we've
    > got to remember not to let anyone hang on to the head pointer because it
    > becomes invalidated by an insert.


    No. The ccl provides with "FirstElement" and "NextElement" functions
    that return the whole (including the "next" pointer), for you to hack
    away with the same liberty as you would using your personal list
    implementation.


    We can automatically destroy the list.

    Sorry I do not understand that. You mean your personal non-generic
    implementation uses garbage collection? Or what?

    >
    > What happens if the list itself changes during an iteration?


    If you use the provided iterators, the iteration stops.


    > Is it important
    > for the programmer to be able to toggle relatively easily between a list and
    > an array?


    Maybe, who knows. In any case it is very easy to do with the ccl.


    > Are out of memory conditions essentially impossible, or do we need
    > an elaborate system to handle them? With a flexible generic solution, you've
    > got to provide answers to all of those, and it has to be the more complex
    > answer. That's one snag.


    Yes, in your own list implementation you just crash if there is no more
    memory left because you assumed it will always be available. The ccl
    doesn't do that but this provokes NO EXTRA COMPLEXITY in your code since
    this only be an issue if an allocation fails!

    This is no "snag" but a feature that allows you to program AS IF there
    was always memory available but in case you do not crash but a user
    supplied recovery function is called!

    > The generic can't be totally simple to use, because
    > there aren't simple ways of addressing these issues.
    >


    The error handling design is quite complex but at the same time very
    easy to use. By default there is an error function that is called when
    an error occurs. The ccl supplies a default function that prints the
    error in stderr and exits the program. You can override that with your
    own function of course.


    > The other problem is the perennial one, dependency. You don't want to create
    > a dependency on another file, much less a library/compiler, just because you're
    > using a linked list to store your objects.
    >


    Yes, you program (and debug) the linked list package (that is always an
    extra file excuse me) at each application that you program.

    The idea of the ccl is that you use a debugged container library instead
    of reinventng the wheel at each time.

    In the case of a linked list it is possible to understand your view
    point, but for other containers like a flexible (resizable) array or
    a tree, a hashtable, etc the situation is different. You do not want
    to program those, and then, instead of using a resizable array you
    use a list even if it is slower!

    I started with the linked list because is the simplest!

    Thanks for your input Malcom.

    jacob
    jacob navia, Dec 8, 2013
    #3
  4. jacob navia

    Ian Collins Guest

    Malcolm McLean wrote:
    >
    > The other problem is the perennial one, dependency. You don't want to create
    > a dependency on another file, much less a library/compiler, just because you're
    > using a linked list to store your objects.


    Where do you draw the line?

    Somewhere between "You don't want to create a dependency just to output
    a character" and "You don't want to create a dependency just to use a list"?

    --
    Ian Collins
    Ian Collins, Dec 8, 2013
    #4
  5. jacob navia

    jacob navia Guest

    Le 08/12/2013 19:57, Ian Collins a écrit :
    > Malcolm McLean wrote:
    >>
    >> The other problem is the perennial one, dependency. You don't want to
    >> create
    >> a dependency on another file, much less a library/compiler, just
    >> because you're
    >> using a linked list to store your objects.

    >
    > Where do you draw the line?
    >
    > Somewhere between "You don't want to create a dependency just to output
    > a character" and "You don't want to create a dependency just to use a
    > list"?
    >


    In the case of a simple container like a single linked list it is easy
    to just hack away a nth implementation of a list but for a hashtable
    or a tree, or even an extensible array the effort is quite considerable
    to get it right...


    Then, you get stuck with using a list instead of a hashtable or
    a flexible array!

    I started with the linked list because is the simplest to explain.

    jacob
    jacob navia, Dec 8, 2013
    #5
  6. On Sunday, December 8, 2013 6:57:52 PM UTC, Ian Collins wrote:
    > Malcolm McLean wrote:
    >
    >
    > > The other problem is the perennial one, dependency. You don't want to
    > > create a dependency on another file, much less a library/compiler, just
    > > because you're using a linked list to store your objects.

    >
    >
    > Where do you draw the line?
    >
    > Somewhere between "You don't want to create a dependency just to output
    > a character" and "You don't want to create a dependency just to use a list"?
    >

    Programming has to be done by reasonably intelligent human beings. There's
    no simple or algorithmic answer.

    Is it important to be able to take a single file, and associated header, and
    drop it into another program? Does it matter if you have three or four
    generic linked lists kicking about in your program source (because Malc used
    Jacob's library whilst Ian used a version based on C++ stl and Fred rolled
    his own)?

    There aren't easy answers. Sometimes it's easy, if we know that the code will
    be used only in one program, it's not going to be of any interest to anyone
    else developing similar programs, and we've got a small, controlled group
    working on that one program, then once we've taken the decision to go the
    Jacob container library route, there's no further issue, we should use it
    for every linked list.

    With outputting a character, that's not a "bit shuffling operation", so
    inherently it has some dependency. So I try to move my IO to separate modules,
    so the bulk of the program can be written as portable, independent, standalone
    functions.
    Malcolm McLean, Dec 8, 2013
    #6
  7. jacob navia wrote:
    > Lists
    > -----
    > Single linked lists use a single pointer to the next element. The data
    > for the element comes right behind that pointer to avoid the overhead
    > that yet another pointer would represent.
    >
    > So, we use this structure:
    >
    > typedef struct _list_element {
    > struct _list_element *Next;
    > char Data[MINIMUM_ARRAY_INDEX]; // See below
    > } list_element;
    >
    > The list header uses this structure to store the elements. As you can
    > see, there is no space wasted in a pointer to the element stored. The
    > element stored is placed just behind the Next pointer. The downside of
    > this decision is that we can’t recycle this object to store other
    > different objects of different size.
    >
    > The constant MINIMUM ARRAY INDEX is defined as 1 if we are compiling in
    > C90 mode or as nothing if we are compiling in C99 mode. In C99 mode we
    > have a flexible structure, that consists of a fixed and a variable part.
    > The fixed part is the pointer to the next element. The variable part is
    > the object we are storing in the list.
    >


    Did you look at the linked lists inside linux(include/linux/list.h...
    types.h... struct list_head)? AFAIK they use a header structure, that
    gets included in structures that want to use it. Then there's usually
    some pointer arithmetics(usually things involving offsetof) to get from
    the header pointer to the containing struct and vice-versa. It has the
    advantage to be able have some struct in multiple lists (perhaps a rare
    use-case.. but sometimes..), by using several header vars..
    Any particular reason not to do it that way? I always thought, it was
    rather clever...
    Johann Klammer, Dec 8, 2013
    #7
  8. jacob navia

    jacob navia Guest

    Le 08/12/2013 22:02, Johann Klammer a écrit :
    > Did you look at the linked lists inside linux(include/linux/list.h...
    > types.h... struct list_head)? AFAIK they use a header structure, that
    > gets included in structures that want to use it. Then there's usually
    > some pointer arithmetics(usually things involving offsetof) to get from
    > the header pointer to the containing struct and vice-versa. It has the
    > advantage to be able have some struct in multiple lists (perhaps a rare
    > use-case.. but sometimes..), by using several header vars..
    > Any particular reason not to do it that way? I always thought, it was
    > rather clever...


    Well, the idea of having a header in each element is too heavy
    for the containers since it is of the order of:

    sizeof(elementHeader) * N

    In my case this is just

    sizeof(void *) * N // the "next item" pointer

    The library doesn't know anything about any offsets into your data.
    It treats the data as a container (a box for instance) into which
    you store stuff and retrieve it. What is stored is invisible for the
    container. You do not expect a box to handle the objects you store into
    it any differently from each other!

    You can have generic lists specialized for any conceivable type
    if you use the generic list container, to be explained later.

    Yes, in C. It uses the C preprocessor. Of course I have always read
    that it can't be done. I believed that also, until... I wrote it.

    It is actually very simple.

    http://code.google.com/p/ccl/

    All the code can be downloaded from there and hacked at will.

    Enjoy!

    jacob
    jacob navia, Dec 9, 2013
    #8
  9. jacob navia <> writes:

    > Le 08/12/2013 22:02, Johann Klammer a écrit :
    >> Did you look at the linked lists inside linux(include/linux/list.h...
    >> types.h... struct list_head)? AFAIK they use a header structure, that
    >> gets included in structures that want to use it. Then there's usually
    >> some pointer arithmetics(usually things involving offsetof) to get from
    >> the header pointer to the containing struct and vice-versa. It has the
    >> advantage to be able have some struct in multiple lists (perhaps a rare
    >> use-case.. but sometimes..), by using several header vars..
    >> Any particular reason not to do it that way? I always thought, it was
    >> rather clever...

    >
    > Well, the idea of having a header in each element is too heavy
    > for the containers since it is of the order of:
    >
    > sizeof(elementHeader) * N
    >
    > In my case this is just
    >
    > sizeof(void *) * N // the "next item" pointer


    The Linux lists are doubly linked, so two pointers are needed. The
    overhead, though, is just that: two pointers plus any required padding.
    Using the same technique for singly linked lists would presumably have
    the same overhead as yours: one pointer plus any required padding.

    <snip>
    --
    Ben.
    Ben Bacarisse, Dec 9, 2013
    #9
  10. jacob navia

    BartC Guest

    "jacob navia" <> wrote in message
    news:l832pm$63l$...

    > It is actually very simple.
    >
    > http://code.google.com/p/ccl/


    I've looked at the PDF, and one word I wouldn't use to describe it is
    simple! (Comprehensive, perhaps.)

    Its 370 dense pages are 100 more than K&R 2 which describes the entire
    language.

    --
    Bartc
    BartC, Dec 9, 2013
    #10
  11. jacob navia

    jacob navia Guest

    Le 09/12/2013 12:17, BartC a écrit :
    > "jacob navia" <> wrote in message
    > news:l832pm$63l$...
    >
    >> It is actually very simple.
    >>
    >> http://code.google.com/p/ccl/

    >
    > I've looked at the PDF, and one word I wouldn't use to describe it is
    > simple! (Comprehensive, perhaps.)
    >
    > Its 370 dense pages are 100 more than K&R 2 which describes the entire
    > language.
    >


    If you look at any book about the STL it is much thicker. Obviously
    writing detailed documentation for each function is not a strength
    of the package in your opinion.

    In those 370 "dense" pages there is a lot of repetitions to facilitate
    reading, a lot of diagrams, a rationale, and comparisons with other
    languages like Java, C# and C++.

    And if K&R 2 is 270 pages, the language standard is 683 pages.
    jacob navia, Dec 9, 2013
    #11
  12. On Monday, December 9, 2013 12:58:12 PM UTC, jacob navia wrote:
    > Le 09/12/2013 12:17, BartC a �crit :
    >
    > > I've looked at the PDF, and one word I wouldn't use to describe it is
    > > simple! (Comprehensive, perhaps.)

    >
    > If you look at any book about the STL it is much thicker. Obviously
    > writing detailed documentation for each function is not a strength
    > of the package in your opinion.
    >

    It's obviously not acceptable for someone to need to read 370 pages before
    they can use the package.
    You need two layers of documentation. In layer one, an overview of the scope
    of the package, what it can do, and it's also convenient though sometimes a
    bit embarrassing to point out what it cannot do. (Nothing wastes time so
    much as looking for the "serialise" method when there isn't one). You also
    need to tell the user how to achieve common tasks, eg set up a list, iterate
    through it, add and delete items, get its length, destroy it, usually with
    source code. He should be able to glance at the docs, and within a minute or
    so have a functioning linked list.
    In layer two, you need the 370 pages of detail. For instance what's meant to
    happen if you add or delete elements from a list whilst iterating through it?
    There's no good, obvious, or simple answer to that. In fact you increment an
    edit counter field. So is the field reset to zero if we take a copy of the list? It's unlikely that many people will want to know the answer to that
    question, but for someone it's going to be vital. So that needs to be buried
    away somewhere.
    Malcolm McLean, Dec 9, 2013
    #12
  13. On 09/12/2013 01:42, Ben Bacarisse wrote:
    > jacob navia <> writes:
    >
    >> Le 08/12/2013 22:02, Johann Klammer a écrit :
    >>> Did you look at the linked lists inside linux(include/linux/list.h...
    >>> types.h... struct list_head)? AFAIK they use a header structure, that
    >>> gets included in structures that want to use it. Then there's usually
    >>> some pointer arithmetics(usually things involving offsetof) to get from
    >>> the header pointer to the containing struct and vice-versa. It has the
    >>> advantage to be able have some struct in multiple lists (perhaps a rare
    >>> use-case.. but sometimes..), by using several header vars..
    >>> Any particular reason not to do it that way? I always thought, it was
    >>> rather clever...

    >>
    >> Well, the idea of having a header in each element is too heavy
    >> for the containers since it is of the order of:
    >>
    >> sizeof(elementHeader) * N
    >>
    >> In my case this is just
    >>
    >> sizeof(void *) * N // the "next item" pointer

    >
    > The Linux lists are doubly linked, so two pointers are needed. The
    > overhead, though, is just that: two pointers plus any required padding.
    > Using the same technique for singly linked lists would presumably have
    > the same overhead as yours: one pointer plus any required padding.
    >
    > <snip>
    >


    A linux list_head is two pointers (plus whatever padding the compiler
    deems is necessary - 0 for the common architectures)

    In addition, you have an ability to turn an arbitrary structure into
    linked list by embedding a list_head. You can put an arbitrary number
    of list_heads into a structure, and have the structure in an arbitrary
    number of independent lists.

    The standard iteration tools will issue hardware preload instructions
    for the subsequent entries as you walk the list, meaning fewer cache misses.

    Judging from the presented code, the entire Linux implementation is
    substantially smaller, and involves no function pointers.


    The proposal presented here is desperately trying to be C++, and is
    completely over engineered for the problem it is trying to solve; It is
    not like lists are hard.

    It seems silly to be concerned about an extra pointer in the structure,
    with a vtable that large for each type of list.

    ~Andrew
    Andrew Cooper, Dec 9, 2013
    #13
  14. jacob navia

    BartC Guest

    "Andrew Cooper" <root@127.0.0.1> wrote in message
    news:vgspu.10410$4...

    > The proposal presented here is desperately trying to be C++, and is
    > completely over engineered for the problem it is trying to solve; It is
    > not like lists are hard.


    The idea of using a common set of functions to work with linked lists is
    good - if you do just want a simple linked list.

    However, I use linked lists a lot, and they are rarely simple! For example,
    one struct I had I think was linked seven different ways to other structs
    (mostly sideways, some up and down to form a tree).

    The point is, my data structures, as many as I needed, were superimposed on
    the struct, rather than the struct being shoe-horned into a (single) data
    structure, which is less flexible.

    I'd imagine it would also be awkward to point into the middle of a list, or
    somehow share the same data with several list headers. And if I wanted to
    insert a pointer to a list into a struct, it's no longer a pointer to the
    first element, but a pointer to a formal header, which then points to the
    data. It's just extra stuff to be aware of and which can complicate matters
    rather than simplify them.

    But, maybe it's a mistake to try and adapt existing code to use these
    containers...

    > It seems silly to be concerned about an extra pointer in the structure,
    > with a vtable that large for each type of list.


    There is just one instance of the vtable (for all singly-linked lists). And
    one list-header for each entire list. But any extra fields per-element can
    have a much bigger impact.

    --
    Bartc
    BartC, Dec 10, 2013
    #14
  15. jacob navia

    jacob navia Guest

    Le 10/12/2013 00:22, Andrew Cooper a écrit :
    >
    > A linux list_head is two pointers (plus whatever padding the compiler
    > deems is necessary - 0 for the common architectures)
    >


    The same in the ccl.


    > In addition, you have an ability to turn an arbitrary structure into
    > linked list by embedding a list_head. You can put an arbitrary number
    > of list_heads into a structure, and have the structure in an arbitrary
    > number of independent lists.
    >


    You can store any data type into an arbitrary number of lists with the
    ccl too.


    > The standard iteration tools will issue hardware preload instructions
    > for the subsequent entries as you walk the list, meaning fewer cache misses.
    >


    That is an implementation detail easily done in an architecture where
    that is relevant. The code presented in the ccl is generic for *ANY*
    architecture.

    > Judging from the presented code, the entire Linux implementation is
    > substantially smaller, and involves no function pointers.
    >


    The linux implementation is not the ccl, it is a kernel list
    implementation that has only lists.


    If you write code with the ccl, you can easily change your container
    from a list to a flexible array just by changing a few lines. You
    can do it if you measure that lists are taking too much CPU time.

    I do not see how you would do that in a container that is isolated from
    the others!

    >
    > The proposal presented here is desperately trying to be C++, and is
    > completely over engineered for the problem it is trying to solve; It is
    > not like lists are hard.
    >


    No, it is not C++ even if I use some of the concepts of object oriented
    programming. And mocking about it by saying "desperatly" doesn't really
    convey any argument. By the way I am not "against" C++. I consider the
    language as a whole a huge mistake, but there are good ideas within it.


    > It seems silly to be concerned about an extra pointer in the structure,
    > with a vtable that large for each type of list.
    >


    The vtable is shared by ALL lists. The elements do not have any pointer
    to the vtable, only the list head.

    And yes, lists are easy, that is why I presented them FIRST.

    Next are the flexible arrays.
    jacob navia, Dec 10, 2013
    #15
  16. On Tuesday, December 10, 2013 10:38:55 AM UTC, Bart wrote:
    > "Andrew Cooper" <root@127.0.0.1> wrote in message
    >
    > However, I use linked lists a lot, and they are rarely simple! For example,
    > one struct I had I think was linked seven different ways to other structs
    > (mostly sideways, some up and down to form a tree).
    >

    That's often a problem with a library that "sits over you" rather than "sits
    under you". By "sits over you", I mean provides structures that you plug
    the subroutines and data items into, rather than calling with your data
    items. It works fine, until you need two libraries that "sit over you".

    Then they start interfering. Both want to control your main loop, both
    want to deallocate your structures when they die, both want to insert a
    special field at the head of the structure.
    Malcolm McLean, Dec 10, 2013
    #16
  17. jacob navia

    Ian Collins Guest

    Andrew Cooper wrote:
    >
    > Judging from the presented code, the entire Linux implementation is
    > substantially smaller, and involves no function pointers.


    But it is just a list.

    > The proposal presented here is desperately trying to be C++, and is
    > completely over engineered for the problem it is trying to solve; It is
    > not like lists are hard.


    It isn't trying to be C++, it is striving to provide a subset of the C++
    standard library in C. Making a small list is trivial, making a
    collection of interchangeable containers isn't.

    Lists are widely used in C because they are so simple. In C++, vector
    (flexible array) is more common. It is more complex to implement, but
    in most use cases it offers better performance. Having a standardised
    collection of containers saves an awful lot of wheel reinventing. Being
    able to quickly swap between them when requirements or the environment
    changes is a big bonus.

    --
    Ian Collins
    Ian Collins, Dec 10, 2013
    #17
  18. jacob navia

    jacob navia Guest

    Le 11/12/2013 20:09, a écrit :
    > Anything particularly wrong with CoreFoundation?
    >
    > http://en.wikipedia.org/wiki/Core_Foundation
    >


    Yes.

    1) The Apple license. The CCL is in the public domain, it has
    in fact no restrictions of any kind, and no, it is free, not GNU
    so the FSF has nothing to say either.

    2)
    <quote>
    Core Foundation is a library with a set of programming interfaces
    conceptually derived from the Objective-C-based Foundation framework but
    implemented in the C language. To do this, Core Foundation implements a
    limited object model in C. Core Foundation defines opaque types that
    encapsulate data and functions, hereafter referred to as “objects.”
    <end quote>

    This makes very strong assumptions about the objects being stored.
    Basically the core foundation is just objective C "compiled" to C.

    The CCL should be able to run in very limited memory setups, and be
    portable to any machine. As far as I know the Core Foundation has been
    only ported to windows, not to linux.

    That said, many specifications of the ccl (for instance the observer
    interface) have been influenced by the work of Apple's engineers.
    jacob navia, Dec 11, 2013
    #18
  19. jacob navia

    jacob navia Guest

    Le 19/12/2013 07:36, Gareth Owen a écrit :
    > jacob navia <> writes:
    >
    >> If you look at any book about the STL it is much thicker. Obviously
    >> writing detailed documentation for each function is not a strength
    >> of the package in your opinion.

    >
    > The STL container section of the C++ standard is around 100 pages.
    > Granted, its terse standardese, and almost entirely free of diagrams,
    > comparisons, examples etc.
    >


    Yes, I will write a shorter documentation, in 100 pages or so.

    > But the real saving is that the error handling (i.e. the interaction
    > between destructors and exceptions) is part of the language, not part of
    > the library.
    >


    It can be "savings" if you want the bloated code that this produces.
    Granted, it is the compiler that writes that, but it does take
    space.
    jacob navia, Dec 19, 2013
    #19
  20. jacob navia

    jacob navia Guest

    Le 19/12/2013 18:14, Gareth Owen a écrit :
    > jacob navia <> writes:
    >
    >>> But the real saving is that the error handling (i.e. the interaction
    >>> between destructors and exceptions) is part of the language, not part of
    >>> the library.

    >>
    >> It can be "savings" if you want the bloated code that this produces.

    >
    > [citation needed]
    >


    Exception handling has a long history in C++.

    The first solutions were just a glorified longjmp that the compiler
    inlined at each entry of a protected block.

    This solutions is now obsolete but still used in some systems like
    gcc under windows. Since each protected block needs to establish a
    context (the setjmp call) you pay even if there is no exception.

    The solution that evolved later is a table approach.

    You build tables relating code points to stack descriptions, so that
    the run time can call a bye code machine that interprets the tables
    to unwind the stack searching for an exception. Lcc-win generates
    this tables to be compatible with C++, and I can tell you those
    tables are HUGE since each function entry/exit needs a lot of
    descriptions for each instruction being emitted.

    There isn't a big literature on this subject since only Microsoft
    has really documented this stuff in a meaningful way. Gcc refers
    always to the DWARF specifications where a documentation exists
    but gcc doesn't follow their own specs for unknown reasons. So,
    you have just to fgure out what gcc generates and then do the same.
    jacob navia, Dec 19, 2013
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    850
  2. Replies:
    0
    Views:
    612
  3. jacob navia
    Replies:
    5
    Views:
    481
    jacob navia
    Sep 28, 2009
  4. jacob navia
    Replies:
    10
    Views:
    637
    jacob navia
    Oct 1, 2009
  5. jacob navia
    Replies:
    20
    Views:
    257
    James Kuyper
    Dec 23, 2013
Loading...

Share This Page