Re: Idioms for iterators in C

Discussion in 'C Programming' started by Ersek, Laszlo, Jan 28, 2010.

  1. In article <>, (Richard Harter) writes:

    > Here is the outline of an example. Let foo be the type of
    > container. For notation I usually append the operation name to
    > the container type - you may prefer a different notation. Thus
    >
    > foo_open // create a foo container, returns handle
    > foo_close // deletes a foo container
    > foo_add // adds an item to the container
    > foo_delete // removes an item from the container


    Sometimes the client programmer is enabled to allocate the necessary
    storage him- or herself, and (un)initialization is made accessible
    separately from (de)allocation. Sort of a "placement new". (Of course
    the nodes of the container are allocated dynamically later.)


    > The first argument is always the handle; the remaining arguments
    > are specific to the operation. An alternative is to have a
    > master function that looks like foo(handle,opname,...) and use a
    > variable argument list. Pros and cons of that choice might be
    > interesting.


    Variable argument lists are more suitable for wildly varying operations
    in my opinion, or for sets of operations that are expected to be
    extended later, in ways yet unclear. I believe containers are
    traditionally categorized into deeply researched, fixed "behavioral
    profiles" or interfaces. For example, quoting from the ToC of the Second
    Edition of the ISO C++ standard:

    23 Containers library
    23.1 Container requirements
    23.1.1 Sequences
    23.1.2 Associative containers
    23.2 Sequences
    23.2.1 Class template deque
    23.2.2 Class template list
    23.2.3 Container adaptors
    23.2.3.1 Class template queue
    23.2.3.2 Class template priority_queue
    23.2.3.3 Class template stack
    23.2.4 Class template vector
    [...]
    23.3 Associative containers
    23.3.1 Class template map
    23.3.2 Class template multimap
    23.3.3 Class template set
    23.3.4 Class template multiset
    [...]
    24 Iterators library
    24.1 Iterator requirements
    24.1.1 Input iterators
    24.1.2 Output iterators
    24.1.3 Forward iterators
    24.1.4 Bidirectional iterators
    24.1.5 Random access iterators


    > When we step through the container we keep "getting" instances
    > until there are no more. This means there are two questions per
    > iteration - is there another instance, and, if so, what is it.
    > Sometimes we can collapse these into one operation, e.g., the
    > "get" returns a pointer. So that gives us
    >
    > for (foo_begin(handle); ptr = foo_next(handle);) {...}
    >
    > Returning pointers can be a security issue;


    What do you mean? Do you mean "pointers in general can be dangerous", or
    that an iterator can become stale after some other operation on the
    container?


    > an alternative is to
    > return a copy of the instance. One way to handle this is:
    >
    > for(foo_begin(handle); foo_more(handle);){
    > x=foo_get(handle)
    > ...
    > }
    >
    > This moves the get into the loop body; we can keep it in the
    > condition with:
    >
    > foo_more(handle) && (x = foo_get(handle))
    >
    > So, is this a good way to encapsulate? If not, what would you
    > suggest?


    I think returning a deep copy of the contained object is overkill. With
    the exception of foo_close() and foo_delete(), all foo_*() operations
    should try not to invalidate any existing iterator. foo_delete() should
    take an iterator and try not to invalidate any iterator pointing
    elsewhere.

    Sorry for parroting banalities.

    Cheers,
    lacos
    Ersek, Laszlo, Jan 28, 2010
    #1
    1. Advertising

  2. Ersek, Laszlo

    Eric Sosman Guest

    On 1/28/2010 5:00 PM, Richard Harter wrote:
    > On 28 Jan 2010 20:57:50 +0100, (Ersek,
    > Laszlo) wrote:
    >> [...]
    >> I think returning a deep copy of the contained object is overkill.

    >
    > That depends on the application and the data. As a general
    > observation, people tend to overestimate the cost of deep copies.
    > One deep copy of data that is going to be operated on is
    > effectively free - the operation cost dominates the copy cost.
    > The cost arises when copying is daisy chained.


    There's also the implementation difficulty: Unless the
    container is purpose-built for the data type, it doesn't know
    how to make a deep copy! The objects in a general-purpose
    container are just opaque bags of bytes, whose inner nature
    is known only to the functions that hash them, compare them,
    and so on.

    Of course, you could always endow the container with a
    copy function that knows how to copy deeply -- but what does
    that copy function receive? (What, for that matter, do hash
    and comparison functions receive?) Pointers, right, give a
    cigar to that man, that nice man who doesn't like to pass
    pointers around because he thinks they threaten security ;-)

    I suppose your container could make a temporary copy of
    each bag of bytes before calling one of the user-supplied type-
    aware functions -- but this is starting to get expensive, and
    it still doesn't solve the problem of pointers (or other links)
    embedded inside the contained objects.

    IMHO, a container should not copy objects at all: It should
    just keep track of pointers to the objects the caller supplies,
    and leave the management of their storage to the caller. This
    avoids a lot of copying, makes it possible for one object to
    reside in several different containers simultaneously, and allows
    me to update a contained object without the subterfuge of deleting
    it and re-inserting the modified version.

    >> With
    >> the exception of foo_close() and foo_delete(), all foo_*() operations
    >> should try not to invalidate any existing iterator. foo_delete() should
    >> take an iterator and try not to invalidate any iterator pointing
    >> elsewhere.

    >
    > Sossman's iterator objects seem appropriate.


    Three points: First, concrete iterator objects are by no
    means original with me. Second, I think you'll find it difficult
    to make an iterator behave sensibly if the container is modified
    while an iteration is in progress (think of a hash table that
    gets re-hashed when an insertion makes it too crowded). Third
    (and most important), there are only two S'es in "Sosman."

    --
    Eric Sosman
    lid
    Eric Sosman, Jan 28, 2010
    #2
    1. Advertising

  3. In article <>, (Richard Harter) writes:
    > On 28 Jan 2010 20:57:50 +0100, (Ersek,
    > Laszlo) wrote:
    >
    >>In article <>, (Richard Harter) writes:


    >>> foo_open // create a foo container, returns handle
    >>> foo_close // deletes a foo container
    >>> foo_add // adds an item to the container
    >>> foo_delete // removes an item from the container

    >>
    >>Sometimes the client programmer is enabled to allocate the necessary
    >>storage him- or herself, and (un)initialization is made accessible
    >>separately from (de)allocation. Sort of a "placement new". (Of course
    >>the nodes of the container are allocated dynamically later.)

    >
    > These is unclear to me. Which necessary storage are you talking
    > about?


    ----v----
    struct stack
    {
    size_t capacity,
    occupied;
    void **slots;
    };


    /*
    Initialize a stack object.

    Parameters:
    stack
    points to the object to be initialized.
    capacity
    the stack will have this many slots.

    Return value:
    0
    if successful
    -1
    if failed. The contents of *stack may be indeterminate.
    */

    int
    stack_init(struct stack *stack, size_t capacity)
    {
    if ((size_t)-1 / sizeof *stack->slots >= capacity) {
    stack->slots = malloc(sizeof *stack->slots * capacity);

    if (0 != stack->slots) {
    stack->capacity = capacity;
    stack->occupied = 0u;
    return 0;
    }
    }

    return -1;
    }


    /*
    Allocate and initialize a stack object.

    Parameters:
    capacity
    the stack will have this many slots.

    Return value:
    null pointer
    if an error occurred.
    otherwise
    address of the newly allocated and initialized stack object.
    */
    struct stack *
    stack_open(size_t capacity)
    {
    struct stack *stack;

    stack = malloc(sizeof *stack);
    if (0 != stack) {
    if (0 == stack_init(stack, capacity)) {
    return stack;
    }

    free(stack);
    }

    return 0;
    }
    ----^----


    The former allows the programmer to write


    ----v----
    static struct stack stk_a;

    int
    main(void)
    {
    int ret;

    ret = EXIT_FAILURE;
    if (0 == stack_init(&stk_a, 10u)) {
    /* auto */ struct stack stk_b;

    if (0 == stack_init(&stk_b, 20u)) {
    /* do stuff */

    ret = EXIT_SUCCESS;

    /* uninit stk_b */
    }

    /* uninit stk_a */
    }

    return ret;
    }
    ----^----


    ("struct stack" must be a complete type in this translation unit.)


    Cheers,
    lacos
    Ersek, Laszlo, Jan 28, 2010
    #3
  4. Ersek, Laszlo

    Eric Sosman Guest

    On 1/28/2010 7:22 PM, Richard Harter wrote:
    > On Thu, 28 Jan 2010 18:06:09 -0500, Eric Sosman
    > <> wrote:
    >
    >> On 1/28/2010 5:00 PM, Richard Harter wrote:
    >>> On 28 Jan 2010 20:57:50 +0100, (Ersek,
    >>> Laszlo) wrote:
    >>>> [...]
    >>>> I think returning a deep copy of the contained object is overkill.
    >>>
    >>> That depends on the application and the data. As a general
    >>> observation, people tend to overestimate the cost of deep copies.
    >>> One deep copy of data that is going to be operated on is
    >>> effectively free - the operation cost dominates the copy cost.
    >>> The cost arises when copying is daisy chained.

    >>
    >> There's also the implementation difficulty: Unless the
    >> container is purpose-built for the data type, it doesn't know
    >> how to make a deep copy! The objects in a general-purpose
    >> container are just opaque bags of bytes, whose inner nature
    >> is known only to the functions that hash them, compare them,
    >> and so on.
    >>
    >> Of course, you could always endow the container with a
    >> copy function that knows how to copy deeply -- but what does
    >> that copy function receive? (What, for that matter, do hash
    >> and comparison functions receive?) Pointers, right, give a
    >> cigar to that man, that nice man who doesn't like to pass
    >> pointers around because he thinks they threaten security ;-)
    >>
    >> I suppose your container could make a temporary copy of
    >> each bag of bytes before calling one of the user-supplied type-
    >> aware functions -- but this is starting to get expensive, and
    >> it still doesn't solve the problem of pointers (or other links)
    >> embedded inside the contained objects.

    >
    > Now that's just silly, three paragraphs of strawman. Pointer and
    > length suffice.


    What we have here is a failure to communicate. Did you
    not say you wished to avoid handing pointers around? (Checks:
    "Returning pointers can be a security issue" -- OP in the OP.)
    And how does returning a pointer accomplish a "deep copy?"

    > It is more efficient, timewise, to have pointers in containers.
    > The flipside is that it is more dangerous. If an object is
    > contained in the form of pointers any piece of code that has
    > access to any container can modify an object. This may well be
    > just what you want, but it creates the potential for really
    > obscure bugs.
    >
    > If you want a object update capability, include it in the API.


    Let's nail something down: Are you talking about a purpose-
    built container, specialized to contain one data type it can
    have knowledge of? Or are you considering a type-blind container,
    something that contains "opaque" objects that it considers only
    as bags of bytes? If the latter, how does the API know how to
    modify a contained object? If you think pointers are dangerous,
    I suggest that

    foo_find_and_modify(handle, key,
    offset1, length1, &data1,
    offset2, length2, &data2,
    ...
    0, 0, (void*)0
    );

    .... is *vastly* more likely to cause trouble. So, what sort
    of container suite are you considering, and what do you expect
    to use it for?

    --
    Eric Sosman
    lid
    Eric Sosman, Jan 29, 2010
    #4
  5. On 28 Jan 2010 at 23:06, Eric Sosman wrote:
    > Third (and most important), there are only two S'es in "Sosman."


    Have you changed your handle recently, Eric?

    I'm sure I remember you styling yourself Sossman in days gone by.
    Antoninus Twink, Jan 29, 2010
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marcin Kaliciñski

    Iterators and reverse iterators

    Marcin Kaliciñski, May 8, 2005, in forum: C++
    Replies:
    1
    Views:
    475
    Kai-Uwe Bux
    May 8, 2005
  2. Ben Charrow

    Idioms and Anti-Idioms Question

    Ben Charrow, Jun 22, 2009, in forum: Python
    Replies:
    11
    Views:
    500
    Lawrence D'Oliveiro
    Jul 4, 2009
  3. Eric Sosman

    Re: Idioms for iterators in C

    Eric Sosman, Jan 28, 2010, in forum: C Programming
    Replies:
    2
    Views:
    413
  4. ImpalerCore

    Re: Idioms for iterators in C

    ImpalerCore, Jan 29, 2010, in forum: C Programming
    Replies:
    5
    Views:
    341
    Nobody
    Feb 2, 2010
  5. Moi

    Re: Idioms for iterators in C

    Moi, Jan 31, 2010, in forum: C Programming
    Replies:
    0
    Views:
    504
Loading...

Share This Page