Container library (continued)

I

ImpalerCore

C++ doesn't use inheritance in its container classes. Having seen
inheritance based conatiner classes I tend to think its the wrong
mechanism.

That's absolutely right imo. If you follow C++'s methodology, the
linked list, hash table, auto-resizing array have their own specific
interfaces (I don't remember seeing any public/private inheritance in
them last time I looked). I don't think that there is much that they
can commonly inherit. The best abstraction for containers is
iterators and algorithms built on using those iterators. And while it
will require calling a container specific function to get a container
specific iterator, the iterators hopefully can share the same
interface regardless of what container it was created from.

In C++, you can define a container specific iterator, but use a common
interface between those iterators.

vector::iterator --> would have to become 'struct array_iterator'
list::iterator --> would have to become 'struct list_iterator'

In C, these iterators would need to have a reference to the container
they use, and a table of function pointers to traverse the container,
access the element, and comparisons (to check whether you're at the
beginning or end of a container), maybe more. The function pointers
can share the same name so that access can be abstracted to an array
or list container.

---Random code segment---

my_list_t* list;
/* Fill up the list with stuff */

my_list_iterator_t l = my_list_iterator_create( list,
my_list_front( list ) );
or maybe
my_list_iterator_t l = list->begin();

/* list->end() is a function pointer that creates a iterator pointing
to the end of the list. */
while ( !my_list_iterator_equal( l, list->end() ) )
{
my_list_iterator_assign( l, object );
l = l->next();
}

Without operator overloading, it gets quite clunky to use iterators in
C compared to C++ containers, perhaps with a bad enough taste that
people will decide to just use C++. This is where C++ really has C
beat in the container arena imo.

At the moment I have no idea how to approach it, as it just seems too
messy to be worth the effort; plus integrating the generic part into a
list or array C interface has to come first. I believe that generic
container specific interfaces do have value in C, but I'm not
convinced about generalizing the interface to them via iterators since
I'm too spoiled from my C++ experience with them. If I saw a C
interface that made sense, sure, but I don't have one that I like.

Just some more thoughts to toss in the salad bowl.
 
A

Andrew Poelstra

Without operator overloading, it gets quite clunky to use iterators in
C compared to C++ containers, perhaps with a bad enough taste that
people will decide to just use C++. This is where C++ really has C
beat in the container arena imo.

At the moment I have no idea how to approach it, as it just seems too
messy to be worth the effort; plus integrating the generic part into a
list or array C interface has to come first. I believe that generic
container specific interfaces do have value in C, but I'm not
convinced about generalizing the interface to them via iterators since
I'm too spoiled from my C++ experience with them. If I saw a C
interface that made sense, sure, but I don't have one that I like.

Just some more thoughts to toss in the salad bowl.

I think you could use next / prev pointers as iterators. It
wouldn't be super clean, but it could certainly be consistent.

for(it = (Iter *) get_first(MyList); it; it = it->next) {
T *data = it->data;
data->foo = grommit();
data->bar = 5;
}

Actually, that looks a fair bit /cleaner/ than a lot of C++
iterator code. And now that I think about it, that cast could
probably be made unnecessary by simply having get_first
return an Iter* to begin with.

it->data would have to be a void pointer, which loses some
type safety, but such is life in C, I suppose.
 
J

jacob navia

Andrew Poelstra a écrit :
I think you could use next / prev pointers as iterators. It
wouldn't be super clean, but it could certainly be consistent.

for(it = (Iter *) get_first(MyList); it; it = it->next) {
T *data = it->data;
data->foo = grommit();
data->bar = 5;
}

Well, that is exactly what I have in mind. The only difference is that you
iterate with a list element (in single linked list) or with a "dlist_element"
in a double linked list. A list_element struture differs from a dlist_element
only in that the second has a "previous" pointer of course.

In any case, all containers support an "Apply" method that will
iterate through all elements. You give it a function and it will call
this function for each element of the container. Obviously this is
quite heavy for simple iterations since it forces you to write a function
for the iteration.

A simpler solution is:
for (i=0; i<container->length; i++) {
void *data = container->lpVtbl->GetElement(container,i);
// work with your data here.
// If you need to update the data you do
container->lpVtbl->SetElement(container,i,data);
}
 
A

Andrew Poelstra

Andrew Poelstra a écrit :

Well, that is exactly what I have in mind. The only difference is that you
iterate with a list element (in single linked list) or with a "dlist_element"
in a double linked list. A list_element struture differs from a dlist_element
only in that the second has a "previous" pointer of course.

In any case, all containers support an "Apply" method that will
iterate through all elements. You give it a function and it will call
this function for each element of the container. Obviously this is
quite heavy for simple iterations since it forces you to write a function
for the iteration.

A simpler solution is:
for (i=0; i<container->length; i++) {
void *data = container->lpVtbl->GetElement(container,i);
// work with your data here.
// If you need to update the data you do
container->lpVtbl->SetElement(container,i,data);
}

I like this, but I definitely think you should write some
macros around that lpVtbl stuff. :)

And personally, I don't think it's necessary to have separate
iterators for single/double linked lists. If there's some
extra performance cost I don't see, then fine, but if it's
just for the space savings, I think anyone who cares would
be writing their own domain-specific containers anyway.
 
S

Seebs

I like this, but I definitely think you should write some
macros around that lpVtbl stuff. :)

And if you want it accepted as a standard, pick names consistent with
the C standard nomenclature -- no capital letters.

I still can't figure out what "lpVtbl" is. I mean, "vtable", sure, but
what's the "lp"? Why is the V capitalized?

I'd probably do:
#define get_element(x, y) ((container *) x)->funcs->getele(x, y)

or something comparable. In particular, if "container" has to be there
twice, that should be hidden.

-s
 
J

jacob navia

Seebs a écrit :
And if you want it accepted as a standard, pick names consistent with
the C standard nomenclature -- no capital letters.
Well, maybe, _Bool is standard C now
I still can't figure out what "lpVtbl" is. I mean, "vtable", sure, but
what's the "lp"? Why is the V capitalized?

This is a COM usage. COM used that name and I got used to it.
Vtbl means virtual table, lp means "long pointer", and that comes from
windows 16 bit days where you had long pointers (beyond 64 K) and
short pointers (16 bit pointers). Obviously I need to change that name
before publication. Sorry for this.

I'd probably do:
#define get_element(x, y) ((container *) x)->funcs->getele(x, y)

or something comparable. In particular, if "container" has to be there
twice, that should be hidden.

PROBLEM:

This library doesn't use very much the user's name space since all
names are within the container structure. If we do that macro (what is
OK) it can only be optional because name clashes are bound to happen.
 
S

Seebs

Seebs a écrit :
Well, maybe, _Bool is standard C now

_Bool exists for a very specific reason, and is hidden from the user.
This is a COM usage.
Ahh.

COM used that name and I got used to it.
Vtbl means virtual table, lp means "long pointer", and that comes from
windows 16 bit days where you had long pointers (beyond 64 K) and
short pointers (16 bit pointers). Obviously I need to change that name
before publication. Sorry for this.

Ahh, yes.

Definitely do NOT do systems-hungarian names in anything you want other
people to use. In fact, you shouldn't do them in ANYTHING. EVER.
UNDER ANY CIRCUMSTANCES. That usage is 100% guaranteed to screw you up
without ever helping you.
This library doesn't use very much the user's name space since all
names are within the container structure. If we do that macro (what is
OK) it can only be optional because name clashes are bound to happen.

It might make more sense to do it as an inline function, thinking about
it more.

But!

That's okay. Because it'll only exist *if the user includes the header*.
So that's fine, same as it's okay for "bool" to be defined if you include
<stdbool.h>.

The container library doesn't need to be part of the fundamental type
system, so it doesn't need its types to be in scope unconditionally, so
it doesn't need to do something crazy like the _Bool trick.

But thinking about it more... Hmm. The problem is really the lack of
a good unambiguous shorthand for "container". Let's imagine that it's
"cont" for now.

static inline cont_ele *
cont_next(void *c, int i) {
return ((container *)c->funcs->cont_next(c, i));
}

Something like that, even though it doesn't look that much
different, will be a lot easier to sell people on. If I see "lp"
prefixes, my first assumption is that it's more godawful Windows
shit based on a 640k barrier, and I lose interest rapidly. :)

-s
 
B

Ben Bacarisse

jacob navia said:
Andrew Poelstra a écrit :

Well, that is exactly what I have in mind. The only difference is that you
iterate with a list element (in single linked list) or with a "dlist_element"
in a double linked list. A list_element struture differs from a dlist_element
only in that the second has a "previous" pointer of course.

It's possible you've missed Andrew's point. I think he is suggesting
that you start to get some payoff if you can iterate any container
with the same code. I.e. that Mylist could be changed to be a tree and
the code would still work.

If that is not the point he was making, I'll make it! I doubt it can
be done with plain pointers, but I would want to be able to iterate
over any container so that I could write generic algorithms. This
would be the kind of benefit that might make a relatively heavyweight
framework like yours pay off.
In any case, all containers support an "Apply" method that will
iterate through all elements. You give it a function and it will call
this function for each element of the container. Obviously this is
quite heavy for simple iterations since it forces you to write a function
for the iteration.

A simpler solution is:
for (i=0; i<container->length; i++) {
void *data = container->lpVtbl->GetElement(container,i);

BCPL had/has a syntax for this:

GetElement#(container, i);

The # caused an indirect call to be made through a vector of function
pointers associated with 'container'.
// work with your data here.
// If you need to update the data you do
container->lpVtbl->SetElement(container,i,data);
}

That's universal but likely to impractical for contains that don't
have random access.

<snip>
 
I

ImpalerCore

I think you could use next / prev pointers as iterators. It
wouldn't be super clean, but it could certainly be consistent.

for(it = (Iter *) get_first(MyList); it; it = it->next) {
  T *data = it->data;
  data->foo = grommit();
  data->bar = 5;

}

Actually, that looks a fair bit /cleaner/ than a lot of C++
iterator code. And now that I think about it, that cast could
probably be made unnecessary by simply having get_first
return an Iter* to begin with.

That just looks like the regular iterative method to access a linked
list data structure. Let's throw in MyArray instead and keep the same
nomenclature. Do you expect this to work?

for(it = (Iter *) get_first(MyArray); it; it = it->next) {
T *data = it->data;
data->foo = grommit();
data->bar = 5;
}

A list iterator would have to keep track of the head of the list and
its current list node pointer. An array iterator would need to keep
track of the array buffer and the current index. Moving to the next
position are different operations. In the list, you follow the next
pointer, for arrays, you increment the index by 1. This currently
requires me to use two different loops styles to iterate through
either a list or an array, rather than one loop that uses iterators
that will work independently whether a list or an array is the data
structure.

It may be possible to create an infrastructure to abstract iteration
of a container, but it only provides a tangible benefit when you have
an algorithms library on top of it. C++ iterators have several things
going for it. Operator overloading to sweeten the syntax, and member
functions so that you can avoid having to pass the iterator back to
its own function pointers, which imo is very annoying and doesn't make
the syntax to accessing a container via a generic iterator look easy
to do in C, at least not the ways I've explored trying to implement
it.

Or maybe I'm missing your point.
it->data would have to be a void pointer, which loses some
type safety, but such is life in C, I suppose.

I don't see a way around keeping the data as a void*. If you're
willing to ignore the problem of generic iterators, you can use macros
to wrap the type into the interface, i.e.

int n = 42;
my_array_t* array = my_array_new( 100, int );
my_array_push_back( array, &n, int );
int* p = my_array_front( array, int );
if ( p && *p == 42 ) {
printf( "Yay\n" );
}

This kind of array can work as long as the size of the type is fixed.
All that needs to be done is to translate the type 'int' into the
amount of memory via sizeof and use that to access and change values
of that type. It does require passing the type around to the
interface though, which is clunky, but not terribly clunky imo.

Best regards,
John D.
 
A

Andrew Poelstra

A list iterator would have to keep track of the head of the list and
its current list node pointer. An array iterator would need to keep
track of the array buffer and the current index. Moving to the next
position are different operations. In the list, you follow the next
pointer, for arrays, you increment the index by 1. This currently
requires me to use two different loops styles to iterate through
either a list or an array, rather than one loop that uses iterators
that will work independently whether a list or an array is the data
structure.

You're right. Using next pointers the way I described is not
as generic as I thought it could be. But still there is hope:

Iter *it;
for(it = get_first(MyArray); it->valid; iter_next(it)) {
T *data = it->data;
data->foo = grommit();
}

This is essentially how C++ does it, without the operator
overloading syntax. iter_next() would be a macro that added
a cast to a generic iterator and called the "real" next
function, which would go through its vtable and interate
itself properly.

OR, you could use a single iterator type with a type field
and associated container, and put all the logic right into
iter_next(). It would be a little faster, probably, but
would restrict extensibility.
 
A

Andrew Poelstra

You're right. Using next pointers the way I described is not
as generic as I thought it could be. But still there is hope:

Iter *it;
for(it = get_first(MyArray); it->valid; iter_next(it)) {
T *data = it->data;
data->foo = grommit();
}

This is essentially how C++ does it, without the operator
overloading syntax. iter_next() would be a macro that added
a cast to a generic iterator and called the "real" next
function, which would go through its vtable and interate
itself properly.

Silly me, there's no need for the cast. get_first() would
setup the vtable properly and a single Iter type could be
used.
OR, you could use a single iterator type with a type field
and associated container, and put all the logic right into
iter_next(). It would be a little faster, probably, but
would restrict extensibility.

This is also silly, as per above.
 
S

Seebs

What's the thinking behind the cast? It looks like it would just hide
errors from me.

The objects passed in would not be "container *", they'd be something
which...

Ohhh. Wait. Nevermind, I'm just being dumb. Even if x may not be
a container...

x->container is always a container.

So x->container->funcs.
Without it, you'd need MCALL and MCALL0:
#define MCALL0(f, obj) ((obj)->funcs->f((obj)))
#define MCALL(f, obj, ...) ((obj)->funcs->f((obj), __VA_ARGS__))
I don't think there is any way to avoid needing two macros in the
situation in standard C, but this is one case where I /want/ to be
wrong. I wonder if there are any plans to provide a standard
solution.

Hmm. I don't know of one. I vaguely recall the question being discussed,
and some conclusion being reached that it was viable to ignore that case.
In this case, though, it doesn't seem to be. On the other hand... I
don't know that we ever need to have one that doesn't take at least one
argument. Because we pretty much always want an aux() type thing.

-s
 
N

Nick

Seebs said:
But thinking about it more... Hmm. The problem is really the lack of
a good unambiguous shorthand for "container".

Invent a new technical term - like "box" or "sack". Why ever not?
 
B

Ben Bacarisse

Seebs said:
On 2010-03-18, Ben Bacarisse <[email protected]> wrote:

The "it" is gcc's use of ## to suppress a preceding , before __VA_ARGS__
Hmm. I don't know of one. I vaguely recall the question being discussed,
and some conclusion being reached that it was viable to ignore that
case.

Seems a shame to me. I did find this:

http://www.open-std.org/Jtc1/sc22/wg14/www/docs/n976.htm

which discusses the issue. I talks about "this DR" but I could not
find an actual Defect Report so I don't think it ever became formal.
In this case, though, it doesn't seem to be. On the other hand... I
don't know that we ever need to have one that doesn't take at least one
argument. Because we pretty much always want an aux() type thing.

I don't follow. Can you say a bit more...
 
S

Seebs

I don't follow. Can you say a bit more...

Basically, any API like this for callbacks should almost always have
an extra parameter to every function, even functions which "take no
arguments", for passing sideband/auxiliary data.

So instead of
list_count(container);
it should be
list_count(container, aux);
where "aux" is a void * interpreted however a given kind of container
wants to interpret it.

-s
 
B

Ben Bacarisse

Seebs said:
Basically, any API like this for callbacks should almost always have
an extra parameter to every function, even functions which "take no
arguments", for passing sideband/auxiliary data.

So instead of
list_count(container);
it should be
list_count(container, aux);
where "aux" is a void * interpreted however a given kind of container
wants to interpret it.

Sure. I think that is widely know. I had no idea you were talking
about callbacks. I thought you were talking about there being no need
for a fix for the __VA_ARGS__ problem in the macro I proposed. At
least I though you comment had something to do with the macro.

You can't be talking about the container operations themselves (which are
what my suggested macro calls) -- your example call has no auxiliary
data argument (quote correctly so in my opinion since it does not need
one).

[Can I suggest you to keep a little more context when you reply? No
one reading your reply could know what was being discussed when you
claimed that "we pretty much always want an aux() type thing".]
 
S

Seebs

Sure. I think that is widely know. I had no idea you were talking
about callbacks. I thought you were talking about there being no need
for a fix for the __VA_ARGS__ problem in the macro I proposed. At
least I though you comment had something to do with the macro.

I think it does. The fact that there must always be at least one more
argument means that we can use the , __VA_ARGS__ thing and expect it to
work, because there will always be at least one argument.
You can't be talking about the container operations themselves (which are
what my suggested macro calls) -- your example call has no auxiliary
data argument (quote correctly so in my opinion since it does not need
one).

Sure it does. (container, aux).
[Can I suggest you to keep a little more context when you reply? No
one reading your reply could know what was being discussed when you
claimed that "we pretty much always want an aux() type thing".]

Not a bad idea. :)

-s
 
B

Ben Bacarisse

Seebs said:
I think it does. The fact that there must always be at least one more
argument means that we can use the , __VA_ARGS__ thing and expect it to
work, because there will always be at least one argument.

But the macro was not for a callback. Zero "extra" arguments are
perfectly normal in the context it would be used.
Sure it does. (container, aux).

Not this example, the one you originally gave that I proposed with the
macro in question. There's been too much snippage. Lets put some
contact back:

[you:]
| I'd probably do:
| #define get_element(x, y) ((container *) x)->funcs->getele(x, y)

I suggested replacing this with MCALL and described the problem when
there are no __VA_ARGS__:

[me:]
| Using gcc I'd use:
| #define MCALL(f, obj, ...) ((obj)->funcs->f((obj), ##__VA_ARGS__))

Your original example does not have (and does not need in my
opinion) auxiliary data. x is the container and y is the item
(index?) to 'get'.

In this context you suggested that the problem of empty __VAR_ARGS__
is not serious because of the need for auxiliary data. So I looked
back to your example (already snipped by this time) and saw none.

The need for auxiliary data /in callbacks/ hardly matters for the
MCALL macro because any 'apply callback' type operation already needs
a function to apply, and MCALL will work as well with one extra
argument as it will with two:

MCALL(apply_function, list, my_function);

works as well as

MCALL(apply_function, list, my_function, &my_data);

If, on the other hand, you are saying that /all/ operations on /all/
containers should have an auxiliary data argument then (a) I disagree
(but won't argue the point) and (b) I would have preferred to see it
in your example

#define get_element(x, y, aux) ((container *)x)->funcs->getele(x, y, aux)

so that this whole sub thread could have been avoided!

If that was not what you were saying, I still can't see the connection
to the __VA_ARG__ problem.

<snip>
 
S

Seebs

But the macro was not for a callback. Zero "extra" arguments are
perfectly normal in the context it would be used.

Hmm. Could be.
Not this example, the one you originally gave that I proposed with the
macro in question. There's been too much snippage. Lets put some
contact back:
[you:]
| I'd probably do:
| #define get_element(x, y) ((container *) x)->funcs->getele(x, y)
I suggested replacing this with MCALL and described the problem when
there are no __VA_ARGS__:
[me:]
| Using gcc I'd use:
| #define MCALL(f, obj, ...) ((obj)->funcs->f((obj), ##__VA_ARGS__))
Your original example does not have (and does not need in my
opinion) auxiliary data. x is the container and y is the item
(index?) to 'get'.

Sure, but you could indeed invoke it with the MCALL macro, because it
has at least one additional argument. Furthermore, I'm not sure it
wouldn't sometimes be useful to have an aux data element for that.
In this context you suggested that the problem of empty __VAR_ARGS__
is not serious because of the need for auxiliary data. So I looked
back to your example (already snipped by this time) and saw none.

Well, arguably, "y" is. Even if we didn't have an "aux" argument,
we'd be doing MCALL(getele, x, y) -- so __VA_ARGS__ would have at
least one member.
If, on the other hand, you are saying that /all/ operations on /all/
containers should have an auxiliary data argument then (a) I disagree
(but won't argue the point) and (b) I would have preferred to see it
in your example
#define get_element(x, y, aux) ((container *)x)->funcs->getele(x, y, aux)
so that this whole sub thread could have been avoided!

I hadn't remembered it by then. I nearly always forget that there's
a recurring theme of discovering, after building an interface, that you
need a way to pass additional data or context in.

-s
 
N

ng2010

jacob navia said:
Seebs a écrit :


I do not know of any other attempt in C.

Inheritance is not necessary at all. The code works without it.

Jacob: Do you even know C++? Because it sounds like you are burying your
head in the sand and doing wishful thinking and hoping that others would
believe and follow you, but the fact of the matter is that you are the
only one who wants to put lipstick on a pig! Eventually, all the old iron
of detroit will be gone, rusted away. So will C rust away. "A few" still
drive around the old cars, but they are still deathtraps and old,
obsolete technology. Restoring them with new underpinnings like
independent suspensions, doesn't change them much (still deathtraps).
With all your zeal, you should study C++ (if you haven't) and maybe a few
or all other languages and then embark on creating a new language (though
I can generalize and say that those who are good at building compilers
make for crappy language designers, but you know what they say about
generalizations). But you aren't a compiler builder, now are you. You
sound like a young man so I wonder what the fork you are doing with C
other than as a potential intermediate representation. You have time. Use
it wisely.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top