Change in C

REH · Sep 17, 2009

In other words, after some *years* working in the problem, it was
impossible to introduce that change in the given time frame.

No, an issue came up and it was felt that it could not be properly
address by next month's deadline. It happens. They wanted to error on
the side of caution then introduce a broken feature. That's a bad
thing? Besides, "impossible in a given time frame" is not
"impossible." Leaving off the qualifier is disingenuous.

Well, the future is very long. Many things may happen.

No, we can't predicate the future, but the next TR is scheduled for
next year. It is already being worked on.

I brought that into this discussion because is an example of
where C++ has gone.

Yes, a popular and powerful language used in everything from embedded
systems to games.

It was an example of the
utter complexity of the language, a complexity of such proportions
that people that should be *the* experts in the matter are just unable
to introduce any further change because of the implications can't be
known, the complexity has grown beyond what mere humans are able
to handle.

unable? really? Have you compared the new draft standard to the '03
version? I think there are a lot of great changes/additions/fixes in C+
+0x.

Complexity is not necessarily bad. You don't need to--and shouldn't--
use very language feature in every program. I think one of the first
mistakes a learner of C++ (and probably other languages) makes is to
learn a "neat" new feature, and think "hmm, how can I use this in my
program."

Ada and Lisp are also complex, but are still great languages. I think
the changes going into Ada 200y are going to be it even better.

What specifically don't you like about C++? Is it just its size?

If the creators of the language are now in such a situation, what
happens with the normal users?

C++ is controlled by its standards committee, the same as C. It is not
controlled by its creator (although he is on the committee).

REH

jacob navia · Sep 17, 2009

Chris M. Thomasson a écrit :

jacob navia said:
jacob navia said:

Seebs a écrit : [...]

Is the table actually stored
in each object?

Click to expand...

No, of course!

Then it's a huge waste of space.

Do we store only a
pointer to the table?
Yes.

Then we have allocation issues with the risk of
dangling pointers or leaked memory, and we have to establish some kind
of pattern for determining whose job it is to allocate the storage.

Click to expand...

There is no allocated memory for the pointer! What are you speaking
about? The pointer points to a static global table for all containers
of the same type.

Click to expand...

Are you doing something like this:

http://clc.pastebin.com/f52a443b1

Well it looks very similar indeed. The software is completely different
apparently (device driver !!!) but the method looks very similar.

I hope I did not iunfringe some copyright!

jacob navia · Sep 17, 2009

Richard Heathfield a écrit :

Clearly we disagree.

Nope. I lose nothing by it, whether by your interpretation of his
article or by mine.

OK Mr heathfield could we stop this and come back to the technical
points?

I propose to use a table of function pointers for all containers, a
static table for each type of container.

In each container object we store a pointer to that table. Access
to function goes like this

MyType Data; // some user data

List *list = newList(sizeof(data)); // Create a container
// Empty at the start

list->lpVtable->Insert(list,&data); // add one element

int count = list->lpVtable->GetCount(list); Should be 1

etc.

Now, since I haven't mentioned anything else please let's GO ON!
What do you think about it?
How would you improve this design?
What are its problems?

Thanks.

Please read the other messages in this thread. I think we
have a good discussion here. Let's abstract for a while from our
differences.

Thanks again.

cognacc · Sep 17, 2009

cognacc a écrit :

There are two problems with that library:

(1) The more serious one:http://library.gnome.org/devel/glib/2.20/glib-Memory-Allocation.html
<quote>
Memory allocation functions.
---------------------------
Description
These functions provide support for allocating and freeing memory.
Note
If any call to allocate memory fails, the application is terminated.
This also means that there is no need to check if the call succeeded.
<end quote>

This is completely unacceptable and actually it is a horrible idea.
This precludes any use of the whole library since all functions
that *potentially* allocate memory are tainted.

I see(and agree), but maybe the library have both a behaviours
(looking in to it)
any way, in a program you could use gmalloc or malloc to get "normal"
behaviour.

(2) Another problem (or maybe not) is:
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Library General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.

I do not know if linking with gtk forces you to disclose your
source code...

Linking with glib can be done without disclosing source. because it
uses the LGPL.
It could still be a problem
for some i guess.

from the glist.c file
---
* GLIB - Library of useful routines for C programming
* Copyright (C) 1995-1997 Peter Mattis, Spencer Kimball and Josh
MacDonald
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
---

In any case the glist functions interface are very similar to
what I propose or to many other lists packagesa around.

Ok.

I look forward to the technical details.

mic

Seebs · Sep 17, 2009

In each container object we store a pointer to that table. Access
to function goes like this

MyType Data; // some user data

List *list = newList(sizeof(data)); // Create a container
// Empty at the start

list->lpVtable->Insert(list,&data); // add one element

int count = list->lpVtable->GetCount(list); Should be 1

Now, since I haven't mentioned anything else please let's GO ON!
What do you think about it?
How would you improve this design?
What are its problems?

From a design standpoint: What if anything goes in "data" to make it
part of the list? Does the list container maintain a separate linked list
of items with pointers to its data elements? If so, consider the malloc
overhead; you're now adding an additional malloc overhead to each item
in the list, by having a list element in the container. The indirections
are expensive. How do I get from the data element to the container, or
can the same data element be in several containers?

From a practical standpoint: There's no way I'd use this. I am familiar
with code using things like the list operations in the BSD kernel, which
are carefully structured to allow extremely efficient list operations.
Compared to them, for small data elements (the normal case, as it happens,
for most of what they're working on), your implementation looks like it
has a minimum cost of 20-30% or so for data, and probably closer to 50%
for execution time.

-s

Seebs · Sep 17, 2009

Obviously if you program in a microcontroller with 16K of RAM you can't
use printf. But nobody is complaining here that printf should be
banned.

The point is, list operations are very often the inner loop of a
program.

Yes. Lists, however do have a "Next" pointer. Flexible arrays not.

Where's the "Next" pointer? Is it in the data element? If it is,
then how does the user declare that pointer?

For lists, it is tored in each element since the elemnts of a list are
ordered by their "next" pointers, that's the essence of a list.

Okay.

So what's the container get you, given that you don't need it to find the
next item? It seems as though this isn't doing much good for lists.

In my implementation the container manages its own memory and copies the
received data. If you do not do this, the container wont be able to
reorganize the data as needed. Obviously it *could* be done, but it
would be complicated to free the elements when some element is deleted.

.... So if I have an object which I need to create, I have to allocate one,
then copy it into the container, which allocates space for it and copies
it.

If the address of the object is significant, I can't put it in a container
at all.

A function call and 2 indirections? That is a "pretty significant
cost"?
Yeah.

That's nothing in most machines. In any case, if you use your
OWN functions, the only thing you gain are just 2 indirections
since you have to make a function call anyway.

Not if they're inline or macros, which list ops often are.

Suppose you have a list with 100 elemnts of 10 bytes each. (1000
bytes of storage). The overhead per list is 8 bytes for a pointer
(in a 64 bits machine) or 4 bytes for a pointer in a 32 bit machine.

Okay. So you're not keeping a pointer to the container, and the list
pointers are in the objects.

But you are requiring me to pass around a container AND a list element
any time I need to be able to do operations on the list element...

Sure. But sometimes you need to change from a list to a table because
lists are expensive. Now you have to rewrite everything. Using this
approach you can keep MOST of your code intact since

foo->lpVtable->Add(foo,(void *)&data);

will stay the same for the new container!

If I want that kind of flexibility, I write in Ruby.

It would be interesting to see the ratio of my library
to ruby...

Always speaking about performance and then... a ruby interpreter!

Exactly. If I am writing in C, I probably care about the cost of a couple
of indirections (you're not thinking about the cache miss cost of that,
which is probably going to dominate the execution cycles). If I'm not that
concerned, I'll write in a much higher level language.

-s

jacob navia · Sep 17, 2009

Seebs a écrit :

From a design standpoint: What if anything goes in "data" to make it
part of the list?

Nothing. This is the "payload" of the container. This will be stored
in each element of the list. Since the list knows how big each element
is, each element is copied into the container linked list.

Another schema is possible with each element having a DIFFERENT size. In
that case data should NOT be copied, just a void * is stored and the
user is responsible for all data he/she stores

> Does the list container maintain a separate linked list
of items with pointers to its data elements?
Yes

If so, consider the malloc
overhead; you're now adding an additional malloc overhead to each item
in the list, by having a list element in the container.

If you do not want data to be copied, just use a void pointer and pass
the address.

The indirections
are expensive. How do I get from the data element to the container, or
can the same data element be in several containers?

As you like, but you can't get from an element to its container since
there are no backpointers. That would be much too expensive in space

Of course nothing is hindering you from storing YOURSELF a backpointer
in the data if you wish.

From a practical standpoint: There's no way I'd use this. I am familiar
with code using things like the list operations in the BSD kernel, which
are carefully structured to allow extremely efficient list operations.
Compared to them, for small data elements (the normal case, as it happens,
for most of what they're working on), your implementation looks like it
has a minimum cost of 20-30% or so for data, and probably closer to 50%
for execution time.

This is completely bogus. You have no measurements to justify such an
assertion. Suppose worst case that you have a list of chars. Each of
size 1.

And you want to store 100 of them. You have 100 chars, and in a 32
bit machine an overhead of 4 bytes for the lpvtbl pointer, and maybe
3 more pointers for cursor, root and last. 16 bytes would be an
overhead of just 16%!

And, who is going to build a list of chars of only 100 chars?

You just allocate a buffer of 512 and you have all your needs covered.
You do not need a list for this case.

The same for integers or even doubles. For very small lists just
using a buffer is much more efficient.

Another container is the flexible array. This is a growable array,
of elements of the same size, or an array of void pointers if
you want different size.

There, the overhead is less since there is no "next" pointer in each
element.

jacob navia · Sep 17, 2009

Seebs a écrit :

The point is, list operations are very often the inner loop of a
program.

Mmmm I wouldn't do that in inner loops. Much better to use an array
then. Lists are more expensive than arrays.

Where's the "Next" pointer? Is it in the data element? If it is,
then how does the user declare that pointer?

The container manages a list of stuff. Each list element contains
two fields:
(1) a Next pointer
(2) Data. If the list is homogeneous (all elements of the same size)
the data is right after the next pointer. If not, just a void * is
stored.

Okay.

So what's the container get you, given that you don't need it to find the
next item? It seems as though this isn't doing much good for lists.

If you want you can manage the list directly and use the "next" pointer.
In that case the container is unnecessary. You do your own list. But
if you want to use a cursor, a pointer to the last element (for
fast appending) etc, you will want the container.

Obviously this containers aren't good for ALL situations in all
circumstances. Like printf. It is MUCH slower that puts()
isn't it? Avoid it in inner loops.

... So if I have an object which I need to create, I have to allocate one,
then copy it into the container, which allocates space for it and copies
it.

As before, if you do not want that, you create a container that manages
void pointers to your data.

If the address of the object is significant, I can't put it in a container
at all.

Yeah.

In all modern CPUs that's nothing. Maybe in a slow microcontroller but
then, do not use containers in those slow machines. Build your own
lists. You can't use printf, you do your own simple formatting.
You can't use sqrt because there isn't any floating point, you write
your own approximation for integers. ETC.

But you are not really proposing to eliminate printf and sqrt from the
language isn't it?

Not if they're inline or macros, which list ops often are.

I thought you were memory constrained. Inline functions and long macros
for handling lists are memory intensive...

Okay. So you're not keeping a pointer to the container, and the list
pointers are in the objects.

But you are requiring me to pass around a container AND a list element
any time I need to be able to do operations on the list element...

You use the indexing functions. GetNth(list,int);

If I want that kind of flexibility, I write in Ruby.

Ahh OK. That flexibility is OK when in Ruby. In C we are
doomed to write:

ptr = ListHead;
n = 23;
while (ptr && n > 0) {
ptr = ptr->Next;
n--;
}
assert(n == 0);
ptr->Data = "This is element 23";

Why can't we write

List->lpVtbl->Assign(List, 23, "This is element 23");

Ben Bacarisse · Sep 17, 2009

jacob navia said:
Seebs a Ã©crit :

Mmmm I wouldn't do that in inner loops. Much better to use an array
then. Lists are more expensive than arrays.

No, sometimes arrays are more expensive. If a list is the right
structure, switching to array will often make matters worse.

Take a merge sort for example. A linked list is the right structure
and the list operations will be in an inner loop. Switching to an
array does not fix anything in this situation.

<snip>

jacob navia · Sep 17, 2009

Richard Heathfield a écrit :

> For the list to own the data (which, personally, I think is a good
idea), it is going to have to make its own copy if it is to persist
past the point where the feed data goes out of scope. That means a
malloc, obviously (or some dark equivalent thereof), and some people
will object purely on those grounds. That's your first major problem
- some people will want the list to own the data, and others will
very much want it *not* to own the data.

This is easily solved. If you do not want the list to own the data
you do not give any data to the list. You make a list of void pointers
to your data. Then, you own the data and no copies are done.

Assuming it does own the data, you then have the problem of pointers
within that data.

Shallow copy since the list container can't know what are pointers
and what are just integers.

Do you do a deep copy, or a shallow copy? If you do
a shallow copy, you risk ownership confusion and of course dangling
pointers. If you do a deep copy, you will probably need the user to
provide copying semantics. This complicates the interface.

Exactly. What is important here is that we keep it simple. Not
ALL possible uses MUST be covered. Just 90% of all uses. Why
should the list container care about what it has to copy?

Instead of passing a pointer to a copy-constructor the user
of the list is responsible for copying and THEN passing that to the
list. Good design goes through keeping tasks separated and
responsibilities clear. The container cares about its data elements,
but it doesn't know ANYTHING about them.

PERIOD. End of story.

One way to obviate this problem is to have a sort of registration
process when setting up a new list, so that all the complication is
up front.

Not necessarily. Comparison defaults to memcmp. Destructors are not
provided but you can use an iterator to iterate all elements of the list
when it is destroyed. This is not C++, it is C. The compiler does
much less for you, but the language is easier to learn and use.

You could also use the opportunity to register destructors,
comparators, and iterators. For example:

List *list = newList(sizeof Data,
MyTypeCopyConstructor,
MyTypeDestructor,
NULL, /* we happen not to need a comparator */
MyTypeIterator);

Copy constructors defaults to memcpy. If you want a fancy one
you can have it but it would add overhead.

A copy constructor would be of type int(const void *, void *);

I assume this means
int (*copyconstructor)(const void *,void *);

Note that you use copy(source , destination) what is the contrary of
copy functions in C. (memcpy strcpy). I would rather have it the
other way. I do not think that this is needed though. The user
copies the data because the user knows what the data is storing,
where are the pointers what do they point to, etc.

Division of reponbsabilities. Again, this is VERY important in
software design.

A destructor would be of type int(void *, void *);

I do not see why you need a destructor *in the list*. If you want to
destroy an item to the list, you call GetElement, you do whatever you
need to do when the item is destroyed, and then you call Delete
with the index of the item.

Yes, it is not as "nice" but it is just as good.

A comparator would be of type int(const void *, const void *, void *);

OK. You need that. It defaults to memcmp but you should be able to
change it. By the way, what is the third argument?

An iterator would be of type int(void *, void *);

Wait. It is much simpler to pass the callback at each call to
the iterator.

In each case, the last (optional - NULL is okay) pointer is to an
arbitrary structure that can carry or even accept side channel
information.

Ahh OK. That's clever. I will add it to my implementation.

These function pointers (the non-NULL ones, anyway) would then be
called whenever the standard data structure is called upon to do
whatever. For example, list->iterate(&myvars) would call
*MyTypeIterator for each item in the list, handing it a pointer to
the current item and the pointer we passed to it.

Correct. I do that, but I receive the function pointer at each
call to the iteration function to avoid storing it in the
list and increasing the fixed overhead. You are NOT going to
iterate all lists, only some of them. Then, it is more
space efficient to pass the iterator to the iterate list
function.

You would also need to have functions to allow these defaults to be
updated - e.g. list->SetNewIterator(MyTypeAnotherIterator).

Not if you pass it at each call.

(This is pretty much how my own library works, by the way.)

The problem is that it's all getting very complicated, and that in
itself is very likely going to make it unacceptable to a standards
committee.

Complicated???

Too complicated???

This is NOTHING. I am sure you have designed software that is MUCH MORE
complicated than this stuff!

And most of us have. Including committee members.

But every way of simplifying it will lead either to complications in
other areas or insufficient power and flexibility.

What is important to have clear at the start is that it is not
possible to achieve PERFECTION. Each user will have some
special needs. Since all those function pointers are writable,
users can add or eliminate features as they wish, always MAINTAINIG
the same interface!

Bottom line: there are a great many ways to do this, and every way has
its advocates and detractors. Getting a consensus within ISO for
adopting any single proposal is a Sisyphean task.

I would not aim so high right now. I would aim to develop here in this
group a consensus implementation that is widely distributed and
with no strings attached at all.

After it has spread around, we have experience with it, it becomes

CURRENT PRACTICE!

and ISO will gladly adopt it.

Thanks for paricipating

Seebs · Sep 17, 2009

Nothing. This is the "payload" of the container. This will be stored
in each element of the list. Since the list knows how big each element
is, each element is copied into the container linked list.

Memory copies are expensive. Furthermore, now the container is allocating
things.

So:

I allocate an object, populate it, add it to the list...
... which allocates an object, copies mine into it
....and then I free my allocated object.

That's only doubling the number of allocations.

Another schema is possible with each element having a DIFFERENT size. In
that case data should NOT be copied, just a void * is stored and the
user is responsible for all data he/she stores

And that still means there's two allocated objects per list item.

If you do not want data to be copied, just use a void pointer and pass
the address.

Doesn't matter, you're still going to have an extra allocation; one for
the hidden list-object item the container keeps track of, and one for the
user's object.

As you like, but you can't get from an element to its container since
there are no backpointers. That would be much too expensive in space

Right.

Which means that the elements are less useful than traditional list members
in a naive C implementation.

This is completely bogus. You have no measurements to justify such an
assertion. Suppose worst case that you have a list of chars. Each of
size 1.

And you want to store 100 of them. You have 100 chars, and in a 32
bit machine an overhead of 4 bytes for the lpvtbl pointer, and maybe
3 more pointers for cursor, root and last. 16 bytes would be an
overhead of just 16%!

And where's the list? You need the information about the ordering, too.

And, who is going to build a list of chars of only 100 chars?

I'm not.

But.

I have 100 objects. Each of them is a separate malloc buffer.

Now I use your library, and as suggested, just store the (void *) in the
container.

Your library has to maintain a list of 100 objects, in which case, it's
storing ANOTHER 100 separate malloc buffers. Which isn't helping me any.

There's also extra copies going on, etcetera.

You just allocate a buffer of 512 and you have all your needs covered.
You do not need a list for this case.

You might want to look up mbufs.

Another container is the flexible array. This is a growable array,
of elements of the same size, or an array of void pointers if
you want different size.

There, the overhead is less since there is no "next" pointer in each
element.

True. But if I'm using the pointers, I'm still dealing with malloc overhead.

Don't forget that there's at least some extra space used by each allocated
object for the malloc internals.

-s

Seebs · Sep 17, 2009

Seebs a écrit :

Mmmm I wouldn't do that in inner loops. Much better to use an array
then. Lists are more expensive than arrays.
mbuf

The container manages a list of stuff. Each list element contains
two fields:
(1) a Next pointer
(2) Data. If the list is homogeneous (all elements of the same size)
the data is right after the next pointer. If not, just a void * is
stored.

Okay.

So each list element is an allocated object.

Which means that, to use the container, I have to double the number of
allocations I perform.

As before, if you do not want that, you create a container that manages
void pointers to your data.

.... So if I have an object which I need to create, I have to allocate one,
then copy its address into the containner, which allocates space for the
list element object plus a void pointer, and copies the address.

It's still an extra allocation.

In all modern CPUs that's nothing.

"Nothing" is a pretty big deal if you're trying to saturate gigabit ethernet
links while doing substantial packet processing.

Maybe in a slow microcontroller but
then, do not use containers in those slow machines. Build your own
lists. You can't use printf, you do your own simple formatting.
You can't use sqrt because there isn't any floating point, you write
your own approximation for integers. ETC.

printf is actually a pretty lightweight formatter -- it's very well built
to do the simple cases quickly.

But you are not really proposing to eliminate printf and sqrt from the
language isn't it?

No.

But I am suggesting that your "container" design is going to be useless for a
BIG chunk of the use cases I've seen for lists in the real world.

I thought you were memory constrained. Inline functions and long macros
for handling lists are memory intensive...

I never said I was particularly memory constrained. Also, I never said
the macros were long...

On x86, anyway, the most likely implementations of the most common list
operations are not noticably longer than the calling sequence for a function
to implement them. Keep in mind that trivial functions SAVE space when
you inline them!

You use the indexing functions. GetNth(list,int);

Yup. That's two arguments, now. Worse, that's going to be a pretty expensive
operation on a list of a thousand items; on average, it'll be 500 loops (and
loops are usually expensive, because branches are expensive).

Ahh OK. That flexibility is OK when in Ruby.

Design-tradeoffs are a big part of effective engineering. If I am writing
in C, I am doing it because I have a real reason to work at the machine level,
and I'm probably counting cycles. If I want convenient high-level
abstractions to save programmer time, I'm usually in the area where my
work is utterly dominated by I/O waits, and I will use shell or ruby.

ptr = ListHead;
n = 23;
while (ptr && n > 0) {
ptr = ptr->Next;
n--;
}
assert(n == 0);
ptr->Data = "This is element 23";

This example is unreasonable, for a very simple reason:

If you care about "element 23", you shouldn't be using a list.

Why can't we write

List->lpVtbl->Assign(List, 23, "This is element 23");

Honestly, the capitalization here makes me feel like you don't quite
"get" the Spirit of C. There's a reason that all the library calls
are lowercase...

Anyway, your example highlights a key problem in both cases:

Who determines whether or not to free the pointer associated with a
list element? You're obviously assigning pointers, not arrays, so
there's implicit allocation somewhere. Who owns that allocation?
If it's the container library, what happens if I allocate some space
and want to assign the pointer? The library has to duplicate the thing
I copied. If I own it, how do I track which list elements have allocated
space and which have string literals?

In short, I don't think this is a good use case.

Go read the BSD mbuf code. If you understand that code, I think you'll
get a lot of insight into the requirements for an effective list
implementation in C.

-s

jacob navia · Sep 17, 2009

Richard Heathfield a écrit :

But now you've got all these void pointers adding overhead to the
story. Not necessarily a bad thing to some people, but others will
see it as a waste of resources.

No. There is no overhead. A list has AT LEAST two elements:
(1) Next element pointer
(2) data. In this case if you do it yourself you will have
a list of void pointers ANYWAY, the container doesn't change
ANYTHING to that fact.

The extra overhead in this case is only one container object per list.

Then you expose the programmer to the risk of dangling pointers.

Yes, better avoid them

.
(1) You do NOT maintain copies of the data you passed to the container.
(2) When needing to copy an element you do the copy yourself, THEN you
pass it through.

Again, not everyone will see this as a showstopper - but many will.
(In this case, I'm one of them - I'd prefer to provide deep copy
facilities.)

This is a valid point. Happily the design is flexible enough to
accomodate your wishes. You subclass the functions you need. To do
that you:
(1) Save the current pointer to the function you need
(2) Set your own function that does the copying using some private
data and then
(3) calls the pointer that was in the table to perform the rest of the
functionality you have just subclassed.

This works since a lot of time and has been used a lot of times in many
different contexts. You can do it under windows for instance, by
subclassing the windows procedure for a window. And I am sure
other window systems allow this too. It is very common concept.

Fair point. Here's another fair point. Why should the user go to the
inconvenience of typing the code for copying a structure instead of
just passing the darn thing to the list interface - fire and forget.

Exactly, you are right. Happily the design is flexible so you
can do that if you wish. Just subclass the functionality. See above.

Yes, but we have that /now/. Adding containers is not going to make
the language any simpler.

Of course it will make the language simpler!!!
That's the main reson of this prtoposal. All that code
handling lists that is EVERYWHERE IN ALL C PROGRAMS of some
complexity will disappear, replaced by a uniform interface
that is standard and will be present in all run time environments.

Right. So some people will want to add one, and others will want not
to add one, and will resent the space taken for the pointer.

No, they can subclass, as I said above.

Well, I was talking about the function itself, but yes, the list
structure would contain a pointer of that type, except...

Just a screwup - I missed a rather important parameter!

int (*copyconstructor)(void *this, /* dest */
const void *that, /* src */
void *theother); /* optional side channel */

Sure. I can see that. But /I/ think it's needed. So we disagree.
That's fine, we should be able to disagree without coming to blows,
but we can't have both support for a constructor /and/ absence of
support (and absence of overhead) for a constructor.

YES WE CAN!

(pun intended)

We can subclass each of those functions if we want to. We can use
our subclassing at the individual list level (this list will need
a copy constructor but that one won't) or we can subclass at the
global level (ALL my lists will use a copy constructor)

The most important part of this design is that IT IS VERY FLEXIBLE!

It's one or the
other, really.

No. I want the cake and the money of the cake too.

[snip since you repeat the same arguments]

Seebs · Sep 17, 2009

I am assuming that you are proposing this as a change to the standard
language. So my first observation is that your proposed function
names are trampling on user namespace. But that's a minor point,
easily fixed by choosing implementation-reserved names.

Jacob's already said that he intends these to be in a struct, so it
doesn't trample user namespace.

Richard, though, is right -- because that namespace is still available
for user #defines.

For the list to own the data (which, personally, I think is a good
idea), it is going to have to make its own copy if it is to persist
past the point where the feed data goes out of scope. That means a
malloc, obviously (or some dark equivalent thereof), and some people
will object purely on those grounds. That's your first major problem
- some people will want the list to own the data, and others will
very much want it *not* to own the data.

More succinct that my own explanation, and thus better.

There are good reasons for both. I think you'd need to support both,
but...

Assuming it does own the data, you then have the problem of pointers
within that data. Do you do a deep copy, or a shallow copy? If you do
a shallow copy, you risk ownership confusion and of course dangling
pointers. If you do a deep copy, you will probably need the user to
provide copying semantics. This complicates the interface.

Good catch, hadn't thought about deep copies yet.

The problem is that it's all getting very complicated, and that in
itself is very likely going to make it unacceptable to a standards
committee.

Pretty much -- especially because it's not at all clear that there's
enough benefit from standardizing something like that to justify it.

Bottom line: there are a great many ways to do this, and every way has
its advocates and detractors. Getting a consensus within ISO for
adopting any single proposal is a Sisyphean task.

Not necessarily. The snprintf proposal, so far as I can tell, had instant
consensus that everyone wanted it and agreed it was necessary, and all
that remained was making sure that it would work.

(IMHO, we botched part of that spec, and I think "we" is at least
partially my fault; the solution we came up with to "how do you avoid
doing multiple allocations" is not, IMHO, very clean.)

-s

jacob navia · Sep 17, 2009

Seebs a écrit :

Memory copies are expensive. Furthermore, now the container is allocating
things.

So:

I allocate an object, populate it, add it to the list...
... which allocates an object, copies mine into it
...and then I free my allocated object.

That's only doubling the number of allocations.

You build an object in a buffer. You populate it and when ready,
you give it to the list. Then, for the next object you reuse
the same static buffer. Only one allocation.

Elementary my dear Watson...

And that still means there's two allocated objects per list item.

ANY list needs two object for each item.

(1) The first one has at least two fields: the "next" pointer, and the
pointer to the data.
(2) The actual data that the list holds.

This stays the same. There is NO extra overhead.

Doesn't matter, you're still going to have an extra allocation; one for
the hidden list-object item the container keeps track of, and one for the
user's object.

Well, THAT'S ALL LISTS IN THE WORLD!

Right.

Which means that the elements are less useful than traditional list members
in a naive C implementation.

Mmm you stoire backpointers to the root of the list at each element?

WoW That is really overhead.

And where's the list? You need the information about the ordering, too.

But that is not different as lists now! ANY list needs two elements
see above.

I'm not.

But.

I have 100 objects. Each of them is a separate malloc buffer.

Now I use your library, and as suggested, just store the (void *) in the
container.

Your library has to maintain a list of 100 objects, in which case, it's
storing ANOTHER 100 separate malloc buffers. Which isn't helping me any.

No, because the list software has a pool of list elements that decreases
the likehood of using malloc...

There's also extra copies going on, etcetera.

No, as I told you before, there are no copies when you just use void
pointers.

Don't forget that there's at least some extra space used by each allocated
object for the malloc internals.

No, since the list software allocates many lists elements at once in
blocks.

Seebs · Sep 17, 2009

Not necessarily. Comparison defaults to memcmp.

That won't work for structs, in general.

Ahh OK. That's clever. I will add it to my implementation.

It is a very good idea to have a sideband thing for iterators.

I didn't do it in my list library.

For reference:

ls_list *ls_append(ls_list *dest, ls_list *src);
void ls_free(ls_list *list, void (*deinit)(ls_list *));
ls_list *ls_iter(ls_list *list, int (*fn)(ls_list *));
ls_list *ls_new(size_t size, void (*init)(ls_list *));
ls_list *ls_push(ls_list *dest, ls_list *src);
ls_list *ls_rev(ls_list *list);
ls_list *ls_sort(ls_list *list, int nelem, int (*cmp)(const ls_list *, const ls_list *));

I wrote this some time back, I've used it for ages on and off, it's never
given
me any trouble.

Correct. I do that, but I receive the function pointer at each
call to the iteration function to avoid storing it in the
list and increasing the fixed overhead. You are NOT going to
iterate all lists, only some of them. Then, it is more
space efficient to pass the iterator to the iterate list
function.

In my implementation, iteration was a quality of ls_list objects, but
you could provide your own function to apply to the list. (It stops iterating
when the function returns nonzero.)

Complicated???

Too complicated???
Yes.

This is NOTHING. I am sure you have designed software that is MUCH MORE
complicated than this stuff!

And most of us have. Including committee members.

Yup. Frequently.

But I never proposed my list library, or my string library, as standard
features, because I don't think they're suitable for standardization.

(My string library does self-space-managing strings which can contain
null bytes and preserve string semantics.)

I would not aim so high right now. I would aim to develop here in this
group a consensus implementation that is widely distributed and
with no strings attached at all.

After it has spread around, we have experience with it, it becomes

CURRENT PRACTICE!

and ISO will gladly adopt it.

This is actually a very good approach to the issue.

My own impressions are that:

1. This design is too top-heavy and full of things that sound general but
are not quite general enough to be worth extra cost.
2. In general, the simple cases are simple enough that a "standard library"
doesn't buy you much, and the complex cases are complex enough that the
"standard library" wouldn't be able to handle them.

FWIW, my list library has been used maybe twice. It dramatically beat qsort
for at least one real-world use case, and that plus not needing to convert
to an array and back made it, apparently, useful to one researcher somewhere
around 1992 or so. I don't even bother to use it if I'm doing trivial lists
now, because it's usually more than I need and I don't want to deal with
including it.

-s

Seebs · Sep 17, 2009

You build an object in a buffer. You populate it and when ready,
you give it to the list. Then, for the next object you reuse
the same static buffer. Only one allocation.

I might not know what size it will be in advance -- in which case, you're
allocating a list object to hold the pointer to my allocated object.

ANY list needs two object for each item.

No, it doesn't.

(1) The first one has at least two fields: the "next" pointer, and the
pointer to the data.
(2) The actual data that the list holds.

This stays the same. There is NO extra overhead.

Uh.

struct list {
struct list *next;
size_t len;
int data[];
};

We fixed that a LONG time ago, and even before C99 formally blessed a syntax
for this, everyone knew it worked with 'int data[1];' or 'int data[999999];'
on real compilers.

Well, THAT'S ALL LISTS IN THE WORLD!

No, it isn't.

An awful lot of lists written in C store the 'next' pointer in the data
object.

That's the point.

Mmm you stoire backpointers to the root of the list at each element?

Nope.

I don't usually find myself needing any hypothetical "root" of the list.
The list isn't a separate object. Each allocated item capable of
being in a list has a next pointer. That's it.

But that is not different as lists now! ANY list needs two elements
see above.
Nope.

No, because the list software has a pool of list elements that decreases
the likehood of using malloc...

This changes very little; it's still allocation, even if it's a streamlined
allocator.

No, since the list software allocates many lists elements at once in
blocks.

And then tracks which ones it's used...

Yeah, not seeing an advantage here.

-s

Bart · Sep 17, 2009

Richard Heathfield a écrit :

YES WE CAN!

(pun intended)

We can subclass each of those functions if we want to. We can use
our subclassing at the individual list level (this list will need
a copy constructor but that one won't) or we can subclass at the
global level (ALL my lists will use a copy constructor)

This is starting to sound a bit like C++ with all this new jargon. To
get people to use this they have to solidly understand it.

Explain again what these containers do, are they just flexible arrays?
Or linked lists?

jacob navia · Sep 17, 2009

Bart a écrit :

This is starting to sound a bit like C++ with all this new jargon. To
get people to use this they have to solidly understand it.

The library comes with a pointer to a predefined function.

If you need to do something before the call to the library or
later you read the function pointer in your list.

Then, you replace it with a pointer to YOUR function.

For instance, you want to make a deep copy of an object
stored in the list, as Mr heathfield wanted.

You read the pointer, and you store a pointer to your function
MyAppend. That function does a deep copy of the
object it receives then calls the stored function pointer
to make a npormal call to the library function.

If you program under window you should remember how window
subclassing works. Clearer now?

Explain again what these containers do, are they just flexible arrays?
Or linked lists?

Lists aren't flexible arrays since storage is not necessarily
contiguous. That's why they need a "next" pointer.

Flexible arrays look very much like lists since you can insert
items but storage is contiguous

OK?

Sorry about the confusion.

Bart · Sep 17, 2009

Bart a écrit :

Lists aren't flexible arrays since storage is not necessarily
contiguous. That's why they need a "next" pointer.

Flexible arrays look very much like lists since you can insert
items but storage is contiguous

OK?

OK, so these are ways of dealing with linked lists in a generic
manner.

Although, I've looked at some of my linked lists, and I usually start
with a struct of some sort then add a .next field for a simply-linked
list. The trouble is I usually have several of these .next fields
(with different names) as a list node can exist simultaneously in one
than one linked list. (And often there are up and down pointers so
technically making these trees too.)

Getting back to a single, simply-linked list, I think what you are
suggesting is to start with any list node struct S, without /
any/ .next fields, and supply this struct to your library which
inserts it (as a pointer to S) in a generic linked list framework.

Yes, that sort of seems like a good idea, although nothing that needs
language changes. But I suspect in practice individual requirements
mean your method cannot always be used.

Homework in C - Help Needed	1	Oct 16, 2024
Beginner at c	0	Oct 5, 2023
Change Calculator	3	Jan 25, 2023
Mini Web Server in C++ (Part One)	4	Oct 2, 2025
Looking to change programming direction	1	Aug 10, 2022
Help figuring out a directory permission change problem	1	May 12, 2023
C exercise	1	Feb 3, 2022
C Programming functions	2	Dec 3, 2021

Change in C

REH

jacob navia

jacob navia

cognacc

Seebs

Seebs

jacob navia

jacob navia

Ben Bacarisse

jacob navia

Seebs

Seebs

jacob navia

Seebs

jacob navia

Seebs

Seebs

Bart

jacob navia

Bart

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads