Link list problem

Shwetabh · Jul 18, 2005

Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

Walter Roberson · Jul 18, 2005

This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Is it possible that struct node is defined slightly differently, as

struct node
{
struct data *next;
} *temp;

If not then it can't be done without some kind of cheating, as
clearly anything accesses by temp will be a pointer.

There is, of course, ready cheating, such as creating a context in
which temp is a pointer to a struct data instead of a struct node.

Jonathan Bartlett · Jul 18, 2005

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

So you are saying that "temp" is a pointer into data at member "s"?

There is no portable solution, as there is no way to determine the
amount of padding used for structure members. You could guess at it by
doing:

void *v = temp;
v -= sizeof(char) + sizeof(float) + sizeof(int);
struct data *d = v;

But that would only work if the structs had no padding to them, which is
an invalid assumption.

Perhaps another possibility is to do:

struct data d;
void *dp = &d;
void *s = &d.s;
int diff = s - dp;

void *v = tmp;
v -= diff;
struct data *answer = v;

Even then you probably have some pretty platform-specific problems.

Jon

Richard Bos · Jul 18, 2005

Jonathan Bartlett said:
So you are saying that "temp" is a pointer into data at member "s"?

No, he isn't saying that. He probably _meant_ that, since without some
hint of that ilk the puzzle is unsolvable, but he didn't say it.

There is no portable solution, as there is no way to determine the
amount of padding used for structure members.

Yes, there is, for a known, declared structure: offsetof.

(No, I'm not giving any more hints.)

Richard

John Cochran · Jul 18, 2005

Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

Ick.
The first thing I'd say is "I hope the code at this company doesn't do
silly things like this."
The second thing is that I'd use the macro "offsetof" along with some creative
casting and pointer arithmetic to transform a "struct node *" into
a "stuct data *". The rest is trivial.

But once again, I'd want an assurance that the above silliness isn't standard
practice at the company you're applying for a job at.

pete · Jul 19, 2005

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
struct node {
struct node *next;
} *temp;
struct data {
int a;
float b;
char c;
struct node s;
};

struct data ds;
ds.a = 12345;
ds.b = 3.14f;
ds.c = 'X';
temp = (struct node *)&ds;
printf("%d\n%f\n%c\n",
((struct data *)temp) -> a,
((struct data *)temp) -> b,
((struct data *)temp) -> c
);
return 0;
}

/* END new.c */

N869
6.2.5 Types
[#27] All pointers to structure types
shall have the same representation and alignment
requirements as each other.

6.3.2.3 Pointers
[#7] A pointer to an object or incomplete type may be
converted to a pointer to a different object or incomplete
type. If the resulting pointer is not correctly aligned
for the pointed-to type, the behavior is undefined.
Otherwise, when converted back again, the result shall
compare equal to the original pointer.

Cong Wang · Jul 19, 2005

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

Hmm,it is easy.ANSI C defines a macro 'offsetof' in stddef.h.

/*GCC 3.2.3*/
#ifndef __cplusplus
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#else /* C++ */
/* The reference cast is necessary to thwart an operator& that might
î‚¾ be applicable to MEMBER's type.î‚¾See DR 273 for details.î‚¾*/
#define offsetof(TYPE, MEMBER) (reinterpret_cast <size_t> \
î‚¾î‚¾(&reinterpret_cast <char &>(static_cast <TYPE *> (0)->MEMBER)))
#endif /* C++ */

Netocrat · Jul 19, 2005

Ick.
The first thing I'd say is "I hope the code at this company doesn't do
silly things like this."

Why not? This is the way linked lists are implemented in the Linux
kernel. It's a good approach - embed the node in the data rather than
the other way around and hence do away with the need for a data
pointer. And it's portable.

Richard Bos · Jul 19, 2005

Netocrat said:
Why not? This is the way linked lists are implemented in the Linux
kernel. It's a good approach - embed the node in the data rather than
the other way around and hence do away with the need for a data
pointer. And it's portable.

Myeah - but it would be clearer to have the node first, rather than
last. That way, when you've traversed the list (or, for that matter,
tree, or whatever structure; this trick works with more than lists), you
can simply do dataptr=(struct data *)nodeptr, and avoid fiddling with
offsetof or worse. That cast must work correctly, since a struct must
have the same address as its first member.

Richard

Netocrat · Jul 19, 2005

Myeah - but it would be clearer to have the node first, rather than
last. That way, when you've traversed the list (or, for that matter,
tree, or whatever structure; this trick works with more than lists), you
can simply do dataptr=(struct data *)nodeptr, and avoid fiddling with
offsetof or worse. That cast must work correctly, since a struct must
have the same address as its first member.

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.

Chris Torek · Jul 19, 2005

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.

In this case, one can use something like the BSD <sys/queue.h>
macros. They are type-safe, and handle this problem. Their main
drawback is some syntactic ugliness:

struct vnode {
[snippage]
TAILQ_ENTRY(vnode) v_freelist; /* vnode freelist */
LIST_ENTRY(vnode) v_mntvnodes; /* vnodes for mount point */
[snippage]
LIST_ENTRY(vnode) v_synclist; /* vnodes with dirty buffers */
[snippage]
};

(from a BSD <sys/vnode.h>). A vnode may simultaneously be on up to
three lists, one a "tailq" (which maintains a pointer to the end of
the list) and two regular "list"s.

Netocrat · Jul 19, 2005

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.

Click to expand...

In this case, one can use something like the BSD <sys/queue.h>
macros. They are type-safe, and handle this problem. Their main
drawback is some syntactic ugliness:

struct vnode {
[snippage]
TAILQ_ENTRY(vnode) v_freelist; /* vnode freelist */
LIST_ENTRY(vnode) v_mntvnodes; /* vnodes for mount point */
[snippage]
LIST_ENTRY(vnode) v_synclist; /* vnodes with dirty buffers */
[snippage]
};

(from a BSD <sys/vnode.h>). A vnode may simultaneously be on up to
three lists, one a "tailq" (which maintains a pointer to the end of
the list) and two regular "list"s.

I wasn't aware of that approach.

I don't know if this is OT but I'll assume it's just within bounds...

Without knowing how the macros are (typically) defined I can't
understand why there is a three-list limit. The linux approach allows
for the same semantics with the difference that an unlimited number of
lists may be used. There is also no distinction between list types -
they are all doubly-linked and circular. There is no "list end" but
you are free to maintain a pointer to any element and use it as such.

Linux uses a macro not to define the structure member, but to
initialise it. Your example would equate to:
struct vnode {
...
struct list_head v_freelist; /* vnode freelist */
...
struct list_head v_mntvnodes; /* vnodes for mount point */
...
struct list_head v_synclist; /* vnodes with dirty buffers */
...
} vnode_instance;
...
INIT_LIST_HEAD(vnode_instance.v_freelist);
... and for the others ...

Is this any less ugly to you? I didn't find BSD's approach all that
ugly (but yes, a little).

CBFalconer · Jul 20, 2005

Netocrat said:
.... snip ...

Totally agree when the data can only be a member of one list. But
often (as is the case in the linux kernel), objects are members of
multiple lists.

If you generalize the idea of a list to an organized data base of
some form, you can see an immediate counter example in my id2id
program. That fundamentally works with two data bases, one for
'search objects' and one for 'replacement objects'. However each
replacement object can be specified by more than one search
object. In fact the search and replacement objects can even be the
same, although you may not like the result.

Thus each input 'object' belongs to both tables. The only thing
that keeps memory ownership under control is a convention about
which table controls which storage. This symettry allows the
programs actions to be reversed (usually).

See: <http://cbfalconer.home.att.net/download/id2id-20.zip>

Chris Torek · Jul 20, 2005

I wasn't aware of that approach.

I don't know if this is OT but I'll assume it's just within bounds...

Without knowing how the macros are (typically) defined I can't
understand why there is a three-list limit.

There is no such limit: the number of lists an object can be on
depends only on the number of "list-of" elements within the object.
The example I used (vnodes) merely happens to need to be on three
(or sometimes fewer) lists.

Linux uses a macro not to define the structure member, but to
initialise it. Your example would equate to:
struct vnode {
...
struct list_head v_freelist; /* vnode freelist */
...
struct list_head v_mntvnodes; /* vnodes for mount point */
...
struct list_head v_synclist; /* vnodes with dirty buffers */
...
} vnode_instance;
...
INIT_LIST_HEAD(vnode_instance.v_freelist);
... and for the others ...

Is this any less ugly to you?

Not really

The main problem with the above -- which was also the case with
the CMU list macros from which the BSD ones were initially derived
-- is that they are (obviously) not type-safe. Consider a structure
that heads two *different* kinds of lists. For instance, a struct
describing a chapter in a book might have a list of pages, and also
a list of sections:

struct page;
struct section;

struct chapter {
...
LIST_HEAD(, page) pages;
LIST_HEAD(, section) sections;
...
};
struct page {
...
LIST_ENTRY(page) list;
...
};
struct section {
...
LIST_ENTRY(section) list;
...
};

If you want to iterate through the sections of a chapter, you write:

/* input: ch is the chapter */
struct section *sec;

LIST_FOREACH(sec, ch->sections, list) { ... }

If you accidentally try to iterate "sec" through the "pages":

LIST_FOREACH(sec, ch->pages, list) { ... }

you get a compile-time diagnostic. Of course, if you want to
iterate through the pages, just use:

struct page *pg;
LIST_FOREACH(pg, ch->pages, list) { ... }

If the list head and list entry data structures do not have some
sort of type-name decoration -- the parameters to the BSD sys/queue.h
macros, for instance -- you cannot propagate the element-type into
them (because C is statically typed).

(It is hard to get the various fields mixed up in simple examples
like these, but -- speaking from experience -- it is all too easy
to get them mixed up in real code.)

Netocrat · Jul 20, 2005

Chris said:
There is no such limit:

I misread that. You meant the specific example structure, not the
approach in general.

The main problem with the [Linux approach] -- which was also the case with
the CMU list macros from which the BSD ones were initially derived
-- is that they are (obviously) not type-safe.

In terms of Standard C you are 100% right. On an off-topic note, since
gcc is specified as "the" Linux kernel compiler, its extensions can be
relied upon, in particular the typeof extension... so type safety is
not lost after all.

The use of the offsetof macro has the benefit that you don't need to
define a distinct "list head". All list operations are agnostic to
position - e.g. when you are iterating a list you can use any member as
the starting point. BSD's lists OTOH require that you have access to
the specific item that is the list head.

So in Standard C the loss of type-safety would have to be balanced with
this benefit; I strongly suspect type-safety would win out.

I appreciate you pointing out the BSD approach which is very
interesting to compare.

<snip>

Anyone else wanting to compare the approaches can find source for the
BSD lists at:

http://www.nendai.nagoya-u.ac.jp/global/sys/HTML/S/sys queue.h.html

and for the Linux kernel lists at:

http://lxr.linux.no/source/include/linux/list.h

Netocrat · Jul 20, 2005

CBFalconer said:
If you generalize the idea of a list to an organized data base of
some form, you can see an immediate counter example in my id2id
program. That fundamentally works with two data bases, one for
'search objects' and one for 'replacement objects'.

I don't quite follow how this relates.

However each
replacement object can be specified by more than one search
object. In fact the search and replacement objects can even be the
same, although you may not like the result.

When specifying identical search and replacement tokens the file was
unchanged, as expected - why wouldn't I like that?

Thus each input 'object' belongs to both tables. The only thing
that keeps memory ownership under control is a convention about
which table controls which storage. This symettry allows the
programs actions to be reversed (usually).

I skimmed the code enough to get the idea. Some cases are impossible
to reverse regardless. You point out the case where the replacement
already exists; another situation is where a replacement object is
specified by multiple search objects.

See: <http://cbfalconer.home.att.net/download/id2id-20.zip>

<Off-topic>
Neat utility. I once wrote a similar batch search and replacement tool
in C but mine was quick and non-hashed; had I been thinking I would
have just used sed. I suspect that due to its hashing id2id would be a
lot faster than sed at the task for which it's designed, and it's
certainly easier to use.
</Off-topic>

Netocrat · Jul 21, 2005

[Comparing the BSD and Linux kernel linked lists; and acknowledging
that BSD's approach has standards-compliant type-safety whilst Linux's
relies on the gcc typeof extension]

The use of the offsetof macro has the benefit that you don't need to
define a distinct "list head". All list operations are agnostic to
position - e.g. when you are iterating a list you can use any member as
the starting point. BSD's lists OTOH require that you have access to
the specific item that is the list head.

That was innaccurate. The requirement for a list head when iterating a
BSD list (and at all) is actually due to BSD's lists not being
circular, rather than their lack of offsetof usage.

Given a list node structure, Linux's offsetof macro approach gives you
"direct" access to the node's corresponding data structure, whereas
BSD's node-defined-by-macro approach instead gives you a pointer to the
next/previous node's corresponding data structure. For a double-linked
circular list (which Linux uses), given that under BSD's approach you
can access a node's corresponding data structure indirectly through a
construct like node->prevdata->nextdata anyway, there doesn't seem to
be a compelling advantage to the offset macro approach after all.

Infinite loop problem	1	Nov 4, 2023
Queue in C	25	May 19, 2014
Circular Link List program doesn't run	9	Sep 27, 2011
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
Codeforces problem	0	Apr 25, 2022
A car garage simulation using de-queue (link list implementation)	0	Jul 22, 2011
Queue in C	0	May 15, 2014
Array of structs function pointer	10	Jul 16, 2023

Link list problem

Shwetabh

Walter Roberson

Jonathan Bartlett

Richard Bos

John Cochran

pete

Cong Wang

Netocrat

Richard Bos

Netocrat

Chris Torek

Netocrat

CBFalconer

Chris Torek

Netocrat

Netocrat

Netocrat

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads