Link list problem

S

Shwetabh

Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.
 
W

Walter Roberson

This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it.
/* I have been given two structures */
struct node
{
struct node *next;
} *temp;
struct data
{
int a;
float b;
char c;
struct node s;
};
/* Now the question is to access the values of a,b,c using temp */

Is it possible that struct node is defined slightly differently, as

struct node
{
struct data *next;
} *temp;


If not then it can't be done without some kind of cheating, as
clearly anything accesses by temp will be a pointer.

There is, of course, ready cheating, such as creating a context in
which temp is a pointer to a struct data instead of a struct node.
 
J

Jonathan Bartlett

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

So you are saying that "temp" is a pointer into data at member "s"?

There is no portable solution, as there is no way to determine the
amount of padding used for structure members. You could guess at it by
doing:

void *v = temp;
v -= sizeof(char) + sizeof(float) + sizeof(int);
struct data *d = v;

But that would only work if the structs had no padding to them, which is
an invalid assumption.

Perhaps another possibility is to do:

struct data d;
void *dp = &d;
void *s = &d.s;
int diff = s - dp;

void *v = tmp;
v -= diff;
struct data *answer = v;

Even then you probably have some pretty platform-specific problems.

Jon
 
R

Richard Bos

Jonathan Bartlett said:
So you are saying that "temp" is a pointer into data at member "s"?

No, he isn't saying that. He probably _meant_ that, since without some
hint of that ilk the puzzle is unsolvable, but he didn't say it.
There is no portable solution, as there is no way to determine the
amount of padding used for structure members.

Yes, there is, for a known, declared structure: offsetof.

(No, I'm not giving any more hints.)

Richard
 
J

John Cochran

Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

Ick.
The first thing I'd say is "I hope the code at this company doesn't do
silly things like this."
The second thing is that I'd use the macro "offsetof" along with some creative
casting and pointer arithmetic to transform a "struct node *" into
a "stuct data *". The rest is trivial.

But once again, I'd want an assurance that the above silliness isn't standard
practice at the company you're applying for a job at.
 
P

pete

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

/* BEGIN new.c */

#include <stdio.h>

int main(void)
{
struct node {
struct node *next;
} *temp;
struct data {
int a;
float b;
char c;
struct node s;
};

struct data ds;
ds.a = 12345;
ds.b = 3.14f;
ds.c = 'X';
temp = (struct node *)&ds;
printf("%d\n%f\n%c\n",
((struct data *)temp) -> a,
((struct data *)temp) -> b,
((struct data *)temp) -> c
);
return 0;
}

/* END new.c */

N869
6.2.5 Types
[#27] All pointers to structure types
shall have the same representation and alignment
requirements as each other.

6.3.2.3 Pointers
[#7] A pointer to an object or incomplete type may be
converted to a pointer to a different object or incomplete
type. If the resulting pointer is not correctly aligned
for the pointed-to type, the behavior is undefined.
Otherwise, when converted back again, the result shall
compare equal to the original pointer.
 
C

Cong Wang

Shwetabh said:
Hi,
This is a question asked to me in an interview.
I haven't been able to figure out an answer for
it. Please try to see what can be done about the
following problem.

/* I have been given two structures */
struct node
{
struct node *next;
} *temp;

struct data
{
int a;
float b;
char c;
struct node s;
};

/* Now the question is to access the values of a,b,c using temp */

Thanks in advance.

Hmm,it is easy.ANSI C defines a macro 'offsetof' in stddef.h.

/*GCC 3.2.3*/
#ifndef __cplusplus
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
#else /* C++ */
/* The reference cast is necessary to thwart an operator& that might
 be applicable to MEMBER's type.See DR 273 for details.*/
#define offsetof(TYPE, MEMBER) (reinterpret_cast <size_t> \
(&reinterpret_cast <char &>(static_cast <TYPE *> (0)->MEMBER)))
#endif /* C++ */
 
N

Netocrat

Ick.
The first thing I'd say is "I hope the code at this company doesn't do
silly things like this."

Why not? This is the way linked lists are implemented in the Linux
kernel. It's a good approach - embed the node in the data rather than
the other way around and hence do away with the need for a data
pointer. And it's portable.
 
R

Richard Bos

Netocrat said:
Why not? This is the way linked lists are implemented in the Linux
kernel. It's a good approach - embed the node in the data rather than
the other way around and hence do away with the need for a data
pointer. And it's portable.

Myeah - but it would be clearer to have the node first, rather than
last. That way, when you've traversed the list (or, for that matter,
tree, or whatever structure; this trick works with more than lists), you
can simply do dataptr=(struct data *)nodeptr, and avoid fiddling with
offsetof or worse. That cast must work correctly, since a struct must
have the same address as its first member.

Richard
 
N

Netocrat

Myeah - but it would be clearer to have the node first, rather than
last. That way, when you've traversed the list (or, for that matter,
tree, or whatever structure; this trick works with more than lists), you
can simply do dataptr=(struct data *)nodeptr, and avoid fiddling with
offsetof or worse. That cast must work correctly, since a struct must
have the same address as its first member.

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.
 
C

Chris Torek

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.

In this case, one can use something like the BSD <sys/queue.h>
macros. They are type-safe, and handle this problem. Their main
drawback is some syntactic ugliness:

struct vnode {
[snippage]
TAILQ_ENTRY(vnode) v_freelist; /* vnode freelist */
LIST_ENTRY(vnode) v_mntvnodes; /* vnodes for mount point */
[snippage]
LIST_ENTRY(vnode) v_synclist; /* vnodes with dirty buffers */
[snippage]
};

(from a BSD <sys/vnode.h>). A vnode may simultaneously be on up to
three lists, one a "tailq" (which maintains a pointer to the end of
the list) and two regular "list"s.
 
N

Netocrat

Totally agree when the data can only be a member of one list. But often
(as is the case in the linux kernel), objects are members of multiple
lists.

In this case, one can use something like the BSD <sys/queue.h>
macros. They are type-safe, and handle this problem. Their main
drawback is some syntactic ugliness:

struct vnode {
[snippage]
TAILQ_ENTRY(vnode) v_freelist; /* vnode freelist */
LIST_ENTRY(vnode) v_mntvnodes; /* vnodes for mount point */
[snippage]
LIST_ENTRY(vnode) v_synclist; /* vnodes with dirty buffers */
[snippage]
};

(from a BSD <sys/vnode.h>). A vnode may simultaneously be on up to
three lists, one a "tailq" (which maintains a pointer to the end of
the list) and two regular "list"s.

I wasn't aware of that approach.

I don't know if this is OT but I'll assume it's just within bounds...

Without knowing how the macros are (typically) defined I can't
understand why there is a three-list limit. The linux approach allows
for the same semantics with the difference that an unlimited number of
lists may be used. There is also no distinction between list types -
they are all doubly-linked and circular. There is no "list end" but
you are free to maintain a pointer to any element and use it as such.

Linux uses a macro not to define the structure member, but to
initialise it. Your example would equate to:
struct vnode {
...
struct list_head v_freelist; /* vnode freelist */
...
struct list_head v_mntvnodes; /* vnodes for mount point */
...
struct list_head v_synclist; /* vnodes with dirty buffers */
...
} vnode_instance;
...
INIT_LIST_HEAD(vnode_instance.v_freelist);
... and for the others ...

Is this any less ugly to you? I didn't find BSD's approach all that
ugly (but yes, a little).
 
C

CBFalconer

Netocrat said:
.... snip ...

Totally agree when the data can only be a member of one list. But
often (as is the case in the linux kernel), objects are members of
multiple lists.

If you generalize the idea of a list to an organized data base of
some form, you can see an immediate counter example in my id2id
program. That fundamentally works with two data bases, one for
'search objects' and one for 'replacement objects'. However each
replacement object can be specified by more than one search
object. In fact the search and replacement objects can even be the
same, although you may not like the result.

Thus each input 'object' belongs to both tables. The only thing
that keeps memory ownership under control is a convention about
which table controls which storage. This symettry allows the
programs actions to be reversed (usually).

See: <http://cbfalconer.home.att.net/download/id2id-20.zip>
 
C

Chris Torek

I wasn't aware of that approach.

I don't know if this is OT but I'll assume it's just within bounds...

Without knowing how the macros are (typically) defined I can't
understand why there is a three-list limit.

There is no such limit: the number of lists an object can be on
depends only on the number of "list-of" elements within the object.
The example I used (vnodes) merely happens to need to be on three
(or sometimes fewer) lists.
Linux uses a macro not to define the structure member, but to
initialise it. Your example would equate to:
struct vnode {
...
struct list_head v_freelist; /* vnode freelist */
...
struct list_head v_mntvnodes; /* vnodes for mount point */
...
struct list_head v_synclist; /* vnodes with dirty buffers */
...
} vnode_instance;
...
INIT_LIST_HEAD(vnode_instance.v_freelist);
... and for the others ...

Is this any less ugly to you?

Not really :)

The main problem with the above -- which was also the case with
the CMU list macros from which the BSD ones were initially derived
-- is that they are (obviously) not type-safe. Consider a structure
that heads two *different* kinds of lists. For instance, a struct
describing a chapter in a book might have a list of pages, and also
a list of sections:

struct page;
struct section;

struct chapter {
...
LIST_HEAD(, page) pages;
LIST_HEAD(, section) sections;
...
};
struct page {
...
LIST_ENTRY(page) list;
...
};
struct section {
...
LIST_ENTRY(section) list;
...
};

If you want to iterate through the sections of a chapter, you write:

/* input: ch is the chapter */
struct section *sec;

LIST_FOREACH(sec, ch->sections, list) { ... }

If you accidentally try to iterate "sec" through the "pages":

LIST_FOREACH(sec, ch->pages, list) { ... }

you get a compile-time diagnostic. Of course, if you want to
iterate through the pages, just use:

struct page *pg;
LIST_FOREACH(pg, ch->pages, list) { ... }

If the list head and list entry data structures do not have some
sort of type-name decoration -- the parameters to the BSD sys/queue.h
macros, for instance -- you cannot propagate the element-type into
them (because C is statically typed).

(It is hard to get the various fields mixed up in simple examples
like these, but -- speaking from experience -- it is all too easy
to get them mixed up in real code.)
 
N

Netocrat

Chris said:
There is no such limit:

I misread that. You meant the specific example structure, not the
approach in general.
The main problem with the [Linux approach] -- which was also the case with
the CMU list macros from which the BSD ones were initially derived
-- is that they are (obviously) not type-safe.

In terms of Standard C you are 100% right. On an off-topic note, since
gcc is specified as "the" Linux kernel compiler, its extensions can be
relied upon, in particular the typeof extension... so type safety is
not lost after all.

The use of the offsetof macro has the benefit that you don't need to
define a distinct "list head". All list operations are agnostic to
position - e.g. when you are iterating a list you can use any member as
the starting point. BSD's lists OTOH require that you have access to
the specific item that is the list head.

So in Standard C the loss of type-safety would have to be balanced with
this benefit; I strongly suspect type-safety would win out.

I appreciate you pointing out the BSD approach which is very
interesting to compare.

<snip>

Anyone else wanting to compare the approaches can find source for the
BSD lists at:

http://www.nendai.nagoya-u.ac.jp/global/sys/HTML/S/sys queue.h.html

and for the Linux kernel lists at:

http://lxr.linux.no/source/include/linux/list.h
 
N

Netocrat

CBFalconer said:
If you generalize the idea of a list to an organized data base of
some form, you can see an immediate counter example in my id2id
program. That fundamentally works with two data bases, one for
'search objects' and one for 'replacement objects'.

I don't quite follow how this relates.
However each
replacement object can be specified by more than one search
object. In fact the search and replacement objects can even be the
same, although you may not like the result.

When specifying identical search and replacement tokens the file was
unchanged, as expected - why wouldn't I like that?
Thus each input 'object' belongs to both tables. The only thing
that keeps memory ownership under control is a convention about
which table controls which storage. This symettry allows the
programs actions to be reversed (usually).

I skimmed the code enough to get the idea. Some cases are impossible
to reverse regardless. You point out the case where the replacement
already exists; another situation is where a replacement object is
specified by multiple search objects.

<Off-topic>
Neat utility. I once wrote a similar batch search and replacement tool
in C but mine was quick and non-hashed; had I been thinking I would
have just used sed. I suspect that due to its hashing id2id would be a
lot faster than sed at the task for which it's designed, and it's
certainly easier to use.
</Off-topic>
 
N

Netocrat

[Comparing the BSD and Linux kernel linked lists; and acknowledging
that BSD's approach has standards-compliant type-safety whilst Linux's
relies on the gcc typeof extension]
The use of the offsetof macro has the benefit that you don't need to
define a distinct "list head". All list operations are agnostic to
position - e.g. when you are iterating a list you can use any member as
the starting point. BSD's lists OTOH require that you have access to
the specific item that is the list head.

That was innaccurate. The requirement for a list head when iterating a
BSD list (and at all) is actually due to BSD's lists not being
circular, rather than their lack of offsetof usage.

Given a list node structure, Linux's offsetof macro approach gives you
"direct" access to the node's corresponding data structure, whereas
BSD's node-defined-by-macro approach instead gives you a pointer to the
next/previous node's corresponding data structure. For a double-linked
circular list (which Linux uses), given that under BSD's approach you
can access a node's corresponding data structure indirectly through a
construct like node->prevdata->nextdata anyway, there doesn't seem to
be a compelling advantage to the offset macro approach after all.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top