Question about a struct declaration

H

heavyz

In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;

Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference.. Would somebody tell me which method is
recommended, and why? Thanks in advance.

ZHENG Zhong
 
R

Richard Bos

heavyz said:
In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;
^^^^^^
This is unwise. Identifiers beginning with _ and a lower-case letter are
only allowed for the compiler itself, not for you, except when they're
declared at block scope; but most struct types (a.o.t. struct _objects_)
are not declared at block scope, because they're more useful with file
scope. (And identifiers beginning with _ and an upper-case letter or
with __ are not allowed for us mere users, full stop.)
The different identifiers are unnecesary, anyway. C has separate
namespaces for struct identifiers and for normal ones, so "struct dummy"
and plain "dummy" do not clash, which means that

typedef struct dummy { ... } dummy;

is (when fleshed out, of course) a perfectly valid declaration of both a
struct type and a typedef'ed alias for it; and IYAM, if you want or need
to use both a struct name and a typedef for it, having them both the
same is clearer, as well.
Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference.. Would somebody tell me which method is
recommended, and why? Thanks in advance.

In most aspects it's a matter of taste, but IMO, unless you have good
reason not to, the first one is best, because it's simplest.
One good reason not to would be if you're creating an abstract data type
library and do not want to tell your users how you're implementing your
ADT; you'd then keep your struct definition in the ADT library code
itself, and only put the typedef in the associated header.

Richard
 
M

Martin Ambuhl

heavyz said:
In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;

Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference.. Would somebody tell me which method is
recommended, and why? Thanks in advance.

#1 is a structure-type-definition. It is sufficient and rarely is more
needed. The
type thoughout the code is 'struct dummy' unless there is a later
typedef.

#2 and #3 are used to provide an alias for the struct.
In #2, there is no alias tag and so the type must be 'dummy' throughout.
This hides the fact that it is a struct and is probably a bad idea.
In #3, you create two names for the type, 'struct _dummy' and 'dummy'.
It is bad idea to use identifiers that begin with underscores unless you
are thoroughly familiar with the exact restrictions on such names. That
you are asking this question suggests that you are not. So #3 could be
done as either (note the absence of the underscore)
3a. typedef struct dummy { /* ... */ } dummy;
or in two parts;
3b. struct dummy { /* ... */ }; /* structure-type-definition */
typedef struct dummy dummy; /* typedef creating alias */

I prefer #1 in almost all situations.

#2 is probably better than your #3 or my #3a or #3b, since there is a
point to not having two ways of identifying the type. Of #3, #3a, and
#3b, #3 is the worst. #3b has real uses: namely to provide a
structure-type-definition in implementation files and providing a
typedef in a header where you want to make the type opaque to the user,
and he will refer to objects only through pointers to them.

Decide how you want the programmer to refer to a struct. If you want
him to use 'struct dummy', use #1. If you want him to use 'dummy', use
#2. If you want opaque datatypes referred to though pointers of the
type 'dummy *', use #3b.
 
K

Keith Thompson

heavyz said:
In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;

Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference..

Method 2 has the disadvantage that there's no name for the type until
you reach the end of the declaration, so you can't declare a member of
type "dummy*" (e.g., for a node in a tree or linked list).

Method 3, as Richard Bos pointed out, infringes on the
implementation's name space. This can be fixed by using the same
identifier for the typedef and the struct tag:

typedef struct dummy { ... } dummy;

or by using some other convention if you don't like re-using the same
identifier for some reason (perhaps due to a limitation in your
development environment):

typedef struct dummy_s { ... } dummy;
Would somebody tell me which method is
recommended, and why? Thanks in advance.

All three are recommended. It just depends on who you ask. This
happens to be a controversial issue (but not one that most of us get
too worked up about).

Personally, I prefer method 1 unless there's a good reason to hide the
fact that the type is a struct. Others will prefer method 2 or 3
because they like to have a single-word name for the type and don't
feel the need to be reminded that it's a struct every time they
mention it.

If you're working on existing code, follow the style used in that
code; consistency is more important than any small benefit one style
might have over another. If you're writing your own original code,
pick a style and stick to it; you're free to follow either my advice,
or the advice of those poor misguided souls who disagree with me for
their own unfathomable reasons.
 
A

Andrey Tarasevich

heavyz said:
In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;

Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference.. Would somebody tell me which method is
recommended, and why? Thanks in advance.

In the method 1 you'll have to refer to your data type as 'struct dummy'
all the time, i.e. use two words. I personally don't consider this a big
problem, but many people might prefer to be able to refer to such type
with one word instead of two. So, they create a typedef name for the
type, as in the method 3. Now they can refer to it as 'dummy'.

The same effect can be achieved by an additional typedef in after the
method 1 declaration:

typedef struct dummy dummy;

Note that it is not really necessary to use a different identifier for
the struct tag and for the typedef name (and using a name that begins
with a '_' is not a good idea in any case). This is perfectly valid as well

typedef struct dummy { ... } dummy;

However, if one day you'll need to do something like that

typedef struct dummy *dummy;

and use it as a dual-language header (C and C++) you'll run into
problems with C++, wince in C++ the latter declaration is illegal. If
this is an issue in your case, then it is a good idea to choose
different identifiers for struct tag and the typedef name. Otherwise,
the same identifier can safely be used (unless I'm missing some other
issue).

The method 2 might be OK, but, as others already mentioned, it won't
work in case when you need to self-refer to the struct type from inside
if its definition.
 
A

Andrey Tarasevich

Richard said:
Andrey Tarasevich said:

Please don't hide pointers in typedefs.

Er... There are situations when you don't, and there are situations when
that the correct thing to do.

If want to use the type as a pointer (i.e. keep its pointer nature
exposed to the client), then hiding the fact that it is a pointer in a
typedef is indeed a pretty useless idea.

However, if you want to declare a generic "handle" type, whose specific
nature is not supposed to be exploited by the user in any way, a pointer
hidden in a typedef is a standard, widely used and accepted idiom. A
classic example would be the interface of 'pthreads' library, where
'pthread_t', 'pthread_mutex_t' etc. might easily be pointers (or might
not be pointers). If they are in fact pointers in a given
implementation, you'll probably see typedefs in the interface header
file that'll look pretty much as the one above. And there's nothing
wrong with it.
I'm surprised that nobody has yet mentioned the very simple solution to
this problem: i.e. forward declaration.

typedef struct dummy_ dummy;

struct dummy_
{
dummy *prev;
dummy *next;
int data;
};


I don't exactly see how it applies to method 2 (since that's what I was
talking about), since method 2 is specifically about the _anonymous_ struct.

As for the other methods, nobody mentioned that "simple solution to this
problem" simply because nobody thinks there's a problem here. Your
solution is in no way simpler that using struct tag to self-refer to a
tagged struct type. All these methods are in the FAQ anyway.
 
A

Andrey Tarasevich

Richard said:
...
Er... I disagree.

Well, then, can you come up with a different implementation hiding
technique that would satisfy the following two requirements:

1) Client code can define objects of type 'T'
2) Client code doesn't see the implementation details of the concept
type 'T' represents
?

I actually can do that, but one alternative technique I know is no
better then a typedef-ed pointer.
...it's *still* a bad idea to hide pointeriness from the user - in my
opinion, of course. This is, however, not a matter of C correctness.

Hide the pointeriness? No, it actually _doesn't_ hide the pointeriness.
The pointeriness will be clearly visible in the interface header, in the
typedef. The important part of the concept of a "handle" is that the
inner workings of the attached entity are hidden, but the facts that the
handle itself is small, copyable with '=' (and copyable efficiently),
comparable with '==', etc. remain exposed. This is what is conveyed
through pointeriness in this case.
Yeah, well, I've only got one pair of 'ands, ain't I?

I don't know what to say...
 
W

Willem

Andrey wrote:
) Richard Heathfield wrote:
)> ...
)> Er... I disagree.
)
) Well, then, can you come up with a different implementation hiding
) technique that would satisfy the following two requirements:

Richard doesn't want you not to use typedef'd pointers like that,
he wants you to typedef the pointer as typedef T <something>
and then have the client code use it as *T

In other words: hide the type, but don't hide the pointeryness.


At least I think that's what he means.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
R

Richard

Andrey Tarasevich said:
Well, then, can you come up with a different implementation hiding
technique that would satisfy the following two requirements:

1) Client code can define objects of type 'T'
2) Client code doesn't see the implementation details of the concept
type 'T' represents
?

I actually can do that, but one alternative technique I know is no
better then a typedef-ed pointer.


Hide the pointeriness? No, it actually _doesn't_ hide the
pointeriness. The pointeriness will be clearly visible in the
interface header, in the typedef.

Exactly. Its not clearly indicated at the line its used. Its one of the
few times I would agree with Heathfield on a style issue. In the same
vein I despise C++ for allowing operator overloading - its nigh on
impossible to "read" the code unless you have the brain the size of a
planet and have memorized all the class subtleties. its also why I
always "read" code in a debugger by stepping through. Those nice little
"variable changed" indicators are a great help.
 
W

Willem

Richard Heathfield wrote:
) Um, not quite. I *don't* want to typedef the pointer in any way at all.
) Rather, I want to expose the pointeriness (or, if you prefer,
) pointeryness), by leaving the * out of the typedef altogether, and letting
) the user-programmer stick it in instead.

Er, yes, that was poorly worded of me. I meant to say what you just said.


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
A

Andrey Tarasevich

Richard said:
...
Yes, by carefully interpreting your words in a way convenient to myself.
> ...
Sigh...


Right, so you gain nothing in terms of information hiding,

Of course, I don't. The point is to actually _lose_ a little bit of that
information hiding, unhide it in a reasonably controlled fashion.
"Information hiding" is not always as black-and-white as "hide
everything" vs. "expose everything", you know.
but lose the
convenience to the user-programmer of being reminded through usage that
he's dealing with a here-be-dragons pointer.

Reminded through usage? How? A pointer to an undefined type cannot be
used in any truly pointer-specific way, i.e. it can't be meaningfully
dereferenced. Whatever usage I mentioned before is not really
pointer-specific, in a sense that it can also be applied to an integer type.
Translation: it's going to take me a long time to persuade everyone /not/
to accept that idiom, working alone.

I'd say you'd have to spend more effort fighting the "prior art",
including but not limited to the aforementioned pthreads library and
standard library implementations that typedef our beloved 'va_list' as a
pointer type.
 
A

Andrey Tarasevich

Willem said:
Andrey wrote:
) Richard Heathfield wrote:
)> ...
)> Er... I disagree.
)
) Well, then, can you come up with a different implementation hiding
) technique that would satisfy the following two requirements:

Richard doesn't want you not to use typedef'd pointers like that,
he wants you to typedef the pointer as typedef T <something>
and then have the client code use it as *T

In other words: hide the type, but don't hide the pointeryness.
At least I think that's what he means.

I understand that perfectly well. I'm taking about the situations when
library specification does not want to restrict implementations to using
pointer types and at the same time does not want to prevent them from
doing that. 'pthreads' library specification is one example of that.
 
A

Andrey Tarasevich

Richard said:
Exactly. Its not clearly indicated at the line its used. Its one of the
few times I would agree with Heathfield on a style issue.

It is not a style issue. It is an issue of implementing a given abstract
specification. Some abstract specification of some library exposes type
'T' and objects of type 'T' have to be user-definable and at the same
the inner workings of what's really hiding behind 'T' needs (is
preferred) to be hidden from the user (I already provided examples:
'pthread_t' in 'ptherads', or simply 'va_list' in our standard library).
The perfectly viable approach in this case is to use a typedef-ed
pointer to an undefined struct. There's no style issues here, and even
if someone is still inclined to see "style issues" here for some reason,
they are largely secondary.
 
R

Richard

Andrey Tarasevich said:
It is not a style issue. It is an issue of implementing a given
abstract specification. Some abstract specification of some library
exposes type 'T' and objects of type 'T' have to be user-definable and
at the same the inner workings of what's really hiding behind 'T'
needs (is preferred) to be hidden from the user (I already provided
examples: 'pthread_t' in 'ptherads', or simply 'va_list' in our
standard library). The perfectly viable approach in this case is to
use a typedef-ed pointer to an undefined struct. There's no style
issues here, and even if someone is still inclined to see "style
issues" here for some reason, they are largely secondary.

Aha. I follow you now. Thanks for the explanation.
 
K

Keith Thompson

Richard Heathfield said:
Andrey Tarasevich said:


Please don't hide pointers in typedefs.
[...]

Hiding pointers in typedefs is *usually* a bad idea. The only
exception is when you want the type to be opaque, meaning that client
code will never treat it as a pointer. A more common approach is to
export a pointer to an opaque type, such as stdio's FILE*, but making
the pointer type itself opaque isn't necessarily evil.

(I suppose "isn't necessarily evil" might not be the highest praise I
could have offered.)

But the real problem with

typedef struct dummy *dummy;

is that the identifier "dummy" is used both for the struct tag and for
a *pointer to* the struct type. It's common to use the same identifier
for a struct tag and a pointer to the struct, as in:

typedef struct dummy dummy;

But making "struct dummy" a struct and "dummy" a pointer will only
cause confusion, whether you're trying to write dual-language code or
not. Pick two different names.

I think the generic word "dummy" obscured the problem in this case.
In real life, the name(s) would be more descriptive, perhaps something
like:

typedef struct widget_info *handle;
 
R

Richard Tobin

Keith Thompson said:
Hiding pointers in typedefs is *usually* a bad idea.
Nonsense!

The only
exception is when you want the type to be opaque, meaning that client
code will never treat it as a pointer.

Much more reasonable.

In my experience, creating opaque types is the *usual* reason for
typedefing pointers, so it is *usually* a good idea...
But the real problem with

typedef struct dummy *dummy;

is that the identifier "dummy" is used both for the struct tag and for
a *pointer to* the struct type.

I don't see that as a problem - "dummy" is the pointer type and
"struct dummy" is the struct type. Used consistently, that would be a
perfectly reasonable convention. The usual, opaque, use has the short
name, and the rare, struct, use has "struct" to draw attention to it.
It's common to use the same identifier
for a struct tag and a pointer to the struct, as in:

typedef struct dummy dummy;

Again, I don't find that "common". Increasingly libraries export
mostly-opaque types, and client programs rarely have occasion to
use the struct type itself.

-- Richard
 
A

Andrey Tarasevich

Keith said:
But the real problem with

typedef struct dummy *dummy;

is that the identifier "dummy" is used both for the struct tag and for
a *pointer to* the struct type. It's common to use the same identifier
for a struct tag and a pointer to the struct, as in:

typedef struct dummy dummy;

But making "struct dummy" a struct and "dummy" a pointer will only
cause confusion, whether you're trying to write dual-language code or
not. Pick two different names.

I'd say that in the implementation hiding situations giving the same
name to the exposed type and to the hidden-struct tag is not a problem,
but (quite the opposite) a way to remind the user, who's thoughtlessly
trying to use 'struct dummy' in his code, that there's something else
named 'dummy' he should really be using instead.
I think the generic word "dummy" obscured the problem in this case.
In real life, the name(s) would be more descriptive, perhaps something
like:

typedef struct widget_info *handle;

No, _that_ would be a bad idea. Giving a descriptive name to something
that is supposed to be hidden is a way to encourage the user to start
making unwarranted assumptions and even attempt break the implementation
protection. There's no reason for the user to know anything
"descriptive" about what's hiding behind the 'handle' (besides the
information given in the specification).
 
J

John Bode

In C, if i want to declare a struct, there are 3 methods to do so:

1: struct dummy { ... };
2: typedef struct { ... } dummy;
3: typedef struct _dummy { ... } dummy;

Usually i use the first method to declare my structs, but in many open
source projects (such as libxml2), the third method is used. I am just
curious about the difference.. Would somebody tell me which method is
recommended, and why? Thanks in advance.

ZHENG Zhong

Using the typedef name (methods 2 and 3) allows you to sort-of de-
emphasize the "struct-ness" of dummy; you're basically telling the
user of that type to think in terms of dummy as an atomic entity (much
like a scalar), not as a collection of items. See FILE in stdio.h as
an example; we're not expected to know anything about the details of
FILE items, just to pass pointers to them between the various stdio
functions.

As others have pointed out, identifiers with leading underscores
(_dummy) are reserved for the implementation, and should not be used
in application code.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top