struct by value

I

ImpalerCore

When creating an API for a struct in C, one of the questions that
recently came up is how to pass that struct or return a struct from a
function. Often the answer is obvious, to use pointers, particularly
for large structs for performance reasons.

However, as the struct size shrinks, the choice of passing struct by
value or by pointer becomes less clear to me. Let me use a simple
struct as an example.

struct greg_ymd
{
int16_t year;
int8_t month;
int8_t day;
};

I use this struct to represent a date in the gregorian calendar
(fields not offset to start from 0). Let's say that I want to have a
function that adds a certain number of days to this ymd struct. There
are a couple of options that come to mind.

1. struct greg_ymd add_days( const struct greg_ymd ymd, int days );
2. <return type> add_days( struct greg_ymd* pymd, int days );

First of all, what is the performance implications of using struct
pass-by-value for smallish structs?
Is there a rule of thumb of struct size for an interface API that you
convert all struct parameter passing to use pointers?
Have you used struct by value parameter passing or return value at all
in your API design experience?

The struct pass-by-value version of the interface avoids the pesky
NULL pointer argument issue, but still I can't get away from it
completely since I use 'int (*compare_function)( const void* p, const
void* q )' to define the sorting property in my generic containers.

Thanks for your time.
 
K

Keith Thompson

ImpalerCore said:
When creating an API for a struct in C, one of the questions that
recently came up is how to pass that struct or return a struct from a
function. Often the answer is obvious, to use pointers, particularly
for large structs for performance reasons.

However, as the struct size shrinks, the choice of passing struct by
value or by pointer becomes less clear to me. Let me use a simple
struct as an example.

struct greg_ymd
{
int16_t year;
int8_t month;
int8_t day;
};

I'd say that "small" structs can sensibly be passed by value (unless of
course the function needs to modify them), and "large" structs
should be passed by pointer for performance reasons.

I have no particular guidance to offer about the dividing line between
"small" and "large". I'd say that anything no larger than a pointer is
certainly "small", but beyond that ...

I know that's not very helpful.
I use this struct to represent a date in the gregorian calendar
(fields not offset to start from 0). Let's say that I want to have a
function that adds a certain number of days to this ymd struct. There
are a couple of options that come to mind.

1. struct greg_ymd add_days( const struct greg_ymd ymd, int days );

The "const" doesn't really serve any purpose here, any more than
"const int days" would.
2. <return type> add_days( struct greg_ymd* pymd, int days );

Here a "const" would be helpful:
<return type> add_days( const struct greg_ymd *pymd, int days );
since it guarantees to the caller that the object pointed to by
the argument won't be modified.
First of all, what is the performance implications of using struct
pass-by-value for smallish structs?

It depends on the compiler.

[...]
 
I

ImpalerCore

I'd say that "small" structs can sensibly be passed by value (unless of
course the function needs to modify them), and "large" structs
should be passed by pointer for performance reasons.

I have no particular guidance to offer about the dividing line between
"small" and "large".  I'd say that anything no larger than a pointer is
certainly "small", but beyond that ...

I know that's not very helpful.

Yeah, I didn't really know how to answer the question either. I could
try to perform some experiments, but I don't know how useful it would
be. The two structs I use pass-by-value in their interface is the
greg_ymd above, and a timeval like struct. I provide my own simply
because struct timeval doesn't seem to be standard as far as I can
tell.

struct c_timeval
{
long int sec;
long int usec;
};
The "const" doesn't really serve any purpose here, any more than
"const int days" would.

Good point.
Here a "const" would be helpful:
    <return type> add_days( const struct greg_ymd *pymd, int days );
since it guarantees to the caller that the object pointed to by
the argument won't be modified.

Actually, in this case, the pymd value would be modified, and the
return value would be used to indicate some error status or void if I
ignore pymd == NULL error issues.

The difference in usage would look like the following.

\code example
int main(void)
{
struct c_greg_ymd ymd = { 2000, 1, 1 };

#ifdef PASS_STRUCT_BY_VALUE
ymd = add_days( ymd, 7 ); /* ymd is now { 2000, 1, 8 } */
#else /* PASS_STRUCT_BY_POINTER
add_days( &ymd, 7 ); /* ymd is now { 2000, 1, 8 } */
#endif
}
\endcode
It depends on the compiler.

And maybe the compiler optimization flags too.

Best regards,
John D.
 
G

Gene

When creating an API for a struct in C, one of the questions that
recently came up is how to pass that struct or return a struct from a
function.  Often the answer is obvious, to use pointers, particularly
for large structs for performance reasons.

However, as the struct size shrinks, the choice of passing struct by
value or by pointer becomes less clear to me.  Let me use a simple
struct as an example.

struct greg_ymd
{
  int16_t year;
  int8_t month;
  int8_t day;

};

I use this struct to represent a date in the gregorian calendar
(fields not offset to start from 0).  Let's say that I want to have a
function that adds a certain number of days to this ymd struct.  There
are a couple of options that come to mind.

1.  struct greg_ymd add_days( const struct greg_ymd ymd, int days );
2.  <return type> add_days( struct greg_ymd* pymd, int days );

First of all, what is the performance implications of using struct
pass-by-value for smallish structs?
Is there a rule of thumb of struct size for an interface API that you
convert all struct parameter passing to use pointers?
Have you used struct by value parameter passing or return value at all
in your API design experience?

The struct pass-by-value version of the interface avoids the pesky
NULL pointer argument issue, but still I can't get away from it
completely since I use 'int (*compare_function)( const void* p, const
void* q )'  to define the sorting property in my generic containers.

I have done some tests in the past with 32-bit gcc and Visual C. Both
would move structs to registers for call-by-value params, local
assignment (as in swapping struct values) and return values if the
size was 4 bytes or less.

In practice though the cost difference between allocating a struct in
the caller and passing a pointer to it vice accepting a return value
must be vanishingly small even if the struct does fit in a register.
When the struct doesn't fit in a register, the two return mechanisms
would be all but identical.

I've always used caller allocates and passes pointer for APIs. This
is for at least 5 reasons. (1) It's efficient enough in all cases
I've ever encountered. (2) A single convention means the caller must
remember only a single convention. (3) You get "in", "out", and "in
out" parameters all with the same mechanism. (4) You also get
"optional" parameters with the same mechanism by allowing NULL to mean
"not needed." This works for all 3 kinds. (5) It's clean in the
implementation to uniformly use -> for all (or nearly all) element
access rather than a mix of . and ->.

A convention I find very useful is this

typedef struct foo_s {

... big collection of fields ...

} FOO;

// Prepares the raw memory of the foo for initialization.
// Often just zeros or sets a flag so that later functions can
// see it hasn't been set up yet. May even do nothing. Declare anyway
// as a placeholder for the convention. Don't allocate anything here.
// Can't fail, so a void function.
void init_foo(FOO *foo);

// Make a foo fully ready for use. Allocate storage and other
resources.
// Return 0 on success else an error code.
int set_up_foo(FOO *foo, ...);

// Release all resources of a foo, returning it to the init state.
// Return 0 on success else an error code. It's okay to set_up
immediately
// after a clear.
int clear_foo(FOO *foo);

Then in code:

// A way to eliminate the clutter of &. YMMV.
FOO foo_instance[1];

// Always init after declaration even if the init does nothing.
init_foo(foo_instance);

.... yada yada ..

for (...) {

err = set_up_foo(foo_instance, ...);

... check for error.

... use the foo instance then release its resources

err = clear_foo(foo_instance);

... check for error.

... more processing that doesn't need the foo instance.
}

I've used this so often that it's a comfortable old friend. Once I
used setjmp/longjmp as poor-man's exceptions rather than returning
error codes. It's okay if error handling doesn't need any
granularity.
 
E

Eric Sosman

When creating an API for a struct in C, one of the questions that
recently came up is how to pass that struct or return a struct from a
function. Often the answer is obvious, to use pointers, particularly
for large structs for performance reasons.

However, as the struct size shrinks, the choice of passing struct by
value or by pointer becomes less clear to me. Let me use a simple
struct as an example.

struct greg_ymd
{
int16_t year;
int8_t month;
int8_t day;
};

I use this struct to represent a date in the gregorian calendar
(fields not offset to start from 0). Let's say that I want to have a
function that adds a certain number of days to this ymd struct. There
are a couple of options that come to mind.

1. struct greg_ymd add_days( const struct greg_ymd ymd, int days );

What does `const' buy you? Or, why not `const int days'? Other
than that, this seems plausible.
2.<return type> add_days( struct greg_ymd* pymd, int days );

This seems plausible, too.
First of all, what is the performance implications of using struct
pass-by-value for smallish structs?

Mu.
Is there a rule of thumb of struct size for an interface API that you
convert all struct parameter passing to use pointers?

When writing software that others might blame you for, the basic
rule of thumb is "Leave no fingerprints." ;-)
Have you used struct by value parameter passing or return value at all
in your API design experience?

Yes. For parameters, I'd estimate that I use a struct pointer
more frequently than a struct value, maybe 90%-10% or even more
lopsided. For function values, leaving "lookup-ish" functions aside,
I'd guess my own ratio is closer to 70%-30%. YMMV.
The struct pass-by-value version of the interface avoids the pesky
NULL pointer argument issue, but still I can't get away from it
completely since I use 'int (*compare_function)( const void* p, const
void* q )' to define the sorting property in my generic containers.

Sorry; I can't make sense of this. If "the pesky ... issue" is
that a struct pointer might be NULL, well, that can often be a help
rather than a harm: You can provide a NULL for an "optional" struct
pointer argument, but you cannot do so with a struct value. As for
your comparison function, the relevance escapes me: You've already
chosen to pass pointers, so what are you asking about?
 
J

Jon

ImpalerCore said:
When creating an API for a struct in C, one of the questions that
recently came up is how to pass that struct or return a struct from a
function. Often the answer is obvious, to use pointers, particularly
for large structs for performance reasons.

However, as the struct size shrinks, the choice of passing struct by
value or by pointer becomes less clear to me. Let me use a simple
struct as an example.

struct greg_ymd
{
int16_t year;
int8_t month;
int8_t day;
};

I use this struct to represent a date in the gregorian calendar
(fields not offset to start from 0). Let's say that I want to have a
function that adds a certain number of days to this ymd struct. There
are a couple of options that come to mind.

I really can't see that being a performance bottleneck, but you are
surely asking about theory/practice (read, being ultra-tidy in your
programming).
1. struct greg_ymd add_days( const struct greg_ymd ymd, int days );
2. <return type> add_days( struct greg_ymd* pymd, int days );

First of all, what is the performance implications of using struct
pass-by-value for smallish structs?

Profile it and see.
Is there a rule of thumb of struct size for an interface API that you
convert all struct parameter passing to use pointers?

I pass all structs by reference (I use a C/C++ compiler but limit
severely the amount of C++ things I use) and primitives by value.
Have you used struct by value parameter passing or return value at all
in your API design experience?

The struct pass-by-value version of the interface avoids the pesky
NULL pointer argument issue,

References are nice in that regard, but I have a feeling it is not
guaranteed portable (sorry for the C++ chat).
but still I can't get away from it
completely since I use 'int (*compare_function)( const void* p, const
void* q )' to define the sorting property in my generic containers.

Default arguments (C++ again) can help with that. You can have null to
mean do an object compare or non-null to use the passed-in function.
(Side-stepping the "default arguments are evil" debate).
 
J

Jon

Keith said:
I'd say that "small" structs can sensibly be passed by value (unless
of course the function needs to modify them), and "large" structs
should be passed by pointer for performance reasons.

I have no particular guidance to offer about the dividing line between
"small" and "large". I'd say that anything no larger than a pointer
is certainly "small", but beyond that ...

I know that's not very helpful.

How about this "rule": if it's a primitive, pass by value, else don't.
There are not stack frame guarantees that allow further portable
("overall") rules of thumb, as far as I know.
 
J

Jon

Gene said:
I have done some tests in the past with 32-bit gcc and Visual C. Both
would move structs to registers for call-by-value params, local
assignment (as in swapping struct values) and return values if the
size was 4 bytes or less.

OK, then, here is something I don't understand: if those compilers do
that, how am I supposed to write an assembly language prologue/epilogue
to interface such things if arguments are not passed on the stack?
In practice though the cost difference between allocating a struct in
the caller and passing a pointer to it vice accepting a return value
must be vanishingly small even if the struct does fit in a register.
When the struct doesn't fit in a register, the two return mechanisms
would be all but identical.

I read something recently (can't remember where) that passing arguments
in registers these days is more than likely "premature optimization", as
passing on the stack would be just as efficient given modern CPU designs.
I've always used caller allocates and passes pointer for APIs.

That doesn't grok? You mean for structs as the OP asked?
This
is for at least 5 reasons. (1) It's efficient enough in all cases
I've ever encountered. (2) A single convention means the caller must
remember only a single convention. (3) You get "in", "out", and "in
out" parameters all with the same mechanism. (4) You also get
"optional" parameters with the same mechanism by allowing NULL to mean
"not needed." This works for all 3 kinds. (5) It's clean in the
implementation to uniformly use -> for all (or nearly all) element
access rather than a mix of . and ->.

Agreed. I'll add that an uncomplicated call standard at the
implementation level is "tits".
A convention I find very useful is this

typedef struct foo_s {

... big collection of fields ...

} FOO;

// Prepares the raw memory of the foo for initialization.
// Often just zeros or sets a flag so that later functions can
// see it hasn't been set up yet. May even do nothing. Declare anyway
// as a placeholder for the convention. Don't allocate anything here.
// Can't fail, so a void function.
void init_foo(FOO *foo);

Like a C++ constructor (with it's steroids taken away).
// Make a foo fully ready for use. Allocate storage and other
resources.
// Return 0 on success else an error code.
int set_up_foo(FOO *foo, ...);

Like a C++ constructor's steroidal effects.
// Release all resources of a foo, returning it to the init state.
// Return 0 on success else an error code. It's okay to set_up
immediately
// after a clear.
int clear_foo(FOO *foo);

Symmetry lacking: you used 2 construction constituents but only one
destruction constituent. I think your "poor mans' OO" design is too
complex and probably missing something at the same time.
Then in code:

// A way to eliminate the clutter of &. YMMV.
FOO foo_instance[1];

That is bizarre syntax.
// Always init after declaration even if the init does nothing.
init_foo(foo_instance);

A testament to C's defficiency over C++.
... yada yada ..

for (...) {

err = set_up_foo(foo_instance, ...);

... check for error.

... use the foo instance then release its resources

err = clear_foo(foo_instance);

... check for error.

... more processing that doesn't need the foo instance.
};

Or pass an "out" error argument to the functions. Or have an optional set
of functions that take an "out" error handler argurment. Or, learn, use,
abuse C++, the latter of which you are doing with the C code you
presented.
 
J

Jon

Eric said:
Sorry; I can't make sense of this. If "the pesky ... issue" is
that a struct pointer might be NULL, well, that can often be a help
rather than a harm: You can provide a NULL for an "optional" struct
pointer argument, but you cannot do so with a struct value.

Much more often than not, "the pesky issue" is the case at hand.
Asserting for null pointers is a PITA and an unnecessary one. Potential
language-level solutions: references that can't be null, a "not null" or
"can be null" keyword.
 
N

Nobody

First of all, what is the performance implications of using struct
pass-by-value for smallish structs?

It depends upon the platform. On some platforms, a struct return is
implemented by passing in a pointer; the compiler will generate a
temporary in the caller if necessary.
 
J

Jon

Nobody said:
It depends upon the platform.

That answer is lame. C is defficient and it's defficiencies must not be
glossed over... where is wikipedia when you need her!
 
J

Jon

Project said:
Any value too large to be returned in registers is returned in
memory;

Oh yeah, confuse all future readers by introducing yet another concept. C
IS a stack-based "language".
you can deal with this yourself or let the compiler do it.

Yes, C is ambiguous.
Concentrate instead on what makes the most sense for your interface.

Oh, you teaching? Someone you don't even know from Joe. Stop that!
Historically only scalars could be returned because the returned
value had to fit in a register. That restriction has been removed,
but old interfaces live forever and with it the reluctance for
returning structs.

Write a f!@n paper! (Or you new-fangled guys call that "Dr. Dobbs
Online"). (It was just a book chapter anyway).
 
E

Eric Sosman

Much more often than not, "the pesky issue" is the case at hand.
Asserting for null pointers is a PITA and an unnecessary one. Potential
language-level solutions: references that can't be null, a "not null" or
"can be null" keyword.

It seems that you and the O.P. somehow regard null pointer values
as Bad Things, poison pills in your program. That's an attitude I can't
understand, as it seems important to be able to respond to "Get another
Thing" with "No Thing there, boss." The null-valued pointer is a very
convenient device, a way to pass either "Here's a Thing" or "No Thing"
through one channel, without the burden of inventing a separate channel
for an independent "Yes/No" answer (with "No" also having the meaning
of "Just pay no attention to that Thing in the other channel; it's not
really there").

A pointer type that cannot be null seems to me crippled, about as
useful as a numeric type that cannot be zero. A pointer that is known
to be non-null when I'm about to use it (or a number known to be non-
zero when I'm about to divide by it) is useful, but that's not the same
thing, not the same thing at all.
 
E

Eric Sosman

That answer is lame. C is defficient and it's defficiencies must not be
glossed over... where is wikipedia when you need her!

Oh, come on! Show me a language that *prescribes* the run-time
cost of its operations -- or even the relative cost -- and I'll show
you a language with far more defficiencies [sic] than C.
 
I

ImpalerCore

     What does `const' buy you?  Or, why not `const int days'?  Other
than that, this seems plausible.

The only thing I can think of is it may prevent some logical error in
a bad implementation of add_days. Provided the implementation of
add_days is sound, there isn't anything 'const' buys you that I can
think of.
     This seems plausible, too.


     Mu.


     When writing software that others might blame you for, the basic
rule of thumb is "Leave no fingerprints."  ;-)


     Yes.  For parameters, I'd estimate that I use a struct pointer
more frequently than a struct value, maybe 90%-10% or even more
lopsided.  For function values, leaving "lookup-ish" functions aside,
I'd guess my own ratio is closer to 70%-30%.  YMMV.


     Sorry; I can't make sense of this.  If "the pesky ... issue" is
that a struct pointer might be NULL, well, that can often be a help
rather than a harm: You can provide a NULL for an "optional" struct
pointer argument, but you cannot do so with a struct value.  As for
your comparison function, the relevance escapes me: You've already
chosen to pass pointers, so what are you asking about?

I personally don't have any problem with using or passing NULL
pointers myself. Unfortunately, that is not how everyone views it.

Let's watch an episode of the NULL pointer blame game.

Me: I got this new little library that does something cool.
I want you guys to try it out.
Develop: Yeah, it works pretty well, except that it crashes when
I pass a NULL pointer to the interface.
Me: So, don't pass NULL pointers, NULL pointers are not
intended to be semantically viable, like strcpy and friends,
so why should I complicate my interface to handle something
that developers should know better anyways. It's in the
documentation.
Manager: But that's the standard library and we don't have control
over that. We do have control over your library API, so why
not make it a little more robust to developer mistakes.
Testing resources can't cover everything, so why should we
make more opportunities to crash the application if we don't
have to.
Me: Invest in more testing then. We deal with strcpy issues
because we have to ... oorrrrr maybe we can just add asserts
to help the developers.
Tester: I don't like the assert idea. I don't want to have to test
two versions of the software, with and without assert.
Me: Okay, so if I make NULL pointer not crash, how do you want
to communicate the error, by return value?
Develop: Sounds fine.
Me: <grumbles>

.... sometime later ...

Tester: We found some funny side effects from that new library your
using. It looks like there could be a problem.
Develop: Interesting, nothing came up in our tests. We'll ask the
library guy.
Me: Hey Developers, what's up?
Develop: The tester found some strange behavior from using your
library.
Me: Ok. What's the problem?
Develop: We're not sure, but since you're the expert on the way your
library works, we'd like your help to troubleshoot it.
Me: Fine. ... <looks at developer code and notices that there
are no error checks from using library function> ... I
noticed that you're not checking for errors from the API.
Could a NULL pointer be causing an issue?
Develop: Could be.
Me: hmmm ... thinking ... <If it's a NULL pointer again, I
should just let the NULL pointer crash their code, then at
least they know it was their fault, but management says
that they don't have resources for complete and perfect
testing, so make your library robust. I wonder how
passing struct by value would change things.>

Best regards,
John D.
 
K

Keith Thompson

Jon said:
Keith Thompson wrote: [...]
I'd say that "small" structs can sensibly be passed by value (unless
of course the function needs to modify them), and "large" structs
should be passed by pointer for performance reasons.

I have no particular guidance to offer about the dividing line between
"small" and "large". I'd say that anything no larger than a pointer
is certainly "small", but beyond that ...

I know that's not very helpful.

How about this "rule": if it's a primitive, pass by value, else don't.
There are not stack frame guarantees that allow further portable
("overall") rules of thumb, as far as I know.

What exactly do you mean by "primitive"?

If you mean that scalar types (i.e., numeric and pointer types)
should be passed by value and other types (arrays, structs, unions)
should be passed by pointer, well, that's a consistent rule, but it
fails to take advantage of C's ability to pass and return structs
and unions by value.

If you have a small struct type, I see no reason not to pass it
by value if that satisfies the semantics you need.
 
K

Keith Thompson

Jon said:
References are nice in that regard, but I have a feeling it is not
guaranteed portable (sorry for the C++ chat).
[...]

References are perfectly portable in C++. They don't exist in C
(which you'll notice is the language we discuss here).
 
K

Keith Thompson

Jon said:
Gene wrote: [...]
I have done some tests in the past with 32-bit gcc and Visual C. Both
would move structs to registers for call-by-value params, local
assignment (as in swapping struct values) and return values if the
size was 4 bytes or less.

OK, then, here is something I don't understand: if those compilers do
that, how am I supposed to write an assembly language prologue/epilogue
to interface such things if arguments are not passed on the stack?

The same way you would in any circumstances: by understanding the
calling convention used by the compiler, which is probably based
on the ABI for the platform.

But how often do you need to write an assembly language
prologue/epilogue anyway?

[...]
 
K

Keith Thompson

Jon said:
That answer is lame. C is defficient and it's defficiencies must not be
glossed over... where is wikipedia when you need her!

That answer is correct, and I see nothing lame about it.

Are you suggesting that the performance implications should be
defined by the language? How exactly would that work?

The purpose of a C program is not to generate CPU instructions.
CPU instructions are a tool used to create the run-time behavior
that the C program specifies; that behavior is what the program is
all about. (There are cases where you care which instructions are
generated, but such cases are, and should be, rare.)
 
K

Keith Thompson

Jon said:
Oh yeah, confuse all future readers by introducing yet another concept. C
IS a stack-based "language".

(Why did you put the word "language" in quotation marks?)

You seem to be new here, so you may not be aware that we've
discussed this at great and tedious length before.

No, C is not a stack-based language. The word "stack" doesn't even
appear in the C language standard. The semantics of C function calls
do imply some kind of stack-like structure, but only in the sense
that storage for function calls is allocated and deallocated in a
last-in first-out manner. There is no implication of a stack laid
out in contiguous memory growing and shrinking in any consistent
direction.

Having said that, most C implementations *do* use a contiguous
stack in memory -- but not all do. There are implementations
where the storage for each function call is allocated on the heap.
The assumption of a contiguous stack is neither universally correct
nor particularly useful. And even in stack-based implementations,
it's very common to pass some arguments in registers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top