Returning a struct from a function - strange behavior

C

CBFalconer

Keith said:
CBFalconer said:
Keith Thompson wrote: [...]
Here's what n1336 6.2.4p7 says (I think it's already been quoted in
this thread):

Yes, it has. And it has also been pointed out that n1336 is a very
preliminary draft for a future C1x standard. My comments have been
to the effect that I don't think this particular proposal will
live, and why not.

And the point of my reply was to refute your arguments. You're
claiming that the proposal "will involve too many ugly
inefficiencies". You snipped the paragraph in which I explained why
this is not the case:

| The proposed change in n1336 applies *only* to a structure or union
| that contains (directly or indirectly) an array member. In the
| majority of such cases, the struct or union is already too big to fit
| in a register. For any other function results, the returned value is
| not stored in an object, either implicitly or explicitly (though of
| course the compiler is still free to do the equivalent of storing it
| in an object internally).
|
| "&strlen(thing)" is still a constraint violation.

Certainly (the violation).

Why do you want to treat some returned values in one manner, and
others in another manner? C has enough silly complexities, and
doesn't need another one. IMO. The simplest thing is to insist on
storing that value somewhere, after which you are free to play with
it. That can be a universal rule.
 
C

CBFalconer

Martien said:
.... snip ...


The storage of which small values? Which interactions? Could you
be a bit more verbose and maybe try to explain what you mean?

I'm not a good teacher. The point is that making all function
returned values addressable requires storing them somewhere. This
is not normally done, since values are (usually) returned in
registers, and if not used the register is free. The other
possibility is to store it and let the optimizer take it out, which
seems inefficient and error prone to me.

As far as interactions are concerned, what if you want to pass two
or more functional results to another routine? Does the second one
overwrite the first one? If so how do you make it work. If not,
you have a non-minor problem in allocating and freeing storage.
 
I

Ian Collins

Keith said:
CBFalconer said:
Keith Thompson wrote: [...]
Here's what n1336 6.2.4p7 says (I think it's already been quoted in
this thread):
Yes, it has. And it has also been pointed out that n1336 is a very
preliminary draft for a future C1x standard. My comments have been
to the effect that I don't think this particular proposal will
live, and why not.

And the point of my reply was to refute your arguments. You're
claiming that the proposal "will involve too many ugly
inefficiencies". You snipped the paragraph in which I explained why
this is not the case:

| The proposed change in n1336 applies *only* to a structure or union
| that contains (directly or indirectly) an array member. In the
| majority of such cases, the struct or union is already too big to fit
| in a register. For any other function results, the returned value is
| not stored in an object, either implicitly or explicitly (though of
| course the compiler is still free to do the equivalent of storing it
| in an object internally).
|
| "&strlen(thing)" is still a constraint violation.
Even if it were broader, there isn't any need for added inefficiencies
as Jacob has explained. It hasn't been a problem for C++ compilers.
 
L

lovecreatesbeauty

In C89/C90, the implicit conversion of an expression of array type to
a pointer to the array object's first element occurs only
when the array expression is an lvalue. Quoting the C90 standard:

Except when it is the operand of the sizeof operator or the unary
& operator, or is a character string literal used to initialize an
array of character type, or is a wide string literal used to
initialize an array with element type compatible with wchar_t, an
lvalue that has type "array of _type_" is converted to an
expression that has type "pointer to _type_" that points to the
initial element of the array object and is not an lvalue.

The quote from C89 draft in Peter Nilsson's first post in this thread
is slightly different from this. I prefer the C89 version :) Quoted
again:

"Except when it is the operand of the sizeof
operator or the unary & operator, or is a character string
literal used to initialize an array of character type, or
is a wide string literal used to initialize an array with
element type compatible with wchar_t, an lvalue that has
^^ ^^^^^^ ^^^^ ^^^
type ``array of type '' is converted to an expression that
^^^^ ^^^^^ ^^ ^^^^
has type ``pointer to type '' that points to the initial
member of the array object and is not an lvalue.
Your array expression "make_person().name" is not an lvalue, so the
conversion doesn't occur. It's unclear what happens next. I *think*

Thank you, Keith. I read your first post again, and it helps me
understand this language feature a lot.

The expression make_person() is a rvalue type of struct Person.
make_person().name also doesn't occur in programmer declared memory,
so it's not a lvalue and remains a array.
you're passing the array by value to printf, which normally isn't
possible; since printf is expecting a char* due to the "%s" format,
the behavior is undefined.

Is the UB triggered by the mismatch of %s and make_person().name or by
accessing make_person().name after the sequence point of evaluating
this argument in the printf call? (But does such an access occur in
that single printf call in the original post?)
 
L

lovecreatesbeauty

The quote from C89 draft in Peter Nilsson's first post in this thread
is slightly different from this. I prefer the C89 version :)  Quoted
again:

        "Except when it is the operand of the sizeof
        operator or the unary & operator, or is a character string
        literal used to initialize an array of character type, or
        is a wide string literal used to initialize an array with
        element type compatible with wchar_t, an lvalue that has
                                              ^^ ^^^^^^ ^^^^ ^^^
        type ``array of type '' is converted to an expression that
        ^^^^   ^^^^^ ^^ ^^^^
        has type ``pointer to type '' that points to the initial
        member of the array object and is not an lvalue.

Sorry, they are the same. My bad.
 
R

Richard Bos

DiAvOl said:
I think that the last 20-30 posts (or more) has nothing to do with the
subject of this thread, maybe creating a new subject for lcc-win32 and
post there would be a better idea

No, it bloody wouldn't. Thanks to jacob, we have too many of those
already. In fact, I would not be surprised if he himself created a new
thread just to "make his point".
Just a thought

I've always found that the most shallow remark to end a post with.

Richard
 
R

Richard Bos

jacob navia wrote:

[ about jacob claiming conformance ]
Nonetheless, a paying customer who actually needs C90 conformance
could sue you for misrepresentation if you claimed it, and because of
the absence of these mandatory diagnostics, such a customer would have
a valid case.
I'm thinking in terms of American law; I'm not sure if this would be
case in France. The complaint would still be valid, but I don't know
how easy it would be to file a lawsuit based upon it.

I don't think such case would have any luck. Jacob could just claim
it's a bug he wasn't aware of.

Not any more, he couldn't. He is very well aware of it; he just doesn't
give a damn.

Richard
 
K

Keith Thompson

CBFalconer said:
I'm not a good teacher. The point is that making all function
returned values addressable requires storing them somewhere.
[...]

Yes, it would, if that were what's being proposed. It isn't. The
vast majority of functions are not affected in any way by the proposed
change.

The proposed change in n1336 *only* affects structs and unions with
array members. The language as it's currently defined, by either C90
or C99, has a glitch in this case: it's possible to have a non-lvalue
expression of array type, which causes serious problems for the usual
implicit array-to-pointer conversion. The added paragraph in n1336
corrects this problem, and it does so with, as far as I can tell,
minimal cost.

With the possible exception of the code in the article that started
this discussion, or code written specifically to illustrate this
issue, I don't think I've ever seen any C code that would be affected.

You say it would cause "too many ugly inefficiencies". I believe
you're wrong. If you want to convince anyone that you're right,
you'll need to offer specific examples.

And what's your alternative? If I write a function that returns a
struct that contains an array, and I refer to the array member of the
result of a call to that function, what should happen? Constraint
error? Undefined behavior? Mandatory nasal demons?
 
K

Keith Thompson

In C89/C90, the implicit conversion of an expression of array type to
a pointer to the array object's first element occurs only
when the array expression is an lvalue.  Quoting the C90 standard:
[snip]
The quote from C89 draft in Peter Nilsson's first post in this thread
is slightly different from this. I prefer the C89 version :)  Quoted
again:
[snip]

Sorry, they are the same. My bad.

In fact, the ANSI C89 and ISO C90 standards are identical, apart from
some (non-normative) introductory material and different numbering of
the sections.
 
N

Nick Keighley

jacob navia wrote:

[ about jacob claiming conformance ]
Nonetheless, a paying customer who actually needs C90 conformance
could sue you for misrepresentation if you claimed it, and because of
the absence of these mandatory diagnostics, such a customer would have
a valid case.
I'm thinking in terms of American law; I'm not sure if this would be
case in France. The complaint would still be valid, but I don't know
how easy it would be to file a lawsuit based upon it.

I don't think such case would have any luck. Jacob could just claim
it's a bug he wasn't aware of. Else everyone would be able to sue any
buggy software and get money.

and hopefully the person sueing wouldn't read clc
 
N

Nick Keighley

Hardly necessary - the courts would probably be a lot less legalistic
then the average clc Heathfield-wannabe. The law has the notion of a
"reasonable person", which is a foreign concept to most clc'ers.

yes and there are terms like "Fir For Purpose" and "Of Merchantable
Quality". If someone could demonstrate they had suffered material
loss because the compiler wasn't fully conforming then I'd say
they had some sort of case. IANAL
 
L

lovecreatesbeauty

In fact, the ANSI C89 and ISO C90 standards are identical, apart from
some (non-normative) introductory material and different numbering of
the sections.

Thanks. I misread one as C99. I should post less and just read great
articles of yours and other regulars here more often.
 
F

Flash Gordon

Richard Heathfield wrote, On 09/10/08 03:57:
CBFalconer said:


"Your alleged 'good' experience with lcc shows a bug in lcc. I
don't know if you mean lcc-win32 (which has quite a few known
insects) or lcc (which is less well known here)." - message ID
<[email protected]>

Thanks Richard, saves me digging though the message history to find it.
If only you would check more often.

I was naughty and relied on my highly dodgy memory, but I was right this
time so got away with it :)
 
F

Flash Gordon

CBFalconer wrote, On 09/10/08 05:51:
Keith said:
CBFalconer said:
jacob navia wrote:
CBFalconer wrote:

I doubt it will always work. For example:

typedef struct foo {char a[20]}; foo;
char ca, cb;
foo getfoo(void);
...
printf("'%c' \"%s\" '%c'\n", ca, &(getfoo().a), cb);

Think about why I picked this test.
When I correct your code and write this:

typedef struct foo {char a[20];} foo;
char ca, cb;
foo getfoo(void) { foo f; f.a[0] = 'a'; f.a[1] = 0; return f; }
int main(void) {
printf("'%c' \"%s\" '%c'\n", ca, &(getfoo().a), cb);
}

I obtain
' ' "a" ' '

Why?

Because lcc-win (as lcc) creates always a temporary object
when a function returns a structure. lcc-win passes a hidden
first argument to the function that contains a pointer to the
temporary variable created in the calling function.
Maybe in your system you can handle it for structures. But what if
you wanted the address of one of those characters? i.e.:

printf("%p \%s\" %p\n", (void*)&ca, &(getfoo().a),
(void*)&cb);

That n1336 proposal would require that to work. Ugh.
I don't see the problem. The n1336 proposal simply requires the
creation of a temporary object of type foo, which must survive at
least until the end of the statement. In many implementations,
apparently including lcc-win, that temporary object is already created
as part of the protocol for returning a structure value.

I guess I should have made the ca and cb items functional results.
The point is horrible complexity and time wasted executing it.

As Keith just pointed out above the object is *already* created in most
implementations, so the extra complexity and time wasted is extra
complexity and time that was added way back in 1989 (or earlier) when
the ability to return structures/unions was added to the language.
C
is basically simple - keep it that way.

It is. Making one more thing defined instead of undefined makes it
simpler from the perspective of someone using the language.
I see lots of problems. I am almost certainly not making them
plain.

You don't seem to be making yourself clear. You don't seem to be
understanding what others are saying either.
 
B

Bartc

Nate Eldredge said:
I don't follow.

For concreteness, one implementation I'm familiar with is amd64, where
structs of 8 bytes or less are returned in a register, structs of 8 to
16 bytes are returned in two registers, and larger structs are copied
into a space which is passed (invisibly) by the caller. This
convention is fixed by an ABI, which a C1x compiler would likely
conform to.

The draft standard requires a temporary object to exist, but as far as
I can tell there's no need for it to be stored in memory unless
something actually looks for it there. If we have

struct foo { long a; long b; }; /* long is 8 bytes on amd64 */
struct foo blah(void);
void qux(long);
void argle(long *);

then doing

qux(blah().a)

would not require the return value of blah() to be stored in memory at
any time. The compiler could simply move the value of blah().a from
the register where it was returned into the appropriate register to
pass it to qux(). It's the same as if you have an auto variable of
type long; unless you take its address (or perhaps declare it
volatile), the compiler is free to keep it in a register, or even
optimize it completely out of existence.

I remember writing a little compiler which could pass and return structs
(and in this case arrays).

Small structs were simply returned in registers as you said, and larger ones
on the stack (no heap storage was used).

However, to access fields of a struct returned from a function (or to index
such an array), my compiler required it to be an l-value.

But it considered any sort of value returned from a function to be a
temporary value (or transient, as I called it), even if it happened to be a
struct which happened to exist in stack memory, so it couldn't be an
l-value. There are too many issues otherwise.

So it seems to me quite reasonable for C to allow structs to be returned but
to disallow field accesses (or indexing of embedded arrays), even if some
hairy compilers try to implement this.

And yes in some cases it does sound inefficient, not very C-like at all:
imagine a function constructing a 1000-element array, embedding it in a
struct, and returning the entire struct, only for the ungrateful caller to
merely select one element of it..

If the caller stored the struct locally, one might hope it would then be
able to use several other elements without re-calling the function.
 
N

Nate Eldredge

Bartc said:
I remember writing a little compiler which could pass and return structs
(and in this case arrays).

Small structs were simply returned in registers as you said, and larger ones
on the stack (no heap storage was used).

However, to access fields of a struct returned from a function (or to index
such an array), my compiler required it to be an l-value.

But it considered any sort of value returned from a function to be a
temporary value (or transient, as I called it), even if it happened to be a
struct which happened to exist in stack memory, so it couldn't be an
l-value. There are too many issues otherwise.

So it seems to me quite reasonable for C to allow structs to be returned but
to disallow field accesses (or indexing of embedded arrays), even if some
hairy compilers try to implement this.

I agree it would have been reasonable 20 years ago. Indeed, K&R I
didn't allow structs to be passed or returned by value at all, and
arguably things would be better if it had been left that way. But we
are now stuck with it as part of the language, field access and all.
The present discussion is over whether the new language in the draft
standard makes matters worse. In my opinion, it doesn't hurt
efficiency for anything presently allowed, and the new things it
allows make it more consistent.
And yes in some cases it does sound inefficient, not very C-like at all:
imagine a function constructing a 1000-element array, embedding it in a
struct, and returning the entire struct, only for the ungrateful
caller to merely select one element of it..

If the caller stored the struct locally, one might hope it would then be
able to use several other elements without re-calling the function.

That's up to the caller, of course. But I know what you mean:
anything to do with structs beyond K&R I (apply the & or . operators)
seems non-C-like to me. I have a subconscious feeling when dealing
with C that each line should compile into a small number of
instructions, approximately proportional to the number of operators.
The notion that a single assignment statement can copy a huge struct
and take thousands of cycles bothers me somehow. You're supposed to
have to write a loop for that! (Or call memcpy, which is morally the
same thing.) And if you want to pass structs to and from functions,
you pass them by reference.
 
J

James Kuyper

jacob said:
When I said that lcc-win compiles and executes that
code correctly they could not stand that.

Denigrating lcc-win is the only thing they all agree with.

Why?

It is the only compiler that is not a C++ compiler that
happens to compile C.

While compilers that handle both languages have become commonplace,
they're not universal. For instance, the MIPSpro IRIX C compiler that I
use on the SGI machines at work isn't a C++ compiler, either. The C++
compiler is a separate program called "CC" rather than "cc".

I also find it odd that you present this as if it were a significant
advantage. I personally need both a C compiler and a C++ compiler.
However, even if all I really needed was a C compiler, I don't see it as
a significant disadvantage if the compiler also has other capabilities
that I don't happen to use. Not providing C++ support presumably makes
the compiler smaller, but disk space is too cheap nowadays for that to
be a significant issue. If not providing C++ support makes the compiler
cheaper, than that is an advantage - but in that case it's the low price
itself that is the advantage, not the lack of C++ support.
 
J

jacob navia

Nick said:
yes and there are terms like "Fir For Purpose" and "Of Merchantable
Quality". If someone could demonstrate they had suffered material
loss because the compiler wasn't fully conforming then I'd say
they had some sort of case. IANAL

Sure, sure.

MORON: Your Honor, the compiler doesn't conform to C89!

JUDGE: What was the problem?

MORON: I compiled a program containing comments written in another
standard and the compiler did NOT complain.

JUDGE: But the generated program executed OK?

MORON: Well... yes your Honor.

JUDEGE: Well, then there are no damages?

MORON: No, but I feel cheated because I expected those errors
from the compiler.

JUDGE: Comptent of the court. You are guilty of abusive procedure.
Pay US$ 1 to Mr Navia and a fine of 10.000 to the court!
 
J

James Kuyper

CBFalconer said:
(e-mail address removed) wrote: ....

But that provision would force the storage of all those small
values. That is where the inefficiency comes in. Also consider
the interactions.

It doesn't apply to them, so how could it force them to be stored?
 
J

James Kuyper

CBFalconer said:
(e-mail address removed) wrote:

[ about jacob claiming conformance ]
Nonetheless, a paying customer who actually needs C90 conformance
could sue you for misrepresentation if you claimed it, and because
of the absence of these mandatory diagnostics, such a customer
would have a valid case.

I'm thinking in terms of American law; I'm not sure if this would
be case in France. The complaint would still be valid, but I don't
know how easy it would be to file a lawsuit based upon it.
I don't think such case would have any luck. Jacob could just claim
it's a bug he wasn't aware of. Else everyone would be able to sue
any buggy software and get money.

Can't be a valid case. If it was, Microsoft would have gone broke
about 2 decades ago.

Well, the quality and quantity of Microsoft's lawyers has something to
do with that, as does the product disclaimer that is industry standard,
essentially absolving them of all responsibility for actually delivering
a usable product. If jacob follows that same industry practice, he would
probably be safe from a customer lawsuit.

However, I think he could still be charged with false advertising if he
claimed C90 conformance, depending upon how he advertises it. I'm pretty
sure that simply saying so in this newsgroup, as he has actually done,
doesn't qualify as advertising for this purpose.

The key difference between Microsoft and jacob is that it is a matter of
public record that jacob has been informed of this non-conformance, and
that it is deliberate on his part. Microsoft normally keeps tighter
control over it's public statements than he does. I would be surprised
if there's comparable publicly available evidence that any particular
Microsoft product deliberate fails to conform to a standard that it was
officially claimed to conform to. Not very surprised; just surprised.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top