memcpy as macro legit?

M

Michael B Allen

I just noticed that doing something like the following may fail because
it can overwrite u->size before it's evaluated.

memcpy(u, buf, u->size);

Is it legit that this is a macro as opposed to a function that would
not clobber the parameter?

Just surprised me a little is all.

Mike
 
L

Lawrence Kirby

I just noticed that doing something like the following may fail because
it can overwrite u->size before it's evaluated.

memcpy(u, buf, u->size);

Is it legit that this is a macro as opposed to a function that would
not clobber the parameter?

Any standard library function can have a macro definition in the relevant
standard header. However the macro must still preserve the semantic of the
function, and as a function it cannot modify its arguments.

Lawrence
 
J

Jens.Toerring

Michael B Allen said:
I just noticed that doing something like the following may fail because
it can overwrite u->size before it's evaluated.

How do you know that u->size gets overwritten before evaluation?
memcpy(u, buf, u->size);

I guess 'u' and 'buf' are pointers to structures - but why would you
then use u->size instead of "sizeof *u" or "sizeof *buf"? Usually
I would assume that there's no need to store the size of a structure
within the structure. Or are you using some dirty tricks like zero-
length arrays in the structure?
Is it legit that this is a macro as opposed to a function that would
not clobber the parameter?

As far as I can see in the standard memcpy() is supposed to be a
function. Are you sure that what 'u' and 'buf' point to never
overlaps?
Regards, Jens
 
I

Ivan Vecerina

Michael B Allen said:
I just noticed that doing something like the following may fail because
it can overwrite u->size before it's evaluated.

memcpy(u, buf, u->size);

Is it legit that this is a macro as opposed to a function that would
not clobber the parameter?

Just surprised me a little is all.

This behavior would surprise me a lot indeed.

How could a single byte be overwritten within 'u' before
having evaluated u->size to at least check that it is non-zero?

Given that the last parameter could be an ordinal constant
or an expensive function call, wouldn't even a macro need
to store the size value in a temporary before using it?

The behavior you are describing definitely isn't legit.
And as far as I understand, the C99 standard requires
memcpy to behave like a function (not like a macro) --
even though platforms obviously are allowed to treat it as an
intrinsic function, and to implement specific optimizations.


hth, Ivan
 
P

pete

Lawrence said:
Any standard library function can have a macro definition in
the relevant standard header. However the macro must still
preserve the semantic of the function,
and as a function it cannot modify its arguments.

The putc() and getc() macros,
are exceptions to the "cannot modify its arguments" rule.
 
L

Lawrence Kirby

The putc() and getc() macros,
are exceptions to the "cannot modify its arguments" rule.

Even putc() and getc() are not permitted to modify their arguments when
implemented as macros.

It is however true that they have a special license to violate function
call semantics. A true function call will evaluate each argument
expression exactly once, putc() and getc() are permitted to evaluate their
FILE * argument expression multiple times when implemented as a macro. So
if this expression contained side-effects there would be problems. E.g.
getc(*openstreams++) invokes undefined behaviour because openstreams may
be modified more than once between sequence points in the macro expansion.

Lawrence
 
M

Michael B Allen

How do you know that u->size gets overwritten before evaluation?


I guess 'u' and 'buf' are pointers to structures - but why would you
then use u->size instead of "sizeof *u" or "sizeof *buf"? Usually I
would assume that there's no need to store the size of a structure
within the structure. Or are you using some dirty tricks like zero-
length arrays in the structure?


As far as I can see in the standard memcpy() is supposed to be a
function. Are you sure that what 'u' and 'buf' point to never overlaps?

Whoops, u (struct url *) and buf (unsigned char *) do overlap. Actually
they point to the same address.

But still, u->size (which is the first member of the url structure)
was getting clobbered to 0 resulting in nothing being copied. That would
suggest to me that memcpy is not a function. This is glibc-2.2.5.

Mike
 
J

Jens.Toerring

Whoops, u (struct url *) and buf (unsigned char *) do overlap. Actually
they point to the same address.
But still, u->size (which is the first member of the url structure)
was getting clobbered to 0 resulting in nothing being copied. That would
suggest to me that memcpy is not a function. This is glibc-2.2.5.

If the memory regions overlap all bets are off and I guess it
is better not to make any assumptions about what might be the
results since your deep in undefined-behaviour-land. Perhaps
your implementation can use some aggressive optimizations in
case the buffers don't overlap but that lead to strange result
otherwise.
Regards, Jens
 
K

Keith Thompson

Lawrence Kirby said:
Any standard library function can have a macro definition in the relevant
standard header. However the macro must still preserve the semantic of the
function, and as a function it cannot modify its arguments.

A function can modify one of its arguments if another argument points
to it (though you have to be careful about undefined behavior).

There could also be issued with sequence points. C99 7.1.4p1 says:

Any invocation of a library function that is implemented as a
macro shall expand to code that evaluates each of its arguments
exactly once, fully protected by parentheses where necessary, so
it is generally safe to use arbitrary expressions as arguments.

with a footnote:

Such macros might not contain the sequence points that the
corresponding function calls do.

For example, the expression sin(x) + cos(x) can invoke undefined
behavior because both calls can set errno, and if sin() and cos() are
implemented as macros there may not be a sequence point where it's
needed. There was a discussion of this in comp.std.c a few years
ago; see
<http://groups.google.com/groups?as_umsgid=<[email protected]>>
 
P

Peter Nilsson

Lawrence Kirby said:
Any standard library function can have a macro definition in the relevant
standard header. However the macro must still preserve the semantic of the
function, and as a function it cannot modify its arguments.

Does this 'preservation of semantics' extend to the sequence point
that applies to function calls in C90?

Normative text 7.1.4p1, and certainly non-normative footnote 156,
suggests it does not apply in C99.
 
M

Mark F. Haigh

Peter said:
Does this 'preservation of semantics' extend to the sequence point
that applies to function calls in C90?

Normative text 7.1.4p1, and certainly non-normative footnote 156,
suggests it does not apply in C99.

I think it's pretty clear that Lawrence was using the term "preserve the
semantic" informally, and did not mean "with exact semantics of".

I think 7.1.4p1 is clear enough about the semantics of library functions
implemented as macros.


Mark F. Haigh
(e-mail address removed)
 
D

Dan Pop

In said:
Even putc() and getc() are not permitted to modify their arguments when
implemented as macros.

If their stream argument is an expression with side effects, they are
effectively modifying their stream arguments and there is nothing in the
standard prohibiting them from doing that.
It is however true that they have a special license to violate function
call semantics. A true function call will evaluate each argument
expression exactly once, putc() and getc() are permitted to evaluate their
FILE * argument expression multiple times when implemented as a macro. So
if this expression contained side-effects there would be problems. E.g.
getc(*openstreams++) invokes undefined behaviour because openstreams may
be modified more than once between sequence points in the macro expansion.

But this is a user code issue, not a getc() implementation issue.

Dan
 
L

Lawrence Kirby

If their stream argument is an expression with side effects, they are
effectively modifying their stream arguments and there is nothing in the
standard prohibiting them from doing that.

The argument in question to putc() and getc() is a pointer of type FILE
*, this must not be modified. Indeed it doesn't have to be an lvalue so
getc() and putc() macros wouldn't work in general if they tried to modify
it. Of course what the argument points to can be modified (subject to
the implementation details of FILE).

Maybe the problem here is that the concept of "stream argument" is
incorrect or at least imprecise. The stream isn't the argument to the
function, the FILE * pointer is. That is the level we have to think at
when interpreting the statement "a function cannot modify its arguments".
But this is a user code issue, not a getc() implementation issue.

It is a discussion of what getc() and putc() are allowed to do by the
standard. As part of that discussion it is reasonable to give examples
of code that demonstrates characteristics of that behaviour, including
when it is undefined.

Lawrence
 
L

Lawrence Kirby

A function can modify one of its arguments if another argument points
to it (though you have to be careful about undefined behavior).

Not really, which is obvious when you stop to consider that the argument
to a function is a value, not an lvalue. Consider

int x = 42;

then

f(x, &x);

and

f(x+0, &x);

are equivalient in C (for fun don't forget f(+x, &x)). Is x the argument
in all of these? No, but its value is. From this perspective talking about
modifying a function's argument makes no sense - there is nothing to
modify; a value has no persistence, only the contents of an object and
things hidden behind standard library functions (such as streams) do.

However "a function cannot modify its arguments" is a common enough
assertion, so what does it mean? Maybe "IF you supply an lvalue as a
function argument the function cannot use that lvalue to modify what it
designates". Naturally modifying the designated object through an lvalue
derived by some other means is possible.
There could also be issued with sequence points. C99 7.1.4p1 says:

Any invocation of a library function that is implemented as a
macro shall expand to code that evaluates each of its arguments
exactly once, fully protected by parentheses where necessary, so
it is generally safe to use arbitrary expressions as arguments.

with a footnote:

Such macros might not contain the sequence points that the
corresponding function calls do.

For example, the expression sin(x) + cos(x) can invoke undefined
behavior because both calls can set errno, and if sin() and cos() are
implemented as macros there may not be a sequence point where it's
needed. There was a discussion of this in comp.std.c a few years
ago; see
<http://groups.google.com/groups?as_umsgid=<[email protected]>>

True that is another area where macros for standard library functions can
behave differently.

Lawrence
 
D

Dan Pop

In said:
The argument in question to putc() and getc() is a pointer of type FILE
*, this must not be modified. Indeed it doesn't have to be an lvalue so
getc() and putc() macros wouldn't work in general if they tried to modify
it. Of course what the argument points to can be modified (subject to
the implementation details of FILE).

Maybe the problem here is that the concept of "stream argument" is
incorrect or at least imprecise. The stream isn't the argument to the
function, the FILE * pointer is. That is the level we have to think at
when interpreting the statement "a function cannot modify its arguments".

There is no confusion if you read the standard:

7.19.7.5 The getc function

Synopsis

1 #include <stdio.h>
int getc(FILE *stream);

Description

2 The getc function is equivalent to fgetc, except that if it is
implemented as a macro, it may evaluate stream more than once,
so the argument should never be an expression with side effects.

So, "stream" is the name of the FILE pointer argument.
It is a discussion of what getc() and putc() are allowed to do by the
standard.

And they *are* allowed to change their stream argument, by multiple
evaluation, if it's something like fp++. The restriction is on the user
code not call them with such an argument.

Dan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,019
Latest member
RoxannaSta

Latest Threads

Top