Extent of the "as-if" rule

S

Sidney Cadot

Hi all,


In a discussion with Tak-Shing Chan the question came up whether the
as-if rule can cover I/O functions. Basically, he maintains it can, and
I think it doesn't.

Consider two programs:

/*** a.c ***/
#include <stdio.h>
int main(void)
{
fopen("somefile","rb");
return 0;
}

/*** b.c ***/
in main(void)
{
return 0;
}

Would it be legal for a compiler (through optimization), to emit the
same code for program a.c and b.c ?

I'd welcome a reference from the standard.

Best regards,

Sidney
 
J

Johan Lindh

Sidney said:
Hi all,


In a discussion with Tak-Shing Chan the question came up whether the
as-if rule can cover I/O functions. Basically, he maintains it can, and
I think it doesn't.

Consider two programs:

/*** a.c ***/
#include <stdio.h>
int main(void)
{
fopen("somefile","rb");
return 0;
}

/*** b.c ***/
in main(void)
{
return 0;
}

Would it be legal for a compiler (through optimization), to emit the
same code for program a.c and b.c ?

I'd welcome a reference from the standard.

Best regards,

Sidney

I'll take a stab at this...

Since fopen() is a function (15.2), it may therefore have side effects.
Optimizing it away would therefore be an error.

Perhaps the OS it's running on starts up a pot of coffee if someone
opens the 'somefile' file for reading, and that's the intended effect.

/J
 
T

Tristan Miller

Greetings.

Consider two programs:

/*** a.c ***/
#include <stdio.h>
int main(void)
{
fopen("somefile","rb");
return 0;
}

/*** b.c ***/
in main(void)
{
return 0;
}

Would it be legal for a compiler (through optimization), to emit the
same code for program a.c and b.c ?

I'd welcome a reference from the standard.

I don't know what the standard says, but from an implementation point of
view it might make sense not to optimize the fopen() away. Many file
systems maintain a "last accessed" timestamp for files; therefore, the
seemingly useless fopen() does indeed modify the environment.

Regards,
Tristan
 
D

Dan Pop

The as-if rule basically says that any optimisation/pessimisation is
allowed, as long as the *specified* output of the program is not affected.

Yes, because both programs generate the same output, according to the
description of the abstract machine. Opening a file in read mode and
closing it (if the opening succeeded) does not generate any output.
I'll take a stab at this...

Since fopen() is a function (15.2), it may therefore have side effects.
Optimizing it away would therefore be an error.

Perhaps the OS it's running on starts up a pot of coffee if someone
opens the 'somefile' file for reading, and that's the intended effect.

The only *relevant* side effects of fopen are those specified in the C
standard.

Here is the standard reference:

5.1.2.3 Program execution

1 The semantic descriptions in this International Standard describe
the behavior of an abstract machine in which issues of optimization
are irrelevant.

2 Accessing a volatile object, modifying an object, modifying a
file, or calling a function that does any of those operations
are all side effects,11) which are changes in the state of the
execution environment. Evaluation of an expression may produce
side effects. At certain specified points in the execution
sequence called sequence points, all side effects of previous
evaluations shall be complete and no side effects of subsequent
evaluations shall have taken place. (A summary of the sequence
points is given in annex C.)

3 In the abstract machine, all expressions are evaluated as
specified by the semantics. An actual implementation need not
evaluate part of an expression if it can deduce that its value is
not used and that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).

Opening a file in read mode doesn't modify the file, therefore it doesn't
count as a side effect, in the context of the C standard.

Dan
 
S

Sidney Cadot

Cross-posted to comp.std.c from a discussion in comp.lang.c; feedback
welcomed.


Dan said:
The as-if rule basically says that any optimisation/pessimisation is
allowed, as long as the *specified* output of the program is not affected.




Yes, because both programs generate the same output, according to the
description of the abstract machine. Opening a file in read mode and
closing it (if the opening succeeded) does not generate any output.




The only *relevant* side effects of fopen are those specified in the C
standard.

Here is the standard reference:

5.1.2.3 Program execution

1 The semantic descriptions in this International Standard describe
the behavior of an abstract machine in which issues of optimization
are irrelevant.

2 Accessing a volatile object, modifying an object, modifying a
file, or calling a function that does any of those operations
are all side effects,11) which are changes in the state of the
execution environment. Evaluation of an expression may produce
side effects. At certain specified points in the execution
sequence called sequence points, all side effects of previous
evaluations shall be complete and no side effects of subsequent
evaluations shall have taken place. (A summary of the sequence
points is given in annex C.)

3 In the abstract machine, all expressions are evaluated as
specified by the semantics. An actual implementation need not
evaluate part of an expression if it can deduce that its value is
not used and that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).

Opening a file in read mode doesn't modify the file, therefore it doesn't
count as a side effect, in the context of the C standard.

.... But that just dismisses one of the possible 'side effects' admitted
by the standard.

Does a fopen(name, "rb") count as 'calling a function that does any of
those operations' ? I think it does; it /has/, at some point, to
interact with the abstract machine's environment, which can only be done
via volatile objects or modifying an object, eventually, somewhere
down the line.

Anyway, we can pick on words (and that's valuable) but if anything, this
shows that the enumeration of 'side effects' as given in the standard
is not exhaustive. Clearly, opening a file (even for reading) interacts
with the outside world, e.g. on unix systems it updates the 'last
accessed' date as pointed out. This is not unambiguously covered by the
standard; if upon literal reading we were to conclude that it isn't
properly covered - well, then the standard needs to be mended on the
next occasion.

Best regards,

Sidney
 
J

Jack Klein

The as-if rule basically says that any optimisation/pessimisation is
allowed, as long as the *specified* output of the program is not affected.


Yes, because both programs generate the same output, according to the
description of the abstract machine. Opening a file in read mode and
closing it (if the opening succeeded) does not generate any output.


The only *relevant* side effects of fopen are those specified in the C
standard.

Here is the standard reference:

5.1.2.3 Program execution

1 The semantic descriptions in this International Standard describe
the behavior of an abstract machine in which issues of optimization
are irrelevant.

2 Accessing a volatile object, modifying an object, modifying a
file, or calling a function that does any of those operations
are all side effects,11) which are changes in the state of the
execution environment. Evaluation of an expression may produce
side effects. At certain specified points in the execution
sequence called sequence points, all side effects of previous
evaluations shall be complete and no side effects of subsequent
evaluations shall have taken place. (A summary of the sequence
points is given in annex C.)

3 In the abstract machine, all expressions are evaluated as
specified by the semantics. An actual implementation need not
evaluate part of an expression if it can deduce that its value is
not used and that no needed side effects are produced (including
any caused by calling a function or accessing a volatile object).

Opening a file in read mode doesn't modify the file, therefore it doesn't
count as a side effect, in the context of the C standard.

Given your reasoning, and I see nothing to argue with, optimizing away
the fopen() is a perfectly acceptable application of the as-if rule on
the typical *NIX system, where opening a file leaves no trace in the
system.

It would not be acceptable on a Win32 system, and quite possible other
systems. Win32 keeps multiple time values for files in the file
system, one of which is the last accessed time. An open of an
existing file followed by a close without reading or writing will
modify the file system's last access time, which does modify the file
even though it does not change the contents.

[also cross-posted to comp.std.c]
 
B

Ben Pfaff

Jack Klein said:
Given your reasoning, and I see nothing to argue with, optimizing away
the fopen() is a perfectly acceptable application of the as-if rule on
the typical *NIX system, where opening a file leaves no trace in the
system.

Most Unix-like kernel, including Linux, also maintain "last
accessed" times. I don't know why you think they don't.
 
J

Jack Klein

Most Unix-like kernel, including Linux, also maintain "last
accessed" times. I don't know why you think they don't.

Doh! I probably knew that, although I haven't had anything running
Linux in quite a while. It's time of creation that they don't keep
separately, right?

In any case the real question is:

Given:

-Compilers that generate executables for platforms A and B

-Platform A does not change the externally visible state of its file
system in any way when an existing file is opened and closed with no
actual access in between.

-Platform B does change the externally visible state of its file
system in some way (e.g., updating a last accessed time stamp field
associated with the file) under the same circumstances.

-The following program:

#include <stdio.h>
int main(void)
{
fopen("somefile","rb");
return 0;
}

....where the parameters passed to fopen() are such that the call
succeeds.

Then does it follow:

-A compiler for platform A may omit the fopen() call under the as-if
rule

-A compiler for platform B may not omit the fopen() call

....even though the output of the program itself is the same in either
case.
 
D

Douglas A. Gwyn

Jack said:
Given your reasoning, and I see nothing to argue with, optimizing away
the fopen() is a perfectly acceptable application of the as-if rule on
the typical *NIX system, where opening a file leaves no trace in the
system.

But that is *not* the way that Unix (POSIX) works!
Opening a file *does* have side effects in the environment.
 
D

Douglas A. Gwyn

Jack said:
Then does it follow:
-A compiler for platform A may omit the fopen() call under the as-if
rule
-A compiler for platform B may not omit the fopen() call
...even though the output of the program itself is the same in either
case.

The question is whether there is any way, other than timing,
code size, and other aspects deemed "inessential", to detect
whether the code actually performs the call. In a way, this
is all a waste of time, because no compiler that I know of
would optimize away a call to fopen(). Some *would* optimize
away pointless calls to strcmp() etc., in contexts where they
know that a standard library function is involved (so that
its complete semantics are known) and known not to have any
observable effect in the particular case.

Is there some *real* issue lying behind this topic, or is it
just a matter of pedanticism?
 
C

Chris Torek

Most Unix-like kernel, including Linux, also maintain "last
accessed" times. I don't know why you think they don't.

Moreover, if the file name corresponds to a fifo or named-pipe,
simply opening the file for reading can have an effect. In
particular, opening a fifo for reading will unblock a process
that is suspended in an attempt to write to that fifo:

$ mkfifo foo
$ echo hello > foo &
$ jobs
[1] 18389 echo hello >foo
$ echo zog < foo &
$ zog
jobs
[1] 18389 Exit 0 echo hello >foo
[2] 18390 Exit 0 echo zog <foo
$

If you do this in the opposite order, the attempt to read from the
fifo before there are any writers causes the reading program to
hang until someone opens the fifo for writing. Either way it is
obvious that something has happened, even though it is outside the
limited domain of portable C programming.
 
B

Barry Margolin

Ben Pfaff said:
Most Unix-like kernel, including Linux, also maintain "last
accessed" times. I don't know why you think they don't.

But it may only be updated if you actually read something from the file;
the act of opening the file in read mode might not update it (consider a
file on an NFS server -- there's nothing in NFS that corresponds to
open() or close(), the server only sees the directory lookup and the
read/write operations).
 
T

those who know me have no need of my name

in comp.std.c i read:
On 19 Jan 2004 16:13:33 GMT, (e-mail address removed) (Dan Pop) wrote in
comp.lang.c:

Given your reasoning, and I see nothing to argue with, optimizing away
the fopen() is a perfectly acceptable application of the as-if rule on
the typical *NIX system, where opening a file leaves no trace in the
system.

actually unices typically do record the time a file was opened. dan might
argue that by itself that isn't sufficient, and in some ways i agree, but
there remains a flaw: there's no way to know whether the FILE object
contains a volatile member or would call the environment since it's
implementation defined.
 
C

CBFalconer

Douglas A. Gwyn said:
The question is whether there is any way, other than timing,
code size, and other aspects deemed "inessential", to detect
whether the code actually performs the call. In a way, this
is all a waste of time, because no compiler that I know of
would optimize away a call to fopen(). Some *would* optimize
away pointless calls to strcmp() etc., in contexts where they
know that a standard library function is involved (so that
its complete semantics are known) and known not to have any
observable effect in the particular case.

Is there some *real* issue lying behind this topic, or is it
just a matter of pedanticism?

I think it is purely pedantic. At any rate there is no telling
what the act of opening a file involves, regardless of system. I
have built systems where the act of opening a particular named
file did some major re-arrangement of the entire i/o system, and
the act of closing that file put it back. This kept the
application proper coding standard and restricted the magic to
file system drivers.
 
S

Sidney Cadot

Douglas said:
The question is whether there is any way, other than timing,
code size, and other aspects deemed "inessential", to detect
whether the code actually performs the call. In a way, this
is all a waste of time, because no compiler that I know of
would optimize away a call to fopen(). Some *would* optimize
away pointless calls to strcmp() etc., in contexts where they
know that a standard library function is involved (so that
its complete semantics are known) and known not to have any
observable effect in the particular case.

Is there some *real* issue lying behind this topic, or is it
just a matter of pedanticism?

As for me, it's mostly curiosity about the standard's wording:

"Accessing a volatile object, modifying an object, modifying a
file, or calling a function that does any of those operations
are all side effects"

.... If this would just say "accessing a file" instead of
"modifying a file" the issue would not exist. I wonder if there is a
good reason this particular wording was chosen, and - if not - would
like to see it changed. There's no reason not to strive for perfection
if it is essentially free.

Another thing is that I think the standard's way of defining a "side
effect" (by enumeration of cases) is flawed. This is a bit like defining
mammals as "primates, whales, furry animals, ... (and so on)", which
works fine until you find a platypus.

Surely, there has to be a more generic way of defining a side effect.

Best regards,

Sidney
 
R

Robert Wessel

Sidney Cadot said:
... But that just dismisses one of the possible 'side effects' admitted
by the standard.

Does a fopen(name, "rb") count as 'calling a function that does any of
those operations' ? I think it does; it /has/, at some point, to
interact with the abstract machine's environment, which can only be done
via volatile objects or modifying an object, eventually, somewhere
down the line.

Anyway, we can pick on words (and that's valuable) but if anything, this
shows that the enumeration of 'side effects' as given in the standard
is not exhaustive. Clearly, opening a file (even for reading) interacts
with the outside world, e.g. on unix systems it updates the 'last
accessed' date as pointed out. This is not unambiguously covered by the
standard; if upon literal reading we were to conclude that it isn't
properly covered - well, then the standard needs to be mended on the
next occasion.


The point is that the change your describing (the last accessed date)
is *not* visible to a standard C program, and an implementation could
therefore claim to be conforming even if it removed the open. I think
we'd all agree that such an implementation would have serious QoI
issues.
 
C

Christian Bau

CBFalconer said:
I think it is purely pedantic.

It is purely pedantic because no existing compiler will remove the call
to fopen ().

On the other hand: If you are a compiler writer and you want to remove
this kind of call, then you have to _prove_ that the C Standard allows
it. If you are an application programmer and you want to make sure that
the call is not removed, then you could write

volatile FILE* p = fopen ("my file", "options);

and that will make it damned hard for the compiler writer to optimise
the call away.
 
D

Dan Pop

In said:
But that is *not* the way that Unix (POSIX) works!
Opening a file *does* have side effects in the environment.

But not according to the C standard (unless you can provide a chapter and
verse). Therefore, removing the fopen() call does not affect the
implementation's conformance to the C standard (which provides a complete
list of what it considers side effects).

Dan
 
J

James Kuyper

Jack Klein wrote:
....
Doh! I probably knew that, although I haven't had anything running
Linux in quite a while. It's time of creation that they don't keep
separately, right?

No - seperate times are kept for the creation date, last access, and the
last modification.
 
K

Keith Thompson

But not according to the C standard (unless you can provide a
chapter and verse). Therefore, removing the fopen() call does not
affect the implementation's conformance to the C standard (which
provides a complete list of what it considers side effects).

Side effects include "modifying a file". In a Unix filesystem, a
directory can be treated as a file; so can the physical device
containing the filesystem.

This is admittedly stretching the point.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top