Compiler trick

R

Razvan

Hi






What do you think about the following code ?

public class CDummy
{
public static void main(String args[])
{
System.out.println("CDummy.");

int xx = 7, yy;

if (xx < 10) yy = 10;
else yy = 20;

System.out.println(yy);
}
}



Just try to guess what the output will be without compiling/running.

How about the next code:

public class CDummy
{
public static void main(String args[])
{
System.out.println("CDummy.");

int xx = 7, yy;

if (xx < 10) yy = 10;
if (xx >= 10) yy = 20;

System.out.println(yy);
}
}

Have you guessed right in both cases ? (on Sun's jdk1.4 the above
mentionated classes do not yield the same result)




Regards,
Razvan
 
A

Andrew Thompson


Hi Razvan, could you not put so many blank
lines in your posts, and do not alos not
post 'tab' characters?
What do you think about the following code ?

Not much.

I replaced your code (the second of which
does not compile), with this version that
is self contained and does compile.

<sscce>
public class CDummy1 {

CDummy1() {
System.out.print("CDummy 1\t ");

int xx = 7, yy;

if (xx < 10) yy = 10;
else yy = 20;

// I expect 10
System.out.println(yy);
}

public static void main(String args[]) {
new CDummy1();
new CDummy2();
}
}

/** Your second CDummy class did not compile. */
class CDummy2 {

CDummy2() {
System.out.print("CDummy 2\t ");

// initinalise yy to -1
int xx = 7, yy=-1;

if (xx < 10) yy = 10;
if (xx >= 10) yy = 20;

// I expect 10
System.out.println(yy);
}
}
Just try to guess what the output will be without compiling/running.

...see comments in code
Have you guessed right in both cases ?

Not so much a guess, but yes, I got it
right in the case of my code, ..
..(on Sun's jdk1.4 the above
mentionated classes do not yield the same result)

...whereas your second example would not
compile, so it is hard to say what result
you are talking about.
 
C

Chris Uppal

Andrew said:
/** Your second CDummy class did not compile. */

I think Razvan's point is that it doesn't compile.

I wouldn't expect it to, anyway. I haven't tested it (so I'm setting myself up
to look stupid here ;-) but I'd expect the compiler to reject the code because
'yy' is not initialised in every path.

We are cleverer than the compiler, so we can see that it /is/ initialised on
every path. But the important point is that the compiler is using specific
/rules/, which are part of the Java language definition, to determine what is
legal and what is not. According to those rules (which are deliberately
simpler than they have to be) 'yy' is not "definitely assigned" before its use
in the println().

Chapter 16 of the Java Language Specification (second edition) defines the
rules for definite assignment,

-- chris
 
J

Joona I Palaste

I think Razvan's point is that it doesn't compile.
I wouldn't expect it to, anyway. I haven't tested it (so I'm setting myself up
to look stupid here ;-) but I'd expect the compiler to reject the code because
'yy' is not initialised in every path.
We are cleverer than the compiler, so we can see that it /is/ initialised on
every path. But the important point is that the compiler is using specific
/rules/, which are part of the Java language definition, to determine what is
legal and what is not. According to those rules (which are deliberately
simpler than they have to be) 'yy' is not "definitely assigned" before its use
in the println().
Chapter 16 of the Java Language Specification (second edition) defines the
rules for definite assignment,

The fact that the compiler is not sentient and does not possess a human
sense of deduction causes frequent amazement among newbies. Just a few
weeks earlier we had a question why this kind of code does not compile:

public class MyClass {
public void doIt() {
/* ... */
}
}

public class Test {
public static void main(String[] args) {
List list = new ArrayList();
/* ... */
Object obj = list.get(0);
if (obj instanceof MyClass) {
obj.doIt();
}
}
}

I think the poster even expected this form of main() to compile:

public static void main(String[] args) {
List list = new ArrayList();
/* ... */
Object obj = list.get(0);
int i=500;
if (obj instanceof MyClass) {
i=401;
}
if (i%2 == 0) {
return;
}
obj.doIt();
}

If the deduction involves knowledge of the environment around the Java
program, then the question becomes even more interesting. =)
 
C

Chris Uppal

Joona said:
The fact that the compiler is not sentient and does not possess a human
sense of deduction causes frequent amazement among newbies.

<grin/>

This is one of the reasons why I think it important to emphasise not just that
the compiler /doesn't/ deduce these things, but that /it is not allowed to/.
That there are very definite Rules, and that those Rules are part of the
language definition.

Otherwise it becomes hard to understand why the compiler doesn't deduce the
legallity of constructions like:

Objet o = ...;
if (o instanceof Dog) { o.bark(); }

which is, after all, perfectly sensible. It just does not happen to be legal
Java....

-- chris
 
R

Razvan

Andrew said:
I think Razvan's point is that it doesn't compile.

Yes, indeed that was my point. My innitial prediction was
that the second code would also compile and print the same result, but
it did not. The Java compiler should be more consistent: it should
detect both cases ('if else' and 'if if') or it should complain on
both. I saw this code on a Java test; for such questions your only
solution is to remember how the compiler behaves. I don't like things
that you just have to remember.



Regards,
Razvan
 
R

Razvan

Hi Andrew,


I replaced your code (the second of which does not compile), with this version that is self contained and does compile.

I have mentioned that you should not try to run/compile.
Indeed, the second class is not compilable. It seems that you did not
anticipated that.




Razvan
 
J

Joona I Palaste

Yes, indeed that was my point. My innitial prediction was
that the second code would also compile and print the same result, but
it did not. The Java compiler should be more consistent: it should
detect both cases ('if else' and 'if if') or it should complain on
both. I saw this code on a Java test; for such questions your only
solution is to remember how the compiler behaves. I don't like things
that you just have to remember.

I disagree. The rules for "if else" are far easier, more deterministic,
and allow for less loopholes than your artificially constructed "if if"
scenario. When you have an "if else" construct, it is a fact that
*always* *exactly one* of the "if" and "else" branches shall be
followed.
However, in your "if if" case, you have to separate "if" statements,
which happen to coincide by having exactly opposite criteria. A human
will be able to look at both at the same time, understand that the
value of an int variable is *always* *either* more than 10 *or* less
than or equal to 10 (paraphrasing your original code), but
automatically deducing this from a set of program statements requires
quite an advanced algorithm, when done by an automaton that can only
ever do what it is told to do, with no intuition or bright ideas of
its own. Here's a rough outline of what it would have to do:
- Note the first "if" statement and store both operands of its
criterion to memory for late comparison,
- Note the execution branch of the "if" statement as a currently open
execution path,
- Note that there is no code between the two "if" statements that might
diverge program flow away from the second "if" statement,
- Note the second "if" statement and store both operands of its
criterion to memory,
- Note the execution branch of the "if" statement as a currently open
execution path,
- Compare the operands of both "if" statements, seeing that the first
operand is the same variable, the second operand is the same constant,
and the operators are directly opposite,
- Deduce that the criteria of the two "if" statements are directly
opposite,
- Use this information to close both open execution paths,
And only then proceed as it would have done *straight away* in the
"if else" case.
What if the variable in the second "if" statement was not i, but j, and
j was assigned a value from i? What if that assignment was done not in
the normal execution path, but an "if" statement whose criterion just
happened to be always true? What if there was a reassignment to j later,
but in a "for" loop that iterates exactly 0 times? What if j was first
decremented by two, then incremented by one, then again incremented by
one?
Go on, *you* try to write a formal, deterministic algorithm to take
care of all this. I suppose there are highly-paid research groups in a
couple of universities working on it. If you produce successful reports,
you could write an article about it in a computing magazine.
The issue gets even more interesting when the criteria for the "if"
statements come from methods in different objects. Sooner or later the
Java compiler will essentially have to run the program in order to find
out if it can compile it.
But for the "if else" case, there is only one simple rule: In every
"if else", no matter where it appears or what its criterion or
background state are, *exactly one* of its branches will *always*
execute. Not both, not neither, *exactly one*. All the compiler has to
do is to spot an "if" with an "else" and instantly arrive at this
conclusion.
 
P

P.Hill

Jacob said:
couldn't some external factor (i.e. a thread) cause
_none_ of the two tests to be true; If the premises
of "clause" is changed just after the first test?

The particular example only included local variables, but as
pointed out by Joona P. such analysis requires complete understanding of all
expressions and paths which is not a very practicle approach for a compiler.

-Paul
 
J

Jacob

Chris said:
We are cleverer than the compiler, so we can see that it /is/ initialised on
every path.

I'm not a thread expert, but in general:

if (clause)
statement1;
if (!clause)
statement2;

couldn't some external factor (i.e. a thread) cause
_none_ of the two tests to be true; If the premises
of "clause" is changed just after the first test?

This could never be the case for

if (clause)
statement1;
else
statement2;
 
R

Raza S. Ali

So the problem here, is the way java handle's Control Flow. When it sees
an if statement, it will ignore all code before that.
This stems from the fact that java is *mostly* a structural programming
language, and that it compiles the entire program before
evaluating one line at a time.

It looks at the first if, and says, ok here is something that clause can
see.

At this point the first if statement is over, so it looks at the second
one. Even though every possibillity appears to
accounted for, it turns out that instead, the comiler doesn't hold enough
state to know this. So you must either explicitly
initilize, or use if - else instead. Just a caveat of java.

-Raza
 
J

Joona I Palaste

Raza S. Ali said:
So the problem here, is the way java handle's Control Flow. When it sees
an if statement, it will ignore all code before that.
This stems from the fact that java is *mostly* a structural programming
language, and that it compiles the entire program before
evaluating one line at a time.

This part is true enough.
It looks at the first if, and says, ok here is something that clause can
see.
At this point the first if statement is over, so it looks at the second
one. Even though every possibillity appears to
accounted for, it turns out that instead, the comiler doesn't hold enough
state to know this. So you must either explicitly
initilize, or use if - else instead. Just a caveat of java.

"Just a caveat of java"? I challenge you to implement *any* compiled
programming language that can do this automatic verification of every
condition of every branch of execution code without actually running
the problem. I wrote some theoretical aspects about it another reply,
you might want to read it.
It's not that *JAVA* explicitly refuses to do a full control flow
analysis at compile time. It's that such an analysis is inherently
extremely difficult to design, let alone implement.
 
C

Chris Uppal

Razvan said:
Yes, indeed that was my point. My innitial prediction was
that the second code would also compile and print the same result, but
it did not. The Java compiler should be more consistent: it should
detect both cases ('if else' and 'if if') or it should complain on
both. I saw this code on a Java test; for such questions your only
solution is to remember how the compiler behaves. I don't like things
that you just have to remember.

I agree. Whenever any design has arbitrary elements then you "just have to
remember them", and that is /not/ good.

But in this case, the Java designers had to make a somewhat arbitrary
decision -- they had to decide what the Java compiler would and would not be
/guaranteed/ to figure out. They also took the sensible (IMO) decision that
anything that it wasn't /required/ to be able to work out it /must/ reject.
Since they had to make an arbitrary decision, they choose to keep the rules
simple. The rules are still arbitrary, but I think that you'll agree that if
you "just had to remember" anything, then it's better if that thing is simple
than if it is complicated.

So the rules for Java are simple, but that means that the compiler can't --
isn't allowed to be -- as clever as you sometimes would want. The advantage is
that, once you have learned the rules, you /know/ what the compiler will do in
every case -- you don't have to experiment to find out (or even worse, "just
remember")what this /particular/ compiler will do in this /particular/ case.

In this example, the "odd" result is because the Java compiler sees the
code as two unconnected conditionals, both of which are of the form
if (...)
...code...
(i.e. with no "else"). The Rules state that a variable is only initialised if
it is assigned on /both/ forks of a condition (in both the "if" and the
"else"), otherwise the compiler is forced to reject the code. Since the
compiler isn't allowed to see your two conditionals as a rather strange way of
writing a single if-then-else (it isn't allowed to "realise" that the two
conditions are opposites) it has to report an error.

I urge you to read the Java language specification. Not because I think it's
something that you /ought/ to do (though it is if you want to be a top class
Java programmer, IMO), but because I think it is something you would probably
find interesting and valuable. Judging by your posts here, you are interested
in what's "really" going on and want to know more than just the superficial
details. If that's the case, then sooner or later you are going to have to
start reading /real/ specifications, like the JLS, because otherwise your
curiosity will /never/ be satisfied. (BTW, don't expect to be able to take it
all in at one read -- I doubt if anyone could do that -- but you can get to
know what's in it, and read (and re-read) bits of it as and when they are
relevant to you.)

-- chris
 
D

dar7yl

Chris Uppal said:
Otherwise it becomes hard to understand why the compiler doesn't deduce
the
legallity of constructions like:

Objet o = ...;
if (o instanceof Dog) { o.bark(); }

The corrected code is:

<code>
Object o = ...;
if (o instanceof Dog)
{
( (Dog) o ).bark();
}
</code>

The rational is that the base Object o does not contain the method 'bark',
so you have to cast it into the required class 'Dog'.

Also, note that the cast of 'o' into 'Dog' is valid at this point, because
it is protected by the 'instanceof' condition enclosing it's use. If you
didn't have the if statement, and tried to cast any object, chances it would
fail with a conversion error exception, assuming a random object 'o'.

regards,
Dar7yl ([email protected])
 
P

P.Hill

dar7yl said:
Object o = ...;
if (o instanceof Dog)
{
( (Dog) o ).bark();
}

Casting is another place where the compiler
is not going to help you, even if what is going
on to you is obvious.

-Paul
 
J

Jim Janney

Joona I Palaste said:
I disagree. The rules for "if else" are far easier, more deterministic,
and allow for less loopholes than your artificially constructed "if if"
scenario. When you have an "if else" construct, it is a fact that
*always* *exactly one* of the "if" and "else" branches shall be
followed.
However, in your "if if" case, you have to separate "if" statements,
which happen to coincide by having exactly opposite criteria. A human
will be able to look at both at the same time, understand that the
value of an int variable is *always* *either* more than 10 *or* less
than or equal to 10 (paraphrasing your original code), but
automatically deducing this from a set of program statements requires
quite an advanced algorithm, when done by an automaton that can only
ever do what it is told to do, with no intuition or bright ideas of
its own. Here's a rough outline of what it would have to do:
- Note the first "if" statement and store both operands of its
criterion to memory for late comparison,
- Note the execution branch of the "if" statement as a currently open
execution path,
- Note that there is no code between the two "if" statements that might
diverge program flow away from the second "if" statement,
- Note the second "if" statement and store both operands of its
criterion to memory,
- Note the execution branch of the "if" statement as a currently open
execution path,
- Compare the operands of both "if" statements, seeing that the first
operand is the same variable, the second operand is the same constant,
and the operators are directly opposite,
- Deduce that the criteria of the two "if" statements are directly
opposite,
- Use this information to close both open execution paths,
And only then proceed as it would have done *straight away* in the
"if else" case.
What if the variable in the second "if" statement was not i, but j, and
j was assigned a value from i? What if that assignment was done not in
the normal execution path, but an "if" statement whose criterion just
happened to be always true? What if there was a reassignment to j later,
but in a "for" loop that iterates exactly 0 times? What if j was first
decremented by two, then incremented by one, then again incremented by
one?
Go on, *you* try to write a formal, deterministic algorithm to take
care of all this. I suppose there are highly-paid research groups in a
couple of universities working on it. If you produce successful reports,
you could write an article about it in a computing magazine.
The issue gets even more interesting when the criteria for the "if"
statements come from methods in different objects. Sooner or later the
Java compiler will essentially have to run the program in order to find
out if it can compile it.
But for the "if else" case, there is only one simple rule: In every
"if else", no matter where it appears or what its criterion or
background state are, *exactly one* of its branches will *always*
execute. Not both, not neither, *exactly one*. All the compiler has to
do is to spot an "if" with an "else" and instantly arrive at this
conclusion.

In the general case, deciding whether or not a variable will have been
initialized at a certain point in a program's execution is equivalent
to solving the halting problem: in other words, it's known to be
computationally undecidable. For more information, google on Rice's
theorem. However, there are useful special cases for which the
question *is* decidable, and Java takes advantage of some of those.
As has been said elsewhere in this thread, we want to able to write
code that will compile on every implementation of Java, so compilers
are not allowed to go beyond the rules specified in the language
definition.

I'm not sure that having the compiler check for this is useful in
practice, but once that decision has been made this is a practical way
to implement it.
 
J

Joona I Palaste

The corrected code is:
<code>
Object o = ...;
if (o instanceof Dog)
{
( (Dog) o ).bark();
}
</code>
The rational is that the base Object o does not contain the method 'bark',
so you have to cast it into the required class 'Dog'.
Also, note that the cast of 'o' into 'Dog' is valid at this point, because
it is protected by the 'instanceof' condition enclosing it's use. If you
didn't have the if statement, and tried to cast any object, chances it would
fail with a conversion error exception, assuming a random object 'o'.

Your explanation is correct, however I think that Chris already knew
that.
 
F

FISH

[snipped...]
- Note that there is no code between the two "if" statements that might
diverge program flow away from the second "if" statement,

In a multi-threaded environment this is even more of a headache, as
even two consecutive 'if' statements that have inverse logic can be
either both or neither run. Okay - this is less of a problem for
variables of local scope - but even if the OP does comes up with such
a deterministic algorithm, multi-threading would mean it could only
be applied to data members which the compiler could prove would be
unique only to that thread of execution. Surely?


-FISH- ><>
 
J

Joona I Palaste

In a multi-threaded environment this is even more of a headache, as
even two consecutive 'if' statements that have inverse logic can be
either both or neither run. Okay - this is less of a problem for
variables of local scope - but even if the OP does comes up with such
a deterministic algorithm, multi-threading would mean it could only
be applied to data members which the compiler could prove would be
unique only to that thread of execution. Surely?

You are right about this. However, if the variables in question are at
method scope rather than class scope or instance scope, you don't need
to worry about multi-threading. Each thread gets its own private copy
of method-scope variables but has to share class-scope and instance-
scope variables with other threads. And ever since the deprecation of
Thread.stop(), one thread cannot directly force another thread to change
its course of execution against that thread's will. If we have two
threads, one of which does A and B, the other does C and D, then the
actual physical sequence may be A-B-C-D, A-C-B-D, A-C-D-B, C-A-B-D,
C-A-D-B or C-D-A-B. It is always guaranteed that all four operations
are done, A is done before B, and C is done before D. Nothing else
about the ordering is guaranteed. (This is assuming a strictly
sequential intra-thread execution with no control structures.)
 
C

Chris Smith

FISH said:
In a multi-threaded environment this is even more of a headache, as
even two consecutive 'if' statements that have inverse logic can be
either both or neither run. Okay - this is less of a problem for
variables of local scope - but even if the OP does comes up with such
a deterministic algorithm, multi-threading would mean it could only
be applied to data members which the compiler could prove would be
unique only to that thread of execution. Surely?

None of that is wrong, but you make it sound harder than it is. A local
variable is *always* unique to a thread of execution, so the "proof" on
the part of the compiler is trivial. This isn't like C where you could
pass around pointers to your local variables; a local variable is only
accessed by one thread, within the context of a single method
invocation. No exceptions.

For non-local variables, this might be a concern between threads... but
if it were, it would mean that you've written broken code. To read a
variable twice, when it may have been changed in the interim, without
crossing a synchronization barrier in the interim (i.e., entering or
leaving a synchronized method or block) would indicate that your code
has undefined behavior. Fix that, and you wouldn't have to worry any
more. That said, for non-local variables it *might* be a concern that
the first 'if' block could modify the state of the condition; and that
may very well be impossible for the compiler to prove doesn't occur.

Note that I'm not disagreeing that the currently defined rules are quite
sensible; they are. Just pointing out that multithreading isn't as
mysterious as it's being made out.

--
www.designacourse.com
The Easiest Way to Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top