Eliminating "with"...

  • Thread starter Rasmus Kromann-Larsen
  • Start date
R

Rasmus Kromann-Larsen

The With Conundrum

I'm currently writing a master thesis on (preparations for) static
analysis of JavaScript, and after investigating the with statement, it
only even more evident to me that the with statement is indeed bad.

My initial thoughts on with were based on:
http://yuiblog.com/blog/2006/04/11/with-statement-considered-harmful/

I built some examples to investigate - and later try to "eliminate"
the with statement from the code I'm analysing. All examples besides
the first one were evaluated in Rhino.

-----------------------------------------------------

Example 1 (pseudo). Readability.

with(x) {
a = b
}

Could be evaluated as:

e.a = e.b
e.a = b
a = e.b
a = b

(And this only gets worse if you add more variables - exponentially
worse actually :)

-----------------------------------------------------

Example 2 (rhino): Assignments.

with(x) {
var z = 42; // Or just z = 42 (without var)
}

Where does z end up? If x.z already exists, it will be overwritten, if
not, it will be added to the global scope.

-----------------------------------------------------

Example 3 (rhino): Assignments continued (functions).

--- Example 3.1 ---

var obj = { x:42 };

with(obj) {
function x() { print(x); }
}

(x || obj.x)(); // Hack to make it call existing function.

--- Example 3.2 ---

var obj = { x:42 };

with(obj) {
var x = function() { print(x); }
}

(x || obj.x)(); // Hack to make it call existing function.

-------------------

Example 3.1 will create a global function x with obj first in it's
scope chain, thus printing: "42"

Example 3.2 will create the method obj.x (since it overwrites 42),
with obj first in it's scope chain, thus printing: "function x()
{ print(x); }"

-----------------------------------------------------

Since I'm interested in doing static analysis on JavaScript code --
and I don't want to deal with all these abnormalities, I decided to
try and normalize my way out of it. That is, take any given code
containing a with statement and automatically output valid JavaScript
with the exact same semantics, but without the with statement.


My first wild attempt was to try and introduce temporary variables, so
that:

-----------------------------------------------------

with(x) {
a = b;
}

-----------------------------------------------------

would become 3 blocks:

-----------------------------------------------------

// With start. (save all variables)
var evaled_x = x;
var temp_a = (evaled_x.a || a);
var temp_b = (evaled_x.b || b);


// With block. (do replaced evaluation)
temp_a = temp_b;


// With end. (restore variables to newly calculated)
if(evaled_x.a)
evaled_x.a = temp_a;
else
a = temp_a;

if(evaled_x.b)
evaled_x.b = temp_b;
else
b = temp_b;

-----------------------------------------------------

But as far as I can figure, this won't do. I reached the conclusion
that you'd have to introduce temporary variables for each statement,
since each statement could potentially either directly or by side-
effects change the presence of properties on the evaluated object
(delete or assignments).

Also, more factors might even apply that I havn't even begun to
consider yet.


So, I was wondering, does anyone have a better solution (or more
insights) in the glorious quest of trying to eliminate a with
statement?


- Rasmus.
 
E

Evertjan.

Rasmus Kromann-Larsen wrote on 29 mrt 2007 in comp.lang.javascript:
Example 1 (pseudo). Readability.

with(x) {
a = b
}

Could be evaluated as:

e.a = e.b
e.a = b
a = e.b
a = b

(And this only gets worse if you add more variables - exponentially
worse actually :)

<script type='text/javascript'>

var a = 1;
var b = 2;

var x = {};

x.b = 7;
// x.a = 9;

with(x) {
a = b;
};

alert(a); // 7
alert(b); // 2

</script>

It works just as expected.

It is all a question of scope, just like in a function.
If you make your scope too large you lose your clear view.
So be it.

Perhaps you do not like a weak-typed language in general?
I do, as it meets my expectations.
I take the art of writing nice code and the art of debugging as
challenges.

Reconstructing badly written code is not my hobby,
yet it could be yours.

Code written for others to work on,
should be completed with equal amounts of remark lines,
we used to say in the old assembler times.
Even now not a bad advice.
 
R

Rasmus Kromann-Larsen

Rasmus Kromann-Larsen wrote on 29 mrt 2007 in comp.lang.javascript:








<script type='text/javascript'>

var a = 1;
var b = 2;

var x = {};

x.b = 7;
// x.a = 9;

with(x) {
a = b;

};

alert(a); // 7
alert(b); // 2

</script>

It works just as expected.

Perhaps it does work as expected, I still had to look over your code
multiple times to make sure what was being set or not set. I'm just
saying that reading with blocks can be counter-intuitive, especially
if they're badly written.
Perhaps you do not like a weak-typed language in general?
I do, as it meets my expectations.
I take the art of writing nice code and the art of debugging as
challenges.

I have no quarrels with weak-typed languages, they have their place as
so many other types of languages have theirs. I fail to see how
anything I wrote could be interpreted in that way, but nevermind
that :)

Thanks for your reply, but I think you misunderstood my post. I was
trying to illuminate my problem with 'with' in regards to static
analysis - and asking if anyone had an idea for ways of replacing a
"generic" with statement with semantically identical JavaScript code.

- Rasmus.
 
E

Evertjan.

Rasmus Kromann-Larsen wrote on 29 mrt 2007 in comp.lang.javascript:
Perhaps it does work as expected, I still had to look over your code
multiple times to make sure what was being set or not set. I'm just
saying that reading with blocks can be counter-intuitive, especially
if they're badly written.


I have no quarrels with weak-typed languages, they have their place as
so many other types of languages have theirs. I fail to see how
anything I wrote could be interpreted in that way, but nevermind
that :)

Forget it, just a prank.
Thanks for your reply, but I think you misunderstood my post. I was
trying to illuminate my problem with 'with' in regards to static
analysis - and asking if anyone had an idea for ways of replacing a
"generic" with statement with semantically identical JavaScript code.

Rasmus, sorry, I have no ready solution for you there.

In the case of a generic script language, in the javascript clientside
case "intended" to be interpreted by many different and more or less
standard incompliant interpreter engines, isn't your quest for exact
analysis deemed to be futile?
replacing a "generic" with statement with semantically identical
JavaScript code.

The above being said, a string parser could easily be build, even in
javascript, to change a with coded javascript script into one without
with [did I count my with'es right?], as long as the scope laws are
incorporated in the algorithm.

The same surely must be true for the scope laws in functions,
where the absence of a var definition only infers a global scope if the
variable was var-defined globally elswhere?

Even the dual definition of the + as adder and concatenator will give you
the same if not more static analysis problems? That is what I, perhaps
too boldly, ment with "weak-typed in general".
 
R

Richard Cornford

Rasmus said:
The With Conundrum

I'm currently writing a master thesis on (preparations for)
static analysis of JavaScript, and after investigating the
with statement, it only even more evident to me that the
with statement is indeed bad.

Bad idea or not the - with - statement exists in javascript and
even if officially deprecated now it will be around for a very
long time to come, for back-compatibility with existing code. It
is also (very) occasional useful to add a specific object to the
scope chain of a function.
My initial thoughts on with were based on:
http://yuiblog.com/blog/2006/04/11/with-statement-considered-harmful/

I built some examples to investigate - and later try to
"eliminate" the with statement from the code I'm analysing.
All examples besides the first one were evaluated in Rhino.

-----------------------------------------------------

Example 1 (pseudo). Readability.

with(x) {
a = b
}

Could be evaluated as:

e.a = e.b
e.a = b
a = e.b
a = b

(And this only gets worse if you add more variables -
exponentially worse actually :)

With the possibility of deleting a or b doing nothing to make
the situation simpler.
-----------------------------------------------------

Example 2 (rhino): Assignments.

with(x) {
var z = 42; // Or just z = 42 (without var)
}

Where does z end up? If x.z already exists, it will be
overwritten, if not, it will be added to the global scope.

-----------------------------------------------------

Example 3 (rhino): Assignments continued (functions).

--- Example 3.1 ---

var obj = { x:42 };

with(obj) {
function x() { print(x); }

By strict ECMAScript rules that is a syntax error; No statement
may commence with the - function - keyword and only statements
may appear inside a Block Statement (FunctionDeclarations and
Statements being the two syntactic units from which javascript
programs are constructed). JavaScript(tm), and so Rhino, has a
syntax extension that provides a Function Statement, which is
what you have here in Rhino. Other ECMAScript implementations
(including JScript(tm) and that in the Opera browser) error-correct
this syntax error and (more or less) interpret it as a Function
Declaration regardless of its incorrect context.

This introduces the question of what language you are planning
as the subject of your static analysis. If it is JavaScript(tm)
then the above is fine, but the end result will have limited
applicability in the real world (where cross-browser, or at
least multi-browser scripts would be the useful subject). If
the subject was ECMAScript (which is the common sub-set of
implementations without any extensions) then it would be a
good idea to recognise its syntax errors.

On the other hand, if you are taking general source code that
may or may not include - with - statements and then converting
it into a form that does not use them then that form can be
JavaScript(tm) only, and so itself employ as many JavaScript(tm)
extensions as it likes.
Since I'm interested in doing static analysis on JavaScript code --
and I don't want to deal with all these abnormalities, I decided to
try and normalize my way out of it. That is, take any given code
containing a with statement and automatically output valid JavaScript
with the exact same semantics, but without the with statement.


My first wild attempt was to try and introduce temporary variables, so
that:

-----------------------------------------------------

with(x) {
a = b;
}

-----------------------------------------------------

would become 3 blocks:

-----------------------------------------------------

// With start. (save all variables)
var evaled_x = x;
var temp_a = (evaled_x.a || a);
var temp_b = (evaled_x.b || b);

That won't work. Here you are using the value of the - b -
property of - evaled_x - to make the decision. If - evaled_x -
(and its prototype, and their respective prototypes) do not
have a - b - property then the value of the - evaled_x.b -
expression will be the undefined value, which will type-convert
to boolean false, and the - (evaled_x.b || b) - expression will
work. But if - evaled_x - has a - b - property but the value of
that property is boolean false, an empty string, the null value,
numeric zero or the undefined value (as may be explicitly
assigned to a property) then the right hand side of the logical
OR expression will still be used, and that would be wrong, and
potentially a runtime error.

What you need to do in order to determine which of - evaled_x.b
- or - b - to use is find out if - evaled_x -, or any object on
its prototype chain, actually has a - b - property. For the
object itself that is easy, as ECMAScript defines a -
hasOwnProperty - method, which is inherited by all objects. The
problem is you also need to call it on every object on the -
evaled_x - object's prototype chain, and ECMAScript keeps the
object's prototype chain internal.

However, if your subject really is JavaScript(tm), or you are
converting ECMAScript source into JavaScript(tm) code for analysis,
then you can use its (JavaScript(tm)'s) - __proto__ - extension,
which is a property of objects that refers to the object at the
top of the object's prototype chain (so you can work down the
whole prototype chain).

In that case a better test may be a function like:-

function hasProperty(obj, propertyName){
return (
(obj.hasOwnProperty(propertyName))||
(
Boolean(obj.__proto__)&&
(hasProperty(obj.__proto__, propertyName))
)
);
}

- with:-

var temp_b = hasProperty(evaled_x, 'b')?evaled_x.b:b;
// With block. (do replaced evaluation)
temp_a = temp_b;


// With end. (restore variables to newly calculated)
if(evaled_x.a)
evaled_x.a = temp_a;
else
a = temp_a;

if(evaled_x.b)
evaled_x.b = temp_b;
else
b = temp_b;

-----------------------------------------------------

But as far as I can figure, this won't do. I reached the conclusion
that you'd have to introduce temporary variables for each statement,
since each statement could potentially either directly or by side-
effects change the presence of properties on the evaluated object
(delete or assignments).

Also, more factors might even apply that I havn't even begun to
consider yet.

The - eval - function and the - Function - constructor turning string
data into executable code?
So, I was wondering, does anyone have a better solution (or more
insights) in the glorious quest of trying to eliminate a with
statement?

It is possible to implement ECMAScript in ECMAScript, and do so
without using the - with - statement. The implication of this is
that it must be possible to re-code any ECMAScript that uses the
- with - statement without it. However, the implications of doing
so are massive. The - with - statement manipulates the scope chain,
so the resulting code would have to dispense with the implicit scope
chains
and explicitly implement them, Identifier resolution and and so also
implement its own objects with their prototype inheritance, prototype
chain and property name resolution, and substitute that
alternative mechanism for _all_ of the subject code.

It can be done, but you would then not be analysing the original
source code but instead the equivalent of the executable structure
'complied' from the original source code. In which case it would
be as valid to use an ECMAScript implementation that compiled a
discreet 'bytecode' from its source code and make that 'bytecode'
the subject of analysis.

Richard.
 
R

Rasmus Kromann-Larsen

Bad idea or not the - with - statement exists in javascript and
even if officially deprecated now it will be around for a very
long time to come, for back-compatibility with existing code. It
is also (very) occasional useful to add a specific object to the
scope chain of a function.







With the possibility of deleting a or b doing nothing to make
the situation simpler.








By strict ECMAScript rules that is a syntax error; No statement
may commence with the - function - keyword and only statements
may appear inside a Block Statement (FunctionDeclarations and
Statements being the two syntactic units from which javascript
programs are constructed). JavaScript(tm), and so Rhino, has a
syntax extension that provides a Function Statement, which is
what you have here in Rhino. Other ECMAScript implementations
(including JScript(tm) and that in the Opera browser) error-correct
this syntax error and (more or less) interpret it as a Function
Declaration regardless of its incorrect context.

This introduces the question of what language you are planning
as the subject of your static analysis. If it is JavaScript(tm)
then the above is fine, but the end result will have limited
applicability in the real world (where cross-browser, or at
least multi-browser scripts would be the useful subject). If
the subject was ECMAScript (which is the common sub-set of
implementations without any extensions) then it would be a
good idea to recognise its syntax errors.

On the other hand, if you are taking general source code that
may or may not include - with - statements and then converting
it into a form that does not use them then that form can be
JavaScript(tm) only, and so itself employ as many JavaScript(tm)
extensions as it likes.






That won't work. Here you are using the value of the - b -
property of - evaled_x - to make the decision. If - evaled_x -
(and its prototype, and their respective prototypes) do not
have a - b - property then the value of the - evaled_x.b -
expression will be the undefined value, which will type-convert
to boolean false, and the - (evaled_x.b || b) - expression will
work. But if - evaled_x - has a - b - property but the value of
that property is boolean false, an empty string, the null value,
numeric zero or the undefined value (as may be explicitly
assigned to a property) then the right hand side of the logical
OR expression will still be used, and that would be wrong, and
potentially a runtime error.

What you need to do in order to determine which of - evaled_x.b
- or - b - to use is find out if - evaled_x -, or any object on
its prototype chain, actually has a - b - property. For the
object itself that is easy, as ECMAScript defines a -
hasOwnProperty - method, which is inherited by all objects. The
problem is you also need to call it on every object on the -
evaled_x - object's prototype chain, and ECMAScript keeps the
object's prototype chain internal.

However, if your subject really is JavaScript(tm), or you are
converting ECMAScript source into JavaScript(tm) code for analysis,
then you can use its (JavaScript(tm)'s) - __proto__ - extension,
which is a property of objects that refers to the object at the
top of the object's prototype chain (so you can work down the
whole prototype chain).

In that case a better test may be a function like:-

function hasProperty(obj, propertyName){
return (
(obj.hasOwnProperty(propertyName))||
(
Boolean(obj.__proto__)&&
(hasProperty(obj.__proto__, propertyName))
)
);

}

- with:-

var temp_b = hasProperty(evaled_x, 'b')?evaled_x.b:b;





The - eval - function and the - Function - constructor turning string
data into executable code?


It is possible to implement ECMAScript in ECMAScript, and do so
without using the - with - statement. The implication of this is
that it must be possible to re-code any ECMAScript that uses the
- with - statement without it. However, the implications of doing
so are massive. The - with - statement manipulates the scope chain,
so the resulting code would have to dispense with the implicit scope
chains
and explicitly implement them, Identifier resolution and and so also
implement its own objects with their prototype inheritance, prototype
chain and property name resolution, and substitute that
alternative mechanism for _all_ of the subject code.

It can be done, but you would then not be analysing the original
source code but instead the equivalent of the executable structure
'complied' from the original source code. In which case it would
be as valid to use an ECMAScript implementation that compiled a
discreet 'bytecode' from its source code and make that 'bytecode'
the subject of analysis.

Richard.


Thanks for your insightful answers Richard.

What I'm analysing is ECMAScript, so it'll make some of the situations
simpler, as you described. The thing that spurred this whole thing was
an article about typesystems in JavaScript, which incidently said that
the - with - statement could easily be implemented using other
language constructs in JavaScript. But when I began to think about it,
it wasn't so simple in my mind, and I wanted to ask here, since I
noticed there's some pretty nice JavaScript (ECMAScript) people here.

Also, I didn't want to outlaw the - with - statement, I was simply
looking for ways of eliminating it from the source in a normalization
phase before actual static analysis. The goal of the thesis is not to
provide a static analysis solution for ECMAScript, but to investigate
the possibilites (and pitfalls) of doing so.

I will most certainly look into making very clear that I'm dealing
with ECMAScript and describe which extensions I "use" in my research,
do you know any resources describing the differences between
JavaScript and ECMAScript, what extensions either browsers implemented
etc.?


Thanks again,
- Rasmus.
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]
oglegroups.com>, Mon, 2 Apr 2007 02:08:36, Rasmus Kromann-Larsen
What I'm analysing is ECMAScript,

Rather than relying on ECMA 263 3rd Edn (and Errata), you should use
also ISO 16262 which should have fewer errors.

I will most certainly look into making very clear that I'm dealing
with ECMAScript and describe which extensions I "use" in my research,
do you know any resources describing the differences between
JavaScript and ECMAScript, what extensions either browsers implemented
etc.?

"either" browser(s)? I've used 3 different ones.

Versions of javascript/jscript were written, and then the standard(s).
One can assert, more or less soundly, that there is at any one moment
only one standard, the latest of ECMA & ISO; but there are multiple
parallel javascripts.

I'm finding (and noting near the top of js-datex.htm) differences in
date behaviour between MSIE and Firefox. IMHO, at least one of them is
where the standard allows a choice; and at least one of them seems to
involve a definite bug.
 
R

Rasmus Kromann-Larsen

In comp.lang.javascript message <[email protected]
oglegroups.com>, Mon, 2 Apr 2007 02:08:36, Rasmus Kromann-Larsen


Rather than relying on ECMA 263 3rd Edn (and Errata), you should use
also ISO 16262 which should have fewer errors.

Thanks for your reply Dr J R Stockton, I've looked at the ISO 16262
and compared it to the ECMA 262 (I expect you didn't actually mean 263
as you wrote), but havn't been able to find any notable differences.
Has the errata from ECMA 262 3rd Edn been incorporated into ISO 16262
or?

- Rasmus.
 
D

Dr J R Stockton

In comp.lang.javascript message <[email protected]
oglegroups.com>, Tue, 3 Apr 2007 02:54:57, Rasmus Kromann-Larsen
Thanks for your reply Dr J R Stockton, I've looked at the ISO 16262
and compared it to the ECMA 262 (I expect you didn't actually mean 263
as you wrote), but havn't been able to find any notable differences.

They did fix the problem with ECMA-262 15.9.1.7 to 15.9.1.9, which is
notable but does not affect the language described.
Has the errata from ECMA 262 3rd Edn been incorporated into ISO 16262
or?

I've not needed to make a full check - but someone should. The date on
the errata page is later than that of the ISO standard, but that proves
little.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,073
Latest member
DarinCeden

Latest Threads

Top