local variable assertion

D

dflanagan

I've started studying Ruby, and while I like it, one thing that bothers
me is that there is not a way to explicitly declare a variable in order
to say "I want a variable local to this scope: I don't want to reuse
some variable by the same name in a containing scope." In particular,
the fact that giving a block parameter the same name as an existing
variable can overwrite that variable troubles me.

Now I haven't written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.

Nevertheless, I thought that the problem could be addressed if there
was some way to "declare" your local variables before you use them. I
put "declare" in quotes because that isn't the right word in Ruby.
What I wanted was a facility to assert that a variable is not yet in
use.

I came up with the code that follows. The introductory comment
explains. I imagine that someone has already done this, but I'd be
interested to hear what folks think.

Thanks,

David Flanagan
------------
module Kernel
# Assert that the named variables do not exist yet,
# so that they can be used as local variables in the block without
# clobbering an existing variable
#
# This method expects any number of variable names as arguments.
# The names may be specified as symbols or strings.
# The method must be invoked with an associated block, although the
# block may be empty. It uses the binding of the block with eval to
check
# whether the variable names are in use yet, and throws a NameError
if
# any of them are currently used.
#
# If the block associated with local expects no arguments, then this
method
# invokes it. The code within the block can safely use the symbols
# passed to local. If the block expects arguments, then local
assumes
# that the block is intended for the caller and just returns it.
#
# Here are typical some uses of this method:
#
# local :x, :y do # Execute a block in which x and y are local
vars
# data.each do |x|
# y = x*x
# puts y
# end
# end
#
# Here's a way to use local where nested blocks are not needed:
#
# data.each &local:)x) {|x| puts x*x }
#
# Here's a way to use it as an assertion with an empty block
#
# local:)x, :y) {} # Assert that x and y aren't in use yet.
# data.each do |x| # Now go use those variables
# y = x*x
# puts y
# end
#
#
def local(*syms, &block)
syms.each do |sym|
# First, see if the symbol itself is defined as a variable or
method
# XXX: do I also need to check for methods like x=?
# XXX Would it be simpler or faster to do eval local_variables
instead?
value = eval("defined? #{sym.to_s}", block)
# If it is not defined, then go on to the next symbol
next if !value
# Otherwise, the symbol is in use, so raise an exception
raise NameError.new("#{sym} is already a #{value}")
end

# If none of the symbols are in use, then we can proceed.
# What we do next depends on the arity of the block, however.
# If the block expects no arguments, then we just call it
# If the block was declared with arguments, then it is not intended
# for this method. Instead, we return it so our caller can invoke
it.
if block.arity == 0 or block.arity == -1
block.call
else
block
end
end
end
 
R

Rob Sanheim

Funny wordwrapping of the code and comments in that post...

You can also see the code at my blog:
http://www.davidflanagan.com/blog/2007_01.html#000120

David

Without diving too much into the implementation of this, I would say
if you really find yourself needing it alot you should refactor to
smaller methods that do less stuff. I really find composed method and
extract method are some of the most critical refactorings in Ruby.

Oh yea, and welcome to Ruby =). I'm sure you'll find a lot to love
coming from the Javascript world. There are some libraries that let
you write prototype style Ruby similiar to idiomatic Javascript.

- Rob
 
G

gga

(e-mail address removed) ha escrito:
I've started studying Ruby, and while I like it, one thing that bothers
me is that there is not a way to explicitly declare a variable in order
to say "I want a variable local to this scope: I don't want to reuse
some variable by the same name in a containing scope." In particular,
the fact that giving a block parameter the same name as an existing
variable can overwrite that variable troubles me.

Now I haven't written enough Ruby code to know whether this is really a
problem in practice or if it is just a theoretical concern.

I think it is more theoretical, as that would indicate you are really
writing VERY long functions.
Remember that in Ruby, global variables are $, instance variables are @
and class variables are @@, so there's a very rare chance of conflict.

That being said, your code can be done simpler, like:

def let(*syms, &block)
raise "No block provided for undefined?" unless block_given?

syms.each do |sym|
value = eval("defined? #{sym.to_s}", block)
next if !value
raise NameError.new("#{sym} is already a #{value}")
end

yield
end


# block check
let( :x ) { x = 20 }

begin
let( :x ) { p 'never run' }
rescue
end

# Forgot block...
let( :x )
 
D

dflanagan

Rob,

Yes, breaking long methods up is usually good. If the smaller methods
that one is refactoring into are not of general utility, however, then
I would argue (perhaps in my JavaScript mindset) that they should not
be methods, but lambdas instead. But re-factoring into lambdas
doesn't help with the local variable issue since you can never be
confident about the scope of your lambda parameters.

Isn't refactoring, in fact, one of the scenarios where you run into
problems with variable overlap? If you cut-and-paste a block from one
method into another, and the new method uses a variable that has the
same names as one of the block parameters, you've just set yourself up
for trouble.

In don't like Perl, but I do think that Perl's "my" variables solve
this problem elegantly.

David
 
D

dflanagan

writing VERY long functions.
Remember that in Ruby, global variables are $, instance variables are @
and class variables are @@, so there's a very rare chance of conflict.

I disagree. Local variables are used most often. I see a good chance
of getting bitten by this problem, especially when using generic
variable names like i or x as block parameters to loop iterators.

Suppose you've got a simple loop to iterate through an array

data.each { |x| x*x}

Now you refactor some code and end up cutting-and-pasting that loop
into a method that happens to use x as a parameter. Suddenly your loop
behaves differently. x is no longer local to the block and it
overwrites the local variable in your method.
That being said, your code can be done simpler, like:

def let(*syms, &block)
raise "No block provided for undefined?" unless block_given?

syms.each do |sym|
value = eval("defined? #{sym.to_s}", block)
next if !value
raise NameError.new("#{sym} is already a #{value}")
end

yield
end

This actually breaks the second-use case for my method. If the block
expects parameters, then I want my method to return the block so that
it can be passed on to the calling method. This allows me to use
local() without having to nest blocks. Consider this invocation:

data.each &local:)x) {|x| puts x*x }

The block is passed to local, which checks that it is safe to use x as
a variable in the block. Then local() returns the block, which gets
passed, in turn, to the each() iterator. The & and the required
parentheses make this syntax a little messy but it allows one block
instead of two.
# block check
let( :x ) { x = 20 }

begin
let( :x ) { p 'never run' }
rescue
end

I believe this would actually print 'never run'. Since x is not a
local variable, its use in the first block remains local to that block.
# Forgot block...
let( :x )

This won't work: the block is needed to pass to eval() for checking for
the existance of the local varables. Otherwise, I'm just checking for
local varaibles inside the local() method itself.

David
 
A

ara.t.howard

I disagree. Local variables are used most often. I see a good chance of
getting bitten by this problem, especially when using generic variable names
like i or x as block parameters to loop iterators.

i've been writing ruby in production 90% of my coding time for nearly 6 years
and have hit this only one or two times. i'd say it's a valid concern but
nearly always a massive sign of code smell: one simply should never have too
many variables to keep in ones head in scope at any given moment. if one
does, it's time to refactor.
Suppose you've got a simple loop to iterate through an array

data.each { |x| x*x}

Now you refactor some code and end up cutting-and-pasting that loop into a
method that happens to use x as a parameter. Suddenly your loop behaves
differently. x is no longer local to the block and it overwrites the local
variable in your method.

you're quite right. what i fail to see is how a 'local' method changes that
one bit. consider, say you paste the above snippet into some code that has an
'x' defined 20 lines up, you don't notice and introduce a bug. in order to
prevent this you are advocating this

local :x do
data.each { |x| x*x}
end

so a re-def of x will raise an error. at first glance that seems ok.
consider this however: one must __know_in_advance__ which vars to declare
local and which not. in your example it's the only obivous one but, in fact,
there are two candidates: 'x' and 'data'. now, in this case we know that we
do, in fact, require the 'data' var __not__ to be local, but to picked up from
the current scope. note that it's __precisely__ this ability to do mixed
scoping which makes blocks useful at all - otherwise we'd all just pass
stuff around.

perhaps you see where i'm going? in order to use local effectively with even
a moderately complex peice of code one needs to look at the code and decide
which vars should be local, which should be block-local (ruby 1.9), and which
should be scoped normally. all this has to be known __up_front__!

the thing is, if i have to know, as a programmer, up front which vars to
declare local then i don't have a problem any more! ;-)

so, imho, you are correct in pointing out a source for errors but blocks, like
all coding contructs, must be weighed by comparing advantages vs. disadvatages
and the mixing of scopes certainly resides on both lists.

it would be nice if way existed to solve the problem you have underscored, but
if that solution requires me to do the same amount of work that i had to do
before to solve it 'manually' then it simply becomes line noise and, as we all
know, any code you write that you don't have to is simply adding bugs.

my 2cts.

kind regards.

-a
 
D

dflanagan

you're quite right. what i fail to see is how a 'local' method changes that
one bit. consider, say you paste the above snippet into some code that has an

It changes it because you fail-fast with a NameError rather than
possibly introducing a bug that may not be near to the source of the
error.
'x' defined 20 lines up, you don't notice and introduce a bug. in order to
prevent this you are advocating this

local :x do
data.each { |x| x*x}
end

so a re-def of x will raise an error. at first glance that seems ok.
consider this however: one must __know_in_advance__ which vars to declare
local and which not.

The local vars are the ones that you want to be local in the block. I
think this is always easy. And, when you have to cut-and-paste, you
copy the entire local block, so that the protection it gives you
travels with the code.
 
A

ara.t.howard

It changes it because you fail-fast with a NameError rather than possibly
introducing a bug that may not be near to the source of the error.

but you could fail faster? by the time you decide the names you will use
you no longer need 'local'?
The local vars are the ones that you want to be local in the block. I
think this is always easy. And, when you have to cut-and-paste, you
copy the entire local block, so that the protection it gives you
travels with the code.

i think this is misleading. take your example: if cut and paste this

local :x do
data.map!{|x| x ** 2}
end

somewhere else i'm safe not only if x hasn't been used in the new scope, but
also if data hasn't, or is has, but it's the correct value. the thing is you
are not going paste that code without knowing where 'data' is coming from: you
haven't eliminated the problem or even really reduced it since you __still__
must ensure your current scope isn't too big you wrap your brain around:
you've got to know where data is coming from and it's going to come from
exactly the same place 'x' is - the current scope, which you must understand
in it's entirety in order to use 'data' properly in the new cut-and-pasted
context...

the entire concept that a programming contruct can make it safe to cut and
paste code is really quite a strech...

your local impl is __still__ a mixed scope like any other block in ruby and
therefore suffers exactly the same issues: in the above you could easily
clobber a local version of 'data', especially if you cut and pasted it into a
scope where it's origin was unknown.

in summary, i don't think one can possibly solve the issues of mixed scoping
of blocks with a method that takes a mixedly scoped block! ;-)

in addtion, the local impl requires __twice__ as many definitions of local
variables and we all know where that goes: more lines almost never equals
fewer bugs - that's the d.r.y principle that's so big in the ruby community.

in anycase i think matz's block-local vars, due for ruby 1.9 address the
largest issues with block scoping already.

cheers.

-a
 
G

gga

This actually breaks the second-use case for my method. If the block
expects parameters, then I want my method to return the block so that
it can be passed on to the calling method.

Ah, I see. Sorry, I did not catch that from your docs. It is indeed an
ugly construct.
I'm not sure trying to save a block is a smart move. You end up with a
method that behaves and returns something very differently just based
on a block's arity. That's just a huge headache waiting to happen.

I mean.... if you are really concerned about the efficiency of this:

local:)x) { data.each { |x| x*x } }

I'd say you are definitively guilty of premature optimization.

I believe this would actually print 'never run'. Since x is not a
local variable, its use in the first block remains local to that block.

Correct, actually. I seemed to have forgotten the x = 0 before the
begin block. Sorry.

PS. Welcome to the Ruby community. Looking forward to see what you'll
do with ruby.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,021
Latest member
AkilahJaim

Latest Threads

Top