Analyzer for errors in code ?

D

David Unric

Hello,

I would like to check a source code for common errors like use of unused
variables. The "ruby -c" only checks for syntax errors.

Is there some tool for Ruby like pylint or pyflakes for Python ?
It has to _not_ run the code to do analysis. At least mistypos of method
and variable names would be enough.

Thx in an advance.
 
R

Robert Klemme

I would like to check a source code for common errors like use of unused
variables. The "ruby -c" only checks for syntax errors.

Is there some tool for Ruby like pylint or pyflakes for Python ?
It has to _not_ run the code to do analysis. At least mistypos of method
and variable names would be enough.

Unused local variables would be doable, but detection of typos of
variable and method names is almost impossible without running the
code. The reason is that the set of valid method names may change at
will during the course of a programs execution. Plus, with
#method_missing you can properly handle methods that are never ever
defined. The usability of such a tool might be very limited.

Having said that you may want to try ruby's command line option -w.

See also
http://stackoverflow.com/questions/1805146/where-can-i-find-an-actively-developed-lint-tool-for-ruby

Kind regards

robert
 
D

David Unric

Robert Klemme wrote in post #960353:
Unused local variables would be doable, but detection of typos of
variable and method names is almost impossible without running the
code. The reason is that the set of valid method names may change at
will during the course of a programs execution. Plus, with
#method_missing you can properly handle methods that are never ever
defined. The usability of such a tool might be very limited.

Having said that you may want to try ruby's command line option -w.

See also
http://stackoverflow.com/questions/1805146/where-can-i-find-an-actively-developed-lint-tool-for-ruby

Kind regards

robert

I understand it is difficult because the dynamic nature of Ruby, but
Python is in the same league and PyLint tool does "wonders" at detecting
possible errors without code execution
(http://www.logilab.org/card/pylintfeatures). It uses Abstract Syntax
Tree parser and Ruby also has its own AST so it would be possible.

But I asked for something more simplistic, like revealing trivial errors
like mentioned variable name typo.

See the following example:

Running with 'ruby -c' returns 'Syntax OK'.
Running with 'ruby -w' also doesn't warn about undefined 'mymsg'
variable.

Only when the condition becomes true, execution aborts with `<main>':
undefined local variable or method `mymsg' for main:Object (NameError)

if __FILE__ == $0
my_msg = 'Hello'
if ARGV[0] == 'doit'
puts mymsg
else
puts 'Nope'
end
end


How to avoid such pitfalls with writing more complex ruby code ? I can't
believe there is no other way how to verify the code without 'waiting
for exception at proper conditions' and then solve the bug.
 
R

Robert Klemme

Robert Klemme wrote in post #960353:
Unused local variables would be doable, but detection of typos of
variable and method names is almost impossible without running the
code. The reason is that the set of valid method names may change at
will during the course of a programs execution. Plus, with
#method_missing you can properly handle methods that are never ever
defined. The usability of such a tool might be very limited.

Having said that you may want to try ruby's command line option -w.

See also
http://stackoverflow.com/questions/1805146/where-can-i-find-an-actively-developed-lint-tool-for-ruby

I understand it is difficult because the dynamic nature of Ruby, but
Python is in the same league and PyLint tool does "wonders" at detecting
possible errors without code execution
(http://www.logilab.org/card/pylintfeatures). It uses Abstract Syntax
Tree parser and Ruby also has its own AST so it would be possible.

But I asked for something more simplistic, like revealing trivial errors
like mentioned variable name typo.

See the following example:

Running with 'ruby -c' returns 'Syntax OK'.
Running with 'ruby -w' also doesn't warn about undefined 'mymsg'
variable.

Only when the condition becomes true, execution aborts with `<main>':
undefined local variable or method `mymsg' for main:Object (NameError)

if __FILE__ == $0
my_msg = 'Hello'
if ARGV[0] == 'doit'
puts mymsg
else
puts 'Nope'
end
end


How to avoid such pitfalls with writing more complex ruby code ? I can't
believe there is no other way how to verify the code without 'waiting
for exception at proper conditions' and then solve the bug.

Well, actually you would catch the bug during testing which you have to
do anyway. If you don't catch it during testing you have bad test
coverage. :)

The problem with this is that "mymsg" might be a method call or a local
variable read. Ruby recognizes local variables (via assignment) so it
knows that "my_msg" is a local variable. But there is no easy way I am
aware of to reliably detect that "mymsg" is a typo from "my_msg". ;aybe
one could do some fuzzy pattern matching.

Kind regards

robert
 
D

David Unric

Robert Klemme wrote in post #960402:
Well, actually you would catch the bug during testing which you have to
do anyway. If you don't catch it during testing you have bad test
coverage. :)

The problem with this is that "mymsg" might be a method call or a local
variable read. Ruby recognizes local variables (via assignment) so it
knows that "my_msg" is a local variable. But there is no easy way I am
aware of to reliably detect that "mymsg" is a typo from "my_msg". ;aybe
one could do some fuzzy pattern matching.

Kind regards

robert

I see but such lint tool would parse whole source and included modules
and just warn if didn't find an asignment in the scope it's used in. In
the above case it wouldn't find any mymsg variable assignment or
function/method definiton and warn me about it.
 
K

Kirk Haines

[Note: parts of this message were removed to make it a legal post.]

I understand it is difficult because the dynamic nature of Ruby, but
Python is in the same league and PyLint tool does "wonders" at detecting
possible errors without code execution
(http://www.logilab.org/card/pylintfeatures). It uses Abstract Syntax
Tree parser and Ruby also has its own AST so it would be possible.

But I asked for something more simplistic, like revealing trivial errors
like mentioned variable name typo.

See the following example:

Running with 'ruby -c' returns 'Syntax OK'.
Running with 'ruby -w' also doesn't warn about undefined 'mymsg'
variable.

Only when the condition becomes true, execution aborts with `<main>':
undefined local variable or method `mymsg' for main:Object (NameError)

if __FILE__ == $0
my_msg = 'Hello'
if ARGV[0] == 'doit'
puts mymsg
else
puts 'Nope'
end
end


How to avoid such pitfalls with writing more complex ruby code ? I can't
believe there is no other way how to verify the code without 'waiting
for exception at proper conditions' and then solve the bug.
 
R

Ryan Davis

But I asked for something more simplistic, like revealing trivial = errors
like mentioned variable name typo.

What should a ruby lint tool do with the below code?

def method_missing(*) # at _any_ scope: instance, class, module, =
global...
# ...
end

def x
a =3D 20
b =3D 30
puts c
end

my vote: nothing...

It is impossible to determine in a language like ruby w/o executing it. =
This is why we write tests. I know you're looking for something to catch =
"simple" errors... but you also want it to be correct. You can't have =
both. For every correctness checker attempt, I can make it wrong.

From: http://www.cl.cam.ac.uk/teaching/0910/CompTheory/scooping.pdf
 
R

Ryan Davis


(damnit, sorry... I'm post workout and fairly braindead... here is the =
text)

Scooping the Loop Snooper
an elementary proof of the undecidability of the halting problem

No program can say what another will do.
Now, I won't just assert that, I'll prove it to you:
I will prove that although you might work till you drop,
you can't predict whether a program will stop.

Imagine w e have a procedure called P
that will snoop in the source code of programs to see
there aren't infinite loops that go round and around;
and P prints the word "Fine!" if no looping is found.

You feed in your code, and the input it needs,
and then P takes them both and it studies and reads
and computes whether things will all end as they should
(as opposed to going loopy the way that they could).

Well, the truth is that P cannot possibly be,
because if you wrote it and gave it to me,
I could use it to set up a logical bind
that would shatter your reason and scramble your mind.

Here's the trick I would use=97and it's simple to do.
I'd define a procedure=97we'll name the thing Q=97
that would take any program and call P (of course!)
to tell if it looped, by reading the source;

And if so, Q would simply print "Loop!" and then stop;
but if no, Q would go right back up to the top,
and start off again, looping endlessly back,
till the universe dies and is frozen and black.

And this program called Q wouldn't stay on the shelf;
I would run it, and (fiendishly) feed it itself.
What behavior results when I do this with Q?
When it reads its own source code, just what will it do?

If P warns of loops, Q will print "Loop!" and quit;
yet P is supposed to speak truly of it.
So if Q's going to quit, then P should say, "Fine!"=97
which will make Q go back to its very first line!

No matter what P would have done, Q will scoop it:
Q uses P's output to make P look stupid.
If P gets things right then it lies in its tooth;
and if it speaks falsely, it's telling the truth!

I've created a paradox, neat as can be=97
and simply by using your putative P.
When you assumed P you stepped into a snare;
Your assumptions have led you right into my lair.

So, how to escape from this logical mess?
I don't have to tell you; I'm sure you can guess.
By reductio, there cannot possibly be
a procedure that acts like the mythical P .

You can never discover mechanical means
for predicting the acts of computing machines.
It's something that cannot b e done. So we users
must find our own bugs; our computers are losers!

GEOFFREY K. PULLUM
STEVENSON COLLEGE
UNIVERSITY OF CALIFORNIA SANTA CRUZ
MATHEMATICS MAGAZINE, 10/2000=
 
D

David Unric

Ryan Davis> Yep, I really understand it's impossible to do static
(lexical) code analysis in case of a programming language with dynamic
type system and selfmodifying abilities at the runtime you can 100% rely
on.
I would just appreciate some helper tool to warn me about _possible_
dumb typing errors I will make at writing my code and I would decide if
it's false or positive warning. Most of time I'm fixing typos rather
then a mistake in an algorithm. Better then nothing.

Unfortunately none of above mentioned or linked tools detected the use
of unassigned variable. For an illustration what does pylint for the
equivalent code in python:

_____ snip ________________________________________
import sys

if __name__ == "__main__":
my_msg = 'Hello'
if sys.argv[-1] == 'doit':
print mymsg
else:
print 'Nope'

~$ pylint -E test.py
No config file found, using default configuration
************* Module test
E: 6: Undefined variable 'mymsg'
_____ snip ________________________________________

I'm using the new generation of Ruby, version 1.9.2 concretely. It
produduces bytecode for YARV interpreter. I'm sure catching an
intermediate output between parsed source and bytecode would allow
static analysis I'm calling for. Is this possible with current versions
Ruby ?
 
R

Ryan Davis

Unfortunately none of above mentioned or linked tools detected the use=20=
of unassigned variable. For an illustration what does pylint for the=20=
equivalent code in python:
=20
_____ snip ________________________________________
import sys
=20
if __name__ =3D=3D "__main__":
my_msg =3D 'Hello'
if sys.argv[-1] =3D=3D 'doit':
print mymsg
else:
print 'Nope'
=20
~$ pylint -E test.py
No config file found, using default configuration
************* Module test
E: 6: Undefined variable 'mymsg'

In python, 'mymsg' _has_ to be a variable. This isn't true in ruby. It =
can be a variable OR a method call. In the case of the latter, we don't =
know if it is valid or not without evaluating.

in ruby, 'def x; mymsg; end' (the most boiled down version of your =
example) looks like this internally:

% echo 'def x; mymsg; end' | parse_tree_show=20
s:)defn,
:x,
s:)args),
s:)scope, s:)block, s:)call, nil, :mymsg, s:)arglist)))))

Ruby parsed the code and decided that mymsg must be a method call. It is =
determined at runtime (since everything is late bound) who (if anyone) =
implemented the method and if not, it goes to method_missing.

Python simply doesn't have this flexibility. 'x' is a variable and 'x()' =
is a call. AFAIK, there is no "__" hook equivalent to method_missing in =
python... But it has been a while for me.
 
R

Ryan Davis

Most of time I'm fixing typos rather=20
then a mistake in an algorithm. Better then nothing.

I also suggest that you deal with the problem instead of the symptom. I =
_don't_ typo that much. Much of that is attributed to my use of "M-/" =
(dabbrev-expand -- Expand previous word "dynamically") in emacs. Vim has =
C-n (keyword completion). I'm sure whatever text editor you're using can =
help... If not, then I have another suggestion for you. :p
 
D

David Unric

Ryan Davis wrote in post #960436:
In python, 'mymsg' _has_ to be a variable. This isn't true in ruby. It
can be a variable OR a method call. In the case of the latter, we don't
know if it is valid or not without evaluating.

in ruby, 'def x; mymsg; end' (the most boiled down version of your
example) looks like this internally:

% echo 'def x; mymsg; end' | parse_tree_show
s:)defn,
:x,
s:)args),
s:)scope, s:)block, s:)call, nil, :mymsg, s:)arglist)))))

Ruby parsed the code and decided that mymsg must be a method call. It is
determined at runtime (since everything is late bound) who (if anyone)
implemented the method and if not, it goes to method_missing.

Python simply doesn't have this flexibility. 'x' is a variable and 'x()'
is a call. AFAIK, there is no "__" hook equivalent to method_missing in
python... But it has been a while for me.

Great info Ryan, thank you.

My appologies I did not realized sooner the deeper conceptual
differencies between these languages.
Despite of that, if Ruby cann't decide if mymsg is a variable or a
method call in advance, how does it rule out the possibility for a
static parser to notify the 'mymsg' symbol has no assignment or method
definiton in the source code ? I.e. if 'x' in your example is a method
call, how can arise without some form of an assignment ?
 
R

Ryan Davis

Despite of that, if Ruby cann't decide if mymsg is a variable or a=20
method call in advance, how does it rule out the possibility for a=20
static parser to notify the 'mymsg' symbol has no assignment or method=20=
definiton in the source code ? I.e. if 'x' in your example is a method=20=
call, how can arise without some form of an assignment ?

I don't entirely understand your question, as "x" in my example is the =
wrapping method and "mymsg" was the message send from your example.

So let's try again, with the code sample reformatted better:
% echo 'def x; mymsg; end' | parse_tree_show
begets:

s:)defn, :x, s:)args),
s:)scope,
s:)block,
s:)call, nil, :mymsg, s:)arglist)))))

read in English: "method definition :x with no args. calls mymsg w/ no =
args"

So, ruby parsed "def x;" and realized it was a 0 arg method. The body =
refers to "mymsg;" and since there was no previous variable assignment =
of the same name, it must be a method call (with no args). That's it. At =
that point, the method is "compiled" down to a single method invocation =
for the message "mymsg". Notice I said "message"... that's an important =
distinction between static and dynamic languages. In static languages =
(or in dynamic/hybrid languages where the method is pre-calculated), =
methods/functions are called by having a pointer to the function body =
and simply jumping to that address.

In a dynamic invocation, a receiver is sent "a message" at runtime. This =
roughly translates to asking the hierarchy of classes if they implement =
that message, and if so, to please execute it. If not, then the runtime =
does the same thing, but this time around the message is =
"method_missing" with the previous message pushed onto the front of the =
args.=
 
D

David Unric

Ryan Davis wrote in post #960452:
So let's try again, with the code sample reformatted better:


read in English: "method definition :x with no args. calls mymsg w/ no
args"

So, ruby parsed "def x;" and realized it was a 0 arg method. The body
refers to "mymsg;" and since there was no previous variable assignment
of the same name, it must be a method call (with no args). That's it. At
that point, the method is "compiled" down to a single method invocation
for the message "mymsg". Notice I said "message"... that's an important
distinction between static and dynamic languages. In static languages
(or in dynamic/hybrid languages where the method is pre-calculated),
methods/functions are called by having a pointer to the function body
and simply jumping to that address.

In a dynamic invocation, a receiver is sent "a message" at runtime. This
roughly translates to asking the hierarchy of classes if they implement
that message, and if so, to please execute it. If not, then the runtime
does the same thing, but this time around the message is
"method_missing" with the previous message pushed onto the front of the
args.

First sorry for the confusion, I had 'mymsg'in your x method example in
mind, of course.

Second, thank you for the more detailed explanation. It closely
describes what is done during code execution by interpreter's point of
view. From the static/naive human code reader's point of view, I would
search for symbol 'mymsg' in current/parent/global scope including
imported modules 'if there is any assignment to this symbol': lvalue,
method definition, method's argument etc. If it's not found, thus
'mymsg' symbol is suspicious and I would notice about it. Similar
procedure I would expect from possibly naive but still handy code
checker.

Btw. I'm using (g)Vim editor and know about C-n shortcut but use it
sparsely because it offers too much symbols at times and it only slows
me down. Code completion C-x C-o makes much more sense but it doesn't
work in all cases. One of reasons of my typing errors are caused by
switching between various keyboards (MS Natural KB at PC and
significantly different at corporate notebook).
 
R

Robert Klemme

Ryan Davis wrote in post #960452:

First sorry for the confusion, I had 'mymsg'in your x method example =A0i= n
mind, of course.

Second, thank you for the more detailed explanation. It closely
describes what is done during code execution by interpreter's point of
view. From the static/naive human code reader's point of view, I would
search for symbol 'mymsg' in current/parent/global scope including
imported modules 'if there is any assignment to this symbol': lvalue,
method definition, method's argument etc. If it's not found, thus
'mymsg' symbol is suspicious and I would notice about it. Similar
procedure I would expect from possibly naive but still handy code
checker.

As we tried to explain, you cannot do any static lookups - not even
for constants:

09:35:26 Temp$ ruby19 -e 'class X;end;X.const_set("Foo"+"Bar",123);p X::Foo=
Bar'
123

Any tool trying to statically find X::FooBar would be lost yet the
code is perfectly legal and error free.

It's worse with methods since the you additionally get the inheritance
chain as an additional, orthogonal dimension for lookups.

Kind regards

robert

--=20
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
 
D

David Unric

Robert Klemme wrote in post #960467:
09:35:26 Temp$ ruby19 -e 'class X;end;X.const_set("Foo"+"Bar",123);p
X::FooBar'
123

Any tool trying to statically find X::FooBar would be lost yet the
code is perfectly legal and error free.

It's worse with methods since the you additionally get the inheritance
chain as an additional, orthogonal dimension for lookups.

Kind regards

robert

I'm saying it from the start I don't expect the static checker would
throw positive warnings in 100% of cases. You demonstrated one example.
I know it depends on the programming style but I dare to claim such kind
of generic constant/variable creation is less frequent and so the static
checker would cover larger subset of cases. After a brief lookup to a
sample of standard Ruby libraries (written in Ruby, not in C) I didn't
find a class where a constant/variable/method name is generated "on the
fly" as you've pointed out. So false warnnings needn't to be too much.

Take care

David
 
B

Brian Candler

David Unric wrote in post #960444:
My appologies I did not realized sooner the deeper conceptual
differencies between these languages.
Despite of that, if Ruby cann't decide if mymsg is a variable or a
method call in advance, how does it rule out the possibility for a
static parser to notify the 'mymsg' symbol has no assignment or method
definiton in the source code ?

On the contrary: ruby *does* always decide (statically, at parse time,
not at run-time) whether a bareword is a local variable or a method
call, based on whether a previous assignment statement exists previously
in the same scope.

e.g.

def foo
if false
x = 123
end
puts x # prints nil - x is a local variable
puts y # NameError
puts z() # NoMethodError
end
foo

Although the assignment to x was never executed, it exists in the parse
tree before the point where it is referenced. Therefore at that point
bareword x is known to be a local variable. At runtime the value is nil
because no assignment was actually executed.

y is decided statically to be a method call, as the parsetree shows,
because no assignment was seen. Note however that the error message says
"undefined local variable or method `y'". The programmer might have
meant to assign to a local variable called y, or might have meant to
define a method called y. Neither was done, but Ruby has no way to tell
which you intended.

z() is known from the syntax definitely to be a method call, and so the
error is "undefined method `z'". The same would be true for self.z. It
doesn't matter whether a statement z=... exists earlier.

Unfortunately, because Ruby doesn't require you to annotate your code,
there is no additional information which could be used to highlight an
anomaly.

For example, in perl, scalar variables are flagged by $. If you do

$myarg = 123;
...
print $my_arg;

it can see that $my_arg is used only once, and is therefore likely an
error. With 'use strict' you are forced to declare your variables, so
every variable must have a matching declaration.

The non-verbose nature of Ruby, combined with the dynamic definition of
methods at runtime means that these tests simply can't be done. Dynamic
method definitions are very widely used - look at code which uses
ActiveRecord for instance. The methods in an object are based on the
columns in the database which the program connects to.

And what this means is, you need tests. Your test suite needs to execute
each line of code in your source at least once. A code coverage tool
like 'rcov' can help you achieve that, or if you are paranoid look at
'heckle'.

Is this harder or more expensive than static code checking? Only if you
take the point of view that you don't need unit tests.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top