using regular express to analyze lisp code

K

Kelie

hello,

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp (it is for AutoCAD and has a syntax
that's similar to common lisp) function body. VisualLisp is case-
insensitive. Any line beginning with ";" is for comment (can have
space(s) before ";").

here is an example of VisualLisp function:

(defun get_obj_app_names (obj / rv)
(foreach app (get_registered_apps (vla-get-document obj))
(if (get_xdata obj app)
(setq rv (cons app rv))
)
)
(if rv
;;"This line is comment (comment)"
;;) This line is also comment
(acad_strlsort rv)
nil
)
)

for a function named foo, it is easy to find the beginning part of the
function
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

not sure if i've made myself understood. thanks for reading.

kelie
 
D

Dan

hello,

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp (it is for AutoCAD and has a syntax
that's similar to common lisp) function body. VisualLisp is case-
insensitive. Any line beginning with ";" is for comment (can have
space(s) before ";").

here is an example of VisualLisp function:

(defun get_obj_app_names (obj / rv)
(foreach app (get_registered_apps (vla-get-document obj))
(if (get_xdata obj app)
(setq rv (cons app rv))
)
)
(if rv
;;"This line is comment (comment)"
;;) This line is also comment
(acad_strlsort rv)
nil
)
)

for a function named foo, it is easy to find the beginning part of the
function
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

not sure if i've made myself understood. thanks for reading.

kelie

So, paren matching is a canonical context-sensitive algorithm. Now,
many regex libraries have *some* not-purely-regular features, but I
doubt your going to find anything to match parens in a single regex.
If you want to go all out you can use a parser generator (for python
parser generators, see http://python.fyxm.net/topics/parsing.html).
Otherwise, you can go about it the quick-and-dirty way you describe:
scan for matching open and close parens, and ignore things in quotes
and comments.

-Dan
 
T

Tim Chase

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp [snipped]
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".


"""
Some people, when confronted with a problem, think
"I know, I'll use regular expressions!"
Now they have two problems
"""


Regular expressions are a wonderful tool when the domain is
correct. However, when your domain involves processing
arbitrarily nested syntax, regexps are not your friend. It is
sometimes feasible to mung them into a fixed-depth-nesting
parser, but it's always fairly painful, and the fixed-depth is an
annoying limitation.

Use a parsing lib. I've tinkered a bit with PyParsing[1] which
is fairly easy to pick up, but powerful enough that you're not
banging your head against limitations. There are a number of
other parsing libraries[2] with various domain-specific features
and audiences, but I'd go browsing through them only if PyParsing
doesn't fill the bill.

As you don't detail what you want to do with the content or how
pathological the input can be, but you might be able to get away
with just skimming through the input and counting open-parens and
close-parens, stopping when they've been balanced, skipping lines
with comments.

-tkc

[1] http://pyparsing.wikispaces.com/
[2] http://nedbatchelder.com/text/python-parsers.html
 
K

Kelie

Use a parsing lib. I've tinkered a bit with PyParsing[1] which
is fairly easy to pick up, but powerful enough that you're not
banging your head against limitations. There are a number of
other parsing libraries[2] with various domain-specific features
and audiences, but I'd go browsing through them only if PyParsing
doesn't fill the bill.

As you don't detail what you want to do with the content or how
pathological the input can be, but you might be able to get away
with just skimming through the input and counting open-parens and
close-parens, stopping when they've been balanced, skipping lines
with comments.

thanks Tim. following you and Dan's advice i visited
http://python.fyxm.net/topics/parsing.html and i picked up pyparsing
after brief reading of descriptions for couple of packages. now that
you recommended it, seems that i made a good choice.

btw, the content found will be copied to a new text file.
 
K

Kelie

So, paren matching is a canonical context-sensitive algorithm. Now,
many regex libraries have *some* not-purely-regular features, but I
doubt your going to find anything to match parens in a single regex.
If you want to go all out you can use a parser generator (for python
parser generators, seehttp://python.fyxm.net/topics/parsing.html).
Otherwise, you can go about it the quick-and-dirty way you describe:
scan for matching open and close parens, and ignore things in quotes
and comments.

-Dan

Dan, thanks for suggesting parser generators.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,009
Latest member
GidgetGamb

Latest Threads

Top