using regular express to analyze lisp code

Kelie · Oct 4, 2007

hello,

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp (it is for AutoCAD and has a syntax
that's similar to common lisp) function body. VisualLisp is case-
insensitive. Any line beginning with ";" is for comment (can have
space(s) before ";").

here is an example of VisualLisp function:

(defun get_obj_app_names (obj / rv)
(foreach app (get_registered_apps (vla-get-document obj))
(if (get_xdata obj app)
(setq rv (cons app rv))
)
)
(if rv
;;"This line is comment (comment)"
;

This line is also comment
(acad_strlsort rv)
nil
)
)

for a function named foo, it is easy to find the beginning part of the
function
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

not sure if i've made myself understood. thanks for reading.

kelie

Dan · Oct 4, 2007

hello,

i've spent couple of hours trying to figure out the correct regular
expression to catch a VisualLisp (it is for AutoCAD and has a syntax
that's similar to common lisp) function body. VisualLisp is case-
insensitive. Any line beginning with ";" is for comment (can have
space(s) before ";").

here is an example of VisualLisp function:

(defun get_obj_app_names (obj / rv)
(foreach app (get_registered_apps (vla-get-document obj))
(if (get_xdata obj app)
(setq rv (cons app rv))
)
)
(if rv
;;"This line is comment (comment)"
; This line is also comment
(acad_strlsort rv)
nil
)
)

for a function named foo, it is easy to find the beginning part of the
function
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

not sure if i've made myself understood. thanks for reading.

kelie

So, paren matching is a canonical context-sensitive algorithm. Now,
many regex libraries have *some* not-purely-regular features, but I
doubt your going to find anything to match parens in a single regex.
If you want to go all out you can use a parser generator (for python
parser generators, see http://python.fyxm.net/topics/parsing.html).
Otherwise, you can go about it the quick-and-dirty way you describe:
scan for matching open and close parens, and ignore things in quotes
and comments.

-Dan

Tim Chase · Oct 4, 2007

i've spent couple of hours trying to figure out the correct regular

expression to catch a VisualLisp [snipped]
"(defun foo", but it is hard to find the ")" at the end of code block.
if eventually i can't come up with the solution using regular
expression only, what i was thinking is after finding the beginning
part, which is "(defun foo" in this case, i can count the parenthesis,
ignoring anything inside "" and any line for comment, until i find the
closing ")".

"""
Some people, when confronted with a problem, think
"I know, I'll use regular expressions!"
Now they have two problems
"""

Regular expressions are a wonderful tool when the domain is
correct. However, when your domain involves processing
arbitrarily nested syntax, regexps are not your friend. It is
sometimes feasible to mung them into a fixed-depth-nesting
parser, but it's always fairly painful, and the fixed-depth is an
annoying limitation.

Use a parsing lib. I've tinkered a bit with PyParsing[1] which
is fairly easy to pick up, but powerful enough that you're not
banging your head against limitations. There are a number of
other parsing libraries[2] with various domain-specific features
and audiences, but I'd go browsing through them only if PyParsing
doesn't fill the bill.

As you don't detail what you want to do with the content or how
pathological the input can be, but you might be able to get away
with just skimming through the input and counting open-parens and
close-parens, stopping when they've been balanced, skipping lines
with comments.

-tkc

[1] http://pyparsing.wikispaces.com/
[2] http://nedbatchelder.com/text/python-parsers.html

Kelie · Oct 4, 2007

Use a parsing lib. I've tinkered a bit with PyParsing[1] which
is fairly easy to pick up, but powerful enough that you're not
banging your head against limitations. There are a number of
other parsing libraries[2] with various domain-specific features
and audiences, but I'd go browsing through them only if PyParsing
doesn't fill the bill.

As you don't detail what you want to do with the content or how
pathological the input can be, but you might be able to get away
with just skimming through the input and counting open-parens and
close-parens, stopping when they've been balanced, skipping lines
with comments.

thanks Tim. following you and Dan's advice i visited
http://python.fyxm.net/topics/parsing.html and i picked up pyparsing
after brief reading of descriptions for couple of packages. now that
you recommended it, seems that i made a good choice.

btw, the content found will be copied to a new text file.

Kelie · Oct 4, 2007

So, paren matching is a canonical context-sensitive algorithm. Now,
many regex libraries have *some* not-purely-regular features, but I
doubt your going to find anything to match parens in a single regex.
If you want to go all out you can use a parser generator (for python
parser generators, seehttp://python.fyxm.net/topics/parsing.html).
Otherwise, you can go about it the quick-and-dirty way you describe:
scan for matching open and close parens, and ignore things in quotes
and comments.

-Dan

Dan, thanks for suggesting parser generators.

emacs lisp text processing example (html5 figure/figcaption)	7	Jul 4, 2011
emacs lisp as text processing language...	1	Oct 29, 2007
to RG - Lisp lunacy and Perl psychosis	62	Mar 10, 2010
Lisp mentality vs. Python mentality	74	Apr 25, 2009
requestion regarding regular expression	8	Apr 14, 2006
Using lisp code in emacs inside a C program	3	Oct 24, 2012
Python pyPDF4 code to bookmark pdf based upon date text	1	Jan 18, 2023
Which Lisp to Learn?	3	Mar 8, 2009

using regular express to analyze lisp code

Kelie

Dan

Tim Chase

Kelie

Kelie

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads