match nested parenthesis

S

s99999999s2003

hi
i wish to find an reg exp for matching nested parenthesis of varying
level like
string =
"somewords1(words(somewords2)-(some(some)words3)somestuff)somestuff"
and be able to evaluate the pair starting from the inner most(the
deepest level) , ie (some)
up till the outer most. What is a good reg exp to do this? or is simple
string manipulations enough?
thanks
 
T

Terry Reedy

| hi
| i wish to find an reg exp for matching nested parenthesis of varying
| level like
| string =
| "somewords1(words(somewords2)-(some(some)words3)somestuff)somestuff"
| and be able to evaluate the pair starting from the inner most(the
| deepest level) , ie (some)
| up till the outer most. What is a good reg exp to do this? or is simple
| string manipulations enough?
| thanks

The presence of indefinitely nested fence markers, like parens,
differentiates a context-free language, generated from a context-free
grammar, from a regular language/grammar. You cannot parse such sentences
with a regex engine or even with an extended regex engine such as Python's.
Use a more general parser.

tjr
 
N

Neil Cerutti

hi
i wish to find an reg exp for matching nested parenthesis of
varying level like
string =

It is not possible, because the set of strings containing
balanced parenthesis is not regular. Python re's aren't *really*
regular expressions, so you might be able to hack something up
using extensions (I'm not sure if it's been proven that you
can'). I believe regexes are the wrong tool for the job.
"somewords1(words(somewords2)-(some(some)words3)somestuff)somestuff"
and be able to evaluate the pair starting from the inner
most(the deepest level) , ie (some) up till the outer most.
What is a good reg exp to do this? or is simple string
manipulations enough?

Write a context free grammar and a recognizer for that grammar,
instead. Python developers haven't been able to agree on any such
module to include in the standard distribution yet,
unfortunately. But fortunately (consequently), there are tons of
nice parser and scanner generators available on the net. Try
pyparsing for a start, as it's a recent, maintained package with
excellent support on this group at the moment.

Or just compose your own little function by hand. A stack-based
solution (implemented using a list as a stack) should be easy enough.
 
A

arn.zart

hi
i wish to find an reg exp for matching nested parenthesis of varying
level like
string =
"somewords1(words(somewords2)-(some(some)words3)somestuff)somestuff"
and be able to evaluate the pair starting from the inner most(the
deepest level) , ie (some)
up till the outer most. What is a good reg exp to do this? or is simple
string manipulations enough?
thanks

Evaluation using re.sub() and recursion (3 lines of code):

def eval_sub(expr):
r"""Evaluate subexpressions in nested paired delimiters.

For example,
‹ single left-pointing angle quotation ‹
› single right-pointing angle quotation ›

>>> eval_sub('‹3 * ‹3 + ‹‹1 + 1› * 2›››')
'21'
>>> eval_sub('‹3 * 3 + ‹‹1 + 1› * 2›››') # test
mismatched delimiters
'13\x9b'
>>> '\x9b' == '›' # encoding ISO-8859-1
True
>>> eval_sub('3 * ‹‹1 + 1› * 2 + 3›') # test absence of
outer delimiters
'3 * 7'
"""
val, n = re.subn("‹([^›‹]+)›", lambda m:
str(eval(m.group(1))), expr)
if n == 0:
return val

return eval_sub(val)
#end

This is just a proof of concept.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,046
Latest member
Gavizuho

Latest Threads

Top