help with regexp

M

Marc Girod

Hello,

I intend to start fixing an issue I have with a regexp of mine, but I
thought I might ask for comments even before I start myself.

I wanted to catch the text between '%[' ... ']N?l' brackets in a
'format specification' [*].
My first attempt worked well at first, with format strings such as
'%Vn %[^O13]Nl\n':

$fmt =~ s/\%\[(.*?)\](N?)l/$ph/

The text I catch is itself a regexp, but which I process in isolation
[I extend the format specification, so that '^O13' will be used as a
filter.]

Unfortunately, later, I started to use bolder format strings, such as
e.g.:
'%Vn %[Foo]NSa %[^O13]Nl\n'

My first regexp obviously bled over the two sets of brackets...
A first naive fix was:

$fmt =~ s/\%\[([^\]]*?)\](N?)l/$ph/

However, I can forsee that this prevents other valid specs, such as
e.g.:
'%Vn %[^[OE]]Nl\n'

I can also see that my strategy works only with *one* such field, but
I am willing to accept that, if I can support complex regexps inside
it.

The question I have is: am I doomed to implement a parser?
Or can I find a reasonable way out e.g. with look ahead?

Of course, I'll post what I get to myself, if I do (I won't jump to it
right away...)

Thanks!
Marc

*: I give the link to the man page for this, but I don't expect you to
need to read it:
<http://publib.boulder.ibm.com/infocenter/cchelp/v7r0m1/topic/
com.ibm.rational.clearcase.cc_ref.doc/topics/fmt_ccase.htm>
 
R

Rainer Weikusat

Ben Morrow said:
Quoth Marc Girod said:
I wanted to catch the text between '%[' ... ']N?l' brackets in a
'format specification' [*].
My first attempt worked well at first, with format strings such as
'%Vn %[^O13]Nl\n':

$fmt =~ s/\%\[(.*?)\](N?)l/$ph/
[...]
The question I have is: am I doomed to implement a parser?
Or can I find a reasonable way out e.g. with look ahead?

You are doomed to implement a parser, but you can do so using the regex
engine :).

Not really. A 'parser' would be something which does a grammatical analysis
of a sequence of tokens. This here is a lexical analyzer.
 
M

Marc Girod

Thanks Ben (and Rainer),

I didn't have any chance to touch it myself today...

I am assuming the spec here requires matching brackets inside a %[]Nl?
Can non-matching brackets be escaped?

I cannot see how non-maching brackets could make any sense there.
So, this would likely be an error, and I'd have to report it.
Now, maybe not in this scope, although...

I'd rather not force escaping inner brackets.
But that's my choice.
If you don't allow escaping of unbalanced brackets, the simple answer is
to use Regexp::Common::balanced. If you do, you will need to use 5.10,
and write out the recursion yourself: ....
The trick is the (?-1) group, which says 'start again at the top of the
nearest enclosing () group'.

I'll have to play with both of these suggestions!
Thanks!
Marc
 
M

Marc Girod

I'll have to play with both of these suggestions!

I am very impressed.
Regexp::Common qw /balanced/ gives me a starting point (I have to use
{-keep}, and work out how to discriminate the 'wrong' brackets (e.g. %
[...]NSa) from the right ones, and to strip the backets;
but yours works fully as such (er... I had to switch from m[...] to
e.g. m{...}-- my Perl (5.14.2 on Cygwin) got confused and told:
'Invalid [] range "?-1" in regex'.)

I wasn't aware of this recursive option.
Only ashamed that I didn't even try...
Thanks!
Marc
 
M

Marc Girod

    %[ac]Nl          # simple brackets
Yes

    %[a[b[c]d]e]Nl      # nested brackets
    %[a\Nl           # an escaped bracket
    %[a[[c]d]Nl         # a Perl character class containing [c
    %[a[]c]d]Nl         # a Perl character class containing ]c
    %[a[^]c]d]Nl        # a Perl character class not containing ]c


Honestly, I believe only the first is relevant...
I.e. I'll take the contents and use it as a regexp to filter 'label
types'.
So, one level of character class may be useful, but brackets are not
themselves legal characters for 'label types', so all the rest is
moot, isn't it?

Thanks again anyway!
Marc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,008
Latest member
HaroldDark

Latest Threads

Top