\$:xml-ms\{((([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))+){1,1} | (\$environment\{"([a-zA-Z_]+\w*)+(\.([a-zA

  • Thread starter Vijayaraghavan Kalyanapasupathy
  • Start date
V

Vijayaraghavan Kalyanapasupathy

Hi,

I am trying to write a regular expression to match the following
patterns:

\$:xml-ms\{((([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))+){1,1} | (\$environment
\{"([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*"\}){1,1})\}

Here is what it is supposed to do:

In general match all patterns of the form:

$:xml-ms{ <something> }

where something is one of:

Variable:
--------

X.y.z
_x._y.z
m

but not

..Y.
..Y.Z
09.abs.d

Essentially each component of the "dotted" expression is like an
identifier matching:

[A-Za-z_]+\w*

I handle this by the regex:

(([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*)

which matches the above types. This does work on the examples I tested.

The test cases are:

Should match: Hi $:xml-ms{_._.x}
Should match: Hi $:xml-ms{_09._87.x}
Should match: Hi $:xml-ms{_09.abc.y}

Should not match: Hi $:xml-ms{_09.87.x}
Should not match: Hi $:xml-ms{_09._87.}
Should not match: Hi $:xml-ms{abs.87.x}
Should not match: Hi $:xml-ms{._a8d7c.x}
Should not match: Hi $:xml-ms{a.._}

The above works fine and as expected.

Environment variable:
--------------------

$environment{"<variable>"}

This can be expressed as:

\$environment\{"([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*"\}

Which should match

$environment{"x.y.z"}

and so on just as above but with the surrounding extras!

When I combine the two regular expressions with the choice |
it doesn't work even though the input I tried is the same as above.

Any suggestions,

thanx,

-vijai.
 
V

Vijayaraghavan Kalyanapasupathy

In general match all patterns of the form:

$:xml-ms{ <something> }

where something is one of:

Well, I meant <something>

Apologies,

-vijai.
 
A

Arndt Jonasson

Vijayaraghavan Kalyanapasupathy said:
I am trying to write a regular expression to match the following
patterns:

\$:xml-ms\{((([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))+){1,1} | (\$environment
\{"([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*"\}){1,1})\}

("Regular expression to match a pattern" sounds a little wrong to me.
I would usually use either "regular expression to match a string", or
"pattern to match a string". But I might also use "pattern" more
generically in simply describing the string.)

From your description, it appears to me that you want the expression
to match one of two things, and you use '|' to separate the two
things in the expression.

In the above expression, you have spaces around the '|' character.
That means that spaces have to be present in the string in order for
the expression to match.

Simple example: /abc | def/ matches "abc " and it matches " def", but
not "abc" or "def". /abc|def/ does match both "abc" and "def".
So maybe the solution is to remove those spaces.
 
A

Anno Siegel

Vijayaraghavan Kalyanapasupathy said:
Hi,

I am trying to write a regular expression to match the following
patterns:

\$:xml-ms\{((([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))+){1,1} | (\$environment
\{"([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*"\}){1,1})\}

Ugh. That regex is much too big to be comprehensible, not to mention
maintainable. That also makes it a poor choice for a subject line.
Here is what it is supposed to do:

In general match all patterns of the form:

$:xml-ms{ <something> }

where something is one of:

Variable:
--------

X.y.z
_x._y.z
m

but not

.Y.
.Y.Z
09.abs.d

Essentially each component of the "dotted" expression is like an
identifier matching:

[A-Za-z_]+\w*

I handle this by the regex:

(([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*)

which matches the above types. This does work on the examples I tested.

The test cases are:

Should match: Hi $:xml-ms{_._.x}
Should match: Hi $:xml-ms{_09._87.x}
Should match: Hi $:xml-ms{_09.abc.y}

Should not match: Hi $:xml-ms{_09.87.x}
Should not match: Hi $:xml-ms{_09._87.}
Should not match: Hi $:xml-ms{abs.87.x}
Should not match: Hi $:xml-ms{._a8d7c.x}
Should not match: Hi $:xml-ms{a.._}

The above works fine and as expected.

Environment variable:
--------------------

$environment{"<variable>"}

This can be expressed as:

\$environment\{"([a-zA-Z_]+\w*)+(\.([a-zA-Z_]+\w*))*"\}

Which should match

$environment{"x.y.z"}

and so on just as above but with the surrounding extras!

When I combine the two regular expressions with the choice |
it doesn't work even though the input I tried is the same as above.

Any suggestions,

You ought to split up the problem some more and not try to do everything
in a single regex.

So you have two cases that wrap the same structure (called <something>
above) in slightly different ways. Treat them separately:

my $something;
my $ok = ( ( $something) = /\$:xml-ms{\s*(\S+)\s*}/ ) ||
( ( $something) = /\$environment{"\s*(\S+)\s*"}/);

If $ok is false at this point, that is the answer and there is nothing
more to do. Otherwise, we had a match, and the $something part is filled.
To check if it is a sequence of identifiers with dots between them,
split it on dots and check each component:

$ok &&= /^[a-zA-Z_]\w*$/ for split /\./, $something, -1;

Now $ok is your answer.

It takes four lines instead of two (and two auxiliary variables), but
it is far more readable and adaptable. It also makes use of the fact
that $something has the same structure in both cases, instead of
repeating the code as the regex does.

Anno
 
V

Vijayaraghavan Kalyanapasupathy

In the above expression, you have spaces around the '|' character.
That means that spaces have to be present in the string in order for
the expression to match.

Simple example: /abc | def/ matches "abc " and it matches " def", but
not "abc" or "def". /abc|def/ does match both "abc" and "def".
So maybe the solution is to remove those spaces.

Aha! I see. I should have read the manual more closely perhaps. Thanks
for pointing it out. It does work now!

-vijai.
 
V

Vijayaraghavan Kalyanapasupathy

Yes, you are correct. I should split up the matching!

thank you,

-vijai.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top