regular expression

H

hugo

I would like a regular expression that can match a sequence of
variable length patterns. Each pattern begins with a known TAG and is
followed by a variable number of characters. In the following, TAG is
literal, and the x's indicated arbitrary characters.

TAGxxx
TAGxxxxxx
TAGxxTAGxxxx
TAGxxxTAGxxxxTAGxxxxx

you get the idea. Don't read anything into the number of x's, their
count is variable. Of course, I'll need the results grouped so I can
extract each of the segments. The following doesn't work,

(TAG)(.*)

This expression finds just the first TAG and then gobbles the rest.
Want I want is (.*) with the restriction that it not contain the
sequence TAG.

Thanks
 
M

Matt Humphrey

hugo said:
I would like a regular expression that can match a sequence of
variable length patterns. Each pattern begins with a known TAG and is
followed by a variable number of characters. In the following, TAG is
literal, and the x's indicated arbitrary characters.

TAGxxx
TAGxxxxxx
TAGxxTAGxxxx
TAGxxxTAGxxxxTAGxxxxx

you get the idea. Don't read anything into the number of x's, their
count is variable. Of course, I'll need the results grouped so I can
extract each of the segments. The following doesn't work,

(TAG)(.*)

This expression finds just the first TAG and then gobbles the rest.
Want I want is (.*) with the restriction that it not contain the
sequence TAG.

Thanks

And I would like a double mocha latte with extra whipped cream. And don't
forget the cinnamon.

Oh, you don't work here either!
 
?

=?ISO-8859-1?Q?Daniel_Sj=F6blom?=

hugo said:
I would like a regular expression that can match a sequence of
variable length patterns. Each pattern begins with a known TAG and is
followed by a variable number of characters. In the following, TAG is
literal, and the x's indicated arbitrary characters.

TAGxxx
TAGxxxxxx
TAGxxTAGxxxx
TAGxxxTAGxxxxTAGxxxxx

you get the idea. Don't read anything into the number of x's, their
count is variable. Of course, I'll need the results grouped so I can
extract each of the segments. The following doesn't work,

(TAG)(.*)

This expression finds just the first TAG and then gobbles the rest.
Want I want is (.*) with the restriction that it not contain the
sequence TAG.

Thanks

Try:

((TAG)(.)*?)*

*? is the reluctant qualifier.
 
?

=?ISO-8859-1?Q?Daniel_Sj=F6blom?=

Daniel said:
Try:

((TAG)(.)*?)*

*? is the reluctant qualifier.

Actually, that won't work. But there is a far simpler solution to your
problem. Simply search for TAG once, record the position and search for
TAG once again from that position onward.

If you insist on using a regex, something like this will work, although
it is not perfect:

(TAG)(.)*?(TAG)*
 
D

david m-

hugo said:
I would like a regular expression that can match a sequence of
variable length patterns. Each pattern begins with a known TAG and is
followed by a variable number of characters. In the following, TAG is
literal, and the x's indicated arbitrary characters.

TAGxxx
TAGxxxxxx
TAGxxTAGxxxx
TAGxxxTAGxxxxTAGxxxxx

you get the idea. Don't read anything into the number of x's, their
count is variable. Of course, I'll need the results grouped so I can
extract each of the segments. The following doesn't work,

(TAG)(.*)

This expression finds just the first TAG and then gobbles the rest.
Want I want is (.*) with the restriction that it not contain the
sequence TAG.

Thanks

TAG(.+?)(?=(TAG)|$)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top