regular expression

Discussion in 'Java' started by hugo, Jan 7, 2004.

  1. hugo

    hugo Guest

    I would like a regular expression that can match a sequence of
    variable length patterns. Each pattern begins with a known TAG and is
    followed by a variable number of characters. In the following, TAG is
    literal, and the x's indicated arbitrary characters.

    TAGxxx
    TAGxxxxxx
    TAGxxTAGxxxx
    TAGxxxTAGxxxxTAGxxxxx

    you get the idea. Don't read anything into the number of x's, their
    count is variable. Of course, I'll need the results grouped so I can
    extract each of the segments. The following doesn't work,

    (TAG)(.*)

    This expression finds just the first TAG and then gobbles the rest.
    Want I want is (.*) with the restriction that it not contain the
    sequence TAG.

    Thanks
    hugo, Jan 7, 2004
    #1
    1. Advertising

  2. "hugo" <> wrote in message
    news:...
    > I would like a regular expression that can match a sequence of
    > variable length patterns. Each pattern begins with a known TAG and is
    > followed by a variable number of characters. In the following, TAG is
    > literal, and the x's indicated arbitrary characters.
    >
    > TAGxxx
    > TAGxxxxxx
    > TAGxxTAGxxxx
    > TAGxxxTAGxxxxTAGxxxxx
    >
    > you get the idea. Don't read anything into the number of x's, their
    > count is variable. Of course, I'll need the results grouped so I can
    > extract each of the segments. The following doesn't work,
    >
    > (TAG)(.*)
    >
    > This expression finds just the first TAG and then gobbles the rest.
    > Want I want is (.*) with the restriction that it not contain the
    > sequence TAG.
    >
    > Thanks


    And I would like a double mocha latte with extra whipped cream. And don't
    forget the cinnamon.

    Oh, you don't work here either!
    Matt Humphrey, Jan 7, 2004
    #2
    1. Advertising

  3. hugo wrote:
    > I would like a regular expression that can match a sequence of
    > variable length patterns. Each pattern begins with a known TAG and is
    > followed by a variable number of characters. In the following, TAG is
    > literal, and the x's indicated arbitrary characters.
    >
    > TAGxxx
    > TAGxxxxxx
    > TAGxxTAGxxxx
    > TAGxxxTAGxxxxTAGxxxxx
    >
    > you get the idea. Don't read anything into the number of x's, their
    > count is variable. Of course, I'll need the results grouped so I can
    > extract each of the segments. The following doesn't work,
    >
    > (TAG)(.*)
    >
    > This expression finds just the first TAG and then gobbles the rest.
    > Want I want is (.*) with the restriction that it not contain the
    > sequence TAG.
    >
    > Thanks


    Try:

    ((TAG)(.)*?)*

    *? is the reluctant qualifier.

    --
    Daniel Sjöblom
    =?ISO-8859-1?Q?Daniel_Sj=F6blom?=, Jan 7, 2004
    #3
  4. Daniel Sjöblom wrote:

    > hugo wrote:
    >
    >> I would like a regular expression that can match a sequence of
    >> variable length patterns. Each pattern begins with a known TAG and is
    >> followed by a variable number of characters. In the following, TAG is
    >> literal, and the x's indicated arbitrary characters.
    >>
    >> TAGxxx
    >> TAGxxxxxx
    >> TAGxxTAGxxxx
    >> TAGxxxTAGxxxxTAGxxxxx
    >>
    >> you get the idea. Don't read anything into the number of x's, their
    >> count is variable. Of course, I'll need the results grouped so I can
    >> extract each of the segments. The following doesn't work,
    >>
    >> (TAG)(.*)
    >>
    >> This expression finds just the first TAG and then gobbles the rest.
    >> Want I want is (.*) with the restriction that it not contain the
    >> sequence TAG.
    >>
    >> Thanks

    >
    >
    > Try:
    >
    > ((TAG)(.)*?)*
    >
    > *? is the reluctant qualifier.
    >


    Actually, that won't work. But there is a far simpler solution to your
    problem. Simply search for TAG once, record the position and search for
    TAG once again from that position onward.

    If you insist on using a regex, something like this will work, although
    it is not perfect:

    (TAG)(.)*?(TAG)*

    --
    Daniel Sjöblom
    =?ISO-8859-1?Q?Daniel_Sj=F6blom?=, Jan 7, 2004
    #4
  5. hugo

    david m- Guest

    "hugo" <> wrote in message
    news:...
    > I would like a regular expression that can match a sequence of
    > variable length patterns. Each pattern begins with a known TAG and is
    > followed by a variable number of characters. In the following, TAG is
    > literal, and the x's indicated arbitrary characters.
    >
    > TAGxxx
    > TAGxxxxxx
    > TAGxxTAGxxxx
    > TAGxxxTAGxxxxTAGxxxxx
    >
    > you get the idea. Don't read anything into the number of x's, their
    > count is variable. Of course, I'll need the results grouped so I can
    > extract each of the segments. The following doesn't work,
    >
    > (TAG)(.*)
    >
    > This expression finds just the first TAG and then gobbles the rest.
    > Want I want is (.*) with the restriction that it not contain the
    > sequence TAG.
    >
    > Thanks


    TAG(.+?)(?=(TAG)|$)
    david m-, Jan 7, 2004
    #5
  6. Daniel Sjöblom wrote:
    >
    > (TAG)(.)*?(TAG)*
    >


    Which is incorrect of course. I'm a dumbass :)

    --
    Daniel Sjöblom
    =?ISO-8859-1?Q?Daniel_Sj=F6blom?=, Jan 7, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Keith-Earl
    Replies:
    1
    Views:
    434
    Mary Chipman
    Jun 15, 2004
  2. VSK
    Replies:
    2
    Views:
    2,261
  3. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    822
    Alan Moore
    Dec 2, 2005
  4. GIMME
    Replies:
    3
    Views:
    11,909
    vforvikash
    Dec 29, 2008
  5. Noman Shapiro
    Replies:
    0
    Views:
    216
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page