Question on regular expression.

Discussion in 'Java' started by Paul, Mar 15, 2007.

  1. Paul

    Paul Guest

    Here is a pattern and the code snippet I have:

    String pat = "#foo#([+-][0-9]+)*";
    String input = "#foo#+1-2";
    Pattern pattern = Pattern.compile(pat);
    Matcher matcher = pattern.matcher(input);

    while (matcher.find())
    {
    String matched = matcher.group(1);
    ...
    }

    The result I'm hoping for is to get "+1" and "-2" returned for
    "matched" string. However, I'm only getting "-2" and then matcher is
    done.

    Can any shed some light on this?

    Thanks,
    Paul.
     
    Paul, Mar 15, 2007
    #1
    1. Advertising

  2. Paul wrote:
    > Here is a pattern and the code snippet I have:
    >
    > String pat = "#foo#([+-][0-9]+)*";
    > String input = "#foo#+1-2";
    > Pattern pattern = Pattern.compile(pat);
    > Matcher matcher = pattern.matcher(input);
    >
    > while (matcher.find())
    > {
    > String matched = matcher.group(1);
    > ...
    > }
    >
    > The result I'm hoping for is to get "+1" and "-2" returned for
    > "matched" string. However, I'm only getting "-2" and then matcher is
    > done.
    >
    > Can any shed some light on this?
    >
    > Thanks,
    > Paul.
    >


    There is only one "#foo#" to match the start of the expression, so only
    the first find() call can succeed.

    What does matcher.groupCount() return after the successful find()? Just
    looking at it, I would have expected 2, with group(1) equal to "+1" and
    group(2) equal to "-2", but you only show a call for group(1).

    Patricia
     
    Patricia Shanahan, Mar 15, 2007
    #2
    1. Advertising

  3. Paul

    Lars Enderin Guest

    Paul skrev:
    > Here is a pattern and the code snippet I have:
    >
    > String pat = "#foo#([+-][0-9]+)*";
    > String input = "#foo#+1-2";
    > Pattern pattern = Pattern.compile(pat);
    > Matcher matcher = pattern.matcher(input);
    >
    > while (matcher.find())
    > {
    > String matched = matcher.group(1);
    > ...
    > }
    >
    > The result I'm hoping for is to get "+1" and "-2" returned for
    > "matched" string. However, I'm only getting "-2" and then matcher is
    > done.
    >
    > Can any shed some light on this?
    >

    Since #foo# is part of pat, you cannot get more than one match. Remove
    #foo# from pat and also remove the *.
     
    Lars Enderin, Mar 15, 2007
    #3
  4. Paul

    Paul Guest

    On Mar 15, 9:55 am, Lars Enderin <> wrote:
    > Paul skrev:
    >
    >
    >
    > > Here is a pattern and the code snippet I have:

    >
    > > String pat = "#foo#([+-][0-9]+)*";
    > > String input = "#foo#+1-2";
    > > Pattern pattern = Pattern.compile(pat);
    > > Matcher matcher = pattern.matcher(input);

    >
    > > while (matcher.find())
    > > {
    > > String matched = matcher.group(1);
    > > ...
    > > }

    >
    > > The result I'm hoping for is to get "+1" and "-2" returned for
    > > "matched" string. However, I'm only getting "-2" and then matcher is
    > > done.

    >
    > > Can any shed some light on this?

    >
    > Since #foo# is part of pat, you cannot get more than one match. Remove
    > #foo# from pat and also remove the *.- Hide quoted text -
    >
    > - Show quoted text -


    The input can be "#foo#", "#foo#+1", "#foo#+1-2", etc. The current
    approach I took is to modify the input and reset the matcher each time
    I got a match. I don't like this approach. It's kind of like what you
    indicated here. Basically, you have to use 2 patterns during the
    process. Right?

    Patricia,

    It only returns 1 group. I was hoping it behaves like what you
    described.

    Thanks,
    Paul.
     
    Paul, Mar 15, 2007
    #4
  5. Paul wrote:
    > Here is a pattern and the code snippet I have:
    >
    > String pat = "#foo#([+-][0-9]+)*";
    > String input = "#foo#+1-2";
    > Pattern pattern = Pattern.compile(pat);
    > Matcher matcher = pattern.matcher(input);
    >
    > while (matcher.find())
    > {
    > String matched = matcher.group(1);
    > ...
    > }
    >
    > The result I'm hoping for is to get "+1" and "-2" returned for
    > "matched" string. However, I'm only getting "-2" and then matcher is
    > done.
    >
    > Can any shed some light on this?


    I believe that only one group is defined in the above pattern, and since
    '*' is greedy, that group gets assigned the last successful match of
    the contained subpattern. The following varient shows this better:
    ---------------------------------------------------
    import java.util.regex.*;

    public class Foo {

    public static void main(String[] args) {
    String pat = "#foo#([+-][0-9]+)([+-][0-9]+)*";
    String input = "#foo#+1-2+3-4";
    Pattern pattern = Pattern.compile(pat);
    Matcher matcher = pattern.matcher(input);

    if (matcher.matches()) {
    System.out.println("Found "+matcher.groupCount()+" groups");
    for (int i = 1; i <= matcher.groupCount(); ++i) {
    System.out.println(""+i+": "+matcher.group(i));
    }
    }

    System.exit(0);
    }

    }
    --------------------------------------------------
    When run, the output is:
    --------------------------------------------------
    ->java Foo
    Found 2 groups
    1: +1
    2: -4
    ->
    --------------------------------------------------

    Unfortunately, I can't see how to specify an arbitrary number of groups...

    --
    Steve Wampler --
    The gods that smiled on your birth are now laughing out loud.
     
    Steve Wampler, Mar 15, 2007
    #5
  6. Steve Wampler wrote:
    > Paul wrote:
    >> Here is a pattern and the code snippet I have:
    >>
    >> String pat = "#foo#([+-][0-9]+)*";
    >> String input = "#foo#+1-2";
    >> Pattern pattern = Pattern.compile(pat);
    >> Matcher matcher = pattern.matcher(input);
    >>
    >> while (matcher.find())
    >> {
    >> String matched = matcher.group(1);
    >> ...
    >> }
    >>
    >> The result I'm hoping for is to get "+1" and "-2" returned for
    >> "matched" string. However, I'm only getting "-2" and then matcher is
    >> done.
    >>
    >> Can any shed some light on this?

    >

    ....
    > Unfortunately, I can't see how to specify an arbitrary number of groups...
    >


    I think it is going to have to be done with two patterns.
    As I understand the situation, the current pattern does a good job of
    representing the substring that should be matched, so how about keeping
    something like it, but adding another pattern for splitting up the
    repeated groups?

    Pattern outer = Pattern.compile("#foo#(([+-][0-9]+)*)");
    Pattern inner = Pattern.compile("[+-][0-9]+");

    void test(String input) {
    Matcher outerMatcher = outer.matcher(input);
    if (outerMatcher.find()) {
    System.out.printf("Matched %s in %s%n",outerMatcher.group(0),input);
    Matcher innerMatcher = inner.matcher(outerMatcher.group(1));
    while(innerMatcher.find()){
    System.out.printf("Group %s%n",innerMatcher.group(0));
    }
    } else {
    System.out.println("No match: " + input);
    }
    }
     
    Patricia Shanahan, Mar 15, 2007
    #6
  7. Patricia Shanahan wrote:
    > Steve Wampler wrote:
    >> Unfortunately, I can't see how to specify an arbitrary number of
    >> groups...
    >>

    >
    > I think it is going to have to be done with two patterns.


    I agree. I *should* have said "...arbitrary number of groups
    in a single pattern".

    --
    Steve Wampler --
    The gods that smiled on your birth are now laughing out loud.
     
    Steve Wampler, Mar 15, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Andrew Munn

    Regular expression question...

    Andrew Munn, Jun 29, 2003, in forum: Perl
    Replies:
    1
    Views:
    2,136
    rakesh sharma
    Jun 30, 2003
  2. Glenn Kidd

    Regular expression question

    Glenn Kidd, Aug 18, 2003, in forum: Perl
    Replies:
    0
    Views:
    932
    Glenn Kidd
    Aug 18, 2003
  3. VSK
    Replies:
    2
    Views:
    2,307
  4. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    851
    Alan Moore
    Dec 2, 2005
  5. GIMME
    Replies:
    3
    Views:
    11,974
    vforvikash
    Dec 29, 2008
Loading...

Share This Page