Regular expression for n consecutive characters

Discussion in 'Java' started by Kannan S, Jun 1, 2005.

  1. Kannan S

    Kannan S Guest

    How do I replace consecutive occurances of any character in a string,
    to just one. eg: "aaabbbcc" to "abc"

    Pattern repeatedChars = Pattern.compile("([a-z]){2,}+");
    String s1 = repeatedChars.matcher("aaabbbcc").replaceAll("\\1");

    The regex part seems to be correct, but all consecutive occurances are
    replaced by a single "." and s1 is now "...".

    Any comments?

    Thanks
    --kannan
     
    Kannan S, Jun 1, 2005
    #1
    1. Advertising

  2. Kannan S

    shakah Guest

    Kannan S wrote:
    > How do I replace consecutive occurances of any character in a string,
    > to just one. eg: "aaabbbcc" to "abc"
    >
    > Pattern repeatedChars = Pattern.compile("([a-z]){2,}+");
    > String s1 = repeatedChars.matcher("aaabbbcc").replaceAll("\\1");
    >
    > The regex part seems to be correct, but all consecutive occurances are
    > replaced by a single "." and s1 is now "...".
    >
    > Any comments?
    >
    > Thanks
    > --kannan


    '([a-z])\1+' worked for me:

    jc@sarah:~/tmp$ cat regextest.java
    public class regextest {
    public static void main(String [] asArgs) {
    java.util.regex.Pattern repeatedChars =
    java.util.regex.Pattern.compile(asArgs[0]) ;
    for(int nArg=1; nArg<asArgs.length; ++nArg) {
    System.out.println("replacement on '" + asArgs[nArg]
    + "' => '" +
    repeatedChars.matcher(asArgs[nArg]).replaceAll("$1")
    + "'"
    ) ;
    }
    }
    }

    jc@sarah:~/tmp$ /usr/java/jdk1.5.0_01/bin/java regextest '([a-z])\1+'
    aaaabbcc aabbbbbcddde
    replacement on 'aaaabbcc' => 'abc'
    replacement on 'aabbbbbcddde' => 'abcde'
     
    shakah, Jun 2, 2005
    #2
    1. Advertising

  3. Kannan S

    hiwa Guest

    "Kannan S" <> wrote in message news:<>...
    > How do I replace consecutive occurances of any character in a string,
    > to just one. eg: "aaabbbcc" to "abc"
    >
    > Pattern repeatedChars = Pattern.compile("([a-z]){2,}+");
    > String s1 = repeatedChars.matcher("aaabbbcc").replaceAll("\\1");
    >
    > The regex part seems to be correct, but all consecutive occurances are
    > replaced by a single "." and s1 is now "...".
    >
    > Any comments?
    >
    > Thanks
    > --kannan

    Don't use regex for such simple problem. Use one or two of plain
    String
    methods.

    From java.util.regex.Pattern javadoc:
    <quote>
    The captured input associated with a group is always the subsequence
    that the group most recently matched. If a group is evaluated a second
    time because of quantification then its previously-captured value, if
    any, will be retained if the second evaluation fails. Matching the
    string "aba" against the expression (a(b)?)+, for example, leaves
    group two set to "b". All captured input is discarded at the beginning
    of each match.
    </quote>
     
    hiwa, Jun 2, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,351
  2. Replies:
    5
    Views:
    2,465
    Andrey Kuznetsov
    Apr 5, 2006
  3. Replies:
    12
    Views:
    2,040
    Dr.Ruud
    Nov 2, 2006
  4. PerlFAQ Server
    Replies:
    0
    Views:
    215
    PerlFAQ Server
    Jan 14, 2011
  5. Sharkie
    Replies:
    6
    Views:
    553
    Sharkie
    Oct 16, 2007
Loading...

Share This Page