Getting substring by regex

Discussion in 'Java' started by Christine Mayer, Sep 6, 2007.

  1. Hi, I got a String that is composed of digits, white space, numbers
    and other characters.
    Example: 03... London (first two digits of post code, plus 3 dots for
    the remaining digits).

    I want to go through the String and search for the first occurrence of
    a letter (A-Za-Z).
    Then I want the String from this point on, excluding the "post code
    String" containing only numbers, whitespace and dots.
    The class String seems to have a "split(regex) function, but this
    didn't work for me.

    Any idea how this could be done?

    Thanks in advance,

    Christine Mayer, Sep 6, 2007
  2. Look at matching for regex:
    Joshua Cranmer, Sep 6, 2007
  3. Well, I know the Pattern class, but I don't think it could help here.
    You were probably thinking of the split function (Which seems to do
    just the same the String.split function does)

    In the API, it gives the following example:

    The input "boo:and:foo", for example, yields the following results
    with these parameters:

    Regex Limit Result
    : 2 { "boo", "and:foo" }
    : 5 { "boo", "and", "foo" }
    : -2 { "boo", "and", "foo" }
    o 5 { "b", "", ":and:f", "", "" }
    o -2 { "b", "", ":and:f", "", "" }
    o 0 { "b", "", ":and:f" }

    However, in all these examples there is only one character as "regex.
    While in my case I need a whole String as regex, if found, I need to
    chop of this part from the String...
    Christine Mayer, Sep 6, 2007
  4. You obviously did not read the link I gave you. On that page, under the
    heading "Groups and capturing":
    Capturing groups are so named because, during a match, each
    subsequence of the input sequence that matches such a group is saved.
    The captured subsequence may be used later in the expression, via a back
    reference, and may also *be retrieved from the matcher once the match
    operation is complete.* [ My emphasis. ]
    Joshua Cranmer, Sep 6, 2007
  5. Christine Mayer

    SadRed Guest

    You don't nedd capturing groups for this simple task.
    import java.util.regex.*;

    public class ChristineMayer{

    public static void main(String[] args){

    String[] texts = {"03... London",
    "18... Christine",
    "35... Mayer",
    "77... Bagdad"};

    String regx = "[A-Za-z]+"; // substring comosed of Eng. alphabet

    Pattern pat = Pattern.compile(regx);
    for (String s : texts){
    Matcher mat = pat.matcher(s);
    while (mat.find()){
    SadRed, Sep 7, 2007
  6. Christine Mayer

    Roedy Green Guest


    See the section on matching vs finding.

    You might find this easier to do by a char by char stepping through
    the string. Write yourself a method that categorizes a char and
    returns an enum, e.g. ALPHA, NUM, DOT, OTHER to use is the loop.

    Roedy Green, Sep 7, 2007
