How to write Regular Expression for recursive matching?

Discussion in 'Python' started by lisong, Nov 26, 2007.

  1. lisong

    lisong Guest

    Hi All,

    I have problem to split a string like this:

    'abc.defg.hij.klmnop'

    and I want to get all substrings with only one '.' in mid. so the
    output I expect is :

    'abc.defg', 'defg.hij', 'hij.klmnop'

    a simple regular expression '\w+.\w' will only return:
    'abc.defg', 'hij.klmnop'

    is there a way to get 'defg.hij' using regular expression?

    Thanks,
    lisong, Nov 26, 2007
    #1
    1. Advertising

  2. lisong

    Paul McGuire Guest

    On Nov 26, 10:40 am, lisong <> wrote:
    > Hi All,
    >
    > I have problem to split a string like this:
    >
    > 'abc.defg.hij.klmnop'
    >
    > and I want to get all substrings with only one '.' in mid. so the
    > output I expect is :
    >
    > 'abc.defg', 'defg.hij', 'hij.klmnop'
    >
    > a simple regular expression '\w+.\w' will only return:
    > 'abc.defg', 'hij.klmnop'
    >
    > is there a way to get 'defg.hij' using regular expression?
    >
    > Thanks,


    Why are you using regular expressions? Use the split method defined
    for strings:

    >>> 'abc.defg.hij.klmnop'.split('.')

    ['abc', 'defg', 'hij', 'klmnop']

    -- Paul
    Paul McGuire, Nov 26, 2007
    #2
    1. Advertising

  3. lisong

    Boris Borcic Guest

    lisong wrote:
    > Hi All,
    >
    > I have problem to split a string like this:
    >
    > 'abc.defg.hij.klmnop'
    >
    > and I want to get all substrings with only one '.' in mid. so the
    > output I expect is :
    >
    > 'abc.defg', 'defg.hij', 'hij.klmnop'
    >
    > a simple regular expression '\w+.\w' will only return:
    > 'abc.defg', 'hij.klmnop'
    >
    > is there a way to get 'defg.hij' using regular expression?
    >
    > Thanks,
    >


    Do you need it to be a regular expression ?

    >>> def f(s) :

    ws = s.split('.')
    return map('.'.join,zip(ws,ws[1:]))

    >>> f('abc.defg.hij.klmnop')

    ['abc.defg', 'defg.hij', 'hij.klmnop']
    Boris Borcic, Nov 26, 2007
    #3
  4. lisong wrote:

    > Hi All,
    >
    > I have problem to split a string like this:
    >
    > 'abc.defg.hij.klmnop'
    >
    > and I want to get all substrings with only one '.' in mid. so the
    > output I expect is :
    >
    > 'abc.defg', 'defg.hij', 'hij.klmnop'
    >
    > a simple regular expression '\w+.\w' will only return:
    > 'abc.defg', 'hij.klmnop'
    >
    > is there a way to get 'defg.hij' using regular expression?


    Nope. Regular expressions can't get back in their input-stream, at least not
    for such stuff.

    The problem at hand is easily solved using

    s = 'abc.defg.hij.klmnop'

    pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]

    Diez
    Diez B. Roggisch, Nov 26, 2007
    #4
  5. lisong

    Paul McGuire Guest

    On Nov 26, 10:51 am, Paul McGuire <> wrote:
    > On Nov 26, 10:40 am, lisong <> wrote:
    >
    >
    >
    >
    >
    > > Hi All,

    >
    > > I have problem to split a string like this:

    >
    > > 'abc.defg.hij.klmnop'

    >
    > > and I want to get all substrings with only one '.' in mid. so the
    > > output I expect is :

    >
    > > 'abc.defg', 'defg.hij', 'hij.klmnop'

    >
    > > a simple regular expression '\w+.\w' will only return:
    > > 'abc.defg', 'hij.klmnop'

    >
    > > is there a way to get 'defg.hij' using regular expression?

    >
    > > Thanks,

    >
    > Why are you using regular expressions? Use the split method defined
    > for strings:
    >
    > >>> 'abc.defg.hij.klmnop'.split('.')

    >
    > ['abc', 'defg', 'hij', 'klmnop']
    >
    > -- Paul- Hide quoted text -
    >
    > - Show quoted text -


    Sorry, misread your post - Diez Roggisch has the right answer.

    -- Paul
    Paul McGuire, Nov 26, 2007
    #5
  6. On Mon, Nov 26, 2007 at 06:04:54PM +0100, Diez B. Roggisch wrote regarding Re: How to write Regular Expression for recursive matching?:
    >
    > lisong wrote:
    >
    > > Hi All,
    > >
    > > I have problem to split a string like this:
    > >
    > > 'abc.defg.hij.klmnop'
    > >
    > > and I want to get all substrings with only one '.' in mid. so the
    > > output I expect is :
    > >
    > > 'abc.defg', 'defg.hij', 'hij.klmnop'
    > >
    > > a simple regular expression '\w+.\w' will only return:
    > > 'abc.defg', 'hij.klmnop'
    > >
    > > is there a way to get 'defg.hij' using regular expression?

    >
    > Nope. Regular expressions can't get back in their input-stream, at least not
    > for such stuff.
    >
    > The problem at hand is easily solved using
    >
    > s = 'abc.defg.hij.klmnop'
    >
    > pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]
    >


    which is veritably perlesque in its elegance and simplicity!

    A slightly more verbose version.

    l = s.split('.')
    pairs = []
    for x in xrange(len(l)-1):
    pairs.append('.'.join(l[x:x+2]))

    Cheers,
    Cliff
    J. Clifford Dyer, Nov 26, 2007
    #6
  7. lisong

    lisong Guest

    On Nov 26, 12:34 pm, "J. Clifford Dyer" <> wrote:
    > On Mon, Nov 26, 2007 at 06:04:54PM +0100, Diez B. Roggisch wrote regarding Re: How to write Regular Expression for recursive matching?:
    >
    >
    >
    >
    >
    > > lisong wrote:

    >
    > > > Hi All,

    >
    > > > I have problem to split a string like this:

    >
    > > > 'abc.defg.hij.klmnop'

    >
    > > > and I want to get all substrings with only one '.' in mid. so the
    > > > output I expect is :

    >
    > > > 'abc.defg', 'defg.hij', 'hij.klmnop'

    >
    > > > a simple regular expression '\w+.\w' will only return:
    > > > 'abc.defg', 'hij.klmnop'

    >
    > > > is there a way to get 'defg.hij' using regular expression?

    >
    > > Nope. Regular expressions can't get back in their input-stream, at least not
    > > for such stuff.

    >
    > > The problem at hand is easily solved using

    >
    > > s = 'abc.defg.hij.klmnop'

    >
    > > pairs = [".".join(v) for v in zip(s.split(".")[:-1], s.split(".")[1:])]

    >
    > which is veritably perlesque in its elegance and simplicity!
    >
    > A slightly more verbose version.
    >
    > l = s.split('.')
    > pairs = []
    > for x in xrange(len(l)-1):
    > pairs.append('.'.join(l[x:x+2]))
    >
    > Cheers,
    > Cliff


    Thank u all for your kindly reply, I agree, RE is not necessary here.

    Song
    lisong, Nov 26, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,274
  2. Codex Twin
    Replies:
    1
    Views:
    673
    Wessel Troost
    Apr 18, 2005
  3. Replies:
    6
    Views:
    845
    John C. Bollinger
    Oct 7, 2005
  4. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    832
    Alan Moore
    Dec 2, 2005
  5. André Hänsel

    Recursive regular expression (or alternative)

    André Hänsel, Apr 12, 2006, in forum: Perl Misc
    Replies:
    2
    Views:
    105
    Anno Siegel
    Apr 14, 2006
Loading...

Share This Page