Regexp parser and generator

Discussion in 'Python' started by George Sakkis, Nov 4, 2008.

  1. Is there any package that parses regular expressions and returns an
    AST ? Something like:

    >>> parse_rx(r'i (love|hate) h(is|er) (cat|dog)s?\s*!+')

    Regex('i ', Or('love', 'hate'), ' h', Or('is', 'er'), ' ', Or('cat',
    'dog'), Optional('s'), ZeroOrMore(r'\s'), OneOrMore('!'))

    Given such a structure, I want to create a generator that can generate
    all strings matched by this regexp. Obviously if the regexp contains a
    '*' or '+' the generator is infinite, and although it can be
    artificially constrained by, say, a maxdepth parameter, for now I'm
    interested in finite regexps only. It shouldn't be too hard to write
    one from scratch but just in case someone has already done it, so much
    the better.

    George
    George Sakkis, Nov 4, 2008
    #1
    1. Advertising

  2. George Sakkis

    Guest

    George> Is there any package that parses regular expressions and returns
    George> an AST ?

    Maybe not directly, but this might provide a starting point for building
    such a beast:

    >>> import re
    >>> re.compile("[ab]", 128)

    in
    literal 97
    literal 98
    <_sre.SRE_Pattern object at 0x47b7a0>
    >>> re.compile("ab*c[xyz]", 128)

    literal 97
    max_repeat 0 65535
    literal 98
    literal 99
    in
    literal 120
    literal 121
    literal 122
    <_sre.SRE_Pattern object at 0x371f90>

    Skip
    , Nov 4, 2008
    #2
    1. Advertising

  3. George Sakkis

    Peter Otten Guest

    George Sakkis wrote:

    > Is there any package that parses regular expressions and returns an
    > AST ? Something like:
    >
    >>>> parse_rx(r'i (love|hate) h(is|er) (cat|dog)s?\s*!+')

    > Regex('i ', Or('love', 'hate'), ' h', Or('is', 'er'), ' ', Or('cat',
    > 'dog'), Optional('s'), ZeroOrMore(r'\s'), OneOrMore('!'))


    Seen today, on planet python:

    >>> import sre_parse
    >>> sre_parse.parse("a|b")

    [('in', [('literal', 97), ('literal', 98)])]


    Peter
    Peter Otten, Nov 4, 2008
    #3
  4. George Sakkis

    Paul McGuire Guest

    On Nov 4, 1:34 pm, George Sakkis <> wrote:
    > Is there any package that parses regular expressions and returns an
    > AST ? Something like:
    >
    > >>> parse_rx(r'i (love|hate) h(is|er) (cat|dog)s?\s*!+')

    >
    > Regex('i ', Or('love', 'hate'), ' h', Or('is', 'er'), ' ', Or('cat',
    > 'dog'), Optional('s'), ZeroOrMore(r'\s'), OneOrMore('!'))
    >
    > Given such a structure, I want to create a generator that can generate
    > all strings matched by this regexp. Obviously if the regexp contains a
    > '*' or '+' the generator is infinite, and although it can be
    > artificially constrained by, say, a maxdepth parameter, for now I'm
    > interested in finite regexps only. It shouldn't be too hard to write
    > one from scratch but just in case someone has already done it, so much
    > the better.
    >
    > George


    Check out this pyparsing regex inverter: http://pyparsing.wikispaces.com/file/view/invRegex.py

    Here is what your example generates:
    i (love|hate) h(is|er) (cat|dog)s?
    Parse time: 0.17 seconds
    16
    i love his cat
    i love his cats
    i love his dog
    i love his dogs
    i love her cat
    i love her cats
    i love her dog
    i love her dogs
    i hate his cat
    i hate his cats
    i hate his dog
    i hate his dogs
    i hate her cat
    i hate her cats
    i hate her dog
    i hate her dogs

    -- Paul
    Paul McGuire, Nov 5, 2008
    #4
  5. On Nov 4, 9:56 pm, Paul McGuire <> wrote:
    > On Nov 4, 1:34 pm, George Sakkis <> wrote:
    >
    >
    >
    > > Is there any package that parses regular expressions and returns an
    > > AST ? Something like:

    >
    > > >>> parse_rx(r'i (love|hate) h(is|er) (cat|dog)s?\s*!+')

    >
    > > Regex('i ', Or('love', 'hate'), ' h', Or('is', 'er'), ' ', Or('cat',
    > > 'dog'), Optional('s'), ZeroOrMore(r'\s'), OneOrMore('!'))

    >
    > > Given such a structure, I want to create a generator that can generate
    > > all strings matched by this regexp. Obviously if the regexp contains a
    > > '*' or '+' the generator is infinite, and although it can be
    > > artificially constrained by, say, a maxdepth parameter, for now I'm
    > > interested in finite regexps only. It shouldn't be too hard to write
    > > one from scratch but just in case someone has already done it, so much
    > > the better.

    >
    > > George

    >
    > Check out this pyparsing regex inverter:http://pyparsing.wikispaces.com/file/view/invRegex.py
    >


    Neat, seems like a good excuse to look into pyparsing :)

    Best,
    George
    George Sakkis, Nov 6, 2008
    #5
  6. On Nov 4, 3:30 pm, Peter Otten <> wrote:

    > George Sakkis wrote:
    > > Is there any package that parses regular expressions and returns an
    > > AST ? Something like:

    >
    > >>>> parse_rx(r'i (love|hate) h(is|er) (cat|dog)s?\s*!+')

    > > Regex('i ', Or('love', 'hate'), ' h', Or('is', 'er'), ' ', Or('cat',
    > > 'dog'), Optional('s'), ZeroOrMore(r'\s'), OneOrMore('!'))

    >
    > Seen today, on planet python:
    >
    > >>> import sre_parse
    > >>> sre_parse.parse("a|b")

    >
    > [('in', [('literal', 97), ('literal', 98)])]
    >
    > Peter


    Thanks, that's rather low level and undocumented but it does the work.

    Best,
    George
    George Sakkis, Nov 6, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Martin Maurer
    Replies:
    3
    Views:
    4,815
    Peter
    Apr 19, 2006
  2. TheDustbustr
    Replies:
    1
    Views:
    447
    Sami Hangaslammi
    Jul 25, 2003
  3. Replies:
    9
    Views:
    537
  4. Chris Withers

    Problems with email.Generator.Generator

    Chris Withers, Sep 11, 2006, in forum: Python
    Replies:
    20
    Views:
    1,689
    Max M
    Sep 12, 2006
  5. Joao Silva
    Replies:
    16
    Views:
    354
    7stud --
    Aug 21, 2009
Loading...

Share This Page