regex compression?

Discussion in 'Perl Misc' started by Bill, Apr 19, 2004.

  1. Bill

    Bill Guest

    Hello,

    I wonder if anyone is aware of a package that will rewrite a set of
    regex patterns to reduce their number. For example, given the four
    regex patterns

    /all\d+ of the/
    /crab.*apple tree/
    /all 3 of them/
    /crabapple tree/

    reduce to something like

    /crab.*apple tree/
    /all\d+ of them?/

    that will match the same set of strings.

    I don't need something that just does an 'or' of all the supplied regexes
    into one though.
    Bill, Apr 19, 2004
    #1
    1. Advertising

  2. (Bill) writes:

    > I wonder if anyone is aware of a package that will rewrite a set of
    > regex patterns to reduce their number.


    This question is oft asked here in various guises and I think the
    answer is "no". I think the problem is quite possibly "hard" (in the
    CS sense of the word).

    > For example, given the four
    > regex patterns
    >
    > /all\d+ of the/
    > /crab.*apple tree/
    > /all 3 of them/
    > /crabapple tree/
    >
    > reduce to something like
    >
    > /crab.*apple tree/
    > /all\d+ of them?/
    >
    > that will match the same set of strings.


    Actually it won't.

    Even if I discount the missing space between /all/ and /\d+/ as a typo
    it still won't.

    > I don't need something that just does an 'or' of all the supplied regexes
    > into one though.


    Why?

    --
    \\ ( )
    . _\\__[oo
    .__/ \\ /\@
    . l___\\
    # ll l\\
    ###LL LL\\
    Brian McCauley, Apr 19, 2004
    #2
    1. Advertising

  3. Bill

    Bill Guest

    > Even if I discount the missing space between /all/ and /\d+/ as a typo
    > it still won't.
    >
    > > I don't need something that just does an 'or' of all the supplied regexes
    > > into one though.

    >
    > Why?


    I was trying to optimize Mail::Milter::Module::ConnectRegex to make it run
    faster. Sorry about the typos above.
    Bill, Apr 20, 2004
    #3
  4. Bill

    Bill Guest

    Brian McCauley <> wrote in message news:<>...
    > (Bill) writes:
    >
    > > I wonder if anyone is aware of a package that will rewrite a set of
    > > regex patterns to reduce their number.

    >
    > This question is oft asked here in various guises and I think the
    > answer is "no". I think the problem is quite possibly "hard" (in the
    > CS sense of the word).
    >


    Aha, just found Regexp::Optimizer. Silly, I'd been searching CPAN
    using 'regex' instead of 'regexp'. Never mind :).
    Bill, Apr 20, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    682
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,598
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    585
  4. Xah Lee
    Replies:
    1
    Views:
    924
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    716
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page