regex with specific list of string

Discussion in 'Python' started by james_027, Sep 26, 2007.

  1. james_027

    james_027 Guest

    hi,

    how do I regex that could check on any of the value that match any one
    of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    'sep', 'oct', 'nov', 'dec'

    Thanks
    james
     
    james_027, Sep 26, 2007
    #1
    1. Advertising

  2. On Wed, 2007-09-26 at 15:42 +0000, james_027 wrote:
    > hi,
    >
    > how do I regex that could check on any of the value that match any one
    > of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    > 'sep', 'oct', 'nov', 'dec'


    Why regex? You can simply check if the given value is contained in the
    set of allowed values:

    >>> s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',

    'sep', 'oct', 'nov', 'dec'])
    >>> 'jan' in s

    True
    >>> 'spam' in s

    False

    HTH,

    --
    Carsten Haese
    http://informixdb.sourceforge.net
     
    Carsten Haese, Sep 26, 2007
    #2
    1. Advertising

  3. james_027

    Steve Holden Guest

    james_027 wrote:
    > hi,
    >
    > how do I regex that could check on any of the value that match any one
    > of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    > 'sep', 'oct', 'nov', 'dec'
    >
    > Thanks


    >>> patr = re.compile('jan|feb|mar|apr|may|jun|jul|aug|sep|nov|oct|dec')
    >>> patr.match("jul")

    <_sre.SRE_Match object at 0x7ff28ad8>
    >>> patr.match("nosuch")
    >>>


    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Skype: holdenweb http://del.icio.us/steve.holden

    Sorry, the dog ate my .sigline
     
    Steve Holden, Sep 26, 2007
    #3
  4. Carsten Haese wrote:
    > On Wed, 2007-09-26 at 15:42 +0000, james_027 wrote:
    >
    >> hi,
    >>
    >> how do I regex that could check on any of the value that match any one
    >> of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    >> 'sep', 'oct', 'nov', 'dec'
    >>

    >
    > Why regex? You can simply check if the given value is contained in the
    > set of allowed values:
    >
    >
    >>>> s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    >>>>

    > 'sep', 'oct', 'nov', 'dec'])
    >
    >>>> 'jan' in s


    Also, check calendar for a locale aware (vs hardcoded) version:

    >>> import calendar
    >>> [calendar.month_abbr.lower() for i in range(1,13)]

    ['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug', 'sep', 'oct', 'nov', 'dec']

    If you still want to use regexes, you can do something like:
    >>> import re
    >>> pattern = '(?:%s)' % '|'.join(calendar.month_abbr[1:13])
    >>> pattern

    '(?:Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)'
    >>> re.search(pattern, "we are in september", re.IGNORECASE)

    <_sre.SRE_Match object at 0xb7ced640>
    >>> re.search(pattern, "we are in september", re.IGNORECASE).group()

    'sep'

    If you want to make sure that the month name begins a word, use the following pattern instead:
    >>> pattern = r'(?:\b%s)' % r'|\b'.join(calendar.month_abbr[1:13])
    >>> pattern

    '(?:\\bJan|\\bFeb|\\bMar|\\bApr|\\bMay|\\bJun|\\bJul|\\bAug|\\bSep|\\bOct|\\bNov|\\bDec)'

    If in doubt, Google for "regular expressions in python" or go to http://docs.python.org/lib/module-re.html


    Regards,
    Pablo
     
    Pablo Ziliani, Sep 26, 2007
    #4
  5. On Wed, 2007-09-26 at 12:49 -0400, Steve Holden wrote:
    > james_027 wrote:
    > > hi,
    > >
    > > how do I regex that could check on any of the value that match any one
    > > of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    > > 'sep', 'oct', 'nov', 'dec'
    > >
    > > Thanks

    >
    > >>> patr = re.compile('jan|feb|mar|apr|may|jun|jul|aug|sep|nov|oct|dec')
    > >>> patr.match("jul")

    > <_sre.SRE_Match object at 0x7ff28ad8>
    > >>> patr.match("nosuch")


    Unfortunately, that also matches margarine, mayonnaise, and octopus,
    just to name a few ;-)

    --
    Carsten Haese
    http://informixdb.sourceforge.net
     
    Carsten Haese, Sep 26, 2007
    #5
  6. james_027

    Steve Holden Guest

    Carsten Haese wrote:
    > On Wed, 2007-09-26 at 12:49 -0400, Steve Holden wrote:
    >> james_027 wrote:
    >>> hi,
    >>>
    >>> how do I regex that could check on any of the value that match any one
    >>> of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    >>> 'sep', 'oct', 'nov', 'dec'
    >>>
    >>> Thanks
    >> >>> patr = re.compile('jan|feb|mar|apr|may|jun|jul|aug|sep|nov|oct|dec')
    >> >>> patr.match("jul")

    >> <_sre.SRE_Match object at 0x7ff28ad8>
    >> >>> patr.match("nosuch")

    >
    > Unfortunately, that also matches margarine, mayonnaise, and octopus,
    > just to name a few ;-)
    >

    Indeed, but I think the essential point was served. Unlike the
    mayonnaise and octopus.

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Skype: holdenweb http://del.icio.us/steve.holden

    Sorry, the dog ate my .sigline
     
    Steve Holden, Sep 26, 2007
    #6
  7. Carsten Haese wrote:
    > On Wed, 2007-09-26 at 12:49 -0400, Steve Holden wrote:
    >> james_027 wrote:
    >>> hi,
    >>>
    >>> how do I regex that could check on any of the value that match any one
    >>> of these ... 'jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',
    >>> 'sep', 'oct', 'nov', 'dec'
    >>>
    >>> Thanks
    >> >>> patr = re.compile('jan|feb|mar|apr|may|jun|jul|aug|sep|nov|oct|dec')
    >> >>> patr.match("jul")

    >> <_sre.SRE_Match object at 0x7ff28ad8>
    >> >>> patr.match("nosuch")

    >
    > Unfortunately, that also matches margarine, mayonnaise, and octopus,
    > just to name a few ;-)


    (and so does the solution you sent before :)

    This is fine IMO since the OP didn't specify the opposite.

    BTW in my previous post I included an example that ensures that the
    search month matches the beginning of a word. That was based in that
    maybe he wanted to match e.g. "dec" against "December" (BTW, it should
    have been r'\b(?:Jan|Feb|...)' instead). To always match a whole word, a
    trailing \b can be added to the pattern OR (much better) if the month
    can appear both in its abbreviated and full form, he can use the
    extensive set as follows (I hope this is clear, excuse my Thunderbird...):

    >>> pattern = r"\b(?:%s)\b" % '|'.join(calendar.month_name[1:13] +

    calendar.month_abbr[1:13])
    >>> pattern

    '\\b(?:January|February|March|April|May|June|July|August|September|October|November|December|Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)\\b'
    >>> target = "Unlike Julia, I like apricots with mayo in august or sep"
    >>> target

    'Unlike Julia, I like apricots with mayo in august or sep'
    >>> re.findall(pattern, target, re.IGNORECASE)

    ['august', 'sep']
    >>> re.search(pattern, target, re.IGNORECASE)

    <_sre.SRE_Match object at 0xb7ced640>
    >>> re.findall(pattern, target, re.IGNORECASE)

    ['august', 'sep']


    Regards,
    Pablo
     
    Pablo Ziliani, Sep 26, 2007
    #7
  8. On Wed, 2007-09-26 at 15:13 -0300, Pablo Ziliani wrote:
    > Carsten Haese wrote:
    > > Unfortunately, that also matches margarine, mayonnaise, and octopus,
    > > just to name a few ;-)

    >
    > (and so does the solution you sent before :)


    No, it doesn't.

    >>> s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',

    .... 'sep', 'oct', 'nov', 'dec'])
    >>> 'margarine' in s

    False
    >>> 'mayonnaise' in s

    False
    >>> 'octopus' in s

    False

    > This is fine IMO since the OP didn't specify the opposite.


    True, but my crystal ball tells me that the OP wants exact matches.
    (Extrapolating from another post made by the OP earlier today, I'm
    guessing he has a list of column names to include in an SQL "order by"
    clause, and he wants to check them for validity first.)

    --
    Carsten Haese
    http://informixdb.sourceforge.net
     
    Carsten Haese, Sep 26, 2007
    #8
  9. james_027

    Steve Holden Guest

    Carsten Haese wrote:
    > On Wed, 2007-09-26 at 15:13 -0300, Pablo Ziliani wrote:
    >> Carsten Haese wrote:
    >>> Unfortunately, that also matches margarine, mayonnaise, and octopus,
    >>> just to name a few ;-)

    >> (and so does the solution you sent before :)

    >
    > No, it doesn't.
    >
    >>>> s = set(['jan', 'feb', 'mar', 'apr', 'may', 'jun', 'jul', 'aug',

    > ... 'sep', 'oct', 'nov', 'dec'])
    >>>> 'margarine' in s

    > False
    >>>> 'mayonnaise' in s

    > False
    >>>> 'octopus' in s

    > False
    >
    >> This is fine IMO since the OP didn't specify the opposite.

    >
    > True, but my crystal ball tells me that the OP wants exact matches.
    > (Extrapolating from another post made by the OP earlier today, I'm
    > guessing he has a list of column names to include in an SQL "order by"
    > clause, and he wants to check them for validity first.)
    >

    Well, as somebody else already pointed out, the OP's query was
    completely misconceived in the first place, and he would have been
    better recasting it in a more natural way.

    However, I am not going to claim that my psychic powers are clearly
    superior to yours ...

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Skype: holdenweb http://del.icio.us/steve.holden

    Sorry, the dog ate my .sigline
     
    Steve Holden, Sep 26, 2007
    #9
  10. james_027

    james_027 Guest

    Hi all,

    > This is fine IMO since the OP didn't specify the opposite.
    >


    Thanks for all your replies, though I don't understand quite well the
    going argument? What do you mean when you say "the OP didn't specify
    the opposite"?

    There reason for using regex is because I am going to use it in
    Django's URL pattern

    Thanks
    james
     
    james_027, Sep 26, 2007
    #10
  11. james_027

    Steve Holden Guest

    james_027 wrote:
    > Hi all,
    >
    >> This is fine IMO since the OP didn't specify the opposite.
    >>

    >
    > Thanks for all your replies, though I don't understand quite well the
    > going argument? What do you mean when you say "the OP didn't specify
    > the opposite"?
    >
    > There reason for using regex is because I am going to use it in
    > Django's URL pattern
    >

    Carsten was pointing out that the pattern I gave you would match any
    string that *began* with one of the month names, as I didn't include an
    element to force a match of the end of the string.

    I did this because I assumed you were most interested in finding out how
    to match one of a number of alternate strings, and this would likely
    only be a part of your final pattern.

    If you already have what you need you really don't need to pay much
    attention to the rest: it's just geeks picking nits!

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC/Ltd http://www.holdenweb.com
    Skype: holdenweb http://del.icio.us/steve.holden

    Sorry, the dog ate my .sigline
     
    Steve Holden, Sep 27, 2007
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mladen Adamovic
    Replies:
    0
    Views:
    777
    Mladen Adamovic
    Dec 4, 2003
  2. Mladen Adamovic
    Replies:
    3
    Views:
    14,782
    Mladen Adamovic
    Dec 5, 2003
  3. Replies:
    3
    Views:
    835
    Reedick, Andrew
    Jul 1, 2008
  4. Ruby Newbee

    regex =~ string or string =~ regex?

    Ruby Newbee, Jan 4, 2010, in forum: Ruby
    Replies:
    3
    Views:
    151
    Kirk Haines
    Jan 4, 2010
  5. Richard Anderson

    Regex that matches anything except a specific string

    Richard Anderson, Nov 6, 2003, in forum: Perl Misc
    Replies:
    3
    Views:
    159
Loading...

Share This Page