How to write this regular expression?

Discussion in 'Python' started by could ildg, May 4, 2005.

  1. could ildg

    could ildg Guest

    I need a regular expression to check if a string matches it.
    The string consists of one to there parts, each parts is a underline
    followed by a number,
    and the number of the first part should be 0~31, and numbers of other
    parts should be
    larger than 31.

    The requested re should match the following strings:
    _3
    _5_33
    _21
    _29_50
    _29_700_700000

    And the re shouldn't match the following strings:
    _3_5 the number of part 2 is less than 31
    _43 the number of part 1 shouldn't be less than 31
    _5_43_69_98 there shouldn't be more than 3 parts

    How to write the re, please?
    Thanks in advance.
    could ildg, May 4, 2005
    #1
    1. Advertising

  2. This newsgroup is in general very helpful, but there are some
    exceptions; one of them is when the problem appears blatantly to be a
    homework. Perhaps if you showed that you worked on it and made some
    progress, but it's not quite right, someone may help you.

    George
    George Sakkis, May 4, 2005
    #2
    1. Advertising

  3. could ildg

    could ildg Guest

    Does it matter whether it is a homework?
    Why do you look down upon homework?
    Everyone can do his homework well without any problems in your logic?
    It's a problem I met. I tried a lot and I can't work it out,
    so I came here for help.
    I saw someone complained that a question is too lengthy,
    and I saw some questions were complained to be unclear,
    but I never saw someone waste his time to judge if a question is a
    homework. If this is natural, I'll pay attention from now on.

    On 4 May 2005 01:31:48 -0700, George Sakkis <> wrote:
    > This newsgroup is in general very helpful, but there are some
    > exceptions; one of them is when the problem appears blatantly to be a
    > homework. Perhaps if you showed that you worked on it and made some
    > progress, but it's not quite right, someone may help you.
    >
    > George
    >
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    could ildg, May 4, 2005
    #3
  4. On Wednesday 04 May 2005 11:34, could ildg wrote:
    > Does it matter whether it is a homework?


    Yes, it does matter. We're not your CS-class homework monkeys... :) We're a
    forum of Python programmers who aid each other at thinking about solutions,
    we don't present solutions (normally), for a beautiful example of this, see
    the thread about finding similarities between two wave files...

    But, anyway, as an additional hint: the stuff you need to do _can_ be solved
    by an RE (the language you're matching is actually regular if you impose
    several restrictions), but I'd rather not do it that way. Programming a small
    function which splits the string and then does the appropriate checks (by
    using int) should be much easier and faster.

    And in case you really need an RE, watch this monster (to match a single term
    having numerical value >= 40)...

    0*(([1-9][0-9]{2,})|([4-9][0-9]))

    Matching numbers >= 31 isn't hard too, I leave this as an exercise to the
    reader... :) But beware, I'd guess this regex performs rather poorly with
    respect to backtracking on erraneous input such as
    "00000000000000000000000030"...

    --- Heiko.
    Heiko Wundram, May 4, 2005
    #4
  5. Op 2005-05-04, could ildg schreef <>:
    > Does it matter whether it is a homework?


    Yes, because if other do your homework for you, you wont
    have learned anything from it.

    > Why do you look down upon homework?


    Who says he does. That he is not willing to do your homework
    for you, doesn't imply he looks down on it.

    > Everyone can do his homework well without any problems in your logic?


    There is difference in asking for help on how to solve a
    problem yourself and asking for the solution.

    --
    Antoon Pardon
    Antoon Pardon, May 4, 2005
    #5
  6. could ildg

    Peter Hansen Guest

    could ildg wrote:
    > I need a regular expression to check if a string matches it.


    Why do you think you need a regular expression?

    If another approach that involved no regular expressions worked much
    better, would you reject it for some reason?

    -Peter
    Peter Hansen, May 4, 2005
    #6
  7. could ildg

    could ildg Guest

    I can tell you that this is not any homework at all,
    I think it by myself.

    I like this maillist, it helped me a lot. but some guys as you look
    weird.
    On 4 May 2005 10:25:20 GMT, Antoon Pardon <> wrote:
    > Op 2005-05-04, could ildg schreef <>:
    > > Does it matter whether it is a homework?

    >
    > Yes, because if other do your homework for you, you wont
    > have learned anything from it.
    >
    > > Why do you look down upon homework?

    >
    > Who says he does. That he is not willing to do your homework
    > for you, doesn't imply he looks down on it.
    >
    > > Everyone can do his homework well without any problems in your logic?

    >
    > There is difference in asking for help on how to solve a
    > problem yourself and asking for the solution.

    I read the document about re on python tommorow, and when I want to
    use it to settle a problem, I found it not so easy, so I raise the question
    here. I didn't say how I thought about it, because I don't want the question
    to be too long. But a short question doesn't mean that I am too lazy
    and I didn't even think about it. If you think I'm a kind of person
    you hate to help,
    you needn't.
    >
    > --
    > Antoon Pardon
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    could ildg, May 4, 2005
    #7
  8. could ildg

    could ildg Guest

    Thank you.

    I just learned how to use re, so I want to find a way to settle it by
    using re. I know that split it into pieces will do it quickly.

    On 5/4/05, Peter Hansen <> wrote:
    > could ildg wrote:
    > > I need a regular expression to check if a string matches it.

    >
    > Why do you think you need a regular expression?
    >
    > If another approach that involved no regular expressions worked much
    > better, would you reject it for some reason?
    >
    > -Peter
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    could ildg, May 4, 2005
    #8
  9. could ildg

    vkeyboard Guest

    Personally I'd use groups.
    vkeyboard, May 4, 2005
    #9
  10. could ildg

    James Stroud Guest

    On Wednesday 04 May 2005 02:34 am, so sayeth could ildg:
    > I saw someone complained that a question is too lengthy,
    > and I saw some questions were complained to be unclear,
    > but I never saw someone waste his time to judge if a question is a
    > homework. If this is natural, I'll pay attention from now on.
    >
    > On 4 May 2005 01:31:48 -0700, George Sakkis <> wrote:
    > > This newsgroup is in general very helpful, but there are some
    > > exceptions; one of them is when the problem appears blatantly to be a
    > > homework. Perhaps if you showed that you worked on it and made some
    > > progress, but it's not quite right, someone may help you.


    I think by participating in this list, most of the members have felt that they
    have agreed to the following unofficial terms and conditions of use:

    http://www.catb.org/~esr/faqs/smart-questions.html

    The interesting thing is that those who follow the letter most strictly are
    usually the best ones to ask. Moreover, most members of this list are usually
    looking for any excuse to compose a regular expression. In fact, they
    probably come up with an answer before they make any assessments about
    homework.

    > I can tell you that this is not any homework at all,
    > I think it by myself.


    In that case, your question is free game:


    >>> r = re.compile(r"_[0-3]\d?(_\d\d?){0,2}")
    >>>
    >>> r.search('_29_700_700000')

    <_sre.SRE_Match object at 0x402ccba0>
    >>> r.search('_29_50')

    <_sre.SRE_Match object at 0x402f8820>
    >>> r.search('_5_33')

    <_sre.SRE_Match object at 0x402ccba0>
    >>> r.search('_500')
    >>>


    James

    --
    James Stroud
    UCLA-DOE Institute for Genomics and Proteomics
    Box 951570
    Los Angeles, CA 90095

    http://www.jamesstroud.com/
    James Stroud, May 4, 2005
    #10
  11. On Wed, 04 May 2005 20:24:51 +0800, could ildg wrote:

    > Thank you.
    >
    > I just learned how to use re, so I want to find a way to settle it by
    > using re. I know that split it into pieces will do it quickly.


    I'll say this; you have two problems, splitting out the numbers and
    verifying their conformance to some validity rule.

    I strongly recommend treating those two problems separately. While I'm not
    willing to guarantee that an RE can't be written for something like ("[A
    number A]_[A number B]" such that A < B) in the general case, it won't be
    anywhere near as clean or as easy to follow if you just write an RE to
    extract the numbers, then verify the constraints in conventional Python.

    In that case, if you know in advance that the numbers are guaranteed to be
    in that format, I'd just use the regular expression "\d+", and the
    "findall" method of the compile expression:

    Python 2.3.5 (#1, Mar 3 2005, 17:32:12)
    [GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> m = re.compile("\d+")
    >>> m.findall("344mmm555m1111")

    ['344', '555', '1111']
    >>>


    If you're checking general matching of the parameters you've given, I'd
    feel no shame in checking the string against r"^(_\d+){1,3}$" with .match
    and then using the above to get the numbers, if you prefer that. (Note
    that I believe .match implies the initial ^, but I tend to write it
    anyways as a good habit. Explicit better than implicit and all that.)

    (I just tried to capture the three numbers by adding a parentheses set
    around the \d+ but it only gives me the first. I've never tried that
    before; is there a way to get it to give me all of them? I don't think so,
    so two REs may be required after all.)
    Jeremy Bowers, May 4, 2005
    #11
  12. could ildg

    could ildg Guest

    On 5/5/05, Jeremy Bowers <> wrote:
    > On Wed, 04 May 2005 20:24:51 +0800, could ildg wrote:
    >
    > > Thank you.
    > >
    > > I just learned how to use re, so I want to find a way to settle it by
    > > using re. I know that split it into pieces will do it quickly.

    >
    > I'll say this; you have two problems, splitting out the numbers and
    > verifying their conformance to some validity rule.
    >
    > I strongly recommend treating those two problems separately. While I'm not
    > willing to guarantee that an RE can't be written for something like ("[A
    > number A]_[A number B]" such that A < B) in the general case, it won't be
    > anywhere near as clean or as easy to follow if you just write an RE to
    > extract the numbers, then verify the constraints in conventional Python.
    >
    > In that case, if you know in advance that the numbers are guaranteed to be
    > in that format, I'd just use the regular expression "\d+", and the
    > "findall" method of the compile expression:
    >
    > Python 2.3.5 (#1, Mar 3 2005, 17:32:12)
    > [GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
    > Type "help", "copyright", "credits" or "license" for more information.
    > >>> import re
    > >>> m = re.compile("\d+")
    > >>> m.findall("344mmm555m1111")

    > ['344', '555', '1111']
    > >>>

    >
    > If you're checking general matching of the parameters you've given, I'd
    > feel no shame in checking the string against r"^(_\d+){1,3}$" with .match
    > and then using the above to get the numbers, if you prefer that. (Note
    > that I believe .match implies the initial ^, but I tend to write it
    > anyways as a good habit. Explicit better than implicit and all that.)
    >
    > (I just tried to capture the three numbers by adding a parentheses set
    > around the \d+ but it only gives me the first. I've never tried that
    > before; is there a way to get it to give me all of them? I don't think so,
    > so two REs may be required after all.)

    You can capture each number by using group, each group can have a name.

    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    could ildg, May 5, 2005
    #12
  13. On Thu, 05 May 2005 09:30:21 +0800, could ildg wrote:
    > Jeremy Bowers wrote:
    >> Python 2.3.5 (#1, Mar 3 2005, 17:32:12) [GCC 3.4.3 (Gentoo Linux
    >> 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2 Type "help", "copyright",
    >> "credits" or "license" for more information.
    >> >>> import re
    >> >>> m = re.compile("\d+")
    >> >>> m.findall("344mmm555m1111")

    >> ['344', '555', '1111']
    >>
    >> (I just tried to capture the three numbers by adding a parentheses set
    >> around the \d+ but it only gives me the first. I've never tried that
    >> before; is there a way to get it to give me all of them? I don't think
    >> so, so two REs may be required after all.)


    > You can capture each number by using group, each group can have a name.


    I think you missed out on what I meant:

    Python 2.3.5 (#1, Mar 3 2005, 17:32:12)
    [GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> import re
    >>> m = re.compile(r"((?P<name>\d+)_){1,3}")
    >>> match = m.match("12_34_56_")
    >>> match.groups("name")

    ('56_', '56')
    >>>


    Can you also get 12 & 34 out of it? (Interesting, as the non-named groups
    give you the *first* match....)

    I guess I've never wanted this because I usually end up using "findall"
    instead, but I could still see this being useful... parsing a function
    call, for instance, and getting a tuple of the arguments instead of all of
    them at once to be broken up later could be useful.
    Jeremy Bowers, May 5, 2005
    #13
  14. could ildg

    could ildg Guest

    Sorry to Jeremy, I send my email derectly to your mailbox just now.

    Group is very useful.
    On 5/5/05, Jeremy Bowers <> wrote:
    > On Thu, 05 May 2005 09:30:21 +0800, could ildg wrote:
    > > Jeremy Bowers wrote:
    > >> Python 2.3.5 (#1, Mar 3 2005, 17:32:12) [GCC 3.4.3 (Gentoo Linux
    > >> 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2 Type "help", "copyright",
    > >> "credits" or "license" for more information.
    > >> >>> import re
    > >> >>> m = re.compile("\d+")
    > >> >>> m.findall("344mmm555m1111")
    > >> ['344', '555', '1111']
    > >>
    > >> (I just tried to capture the three numbers by adding a parenthesesset
    > >> around the \d+ but it only gives me the first. I've never tried that
    > >> before; is there a way to get it to give me all of them? I don't think
    > >> so, so two REs may be required after all.)

    >
    > > You can capture each number by using group, each group can have a name.

    >
    > I think you missed out on what I meant:
    >
    > Python 2.3.5 (#1, Mar 3 2005, 17:32:12)
    > [GCC 3.4.3 (Gentoo Linux 3.4.3, ssp-3.4.3-0, pie-8.7.6.6)] on linux2
    > Type "help", "copyright", "credits" or "license" for more information.
    > >>> import re
    > >>> m = re.compile(r"((?P<name>\d+)_){1,3}")
    > >>> match = m.match("12_34_56_")
    > >>> match.groups("name")

    > ('56_', '56')
    > >>>

    >
    > Can you also get 12 & 34 out of it? (Interesting, as the non-named groups


    Yes, you can extract **anything** you want if you like, to get each number
    is easy, the only thing you need to do is to give a name to the number.

    import re
    str=r"_2_544_44000000"
    r=re.compile(r'^(?P<slice1>_(?P<number1>[1-3]?\d))'
    '(?P<slice2>_(?P<number2>(3[2-9])|([4-9]\d)|(\d{3,})))?'
    '(?P<slice3>_(?P<number3>(3[2-9])|([4-9]\d)|(\d{3,})))?$',re.VERBOSE)
    mo=r.match(str)
    if mo:
    print mo.groupdict()
    else:
    print "doesn't matche"

    The code above will get the following rusult:
    {'slice1': '_2', 'slice2': '_544', 'slice3': '_44000000', 'number2':
    '544', 'number3': '44000000', 'number1': '2'}

    > give you the *first* match....)
    >
    > I guess I've never wanted this because I usually end up using "findall"
    > instead, but I could still see this being useful... parsing a function
    > call, for instance, and getting a tuple of the arguments instead of all of
    > them at once to be broken up later could be useful.
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >
    could ildg, May 5, 2005
    #14
  15. could ildg

    D H Guest

    Peter Hansen wrote:
    > could ildg wrote:
    >
    >> I need a regular expression to check if a string matches it.

    >
    >
    > Why do you think you need a regular expression?
    >
    > If another approach that involved no regular expressions worked much
    > better, would you reject it for some reason?
    >
    > -Peter


    A regular expression will work fine for his problem.
    Just match the digits separated by underscores using a regular
    expression, then afterward check if the values are valid.
    D H, May 5, 2005
    #15
  16. "D H" <> wrote:

    > > Why do you think you need a regular expression?
    > >
    > > If another approach that involved no regular expressions worked much
    > > better, would you reject it for some reason?

    >
    > A regular expression will work fine for his problem.
    > Just match the digits separated by underscores using a regular
    > expression, then afterward check if the values are valid.


    you forgot to mention Boo here, Doug. nice IronPython announcement,
    btw. the Boo developers must be so proud of you.

    </F>
    Fredrik Lundh, May 5, 2005
    #16
  17. could ildg

    D H Guest

    Fredrik Lundh

    Fredrik Lundh wrote:
    > "D H" <> wrote:
    >
    >
    >>>Why do you think you need a regular expression?
    >>>
    >>>If another approach that involved no regular expressions worked much
    >>>better, would you reject it for some reason?

    >>
    >>A regular expression will work fine for his problem.
    >>Just match the digits separated by underscores using a regular
    >>expression, then afterward check if the values are valid.

    >
    >
    > you forgot to mention Boo here, Doug. nice IronPython announcement,
    > btw. the Boo developers must be so proud of you.
    >
    > </F>


    You never learn, do you Fredrik. I guess that explains why Boo will
    never be mentioned on the python daily site your pythonware business
    controls.

    Here are some of Fredrik's funnier crazy rants right here:
    http://www.oreillynet.com/pub/wlg/6291

    Any that you perceive as competition and threatening to your consulting
    business really draws out your true nature.
    D H, May 9, 2005
    #17
  18. could ildg

    Robert Kern Guest

    Re: Fredrik Lundh

    D H wrote:
    > Fredrik Lundh wrote:


    >>you forgot to mention Boo here, Doug. nice IronPython announcement,
    >>btw. the Boo developers must be so proud of you.
    >>
    >></F>

    >
    > You never learn, do you Fredrik. I guess that explains why Boo will
    > never be mentioned on the python daily site your pythonware business
    > controls.


    It's called Daily Python-URL not Daily Python-Like-Languages-URL. *That*
    explains it. It's not like Pythonware is hiding its relationship.

    > Here are some of Fredrik's funnier crazy rants right here:
    > http://www.oreillynet.com/pub/wlg/6291


    Funny you should mention that article since I showed that Fredrik's
    benchmarks were correctly done (if not diligently-reported) while Uche's
    were wrong on both marks.

    http://www.oreillynet.com/cs/user/view/cs_msg/51158

    > Any that you perceive as competition and threatening to your consulting
    > business really draws out your true nature.


    Oy, my head hurts. Take it off-list, both of you. The rest of us don't
    care about your bickering.

    --
    Robert Kern


    "In the fields of hell where the grass grows high
    Are the graves of dreams allowed to die."
    -- Richard Harter
    Robert Kern, May 9, 2005
    #18
  19. could ildg

    D H Guest

    Re: Fredrik Lundh

    Robert Kern wrote:
    > It's called Daily Python-URL not Daily Python-Like-Languages-URL. *That*
    > explains it.


    google for logix site:pythonware.com He's announced plenty non-python
    stuff that is of interest to python users, including plenty of marketing
    for his own software.

    > It's not like Pythonware is hiding its relationship.


    It hides any mention that Fredrik Lundh is behind it, which is deceitful
    when he posts any smidgeon of praise his software gets, not admitting he
    makes his income off support fees for that same software.

    He can try to smear me all he wants if he really thinks that will help
    his business.


    > Funny you should mention that article since I showed that Fredrik's
    > benchmarks were correctly done (if not diligently-reported) while Uche's
    > were wrong on both marks.
    >
    > http://www.oreillynet.com/cs/user/view/cs_msg/51158


    Funny how you link to your own post out of context. You must have not
    listened to any of the other comments.

    > Oy, my head hurts. Take it off-list, both of you. The rest of us don't
    > care about your bickering.


    Yet again someone bitches about a thread right after they hypocritically
    throw their own little darts into the mix.
    D H, May 9, 2005
    #19
  20. Re: Fredrik Lundh

    D H wrote:

    > Yet again someone bitches about a thread right after they hypocritically
    > throw their own little darts into the mix.


    No one cares. Please take it elsewhere.


    --
    Erik Max Francis && && http://www.alcyone.com/max/
    San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
    There's this perfect girl / Living inside the shell
    -- Lamya
    Erik Max Francis, May 9, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,272
  2. vaidas gudas

    How to write Regular Expression

    vaidas gudas, Oct 4, 2004, in forum: ASP .Net
    Replies:
    1
    Views:
    458
    Alphonse Giambrone
    Oct 4, 2004
  3. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    831
    Alan Moore
    Dec 2, 2005
  4. GIMME
    Replies:
    3
    Views:
    11,921
    vforvikash
    Dec 29, 2008
  5. lisong
    Replies:
    6
    Views:
    279
    lisong
    Nov 26, 2007
Loading...

Share This Page