Problema con le RE....

Discussion in 'Python' started by Alessandro, Jan 9, 2006.

  1. Alessandro

    Alessandro Guest

    Problema con le RE....
    Ho questa stringa "3 HOURS, 22 MINUTES, and 28 SECONDS" e la devo
    'dividere' nelle sue tre parti "3 HOURS", "22 MINUTES", "28 SECONDS".
    La cosa mi viene molto con le RE...(inutile la premessa che sono molto
    alle prime armi con RE e Python)
    Qesito perchè se eseguo questo codice

    >>>>regex=re.compile("[0-9]+ (HOUR|MINUTE|SECOND)")
    >>>>print regex.findall("22 MINUTE, 3 HOUR, AND 28 SECOND")

    ottengo come output:

    >>>> ['MINUTE', 'HOUR', 'SECOND']


    e non come mi aspettavo:

    >>>> ['3 MINUTE', '22 HOUR', '28 SECOND']


    Saluti e grazie mille...
    Alessandro
     
    Alessandro, Jan 9, 2006
    #1
    1. Advertising

  2. Alessandro

    Xavier Morel Guest

    Alessandro wrote:
    > Problema con le RE....
    > Ho questa stringa "3 HOURS, 22 MINUTES, and 28 SECONDS" e la devo
    > 'dividere' nelle sue tre parti "3 HOURS", "22 MINUTES", "28 SECONDS".
    > La cosa mi viene molto con le RE...(inutile la premessa che sono molto
    > alle prime armi con RE e Python)
    > Qesito perchè se eseguo questo codice
    >
    > >>>>regex=re.compile("[0-9]+ (HOUR|MINUTE|SECOND)")
    > >>>>print regex.findall("22 MINUTE, 3 HOUR, AND 28 SECOND")

    > ottengo come output:
    >
    > >>>> ['MINUTE', 'HOUR', 'SECOND']

    >
    > e non come mi aspettavo:
    >
    > >>>> ['3 MINUTE', '22 HOUR', '28 SECOND']

    >
    > Saluti e grazie mille...
    > Alessandro
    >

    Would probably be slightly easier had you written it in english, but
    basically the issue is the matching group.

    A match group is defined by the parenthesis in the regular expression,
    e.g. your match group is "(HOUR|MINUTE|SECOND)", which means that only
    that will be returned by a findall.

    You need to include the number as well, and you can use a non-grouping
    match for the time (with (?: ) instead of () ) to prevent dirtying your
    matched groups.

    >>> pattern = re.compile(r"([0-9]+ (?:HOUR|MINUTE|SECOND))")


    Other improvements:
    * \d is a shortcut for "any digit" and is therefore equivalent to [0-9]
    yet slightly clearer.
    * You may use the re.I (or re.IGNORECASE) to match both lower and
    uppercase times
    * You can easily handle an optional "s"

    Improved regex:

    >>> pattern = re.compile(r"(\d+ (?:hour|minute|second)s?)", re.I)
    >>> pattern.findall("3 HOURS 22 MINUTES 28 SECONDS")

    ['3 HOURS', '22 MINUTES', '28 SECONDS']
    >>> pattern.findall("1 HOUR 22 MINUTES 28 SECONDS")

    ['1 HOUR', '22 MINUTES', '28 SECONDS']

    If you want to learn more about regular expressions, I suggest you to
    browse and read http://regular-expressions.info/ it's a good source of
    informations, and use the Kodos software which is a quite good Python
    regex debugger.
     
    Xavier Morel, Jan 9, 2006
    #2
    1. Advertising

  3. Alessandro

    Alessandro Guest

    Thanks for the reply it's ok!!!
    The language? I selected the wrong newsgroup in my
    newsreader!!!...sorry...

    Thanks...

    Alessandro...
     
    Alessandro, Jan 9, 2006
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jose Joaquin de Haro

    Problema con archivos dbx

    Jose Joaquin de Haro, Jan 28, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    1,589
    Lionel LASKE
    Jan 28, 2005
  2. Fabio Cirillo

    problema con vb net e system.net.socket

    Fabio Cirillo, Mar 29, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    678
    Fabio Cirillo
    Mar 29, 2005
  3. Replies:
    1
    Views:
    646
  4. Replies:
    0
    Views:
    412
  5. Matteo Mancini

    Piccolo con problema con il tipo float

    Matteo Mancini, Oct 8, 2007, in forum: Ruby
    Replies:
    3
    Views:
    249
    Jano Svitok
    Oct 8, 2007
Loading...

Share This Page