regular expressions use

Discussion in 'Python' started by max(01)*, Aug 22, 2005.

  1. max(01)*

    max(01)* Guest

    hi everyone.

    i would like to do some uri-decoding, which means to translate patterns
    like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
    "%2b" is found, it should be translated into one character whose hex
    ascii code is 2b.

    i did this:

    ....
    import re
    import sys

    modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)

    def funzione(corrispondenza):
    return chr(eval('0x' + corrispondenza.group(1)))

    for riga in sys.stdin:
    riga = modello.sub(funzione, riga)
    sys.stdout.write(riga)
    ....

    please comment it. can it be made easily or more compactly? i am a
    python regexp novice.

    bye

    max

    ps: i was trying to pythonate this kind of perl code:

    $riga =~ s/%([A-Fa-f0-9][A-Fa-f0-9])/chr(hex($1))/ge;
     
    max(01)*, Aug 22, 2005
    #1
    1. Advertising

  2. max(01)*

    Peter Otten Guest

    max(01)* wrote:

    > would like to do some uri-decoding, which means to translate patterns
    > like "%2b/dhg-%3b %7E" into "+/dhg-; ~": in practice, if a sequence like
    > "%2b" is found, it should be translated into one character whose hex
    > ascii code is 2b.
    >
    > i did this:
    >
    > ...
    > import re
    > import sys
    >
    > modello = re.compile("%([0-9a-f][0-9a-f])", re.IGNORECASE)
    >
    > def funzione(corrispondenza):
    > return chr(eval('0x' + corrispondenza.group(1)))


    You can specify the base for str to int conversion, e. g:

    return chr(int(corrispondenza.group(1), 16))

    And then there is also urllib.unquote() in the library.

    Peter
     
    Peter Otten, Aug 22, 2005
    #2
    1. Advertising

  3. max(01)*

    Paul McGuire Guest

    Perhaps a bit more verbose than your Perl regexp, here is a decoder
    using pyparsing.

    -- Paul

    # download pyparsing at http://pyparsing.sourceforge.net
    from pyparsing import Word,Combine

    # define grammar for matching encoded characters
    hexnums = "0123456789ABCDEFabcdef"
    encodedChar = Combine( "%" + Word(hexnums,exact=2) )

    # define and attach conversion action
    def unencode(s,l,toks):
    return chr(int(toks[0][1:],16))
    encodedChar.setParseAction( unencode )

    # transform test string
    data = "%2b/dhg-%3b %7E"
    print encodedChar.transformString( data )
    """
    Prints "+/dhg-; ~":
    """
     
    Paul McGuire, Aug 22, 2005
    #3
  4. "max(01)*" <> wrote:

    > i would like to do some uri-decoding, which means to translate patterns
    > like "%2b/dhg-%3b %7E" into "+/dhg-; ~"


    >>> import urllib
    >>> urllib.unquote("%2b/dhg-%3b %7E")

    '+/dhg-; ~'

    </F>
     
    Fredrik Lundh, Aug 22, 2005
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jay Douglas
    Replies:
    0
    Views:
    618
    Jay Douglas
    Aug 15, 2003
  2. Kenneth McDonald
    Replies:
    0
    Views:
    349
    Kenneth McDonald
    Jun 10, 2004
  3. bp
    Replies:
    0
    Views:
    318
  4. Kenneth McDonald
    Replies:
    1
    Views:
    301
    Skip Montanaro
    Jan 31, 2005
  5. Noman Shapiro
    Replies:
    0
    Views:
    239
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page