Re: regex walktrough

Discussion in 'Python' started by MRAB, Dec 8, 2012.

  1. MRAB

    MRAB Guest

    On 2012-12-08 17:48, rh wrote:
    > Look through some code I found this and wondered about what it does:
    > ^(?P<salsipuedes>[0-9A-Za-z-_.//]+)$
    >
    > Here's my walk through:
    >
    > 1) ^ match at start of string
    > 2) ?P<salsipuedes> if a match is found it will be accessible in a variable
    > salsipuedes
    > 3) [0-9A-Za-z-_.//] this is the one that looks wrong to me, see below
    > 4) + one or more from the preceeding char class
    > 5) () the grouping we want returned (see #2)
    > 6) $ end of the string to match against but before any newline
    >
    >
    > more on #3
    > the z-_ part looks wrong and seems that the - should be at the start
    > of the char set otherwise we get another range z-_ or does the a-z
    > preceeding the z-_ negate the z-_ from becoming a range? The "."
    > might be ok inside a char set. The two slashes look wrong but maybe
    > it has some special meaning in some case? I think only one slash is
    > needed.
    >
    > I've looked at pydoc re, but it's cursory.
    >

    Python itself will help you:

    >>> re.compile(r"^(?P<salsipuedes>[0-9A-Za-z-_.//]+)$", flags=re.DEBUG)

    at at_beginning
    subpattern 1
    max_repeat 1 65535
    in
    range (48, 57)
    range (65, 90)
    range (97, 122)
    literal 45
    literal 95
    literal 46
    literal 47
    literal 47
    at at_end

    Inside the character set: "0-9", "A-Z" and "a-z" are ranges; "-", "_",
    "." and "/" are literals. Doubling the "/" is unnecessary (it has no
    special meaning). "-" is a literal because it immediately follows a
    range, so it can't be defining another range (if it immediately
    followed a literal and wasn't immediately followed by an unescaped "]"
    then it would, so r"[a-]" is the same as r"[a\-]").

    As for "(?P<salsipuedes>...)", it won't be accessible in a variable
    "salsipuedes", but will be accessible as a named group in the match
    object:

    >>> m = re.match(r"(?P<foo>[a-z]+)", "xyz")
    >>> m.group("foo")

    'xyz'
     
    MRAB, Dec 8, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    724
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,649
    Ant...
    Nov 6, 2003
  3. Replies:
    3
    Views:
    794
    Reedick, Andrew
    Jul 1, 2008
  4. rh

    regex walktrough

    rh, Dec 8, 2012, in forum: Python
    Replies:
    4
    Views:
    137
  5. rh

    Re: regex walktrough

    rh, Dec 8, 2012, in forum: Python
    Replies:
    3
    Views:
    155
Loading...

Share This Page