regular expression back references

Discussion in 'Python' started by Matthew, Aug 8, 2003.

  1. Matthew

    Matthew Guest

    Greetings, I am having a problem using back references in my regex and
    I am having a difficult time figuring out what I am doing wrong. My
    regex works fine with out the back refs but when I try to use them it
    won't match my sample. It looks to me that I am using them no
    differently then my examples and documentation but to no avail.

    Here is my patteren:

    macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:

    And this is how I am using it:

    matched = re.match(macExpression, macAddress)

    I am trying to match mac addresses in the following formats
    0:a0:c9:ee:b2:c0, 0-a0-c9-ee-b2-c0 & 0.a0.c9.ee.b2.c0 etc.

    I wasn't sure how to do it but then I read about back references and I
    thought that all was well... Alas If any one could lend a hand I would
    appreciate it very much.

    -matthew
    Matthew, Aug 8, 2003
    #1
    1. Advertising

  2. Matthew

    John Machin Guest

    (Matthew) wrote in message news:<>...
    >
    > Here is my patteren:
    >
    > macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:
    >
    > And this is how I am using it:
    >
    > matched = re.match(macExpression, macAddress)
    >
    > I am trying to match mac addresses in the following formats
    > 0:a0:c9:ee:b2:c0, 0-a0-c9-ee-b2-c0 & 0.a0.c9.ee.b2.c0 etc.
    >


    Four problems (1) Your pattern has 5 occurrences of [0-9-A-F] but your
    data has 6 (2) your pattern has uppercase hex digits but your data has
    lowercase (3) you need to double some backslashes or (preferably) use
    the r"..." notation (4) your pattern is missing the trailing " -- it
    helps if you cut and paste when posting rather than re-typing stuff.

    and one superfluity: the "^" at the start is redundant

    The following appears to work:

    >>> macExpression = r"[0-9A-F]{1,2}(\:|\.|\-)([0-9A-F]{1,2}\1){4,4}[0-9A-F]{1,2}$"
    >>> for macAddr in ["0:a0:c9:ee:b2:c0", "0-a0-c9-ee-b2-c0",

    "0.a0.c9.ee.b2.c0", "0:a0-c9:ee:b2:c0"]:
    .... print re.match(macExpression, macAddr, re.I)
    ....
    <_sre.SRE_Match object at 0x007C8818>
    <_sre.SRE_Match object at 0x007C8818>
    <_sre.SRE_Match object at 0x007C8818>
    None
    John Machin, Aug 9, 2003
    #2
    1. Advertising

  3. Matthew

    Clay Shirky Guest

    (Matthew) wrote in message news:<>...

    > Here is my patteren:
    >
    > macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:


    good lord, that looks like perl.

    that sort of thing is miserable to write and miserable to maintain. it
    makes more sense to treat MAC addresses as numbers than strings (and
    saves you the horror of upper/lower case and "is it 0 or 00?" issues
    as well)

    use the re moduel to figure out what to split on, then convert
    everything to numeric comparisons. here's an example, more readable
    than the macExpression above:

    import re

    orig_list = [ 0, 160, 201, 238, 178, 192 ] # test MAC as numbers

    new_addresses = [ "00:30:65:01:dc:9f", # various formats...
    "00-03-93-52-0c-c6",
    "00.A0.C9.EE.B2.C0" ]

    for new_address in new_addresses:

    test_list = []

    # use regexes to see what to split on
    if re.search(":", new_address):
    new_list = new_address.split(":")
    elif re.search("-", new_address):
    new_list = new_address.split("-")
    elif re.search(".", new_address):
    new_list = new_address.split(".")

    # convert alphanumeric hex strings to numbers
    # via a long() cast, in base 16
    for two_byte in new_list:
    test_list.append(long(two_byte, 16)) # make a test list

    if test_list == orig_list: # check for numeric matches
    print new_address, "matches..."
    else:
    print new_address, "doesn't match..."
    Clay Shirky, Aug 9, 2003
    #3
  4. Matthew

    Andrew Dalke Guest

    Clay Shirky
    > # use regexes to see what to split on
    > if re.search(":", new_address):


    or use
    if ":" in new_address:

    > elif re.search("-", new_address):
    > new_list = new_address.split("-")
    > elif re.search(".", new_address):
    > new_list = new_address.split(".")


    and include a
    else:
    raise Exception("I have no idea what you're asking for")

    and maybe some ValueError catching in the int call.

    Andrew
    Andrew Dalke, Aug 9, 2003
    #4
  5. Matthew

    John Machin Guest

    On Fri, 8 Aug 2003 20:40:27 -0600, "Andrew Dalke"
    <> wrote:

    >Clay Shirky
    >> # use regexes to see what to split on
    >> if re.search(":", new_address):

    >
    >or use
    > if ":" in new_address:
    >
    >> elif re.search("-", new_address):
    >> new_list = new_address.split("-")
    >> elif re.search(".", new_address):
    >> new_list = new_address.split(".")

    >
    >and include a
    > else:
    > raise Exception("I have no idea what you're asking for")
    >
    >and maybe some ValueError catching in the int call.
    >


    instead maybe something like
    new_list = []
    for sep in '-.:':
    if sep in new_address:
    new_list = new_address.split(sep)
    break
    if len(new_list) != 6:
    raise .......

    plus also a test that each octet is in range(256) ....
    John Machin, Aug 9, 2003
    #5
  6. Matthew

    John Machin Guest

    (John Machin) wrote in message news:<>...
    > (Matthew) wrote in message news:<>...


    > > macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:


    > Four problems (1) Your pattern has 5 occurrences of [0-9-A-F] but your
    > data has 6


    Make that 3 problems. I can't count -- that's why i use computers!
    John Machin, Aug 9, 2003
    #6
  7. Matthew

    Andrew Dalke Guest

    John Machin
    > instead maybe something like
    > new_list = []
    > for sep in '-.:':


    Yup. Much nicer. Clean, defensive programming is such effort. :)

    Andrew
    Andrew Dalke, Aug 9, 2003
    #7
  8. Matthew

    Peter Abel Guest

    (Matthew) wrote in message news:<>...
    > Greetings, I am having a problem using back references in my regex and
    > I am having a difficult time figuring out what I am doing wrong. My
    > regex works fine with out the back refs but when I try to use them it
    > won't match my sample. It looks to me that I am using them no
    > differently then my examples and documentation but to no avail.
    >
    > Here is my patteren:
    >
    > macExpression = "^[0-9A-F]{1,2}(\:|\.|\-)[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}\1[0-9A-F]{1,2}$:
    >
    > And this is how I am using it:
    >
    > matched = re.match(macExpression, macAddress)
    >
    > I am trying to match mac addresses in the following formats
    > 0:a0:c9:ee:b2:c0, 0-a0-c9-ee-b2-c0 & 0.a0.c9.ee.b2.c0 etc.
    >
    > I wasn't sure how to do it but then I read about back references and I
    > thought that all was well... Alas If any one could lend a hand I would
    > appreciate it very much.
    >
    > -matthew


    Matching even the silliest format of macAdress, the
    following works for me:
    >>> silly_adrs=['00-30:65:01.dC:9f', '00:03-93.52.0C-c6', '00-A0:C9.eE:b2.C0']
    >>> def mac_adr(any_adr):

    .... return '.'.join(map(lambda x:str(int(x,16)),
    re.split('[\:\-\.]',any_adr)))
    ....
    >>> for adress in silly_adrs:

    .... print mac_adr(adress)
    ....
    0.48.101.1.220.159
    0.3.147.82.12.198
    0.160.201.238.178.192
    >>>


    Regards
    Peter
    Peter Abel, Aug 9, 2003
    #8
  9. Matthew

    Matthew Guest

    > >>> macExpression = r"[0-9A-F]{1,2}(\:|\.|\-)([0-9A-F]{1,2}\1){4,4}[0-9A-F]{1,2}$"


    Thanks very much that will work a lot better. I am going to save the
    other code you (and the others) offered for later. It may be useful
    later on in the program.

    Thanks
    -matthew
    Matthew, Aug 11, 2003
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,287
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    845
    Alan Moore
    Dec 2, 2005
  3. GIMME
    Replies:
    3
    Views:
    11,948
    vforvikash
    Dec 29, 2008
  4. Roger Leigh
    Replies:
    8
    Views:
    436
    Karl Heinz Buchegger
    Nov 17, 2003
  5. Amit Gupta
    Replies:
    7
    Views:
    284
    Amit Gupta
    Feb 24, 2008
Loading...

Share This Page