Regex not matching a string

Discussion in 'Python' started by python.prog29, Jan 9, 2013.

  1. Hi All -


    In the following code ,am trying to remove a multi line - comment that contains "This is a test comment" for some reason the regex is not matching.. can anyone provide inputs on why it is so?

    import os
    import sys
    import re
    import fnmatch

    def find_and_remove(haystack, needle):
    pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
    return re.sub(pattern, "", haystack)

    for path,dirs,files in os.walk(sys.argv[1]):
    for fname in files:
    for pat in ['*.cpp','*.c','*.h','*.txt']:
    if fnmatch.fnmatch(fname,pat):
    fullname = os.path.join(path,fname)
    # put all the text into f and read and replace...
    f = open(fullname).read()
    result = find_and_remove(f, r"This is a test comment")
    print result
     
    python.prog29, Jan 9, 2013
    #1
    1. Advertisements

  2. It works for me.

    Some observations:

    Perhaps you should consider using the glob module rather than manually
    using fnmatch. That's what glob does.

    Also, you never actually write to the files, is that deliberate?

    Finally, perhaps your regex simply doesn't match what you think it
    matches. Do you actually have any files containing the needle

    "/* ... This is a test comment ... */"

    (where the ... are any characters) exactly as shown?

    Instead of giving us all the irrelevant code that has nothing to do with
    matching a regex, you should come up with a simple piece of example code
    that demonstrates your problem. Or, in this case, *fails* to demonstrate
    the problem.

    import re
    haystack = "aaa\naaa /*xxxThis is a test comment \nxxx*/aaa\naaa\n"
    needle = "This is a test comment"
    pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
    print haystack
    print re.sub(pattern, "", haystack)
     
    Steven D'Aprano, Jan 9, 2013
    #2
    1. Advertisements

  3. python.prog29

    Peter Otten Guest

    If a comment does not contain the needle "/\*.*?" extends over the end of
    that comment:
    '/* yyy */ /* xxx'


    One solution may be a substitution function:
    .... s = match.group()
    .... if needle in s:
    .... return ""
    .... else:
    .... return s
    ....'/* yyy */ '
     
    Peter Otten, Jan 9, 2013
    #3
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.