Regex not matching a string

Discussion in 'Python' started by python.prog29@gmail.com, Jan 9, 2013.

  1. Guest

    Hi All -


    In the following code ,am trying to remove a multi line - comment that contains "This is a test comment" for some reason the regex is not matching.. can anyone provide inputs on why it is so?

    import os
    import sys
    import re
    import fnmatch

    def find_and_remove(haystack, needle):
    pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
    return re.sub(pattern, "", haystack)

    for path,dirs,files in os.walk(sys.argv[1]):
    for fname in files:
    for pat in ['*.cpp','*.c','*.h','*.txt']:
    if fnmatch.fnmatch(fname,pat):
    fullname = os.path.join(path,fname)
    # put all the text into f and read and replace...
    f = open(fullname).read()
    result = find_and_remove(f, r"This is a test comment")
    print result
    , Jan 9, 2013
    #1
    1. Advertising

  2. On Wed, 09 Jan 2013 02:08:23 -0800, python.prog29 wrote:

    > Hi All -
    >
    >
    > In the following code ,am trying to remove a multi line - comment that
    > contains "This is a test comment" for some reason the regex is not
    > matching.. can anyone provide inputs on why it is so?


    It works for me.

    Some observations:

    Perhaps you should consider using the glob module rather than manually
    using fnmatch. That's what glob does.

    Also, you never actually write to the files, is that deliberate?

    Finally, perhaps your regex simply doesn't match what you think it
    matches. Do you actually have any files containing the needle

    "/* ... This is a test comment ... */"

    (where the ... are any characters) exactly as shown?

    Instead of giving us all the irrelevant code that has nothing to do with
    matching a regex, you should come up with a simple piece of example code
    that demonstrates your problem. Or, in this case, *fails* to demonstrate
    the problem.

    import re
    haystack = "aaa\naaa /*xxxThis is a test comment \nxxx*/aaa\naaa\n"
    needle = "This is a test comment"
    pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
    print haystack
    print re.sub(pattern, "", haystack)


    --
    Steven
    Steven D'Aprano, Jan 9, 2013
    #2
    1. Advertising

  3. Peter Otten Guest

    wrote:

    > In the following code ,am trying to remove a multi line - comment that
    > contains "This is a test comment" for some reason the regex is not
    > matching.. can anyone provide inputs on why it is so?


    > def find_and_remove(haystack, needle):
    > pattern = re.compile(r'/\*.*?'+ needle + '.*?\*/', re.DOTALL)
    > return re.sub(pattern, "", haystack)


    If a comment does not contain the needle "/\*.*?" extends over the end of
    that comment:

    >>> re.compile(r"/\*.*?xxx").search("/* xxx */").group()

    '/* xxx'
    >>> re.compile(r"/\*.*?xxx").search("/* yyy */ /* xxx */").group()

    '/* yyy */ /* xxx'


    One solution may be a substitution function:

    >>> def sub(match, needle="xxx"):

    .... s = match.group()
    .... if needle in s:
    .... return ""
    .... else:
    .... return s
    ....
    >>> re.compile(r"/\*.*?\*/").sub(sub, "/* yyy */ /* xxx */")

    '/* yyy */ '
    Peter Otten, Jan 9, 2013
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xah Lee
    Replies:
    1
    Views:
    927
    Ilias Lazaridis
    Sep 22, 2006
  2. Xah Lee
    Replies:
    8
    Views:
    454
    Ilias Lazaridis
    Sep 26, 2006
  3. Replies:
    3
    Views:
    725
    Reedick, Andrew
    Jul 1, 2008
  4. Xah Lee
    Replies:
    2
    Views:
    209
    Xah Lee
    Sep 25, 2006
  5. Replies:
    2
    Views:
    381
Loading...

Share This Page