F: How can I make re.sub() replace patterns across newlines

Discussion in 'Python' started by Viktor Rosenfeld, Feb 2, 2004.

  1. Hi,

    I want to strip a JAVA file of /* */ like comments. Unfortunately, the
    simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
    Is there a simple way to remove comments that go across several lines with
    python regexp's? I tried re.M to no avail.

    Thanks,
    Viktor
     
    Viktor Rosenfeld, Feb 2, 2004
    #1
    1. Advertising

  2. Viktor Rosenfeld <- -berlin.de wrote:

    > I want to strip a JAVA file of /* */ like comments. Unfortunately, the
    > simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
    > Is there a simple way to remove comments that go across several lines with
    > python regexp's? I tried re.M to no avail.


    You must use re.S

    ,----[ Python lib reference ]
    | `S'
    |
    | `DOTALL'
    | Make the `.' special character match any character at all,
    | including a newline; without this flag, `.' will match anything
    | _except_ a newline.
    `----


    KP

    --
    Männer der Wissenschaft! Man sagt ihr viele nach,
    aber die meisten mit Unrecht.
    Karl Kraus 'Aphorismen'
     
    Karl =?iso-8859-1?q?Pfl=E4sterer?=, Feb 2, 2004
    #2
    1. Advertising

  3. Viktor Rosenfeld wrote:

    > Hi,
    >
    > I want to strip a JAVA file of /* */ like comments. Unfortunately, the
    > simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
    > Is there a simple way to remove comments that go across several lines with
    > python regexp's? I tried re.M to no avail.
    >
    > Thanks,
    > Viktor


    Viktor,

    Supply the DOTALL flag during the regular expression compile as
    described here: http://www.python.org/doc/current/lib/re-syntax.html

    You will also want to make the regular expression non-greedy...the
    reasons are quite evident.

    >>> import re
    >>> import pprint
    >>>
    >>> st = """

    .... /* this is a
    .... multi-line comment */
    ....
    .... /* this is a single-line comment */
    ....
    .... /* this /* has multiple
    .... starts */
    .... """
    #non-greedy matching
    >>> NonGreedy = re.compile("\/\*.*?\*\/", re.DOTALL)
    >>>
    >>> pprint.pprint(NonGreedy.findall(st))

    ['/* this is a\nmulti-line comment */',
    '/* this is a single-line comment */',
    '/* this /* has multiple\nstarts */']

    #greedy matching
    >>> Greedy = re.compile("\/\*.*\*\/", re.DOTALL)
    >>> pprint.pprint(Greedy.findall(st))

    ['/* this is a\nmulti-line comment */\n\n/* this is a single-line
    comment */\n\n/* this /* has multiple\nstarts */']

    - Josiah
     
    Josiah Carlson, Feb 2, 2004
    #3
  4. Viktor Rosenfeld

    Hans Nowak Guest

    Viktor Rosenfeld wrote:
    > Hi,
    >
    > I want to strip a JAVA file of /* */ like comments. Unfortunately, the
    > simple regexp "\/\*.*\*\/" only works on comments, that are on one line.
    > Is there a simple way to remove comments that go across several lines with
    > python regexp's? I tried re.M to no avail.


    Something like:

    import re
    pattern = re.compile("/\*.*?\*/", re.MULTILINE|re.DOTALL)
    stripped_data = pattern.sub("", data)

    Note that I added a ? to the regex, so it won't be "greedy".

    HTH,

    --
    Hans ()
    http://zephyrfalcon.org/
     
    Hans Nowak, Feb 2, 2004
    #4
  5. Re: [SOLVED] F: How can I make re.sub() replace patterns across newlines

    Thanks to all that were quick to answer, using re.DOTALL indeed solves the
    problem. I was too tired to read the documentation correctly.

    Ciao,
    Viktor
     
    Viktor Rosenfeld, Feb 2, 2004
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. crichmon
    Replies:
    4
    Views:
    499
    Mabden
    Jul 7, 2004
  2. Ben
    Replies:
    2
    Views:
    922
  3. Lawrence D'Oliveiro

    Death To Sub-Sub-Sub-Directories!

    Lawrence D'Oliveiro, May 5, 2011, in forum: Java
    Replies:
    92
    Views:
    2,083
    Lawrence D'Oliveiro
    May 20, 2011
  4. Alan Munn

    including newlines in a .sub

    Alan Munn, Jul 20, 2009, in forum: Ruby
    Replies:
    11
    Views:
    205
    Kyle Smith
    Jul 24, 2009
  5. John Black

    Why is this sub removing newlines??

    John Black, Dec 5, 2013, in forum: Perl Misc
    Replies:
    29
    Views:
    241
    Charles DeRykus
    Dec 13, 2013
Loading...

Share This Page