How to remove empty lines with re?

Discussion in 'Python' started by Tim Haynes, Oct 10, 2003.

  1. Tim Haynes

    Tim Haynes Guest

    If you will set a variable to an empty string and then print it, you will
    get an empty line printed ;)

    Product Development Consultant
    OpenLink Software
    Tel: +44 (0) 20 8681 7701
    Web: <>
    Universal Data Access & Data Integration Technology Providers
    Tim Haynes, Oct 10, 2003
    1. Advertisements

  2. Tim Haynes

    ted Guest

    I'm having trouble using the re module to remove empty lines in a file.

    Here's what I thought would work, but it doesn't:

    import re
    f = open("old_site/index.html")
    for line in f:
    line = re.sub(r'^\s+$|\n', '', line)
    print line

    Also, when I try to remove some HTML tags, I get even more empty lines:

    import re
    f = open("old_site/index.html")
    for line in f:
    line = re.sub('<.*?>', '', line)
    line = re.sub(r'^\s+$|\n', '', line)
    print line

    I don't know what I'm doing. Any help appreciated.

    ted, Oct 10, 2003
    1. Advertisements

  3. Tim Haynes

    Peter Otten Guest


    import sys
    for line in f:
    if line.strip():

    Background: lines read from the file keep their trailing "\n", a second
    newline is inserted by the print statement.
    The strip() method creates a copy of the string with all leading/trailing
    whitespace chars removed. All but the empty string evaluate to True in the
    if statement.

    Peter Otten, Oct 10, 2003
  4. nonempty = [x for x in f if not x.strip()]

    Bror Johansson, Oct 10, 2003
  5. Tim Haynes

    Anand Pillai Guest

    To do this, you need to modify your re to just


    This of course looks for a pattern where there is beginning just
    after end, ie the line is empty :)

    Here is the complete code.

    import re

    for line in open('test.txt').readlines():
    if empty.match(line):
    print line,

    The comma at the end of the print is to avoid printing another newline,
    since the 'readlines()' method gives you the line with a '\n' at the end.

    Also dont forget to compile your regexps for efficiency sake.


    -Anand Pillai
    Anand Pillai, Oct 10, 2003
  6. Tim Haynes

    Anand Pillai Guest


    I meant "there is end just after the beginning" of course.

    Anand Pillai, Oct 10, 2003
  7. The .readlines() method retains any line terminators, and using the
    builtin print will suffix an extra line terminator to every line,
    thus effectively producing an empty line for every non-empty line.
    You'd want to use e.g. sys.stdout.write() instead of print.

    // Klaus

    Klaus Alexander Seistrup, Oct 10, 2003
  8. Tim Haynes

    ted Guest

    Thanks Anand, works great.

    ted, Oct 11, 2003
  9. Tim Haynes

    Anand Pillai Guest

    You probably did not read my posting completely.

    I have added a comma after the print statement and mentioned
    a comment specifically on this.

    The 'print line,' statement with a comma after it does not print
    a newline which you also call as line terminator whereas
    the 'print' without a comma at the end does just that.

    No wonder python sometimes feels like high-level psuedocode ;-)
    It has that ultra intuitive feel for most of its tricks.

    In this case, the comma is usually put when you have more than
    one item to print, and python puts a newline after all items.
    So it very intuitively follows that just putting a comma will not
    print a newline! It is better than telling the programmer to use
    another print function to avoid newlines, which you find in many
    other 'un-pythonic' languages.

    Anand Pillai, Oct 12, 2003
  10. You are completely right, I missed an important part of your posting.
    I didn't know about the comma feature, so thanks for teaching me!


    // Klaus

    Klaus Alexander Seistrup, Oct 12, 2003
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.