Method needed for skipping lines

Discussion in 'Python' started by Gustaf, Oct 31, 2007.

  1. Gustaf

    Gustaf Guest

    Hi all,

    Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.

    I imagine having a flag variable to tell whether you're inside the auto-generated part, but I wasn't able to figure out exactly how. Here's the function, without the ability to skip auto-generated code:

    # Count the lines of source code in the file
    def count_lines(f):
    file = open(f, 'r')
    rows = 0
    for line in file:
    rows = rows + 1
    return rows

    How would you modify this to exclude lines between "Begin VB.Form" and "End" as described above?

    Gustaf
     
    Gustaf, Oct 31, 2007
    #1
    1. Advertising

  2. On Wed, 31 Oct 2007 18:02:26 +0100, Gustaf wrote:

    > Just for fun, I'm working on a script to count the number of lines in
    > source files. Some lines are auto-generated (by the IDE) and shouldn't be
    > counted. The auto-generated part of files start with "Begin VB.Form" and
    > end with "End" (first thing on the line). The "End" keyword may appear
    > inside the auto-generated part, but not at the beginning of the line.
    >
    > I imagine having a flag variable to tell whether you're inside the
    > auto-generated part, but I wasn't able to figure out exactly how. Here's
    > the function, without the ability to skip auto-generated code:
    >
    > # Count the lines of source code in the file def count_lines(f):
    > file = open(f, 'r')
    > rows = 0
    > for line in file:
    > rows = rows + 1
    > return rows
    >
    > How would you modify this to exclude lines between "Begin VB.Form" and
    > "End" as described above?


    Introduce the flag and look up the docs for the `startswith()` method on
    strings.

    Ciao,
    Marc 'BlackJack' Rintsch
     
    Marc 'BlackJack' Rintsch, Oct 31, 2007
    #2
    1. Advertising

  3. Gustaf

    Yu-Xi Lim Guest

    Gustaf wrote:
    > Hi all,
    >
    > Just for fun, I'm working on a script to count the number of lines in
    > source files. Some lines are auto-generated (by the IDE) and shouldn't
    > be counted. The auto-generated part of files start with "Begin VB.Form"
    > and end with "End" (first thing on the line). The "End" keyword may
    > appear inside the auto-generated part, but not at the beginning of the
    > line.
    >
    > I imagine having a flag variable to tell whether you're inside the
    > auto-generated part, but I wasn't able to figure out exactly how. Here's
    > the function, without the ability to skip auto-generated code:
    >
    > # Count the lines of source code in the file
    > def count_lines(f):
    > file = open(f, 'r')
    > rows = 0
    > for line in file:
    > rows = rows + 1
    > return rows
    >
    > How would you modify this to exclude lines between "Begin VB.Form" and
    > "End" as described above?
    > Gustaf


    David Mertz's Text Processing in Python might give you some more
    efficient (and interesting) ways of approaching the problem.

    http://gnosis.cx/TPiP/
     
    Yu-Xi Lim, Oct 31, 2007
    #3
  4. Gustaf a écrit :
    > Hi all,
    >
    > Just for fun, I'm working on a script to count the number of lines in
    > source files. Some lines are auto-generated (by the IDE) and shouldn't
    > be counted. The auto-generated part of files start with "Begin VB.Form"
    > and end with "End" (first thing on the line). The "End" keyword may
    > appear inside the auto-generated part, but not at the beginning of the
    > line.
    >
    > I imagine having a flag variable to tell whether you're inside the
    > auto-generated part, but I wasn't able to figure out exactly how. Here's
    > the function, without the ability to skip auto-generated code:
    >
    > # Count the lines of source code in the file
    > def count_lines(f):
    > file = open(f, 'r')


    1/ The param name is not very explicit.
    2/ You're shadowing the builtin file type.
    3/ It migh be better to pass an opened file object instead - this would
    make your function more generic (ok, perhaps a bit overkill here, but
    still a better practice IMHO).

    > rows = 0


    Shouldn't that be something like 'line_count' ?

    > for line in file:
    > rows = rows + 1


    Use augmented assignment instead:
    rows += 1

    > return rows


    You forgot to close the file.

    > How would you modify this to exclude lines between "Begin VB.Form" and
    > "End" as described above?


    Here's a straightforward solution:

    def count_loc(path):
    loc_count = 0
    in_form = False
    opened_file = open(path)
    try:
    # striping lines, and skipping blank lines
    for line in opened_file:
    line = line.strip()
    # skipping blank lines
    if not line:
    continue
    # skipping VB comments
    # XXX: comment mark should not be hardcoded
    if line.startswith(';'):
    continue
    # skipping autogenerated code
    if line.startswith("Begin VB.Form"):
    in_form = True
    continue
    elif in_form:
    if line.startswith("End"):
    in_form = False
    continue
    # Still here ? ok, we count this one
    loc_count += 1
    finally:
    opened_file.close()
    return loc_count

    HTH

    PS : If you prefer a more functional approach
    (warning: the following code may permanently damage innocent minds):

    def chain(*predicates):
    def _chained(arg):
    for p in predicates:
    if not p(arg):
    return False
    return True
    return _chained

    def not_(predicate):
    def _not_(arg):
    return not predicate(arg)
    return _not_

    class InGroupPredicate(object):
    def __init__(self, begin_group, end_group):
    self.in_group = False
    self.begin_group = begin_group
    self.end_group = end_group

    def __call__(self, line):
    if self.begin_group(line):
    self.in_group = True
    return True
    elif self.in_group and self.end_group(line):
    self.in_group = False
    return True # this one too is part of the group
    return self.in_group

    def count_locs(lines, count_line):
    return len(filter(
    chain(lambda line: bool(line), count_line),
    map(str.strip,lines)
    ))

    def count_vb_locs(lines):
    return count_locs(lines, chain(
    not_(InGroupPredicate(
    lambda line: line.startswith('Begin VB.Form'),
    lambda line: line.startswith('End')
    )),
    lambda line: not line.startswith(';')
    ))

    # and finally our count_lines function, greatly simplified !-)
    def count_lines(path):
    f = open(path)
    try:
    return count_vb_locs(f)
    finally:
    f.close()

    (anyone on doing it with itertools ?-)
     
    Bruno Desthuilliers, Oct 31, 2007
    #4
  5. Gustaf

    Paul Hankin Guest

    On Oct 31, 5:02 pm, Gustaf <> wrote:
    > Hi all,
    >
    > Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.
    >
    > I imagine having a flag variable to tell whether you're inside the auto-generated part, but I wasn't able to figure out exactly how. Here's the function, without the ability to skip auto-generated code:
    >
    > # Count the lines of source code in the file
    > def count_lines(f):
    > file = open(f, 'r')
    > rows = 0
    > for line in file:
    > rows = rows + 1
    > return rows
    >
    > How would you modify this to exclude lines between "Begin VB.Form" and "End" as described above?


    First, your function can be written much more compactly:
    def count_lines(f):
    return len(open(f, 'r'))


    Anyway, to answer your question, write a function that omits the lines
    you want excluded:

    def omit_generated_lines(lines):
    in_generated = False
    for line in lines:
    line = line.strip()
    in_generated = in_generated or line.starts_with('Begin
    VB.Form')
    if not in_generated:
    yield line
    in_generated = in_generated and not line.starts_with('End')

    And count the remaining ones...

    def count_lines(filename):
    return len(omit_generated_lines(open(filename, 'r')))

    --
    Paul Hankin
     
    Paul Hankin, Nov 1, 2007
    #5
  6. Gustaf

    Anand Guest

    On Nov 1, 5:04 am, Paul Hankin <> wrote:
    > On Oct 31, 5:02 pm, Gustaf <> wrote:
    >
    > > Hi all,

    >
    > > Just for fun, I'm working on a script to count the number of lines in source files. Some lines are auto-generated (by the IDE) and shouldn't be counted. The auto-generated part of files start with "Begin VB.Form" and end with "End" (first thing on the line). The "End" keyword may appear inside the auto-generated part, but not at the beginning of the line.


    I think we can take help of regular expressions.

    import re

    rx = re.compile('^Begin VB.Form.*^End\n', re.DOTALL|re.MULTILINE)

    def count(filename)
    text = open(filename).read()
    return rx.sub('', text).count('\n')
     
    Anand, Nov 1, 2007
    #6
  7. Gustaf

    Gustaf Guest

    Yu-Xi Lim wrote:

    > David Mertz's Text Processing in Python might give you some more
    > efficient (and interesting) ways of approaching the problem.
    >
    > http://gnosis.cx/TPiP/


    Thank you for the link. Looks like a great resource.

    Gustaf
     
    Gustaf, Nov 1, 2007
    #7
  8. Gustaf

    Gustaf Guest

    Bruno Desthuilliers wrote:

    > Here's a straightforward solution:


    <snip/>

    Thank you. I learned several things from that. :)

    Gustaf
     
    Gustaf, Nov 1, 2007
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jack
    Replies:
    9
    Views:
    2,680
  2. python skipping lines?

    , Nov 27, 2006, in forum: Python
    Replies:
    6
    Views:
    561
  3. skipping the lines

    , Jun 4, 2008, in forum: C++
    Replies:
    3
    Views:
    353
  4. Chris R.
    Replies:
    3
    Views:
    138
    Adam Prescott
    Jan 28, 2011
  5. Replies:
    27
    Views:
    409
    Peter J. Holzer
    May 18, 2007
Loading...

Share This Page