counting lines of code

M

Michele Simionato

I need a small utility to count the lines of Python code in a
directory, traversing subdirectories and ignoring comments and
docstrings. I am sure there is already something doing that, what do
you suggest?

TIA,

Michele Simionato
 
M

Mark Hammond

I need a small utility to count the lines of Python code in a
directory, traversing subdirectories and ignoring comments and
docstrings. I am sure there is already something doing that, what do
you suggest?

I suggest typing your subject line into google and hitting the "I feel
lucky" button :)

HTH,

Mark
 
M

Michele Simionato

I did not known about cloc, it does more that I need, but it looks
cool (it is perl and not Python, by who cares? ;)

Thanks,

Michele
 
M

Michele Simionato

Any of the static code checkers (‘pylint’, ‘pyflakes’, etc.) would
already be doing this.

pylint does too many things, I want something fast that just counts
the lines and can be run on thousands of files at once.
cloc seems fine, I have just tried on 2,000 files and it gives me a
report in just a few seconds.
 
P

Phlip

pylint does too many things, I want something fast that just counts
the lines and can be run on thousands of files at once.
cloc seems fine, I have just tried on 2,000 files and it gives me a
report in just a few seconds.

In my experience with Python codebases that big...

....how many of those lines are duplicated, and might merge together
into a better design?

The LOC would go down, too.
 
A

Aahz

In my experience with Python codebases that big...

...how many of those lines are duplicated, and might merge together
into a better design?

Um... do you have any clue who you followed up to? If you don't, Google
is your friend.
 
R

Robert Kern

Oh, sorry, did I have the wrong opinion?

You had a condescending attitude.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco
 
M

Michele Simionato

In my experience with Python codebases that big...

...how many of those lines are duplicated, and might merge together
into a better design?

The LOC would go down, too.

Actually 2,000 files is a very small portion of our code base, the one
I am working on now. I have spent the last couple of months on a big
refactoring project (which is still only at the beginning) and I
wanted to count the difference between the lines of code before the
refactoring and after the refactoring. I guess the new code is less
than half than the old one. There was no cut and paste in the old code
but a lot of subtle duplication, i.e. a code that could be unified in
common libraries, but only after a lot of grunt work. The core parts
were written 10 years ago, with a wrong architecture starting from the
beginning, and then things started growing and growing on that
monster. Just for fun I have run cloc on our trunk:

Language files blank comment code scale 3rd
gen. equiv
--------------------------------------------------------------------------------
C++ 1528 67150 48251 304365 x 1.51 =
459591.15
XML 560 2769 2517 223223 x 1.90 =
424123.70
ASP 731 40136 4630 216713 x 1.29 =
279559.77
Python 2027 38825 47261 179532 x 4.20 =
754034.40
C/C++ Header 2150 51352 72619 141356 x 1.00 =
141356.00
Javascript 153 26196 9819 115311 x 1.48 =
170660.28
C 332 14147 12871 97918 x 0.77
= 75396.86
SQL 426 16432 4214 93598 x 2.29 =
214339.42
CSS 110 1493 1013 23087 x 1.00
= 23087.00
C# 83 3301 1990 19827 x 1.36
= 26964.72
Visual Basic 35 4363 5927 14633 x 2.76
= 40387.08
make 259 1617 650 8339 x 2.50
= 20847.50
Bourne Shell 52 598 1282 6557 x 3.81
= 24982.17
m4 28 611 627 5612 x 1.00
= 5612.00
IDL 23 560 0 3895 x 3.80
= 14801.00
HTML 33 354 76 3834 x 1.90
= 7284.60
MSBuild scripts 3 2 7 3419 x 1.90
= 6496.10
Lisp 33 562 648 2695 x 1.25
= 3368.75
Ruby 13 272 97 1141 x 4.20
= 4792.20
DOS Batch 77 790 410 1034 x 0.63
= 651.42
Java 4 148 181 972 x 1.36
= 1321.92
Perl 6 104 131 922 x 4.00
= 3688.00
XSD 6 0 0 506 x 1.90
= 961.40
awk 5 65 17 366 x 3.81
= 1394.46
DTD 4 117 50 351 x 1.90
= 666.90
ASP.Net 36 153 561 280 x 1.29
= 361.20
Bourne Again Shell 12 63 8 245 x 3.81
= 933.45
XSLT 1 15 14 196 x 1.90
= 372.40
NAnt scripts 3 27 0 119 x 1.90
= 226.10
Teamcenter def 10 16 0 93 x 1.00
= 93.00
 
P

Phlip

Just for fun I have run cloc on our trunk:

SUM:                8743    272238    215871   1470139 x   1.84 =
2708354.95

Nice!

My favorite version of a cloc system can distinguish test from
production code. That's why I always use executable cloc to measure
the ratio of test to production code (where 1.2:1 is almost
comfortable an 2:1 is sacred).

Just so long as nobody confuses "more lines of code!" with progress...
 
S

Steve Holden

Robert said:
You had a condescending attitude.
Towards someone who is fairly obviously not a Python neophyte.

Please don't think we are telling you you can't have any opinion you
like. Just don't expect to get away with it when you are wrong ;-)

Welcome to c.l.py.

regards
Steve
 
M

Michele Simionato

Nice!

My favorite version of a cloc system can distinguish test from
production code. That's why I always use executable cloc to measure
the ratio of test to production code (where 1.2:1 is almost
comfortable an 2:1 is sacred).

Most of this code base is old, before we started using automatic
tests, so tests are not a significant fraction of the code. And in any
case I consider tests as code, since you have to maintain them,
refactor them, etc.
 
M

Michele Simionato

Towards someone who is fairly obviously not a Python neophyte.

Please don't think we are telling you you can't have any opinion you
like. Just don't expect to get away with it when you are wrong ;-)

Come on, we are all friends here! The guy just said
In my experience with Python codebases that big...

...how many of those lines are duplicated, and might merge together
into a better design?

The LOC would go down, too.

and it was even partially right. It is certainly right for the part I
was working on.
If it was good code from the beginning we would not have started this
refactoring project,
right? Peace,

Michele
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,781
Messages
2,569,619
Members
45,316
Latest member
naturesElixirCBDGummies

Latest Threads

Top