Opinions please -- how big should a single module grow?

  • Thread starter Steven D'Aprano
  • Start date
S

Steven D'Aprano

This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.
 
S

Stefan Behnel

Steven D'Aprano, 09.07.2010 06:37:
This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is> 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

Well, if that's the case, then there's no reason to split it up. However,
for any module of substantial growth, I'd expect there to be some point
where you start adding section separation comments (Meta-80-#) to make the
structure clearer. That might be the time when you should also consider
breaking it up into separate source files.

Stefan
 
S

Stephen Hansen

This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

If that is really true, then it belongs as one module, regardless of
size -- I don't believe there's any certain count where you should draw
this line. Now, once you get up to say, 5-10 KLOC, I find it hard to
believe that everything there really does belong together and there's
not a substantial subset of code which really belongs "more" together
then it does with the surrounding code. That would then lend itself to
be a new file in a package.

But that can happen at 1 KLOC too. But then it can still be 5-10 KLOC
which, to me, fits perfectly together still. Its just more likely the
higher the count that you have subsets that would benefit from logical
separation.

--

Stephen Hansen
... Also: Ixokai
... Mail: me+list/python (AT) ixokai (DOT) io
... Blog: http://meh.ixokai.io/


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.10 (Darwin)

iQEcBAEBAgAGBQJMNrCcAAoJEKcbwptVWx/ljd0H/RWjpKozi24v3Q1kd7XzDvUp
bOf7/H0GpW21WQ8DIstCK+rnlCC0PoW9TClcg7VUa6CZWG7P7fcXDfAev4veCe1K
WO+QcOyhpgoj9PFhvvgbKgY+K0Af/AF3bOIm7Tu2dYINWlfv5C/ErW41ydUXgv9v
+xapMYIE6h5qHRqTtvlFZEBjstDMzxjMKOYzjS/GjnDHMimfi8xwapATAEbKQIyR
hdvi4JLF3Un9vYTBzwND2zwBlaGbFCA5bHdHWuTeeYMlwmEbpxPacTqk2E8Ibj4I
2Zr2RP3035bqW75lkSkWIQC2Dj4ecy7OoCYUwF5nPFqeLFhfarhTxDyRShjcpps=
=fVKm
-----END PGP SIGNATURE-----
 
R

Roy Smith

Steven D'Aprano said:
This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

To paraphrase Einstein, A module should be as large as it needs to be,
but no larger.

There's no hard and fast rule. More important than counting lines of
code is that all the things in a module should represent some coherent
set of related functionality. Beyond that, I would say the only hard
limit on line count is when common editing tools start to balk at
opening the file.
 
D

Dave Angel

Steven said:
This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.
I don't have a number for you, but the measure I'd suggest is the size
of the documentation.

DaveA
 
T

Tomasz Rola

This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

Myself, I would draw "the line" somewhere between 20-50 KLOC&C (code with
comments) - if something takes a lot of place to comment, then maybe it
should go to a separate unit.

As of the other side of scale, this 3000 KLOC thing, I believe you would
have encountered a performance hit. And probably a big one - I would be
afraid of some O(n^2) dragon or something bigger, waiting deeply in
CPython code to eat your performance. But, I am just fantasizing - I have
no idea of CPython internals and how big is the biggest Python code
written so far. Python code tends to be very concise, so 3 MLOC sounds
rather extremal to me. On the other hand, it is possible nobody tested
CPython for such extreme case.

Speaking of performance, there is also target user, and her needs (CGI?
desktop? etc). So I would take this under consideration, too - how many
times per second (minute, day) this code will be executed?

Regards,
Tomasz Rola

--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:[email protected] **
 
T

Tomasz Rola

Myself, I would draw "the line" somewhere between 20-50 KLOC&C (code with
comments) - if something takes a lot of place to comment, then maybe it
should go to a separate unit.

I meant 2-5 KLOC&C. Oups...

Regards,
Tomasz Rola

--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:[email protected] **
 
M

Martin P. Hellwig

This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is> 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

Personally I like to split up my module quite early and the main
criteria is whether this improves (IMO) maintainability.
So if I can refactor stuff that with some adaption can be used multiple
times I put it in its own module. I rarely exceed 1KLOC per module and
on average are on half of that.
 
T

Tomasz Rola

And just in case... A real life example (my computer, more or less typical
Linux setup):

find / -type f -name '*.py' -exec wc {} \;
| gawk '{ l+=$1; } END {print l / FNR; } BEGIN { l=0; }'

(the two lines should be concatenated)

This gives a mean:

269.069

So, if I did not screw something, a typical Python code size is far below
1KLOC (wc counts all lines, so my result includes all comments and blanks
too).

Regards,
Tomasz Rola

--
** A C programmer asked whether computer had Buddha's nature. **
** As the answer, master did "rm -rif" on the programmer's home **
** directory. And then the C programmer became enlightened... **
** **
** Tomasz Rola mailto:[email protected] **
 
P

Paul Rubin

Steven D'Aprano said:
How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is > 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

I'd tend to say 2-4 KLOC, basically the max size in which it's
reasonable to scroll through the file and read what it's doing, navigate
around it with an editor, etc. Some exceptions are possible.
 
T

Terry Reedy

This is a style question rather than a programming question.

How large (how many KB, lines, classes, whatever unit of code you like to
measure in) should a module grow before I should break it up into a
package? I see that, for example, decimal.py is> 3000 lines of code, so
I can assume that 3 KLOC is acceptable. Presumably 3000 KLOC is not.
Where do you draw the line?

For the purposes of the discussion, you should consider that the code in
the module really does belong together, and that splitting it into sub-
modules would mean arbitrarily separating code into separate files.

3000 lines is more that I would prefer. I pulled decimal.py into an IDLE
edit window; loading was sluggish; the scroll bar is tiny; and any
movement moves the text faster and farther than I would like.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,434
Messages
2,571,690
Members
48,796
Latest member
Greg L.

Latest Threads

Top