How big do your programs get before you modularise most of it?

Justin C · Dec 10, 2009

I keep writing programs that (for me at least) start getting large. I
know they're small, even by the standard of many of the modules that
I've seen, but how do you decide when to break down your code and
modularise?

Do you only create modules if you think the code is likely to re
re-used, and if it's not then if the program has 10k lines then so
be it?

I fine myself bashing away at the keyboard, and the next thing I know is
that I've got 1000 lines and keeping track of what's where starts
getting complicated (I do love an editor that can fold!). I was just
wondering how others program.

Justin.

ccc31807 · Dec 10, 2009

I keep writing programs that (for me at least) start getting large. I
know they're small, even by the standard of many of the modules that
I've seen, but how do you decide when to break down your code and
modularise?

Do you only create modules if you think the code is likely to re
re-used, and if it's not then if the program has 10k lines then so
be it?

I fine myself bashing away at the keyboard, and the next thing I know is
that I've got 1000 lines and keeping track of what's where starts
getting complicated (I do love an editor that can fold!). I was just
wondering how others program.

From my experience, this is an incremental process.

First, you write a small script sequentially. It consists mostly of a
series of statements with maybe a couple of blocks.

After a while as your script grows, you see certain patterns and
repetitions, so you move the similar statements to user defined
functions. Then, your program consists mostly of a series of calls to
user defined functions.

Later on, the number of user defined functions increases so you decide
to place functions with similar functionalities in files, which become
modules. Then, your program 'uses' your modules but still consists of
calls to functions.

The largest monolithic program I've written probably contains several
hundred LOC. By the time you hit KLOCs, you have invariably moved to a
modular program. It's actually pretty easy to move from sequential
scripts to user defined functions to user defined modules - mostly a
matter of cut and paste.

I haven't written any OO Perl, but I have written Java, and I find the
Java development style completely different. With Java, I will
decompose the requirements into classes, write the test cases for each
of the classes, and then write the classes. When the classes behave as
they ought, I'll write the main application. In essence, this turns
the development process I use with Perl on its head.

Interestingly, I've been playing with Lisp for several years, and find
myself writing Lisp a couple of hours a day now. The typical
development style in Lisp is writing and testing the functions that
you use to compose larger structures. The OO development in Lisp also
follows a similar pattern. Since Lisp developers use an interactive
development environment, this style becomes natural and the programs
'grow' more or less in an ad hoc manner (which causes neurotic
reactions in software engineers). Perl 6 supposedly has an interactive
development environment as well, and I'll bet that the Perl community
as a whole will gravitate toward this and away from the typical C
style of designing and developing a program.

CC.

John Bokma · Dec 10, 2009

Sir Robert Burbridge said:
That makes packaging it pretty trivial when it grows past
that. Sometimes putting a couple minutes of thought into it first
shows me that it should *start out* modularized. In that case I still
usually do it all in the same file, until it's time to break it out.

Same here

Jürgen Exner · Dec 10, 2009

Justin C said:
I keep writing programs that (for me at least) start getting large. I
know they're small, even by the standard of many of the modules that
I've seen, but how do you decide when to break down your code and
modularise?

Do you only create modules if you think the code is likely to re
re-used,
No

and if it's not then if the program has 10k lines then so
be it?

And no.

I fine myself bashing away at the keyboard, and the next thing I know is
that I've got 1000 lines and keeping track of what's where starts
getting complicated (I do love an editor that can fold!). I was just
wondering how others program.

That's called spagetthi programming and it is very poor style.

The size of a proram has pretty much nothing to do with modules.
The purpose of a module is not to reduce the physical size of a program.
Having a module "Lines5000to10000" obviously doesn't make any sense.

Instead a module encapsulates and abstracts something, e.g. a data
structure or a set of similar functions or something along that line.

Example: I know that my program will use balanced trees. So I create a
module to manage balanced trees and it will contain the typical
functions on such data structures like create item, add element, delete
element, traverse data structure (map or apply-all), and so on.

Same for let's say geometry. If I write a progam that heavily relies on
geometric functions then I create a module to contain all those
functions for surface area of a cylinder and sphere, volume of a cone
and cube and circumference of a triangle,

Neither has anything at all to do with the size of the program.

Now, there is one exception: if the main program grows beyond a certain
size, then I go back and seriously question the basic design. Usually it
turns out that I made a mistake in the very early design stages and
forgot an abstration layer or two and therefore the program grew beyond
being managable.
Cure: go back to square two and add those abstraction layers, of course
each encapsulated in a module, and the program will shrink back to a
more sane size.

jue

ccc31807 · Dec 10, 2009

I don't know what you mean by 'Perl 6 has an interactive development
environment'.

http://perl6advent.wordpress.com/

Day 1,

CC.

Jürgen Exner · Dec 11, 2009

ccc31807 said:
From my experience, this is an incremental process.

I strongly disagree.

First, you write a small script sequentially. It consists mostly of a
series of statements with maybe a couple of blocks.

After a while as your script grows, you see certain patterns and
repetitions, so you move the similar statements to user defined
functions. Then, your program consists mostly of a series of calls to
user defined functions.

Later on, the number of user defined functions increases so you decide
to place functions with similar functionalities in files, which become
modules. Then, your program 'uses' your modules but still consists of
calls to functions.

That is a very poor way to write code and shows a major lack of proper
upfront design. Very often programs that have grown like that are
impossible to modularize because of random dependencies between
otherwise unrelated program parts. And for the same reason they are
extremely hard to maintain.

The largest monolithic program I've written probably contains several
hundred LOC. By the time you hit KLOCs, you have invariably moved to a
modular program. It's actually pretty easy to move from sequential
scripts to user defined functions to user defined modules - mostly a
matter of cut and paste.

Absolutely not! Yes, there are always reasons and excuses why
refactoring code may become necessary. But refactoring is expensive in
time and cost, not to mention error prone. And recommending it as a
standard way for program development is, well, shall we just say quite a
stretch.

When you got a task, maybe just sit down and think about it for a few
hours. There is nothing gained by starting to run without knowing in
which direction to run.

Example: If I have to implement a customer management tool, then
immediately I know I will need a data type customer with at least the
usual functions like create, modify, ....
And because there will be quite a few customers I will need some sort of
customer store, a collection of customers, again with at least the usual
functions like create store, add customer, remove customer, probably
something like subset of customers filtered by some condition, ...
Furthermore I know from the requirements that search on similarities is
important. So I will want to implement something to
classify/find/evaluate/judge similarities. And obviously this has to be
done on the level of strings as well as on the level of customers.

And thus before even writing the very first line of code I already
identified 4 modules that I will need. Maybe after evaluation I will
decide to merge customer and customer store into a single physical
module, but those are minor tweaks which don't change anything about the
big picture.

You don't create modules to reduce the size of your code, you do create
modules to encapsulate and manage layers of abstraction.

jue

ccc31807 · Dec 11, 2009

Right. That's a REPL. How do you do any meaningful development using
that? Once you've typed in a function there's no way to get it out of
the interpreter and into a file.

This may or may not make sense with Perl, but it makes perfect sense
with Lisp, Erlang, and similar languages. You commonly pass functions
as arguments to other functions, and return functions from functions,
and it's critical to ensure that the functions you pass as arguments
or return behave as they ought. Much of this can be trial and error,
or a process of optimization.

I'm not proficient in Lisp by any stretch, but I currently write Lisp
on a daily basis. When writing a function, I don't start in the
editor, but on the REPL. I play with it until I get it the way that I
want, then (in emacs) copy the sexp, switch the the file buffer, and
yank it.

For example, in chapter 3 of Practical Common Lisp, we have this
example:

(defun prompt-for-cd ()
(make-cd
(prompt-read "Title")
(prompt-read "Artist")
(or (parse-integer (prompt-read "Rating") :junk-allowed t) 0)
(y-or-n-p "Ripped [y/n]: ")))

The function prompt-for-cd contains four other functions, make-cd,
prompt-read, parse-integer, and y-or-n-p. prompt-read is defined as

defun prompt-read (prompt)
(format *query-io* "~a: " prompt)
(force-output *query-io*)
(read-line *query-io*))

The REPL allows you to play with the functions until you get them
right. If you could imagine developing a Perl program in an
environment that allows you to write and test individual user defined
functions BEFORE INCLUDING THEM IN THE PROGRAM you would be close to
the idea. Also, this encourages your functions to be self-contained,
so you avoid many of the errors that can develop from dependencies
between functions.

I'm not advocating one language or style of development. There are
reasons that Perl is a lot more popular than Common Lisp, and Java a
lot more popular that Perl. I'm just saying, based on my personal
experience, that developing a program piece by piece, function by
function, by interactively testing them in the REPL, is efficient and
productive in certain ways.

CC.

Jürgen Exner · Dec 11, 2009

ccc31807 said:
This may or may not make sense with Perl, but it makes perfect sense
with Lisp, Erlang, and similar languages. You commonly pass functions
as arguments to other functions, and return functions from functions,

Sure, those are just HOF, aka higher order functions aka functions as
first-class citizen. Nothing special about them.

and it's critical to ensure that the functions you pass as arguments
or return behave as they ought. Much of this can be trial and error,
or a process of optimization.

Aeeeehmmmm, no. There is nothing special about a function that is passed
as an argument, and you can design and develop and write them just like
any other function in your program. Perl supports HOFs in a somewhat
simplified form, e.g. as the first parameter of sort(my_cmp ($a, $b),
@somearray) or File::Find::find(\&wanted).
And there is absolutely nothing special about writing or testing
my_cmp() or wanted() and you certainly don't need any special tools to
do so.

jue

ccc31807 · Dec 11, 2009

I strongly disagree.

I'm a database guy, and my main responsibilities is to munge data,
typically by running database queries and then processing the data. My
typical 'program' is between 50 and 100 lines. Most of the time, these
scripts are throw-aways. Occasionally, I'll develop a script that
grows and grows, but I usually can't tell in advance which is which.

That is a very poor way to write code and shows a major lack of proper
upfront design. Very often programs that have grown like that are
impossible to modularize because of random dependencies between
otherwise unrelated program parts. And for the same reason they are
extremely hard to maintain.

There really isn't a call for upfront design when you write a script
that inputs a CSV file 12 columns across, sort it by three fields, and
output a PDF file with the sorted data. It's as easy to write through
it as to think through it, and a lot quicker.

Absolutely not! Yes, there are always reasons and excuses why
refactoring code may become necessary. But refactoring is expensive in
time and cost, not to mention error prone. And recommending it as a
standard way for program development is, well, shall we just say quite a
stretch.

I didn't way that I refactored the code. What I said was 'copy and
paste' - by which I mean that I slap a sub <name> { ... } block around
the code and move it out of the sequence. By far most of my code is
for my own use, and while I will refactor code that I use over a
period of time, this isn't very common for me.

When you got a task, maybe just sit down and think about it for a few
hours. There is nothing gained by starting to run without knowing in
which direction to run.

When I get a task, my user expects the results quickly, and I'm
generally successfully completing the task in less than an hour. By
the time I write the database query, I have already thought through
the entire process -- you really have to in order to have the data in
a form most conductive for creating the output.

Example: If I have to implement a customer management tool, then
immediately I know I will need a data type customer with at least the
usual functions like create, modify, ....
And because there will be quite a few customers I will need some sort of
customer store, a collection of customers, again with at least the usual
functions like create store, add customer, remove customer, probably
something like subset of customers filtered by some condition, ...
Furthermore I know from the requirements that search on similarities is
important. So I will want to implement something to
classify/find/evaluate/judge similarities. And obviously this has to be
done on the level of strings as well as on the level of customers.

And thus before even writing the very first line of code I already
identified 4 modules that I will need. Maybe after evaluation I will
decide to merge customer and customer store into a single physical
module, but those are minor tweaks which don't change anything about the
big picture.

Example: give me a report of active students who have completed 30
hours without taking the required MATH and ENG classes or have not
completed their placement tests, and sort them by program and adviser.
Zero modules, and probably zero functions -- just one block of
monolithic code, probably about 50 lines, which would take maybe 15
minutes to write, and I'll most likely not use it again.

Perhaps twice a year I'll get an assignment like your example, and
typically I'll use Java for it, so yes, I ABSOLUTELY AGREE (!) that
thought and planning is necessary for a more complex project. However,
for me, Perl development is an incremental process that mostly doesn't
increment.

CC.

Xho Jingleheimerschmidt · Dec 11, 2009

Sir said:
That makes packaging it pretty trivial when it grows past that.
Sometimes putting a couple minutes of thought into it first shows me
that it should *start out* modularized.

I actually find this rather rare. If I can come up with a modular
design within a couple minutes, then almost certainly those modules have
already been thought of and already exist on CPAN, and I should use
those rather than writing my own. When a module doesn't already exist,
I've found that dashing out to write one based on a few minutes of
hubris and navel gazing is not the most propitious start.

Xho

Jürgen Exner · Dec 11, 2009

ccc31807 said:
Example: give me a report of active students who have completed 30
hours without taking the required MATH and ENG classes or have not
completed their placement tests, and sort them by program and adviser.

You don't need a program for that! That's just one SQL statement (which
may take a few minutes to think about) and then an output to PDF if that
is required. The only stumbling block might be to find a converter, that
converts the SQL response into PFD. Otherwise it's at most a one-liner
in any shell, no need for Perl.

If you find yourself doing those queries often it would probably pay of
to create a generic tool (with a nice GUI for the not-so-gifted) such
that your users can easily create their own reports.

jue

ccc31807 · Dec 11, 2009

You don't need a program for that! That's just one SQL statement (which
may take a few minutes to think about) and then an output to PDF if that
is required.

JUE:

I think you missed my point. Yes, in many cases I can open the dataset
in Excel directly, save it as an Excel file, and be done with it.

My point was that you don't need to spend several hours thinking about
a script that only takes a few minutes to write. An electrician
doesn't need to draw a schematic to change a light fixture, and a
programmer doesn't need to generate UML diagrams for a simple script.

The other point which we haven't addressed is requirements. In my
case, I write a script to produce a report WITHOUT any requirements
specification. Over a period of time, we have requirements creep, in
dribbles and drabs. The 'easy' approach is to use the script that you
used last time and tack on a few lines to satisfy the new
requirements. Many of the largest (in LOCs) scripts I maintain started
off with a very simple request that got added to over the years. How
do you think and plan about specifications that you don't know exist?

I've had several graduate level course in OO programming, SwE, QA, and
the like, and have used my share of tools (Rational Rose, Eclipse,
Visual Studio, etc.) to generate very nice UML, and generate skeletal
code based on UML diagrams. If I were a professional programmer, I
would certainly use these tools to automate development. This isn't
what the OP asked.

I fully agree with you that planning is extremely important. As the
old saw goes, fifty percent of SW development is requirements
engineering, and the other fifty percent is testing and debugging. ;-)
However, the OP focused on the zero percent of SW development that
lies between specification and testing. Just because my experience is
different from yours doesn't mean that the lessons I have learned by
experience are wrong.

CC.

Peter J. Holzer · Dec 13, 2009

You can do this with Perl as well, of course, and with function-based
modules just as well as with OO modules. The key is to separate out
pieces of your problem that can be solved (and tested) as individual,
well-defined problems on their own.

And perl even comes with a testing framework and everybody who has ever
installed a module from CPAN has seen it in action - so I'd argue that
Perl is one of the languages which very much encourage this type of
development.

hp

Martijn Lievaart · Dec 14, 2009

very interesting, it's certainly made me think differently about what
I'm working on at the moment, and that I really should have created a
function (or three) for it - I can see how that would have helped, but I
think I've gone too far with this to go back... but for next time.

Refactoring your code if you get it wrong is a basic part of writing
code. Don' put it off, just do it. Most of the time it actually pays back
before you are finished with the first release, if not, it definately
pays off when writing release two!

(I'm currently on a v2 where I wished I took my own advise...)

M4

Jochen Lehmeier · Dec 14, 2009

Refactoring your code if you get it wrong is a basic part of writing
code. Don' put it off, just do it.

.... while running your unit tests frequently.

Peter J. Holzer · Dec 14, 2009

... while running your unit tests frequently.

Refactoring is a good opportunity to add unit tests if you don't have
them already. Before you start rewriting code ask yourself what that
code should do. Then write tests which test your assumptions. Then start
rewriting the code.

Devel::Cover helps you to find out whether the code you are rewriting is
already covered by tests.

hp

FAQ 3.6 How do I profile my Perl programs?	0	Mar 4, 2011
How to get education and coding job coming from abroad starting new in the US? Advice of courses or places to look?	2	May 18, 2023
FAQ 3.5 How do I debug my Perl programs?	0	Feb 8, 2011
PyPI - how do you pronounce it?	11	Jan 28, 2012
How do you lock a file BEFORE changes are made?	6	Aug 27, 2004
GUI programs	4	May 30, 2010
How do you get the tail end of a string?	52	Oct 30, 2009
Native Rich-Textarea, what do you think?	4	Dec 30, 2011

How big do your programs get before you modularise most of it?

Justin C

ccc31807

John Bokma

Jürgen Exner

ccc31807

Jürgen Exner

ccc31807

Jürgen Exner

ccc31807

Xho Jingleheimerschmidt

Jürgen Exner

ccc31807

Peter J. Holzer

Martijn Lievaart

Jochen Lehmeier

Peter J. Holzer

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads