Is there a "Large Scale Python Software Design" ?

A

Andrea Griffini

I did it.

I proposed python as the main language for our next CAD/CAM
software because I think that it has all the potential needed
for it. I'm not sure yet if the decision will get through, but
something I'll need in this case is some experience-based set
of rules about how to use python in this context.

For example... is defining readonly attributes in classes
worth the hassle ? Does duck-typing scale well in complex
software or should I go for a classic inheritance hierarchy ?

In other words... is there something like the classic "Large
Scale C++ Software Design" (Lakos) for python ? I'm not
looking for a bible, but lessons learned from someone that
already went down this path could be quite interesting.

Any suggestions/pointers are welcome.

Andrea
 
J

Jonathan Ellis

Andrea said:
I did it.

I proposed python as the main language for our next CAD/CAM
software because I think that it has all the potential needed
for it. I'm not sure yet if the decision will get through, but
something I'll need in this case is some experience-based set
of rules about how to use python in this context.

For example... is defining readonly attributes in classes
worth the hassle ? Does duck-typing scale well in complex
software or should I go for a classic inheritance hierarchy ?

In other words... is there something like the classic "Large
Scale C++ Software Design" (Lakos) for python ? I'm not
looking for a bible, but lessons learned from someone that
already went down this path could be quite interesting.

Wouldn't it have been better to ask these questions BEFORE proposing
python as (presumably) a Great Solution? IMO, as great as python is,
it isn't appropriate for projects that are large and include many
developers.

The benefits of static typing, not least among which is the vastly
superior ease of creating tools that "understand" the language,
outweigh python's advantages in an environment when many people are
writing a lot of code. This can be mitigated by reducing the
connectedness of your code, e.g. with a plugin architecture, but that
isn't always an option either...

Good luck.

-Jonathan
 
C

Carlos Ribeiro

The benefits of static typing, not least among which is the vastly
superior ease of creating tools that "understand" the language,
outweigh python's advantages in an environment when many people are
writing a lot of code. This can be mitigated by reducing the
connectedness of your code, e.g. with a plugin architecture, but that
isn't always an option either...

On principle, I disagree with this statement. Doing large scale
development using Python isn't certainly the same thing as to do it
with another language - C, C++ or Java, for instance. It will require
a different approach to the problem, and perhaps a particular set of
tools and disciplines to help with the process. But I don't think that
static typing represents such a great advantage per itself, as to make
Python badly suited to the problem, because there are many aspects to
it, and Python has its own advantages too. Give it a solid design,
leveraging Python particular strengths, and the end result has the
potential be a positive surprise. But again, that's just my opinion,
and I'm not the best person around to make a definitive claim on it
:)


--
Carlos Ribeiro
Consultoria em Projetos
blog: http://rascunhosrotos.blogspot.com
blog: http://pythonnotes.blogspot.com
mail: (e-mail address removed)
mail: (e-mail address removed)
 
P

Peter L Hansen

Jonathan said:
Wouldn't it have been better to ask these questions BEFORE proposing
python as (presumably) a Great Solution? IMO, as great as python is,
it isn't appropriate for projects that are large and include many
developers.

I don't know what Jonathan's experience with using Python in
large teams and projects is, but mine includes four years
as Director of Software Engineering at a wireless tech company
and a team that ran between ten or fifteen people, and a very
large amount of code. We found Python to be *very* appropriate
for this and of course anything smaller.
The benefits of static typing, not least among which is the vastly
superior ease of creating tools that "understand" the language,
outweigh python's advantages in an environment when many people are
writing a lot of code.

While it appears true that it is easier to develop certain
tools for statically typed languages, it's not at all apparent
that this small benefit outweighs the very significant advantages
that Python brings to large-scale development, and to large-team
development. I'll add "especially when using test-driven
development and any agile process", and to be perfectly honest
I'm not sure I would recommend Python nearly as strongly if one
was forced to use a traditional, non-agile approach to the work.

My past posts on the subject have covered this a number of times.
I have to admit I haven't seen anything from Jonathan on this
topic, so I can't say how his experience compares with mine, nor
why he would feel the way he does.

-Peter
 
J

Josiah Carlson

Andrea said:
I proposed python as the main language for our next CAD/CAM
software because I think that it has all the potential needed
for it. I'm not sure yet if the decision will get through, but
something I'll need in this case is some experience-based set
of rules about how to use python in this context.

I know of 2 startups who have decided to construct similar softwares in
Python, due to the fact that they can build entire packages in a year
with a small, but experienced, development team. At least one
of them is funded in the tens-of-millions of dollars range by a
half-dozen automotive and aerospace companies.

Wouldn't it have been better to ask these questions BEFORE proposing
python as (presumably) a Great Solution? IMO, as great as python is,
it isn't appropriate for projects that are large and include many
developers.

Having recently released a piece of software with 10k lines of Python
running in its backend as a core technology, and being paid for it, I
will say that Python was and is the best tool for the job. A C version
would have been at least 4-10 times as many lines, and we wouldn't be
releasing ~3 months after starting with nearly the confidence we are now.


In terms of developers, some projects require more than one developer,
and in that sense, Python works as well as other languages: planning is
key.

- Josiah
 
D

Dave Brueck

Jonathan said:
Wouldn't it have been better to ask these questions BEFORE proposing
python as (presumably) a Great Solution? IMO, as great as python is,
it isn't appropriate for projects that are large and include many
developers.

The OP would be well-advised to search the Google archives of c.l.py as many
(myself included) take the contrarian view - as the project grows in size it is
harder to justify going with "classic" languages like C++, or even Java - the
associated costs at each stage of the project are relatively larger to begin
with, and grow more quickly as well.
The benefits of static typing, not least among which is the vastly
superior ease of creating tools that "understand" the language,
outweigh python's advantages in an environment when many people are
writing a lot of code.

I'm not so sure - how much of the benefit those "smart" tools provide goes to
helping the developer manage complexity caused by the language itself? It seems
that often (not always, of course) a lot of what they do is help the programmer
manage oodles of little details that the programmer ought not be burdened with
in the first place, _especially_ on large projects.

What specifically do you see breaking down if Python is used in a project with
lots of people? From working on large projects with lots of people, I've noticed
that projects naturally get divided into components as different teams work on
them, regardless of the language (so for any given piece of code, the percentage
of the total programmers touching that piece of code drops, not rises, as the
total size of the development staff goes up). Again, regardless of language,
large projects & teams almost force well-defined interface points between
various components - I don't see how Python would be any hinderance at all.

On the plus side, projects implemented in higher level languages grow more
slowly (and thus become unmanageable more slowly) than would projects
implemented in lower-level languages. The list goes on and on - I've found
Python components generally easier to test than, say, C++ components. It's also
easier for more people to comprehend more of the code (and, in turn, more of the
implications of decisions), etc., etc.

-Dave
 
S

Stephen Waterbury

Andrea said:
I proposed python as the main language for our next CAD/CAM
software because I think that it has all the potential needed
for it.

I agree, even without knowing the intended scope. ;)
Speaking of scope, if you are allowed to divulge it, that
would be interesting to know. Will it be 2D or 3D (3D I
would assume), and what kind of geometry engine?
Probably one of the open-source ones that already have
a Python API, no?

If it is 3D, a very desirable feature would be STEP
(ISO 10303) geometry import/export, so that you will be able
to exchange CAD data with virtually any commercial CAD
tool, and some open source ones (such as OpenCascade).
That will greatly increase its chance of adoption by
experienced CAD users, who typically have existing
libraries of CAD designs created using a COTS CAD tool.
(This is even more useful if you are planning to support
assemblies of components -- which might even be the most
logical initial feature for a new Python-based CAD/CAM,
since assemblies could be manipulated even without having
native geometric-form-creation capabilities: all you
would need is rendering, orientation, and interfacing
of existing solids -- a.k.a., "parts".)

If you have access to a license for ABAQUS, I recently
discovered that they have implemented a Python API for their
FEA engine, and have implemented STEP geometry as well.
See: http://www.abaqus.com/PAPortal
... I'm not sure yet if the decision will get through, but
something I'll need in this case is some experience-based set
of rules about how to use python in this context.

For example... is defining readonly attributes in classes
worth the hassle ? Does duck-typing scale well in complex
software or should I go for a classic inheritance hierarchy ?

For something as complex as CAD/CAM, you will probably want to
make maximum use of interfaces and adaptors, with minimal and
very judicious application of classic inheritance hierarchies.
I am *not* an expert on interfaces and adapters, but several
of the gurus on this list are.

Since you will probably want to do lots of prototyping, you
can probably delay decisions about matters such as read-only
attributes until your API has stabilized somewhat.

Keep us posted on your progress.

Cheers,
Steve
 
J

Jonathan Ellis

Josiah said:
Having recently released a piece of software with 10k lines of Python
running in its backend as a core technology, and being paid for it, I
will say that Python was and is the best tool for the job. A C version
would have been at least 4-10 times as many lines, and we wouldn't be
releasing ~3 months after starting with nearly the confidence we are
now.

Heh. "Large" depends on a lot of things, particularly connectedness,
but I really can't picture 10k being large under any circumstances.
-Jonathan
 
J

Jonathan Ellis

Peter said:
While it appears true that it is easier to develop certain
tools for statically typed languages, it's not at all apparent
that this small benefit outweighs the very significant advantages
that Python brings to large-scale development, and to large-team
development.

Almost four years ago I started working at a company with about 500
kloc of Java code. Thanks largely to tool support I was able to get in
and start fixing bugs my first day (this is without significant prior
Java experience). A more-experienced co-worker pointed me in the right
direction, and the IDE did the rest. ("Find definition," "Find
references.") Grep can do much the same thing, but painfully slowly --
and inaccurately, when you have a bunch of interfaces implementing the
same method names. Even after years in the codebase, I still used
these heavily; the codebase grew to about 800 kloc during the 3 years I
worked there. Developers came and went; even if my memory were good
enough to remember all the code _I_ ever wrote, I'd still have to
periodically repeat the familiarization process with code written by
others.

I haven't jumped into a project of similar size with python, but the
tool support for this approach to working with a large codebase just
isn't there, and I haven't seen any convincing arguments that
alternative methodologies are enough better to make up for this.
I'll add "especially when using test-driven
development and any agile process", and to be perfectly honest
I'm not sure I would recommend Python nearly as strongly if one
was forced to use a traditional, non-agile approach to the work.

Testing is good; preventing entire classes of errors from ever
happening at all is better, particularly when you get large. Avoiding
connectedness helps, but that's not always possible.

-Jonathan
 
B

Brad Tilley

Jonathan said:
Heh. "Large" depends on a lot of things, particularly connectedness,
but I really can't picture 10k being large under any circumstances.
-Jonathan

It's large to me. Most sys-admin scripts/programs never cross 1K or 2K
at the most. And Python works well for sys-admin tasks.
 
A

Alex Martelli

Dave Brueck said:
The OP would be well-advised to search the Google archives of c.l.py as
many (myself included) take the contrarian view - as the project grows in
size it is harder to justify going with "classic" languages like C++, or
even Java - the associated costs at each stage of the project are
relatively larger to begin with, and grow more quickly as well.

I entirely agree with you, Dave. Moreover, I do have a mass of growing
but as-yet-unorganized notes, based mostly on experiences on large
projects I have consulted for or even been very intimately connected
with, showing why Python is superior to various plausible alternatives
(for various and different reasons in each case) for large-scale
software development, and what principles, practices and patterns best
enable teams in various conditions to actualize those advantages.

That is the book I want to write, the one I have always wanted to write;
the Nutshell and the Cookbook (and now their second editions) keep
delaying that plan, but, in a sense, that's good, because I keep
accumulating useful experiences to enrich those notes, and Python keeps
growing (particularly but not exclusively in terms of third-party
extensions and tools) in ways that refine and sometimes indeed redefine
some key aspects. To give a simple technical example: I used to have
substantial caveats in those notes cautioning readers to use multiple
inheritance in an extremely sparing, cautious way, due to traps and
pitfalls that made it fragile. Nowadays, with the advent of 2.3, most
of those traps and pitfalls have gone away (in the newstyle object
model), to the point that the whole issue can be reconsidered.

Anybody who has written serious technical books can gauge the amount of
work it takes to turn "a mass of yet-unorganized notes" into a real
book: it's _staggering_. I can't seriously undertake the task of making
my copious notes into a book until I can consider devoting at least half
of my time to it for a year -- this means no other books in the making,
_and_ a reduction in the amount of consulting, teaching, mentoring, etc,
that I do. The biggest general issue is that a book cannot be
_interactive_, _customized_ to the specific skills and interests of a
reader, in the way in which I can customize interactively the kind of
hands-on teaching, mentoring and consulting which I do for a specific
customer.

For a given customer, I can and do find out what kinds of areas they
believe their large projects will cover, what skills their people start
with (and what skills can they expect other people to start with in the
future, depending on expected turnover), the "political" and "social"
dynamics of the team -- is the kind of "customer involvement" that's the
crux of Extreme Programming wrt other kinds of Agile Development
feasible at all, at what cost, etc, for example -- and so on. I can
avoid spending substantial time and energy on issues which don't matter
to project A even though they may be crucial to most large projects --
believe it or not, SOME projects need no networking, others will never
directly interface to a relational database, etc, etc, even though these
days 9 large projects out of 10 will need to deal with both kinds of
issues; and GUI issues, especially for large projects which mostly deal
with web interfacing vs others which will need traditional GUIs, can be
even more divergent. And this amount of variety is just for the
_technical_ issues; the political/social/business-plan ones, people's
skills and backgrounds, etc, are even more diverse...

To make a book, I will have to find an organization that works for busy
readers who don't have the time or patience to read through long parts
connected to database issues if they're on one of the few projects that
don't care about databases, and so forth -- structure sections,
chapters, appendices, footnotes, sidebars, ... so that skimming or
skipping the "don't care about it right now" parts can work; find a way
to reach that part of the audience that has never really undertaken a
large project before, or has played in such projects the role of a "cog"
without a clear picture of the whole structure, _as well all_ the lead
architects and tech-savvy project managers.

Lakos did manage, and I admire him immensely for that. Robert Martin
has also done great work, though his books, while good, are (IMHO) never
_quite_ as excellent as his superb essays (don't get me wrong: I wish I
was half as good as Uncle Bob!-). Eric Raymond's "Art of Unix
Programming" is one of the most useful books for would-be architects of
software systems that I've ever laid my paws on -- I rate it as close to
the Mythical Man-Month, Design Patterns, Programming Pearls, and a few
of the many recent books on Extreme and other Agile methods (my personal
favorites of the crop are Scott Ambler's and Kent Beck's books).

However, none of these excellent books really addresses the questions
specific to the architecture, design, and development practices that
work best for dynamic VHLLs, and specifically for Python; so, I do
believe the book I dream to write is still needed (even though I might
be a grandfather by the time I'm done with it;-).

Meanwhile, to people and firms which aren't interested in retaining my
professional services, the best advice I can give -- after that of
studying the various books I have mentioned above (as well as good
Python books -- I like my own, but then, of course, I'm biased; I'd also
suggest others, such as Holden's, Pilgrim's, Hetland's, ...) -- is to
try something like:
<http://groups.google.com/groups?safe=images&as_ugroup=*python*&as_uauth
ors=alex%20martelli&lr=&hl=en>
as well as similar searches for the many other authors that contribute
so validly to the Python discussions, of course.

Somewhere or other, in my 8190 posts found by the above Google Groups
search, I have expressed (often more than once, and with different
nuances depending on the exact subject, apparent skills and interests of
other discussants, etc; as well as sometimes based on my changing ideas
on some sub-issue, or changes in Python and other tools and
technologies) a majority of the issues that I touch upon in that "mass
of notes". Of course, the stuff is yet more disorganized than said
notes; however, it _is_ written to be read and hopefully understood by
others, while most of said notes are written essentially "to myself", to
remind me of the huge variety of things that may need to be covered
regarding the huge variety of facets that make up the subject "Large
Scale System Architecture, Design, and Development Practices with
Python". Moreover, a majority of the 8190 posts are undoubtedly dealing
with subjects that aren't really related to LSSADDPP. Hey, there's
_got_ to be some advantage in retaining me, or reading my hopefully
future book, rather than combing through all my posts, no?-)

Seriously: one day do I hope to start putting up some parts of those
notes, mutated into intelligible text and organized into kind of
almost-essays, on my website -- fragments of said future book, but more
accessible and usable than the sheer morass of posts above-mentioned.
But don't hold your breath for _that_, any more than for the book; I've
been meaning to redo my site for _years_, and it just hasn't happened...
there's always something else that looks more interesting, either
intrinsically, and/or because of the little issue of money;-). Some
stuff (mostly presentations) you can find at www.strakt.com, which also
has important stuff written by Jacob Hallén and others.

People with lot of important and interesting things to teach, who have
managed to do a much better job than me at organizing their stuff on the
web, include for example Fredrik Lundh and Marc-Andre Lemburg. The
latter gave an hour-long talk this summer at Europython on the subject.
Unfortunately I can't easily find his presentation on
www.europython.org, nor Fredrik's, but I'm sure that an abler searcher
than me will manage, and they do have their own websites as well. In
any case, I'm sure that either of them could be (and often is, in their
respective professional practices) at least as effective as a teacher,
consultant or mentor, on large-scale software projects in Python, as me;
and the same applies no doubt to many others. In fact, the Python world
is blessed, in my opinion, with quite a number of excellent people who
might fill such roles -- one more reason to consider Python for
large-scale, mission-critical development, in fact!!!-)


Alex
 
A

Alex Martelli

Stephen Waterbury said:
For something as complex as CAD/CAM, you will probably want to
make maximum use of interfaces and adaptors, with minimal and
very judicious application of classic inheritance hierarchies.
I am *not* an expert on interfaces and adapters, but several
of the gurus on this list are.

Heh -- funny enough, I did develop my ideas on protocol adaptation
mostly while working in the CAD area (as Senior Software Consultant to
what used to be Cad.Lab, and is now Think3, for over 10 years).

Our main implementation language, over time, moved from Fortran to C,
then to C++ -- but we did have our own proprietary scripting language,
and a growing amount of applications' functionality was coded in that
higher-level language. Interfaces (formalized or not) were of course a
given -- in the last few years I was there (and later when I worked as a
consultant for them), as the firm had moved to Windows as the only
platform for its products, mostly COM interfaces among components
(earlier, we had tried Corba, Java, and less formalized ones). The
Gof4's Design Patterns, and Lakos' Larce Scale C++ Software Design,
helped us crystallize our ideas and practices when they came out (I
devoured both avidly as soon as I could get my hands on them;-), but we
_had_ mostly gone that way already. But something was missing, and
Robert Martin's excellent essays (the Dependency Inversion Principle
first and foremost) helped BUT didn't quite solve that something...

Protocol Adaptation can, at least potentially. Try Eby's PyProtocols
for a taste (I may not agree with every one of Eby's design and
architectural choices, but nevertheless it seems to me that PyProtocols
is, today, the best implementation of Protocol Adaptation ideas).
Unfortunately, _that_ is when our choice of programming languages bit --
none of them, including our proprietary scripting language, had
introspection and dynamism enough to get anywhere near. Java perhaps
might, with much huffing and puffing, but we had put it aside after
extensive trials: too hard to interface our huge existing base of C++,
and rewriting stuff from C++ to Java would have been a nightmare without
templates (generic programming) in Java at the time -- even quite apart
from performance issues, the productivity gains with Java were not worth
the migration costs (for a single-platform software company, at least;
had we still been striving on multiple platforms, I guess it might have
been different:).

Python (as Eby's work shows, for example) is fully adequate for Protocol
Adaptation (as, no doubt, would other modern VHLLs!)...


Alex
 
S

Stephen Waterbury

Jonathan said:
... A more-experienced co-worker pointed me in the right
direction, and the IDE did the rest. ("Find definition," "Find
references.") Grep can do much the same thing, but painfully slowly --
and inaccurately, when you have a bunch of interfaces implementing the
same method names. ...

Try "glimpse" (http://webglimpse.net) -- it uses a superset of
grep's arguments and can search large collections of files at
a single bound! Re-indexing takes a few seconds, but doesn't
need to be done unless there are major changes. The indexing
makes it considerably faster than grep (you can even read the
index into memory using glimpseserver, and then searches of
~100MB of files take a fraction of a second). The first thing
I do when using any large Python library is put a glimpse
index on it.

Steve
 
G

GerritM

"Jonathan Ellis" <[email protected]> schreef in bericht
Almost four years ago I started working at a company with about 500
kloc of Java code. Thanks largely to tool support I was able to get in
and start fixing bugs my first day (this is without significant prior
Java experience). A more-experienced co-worker pointed me in the right
direction, and the IDE did the rest. ("Find definition," "Find
references.") Grep can do much the same thing, but painfully slowly --
and inaccurately, when you have a bunch of interfaces implementing the
same method names. Even after years in the codebase, I still used
these heavily; the codebase grew to about 800 kloc during the 3 years I
worked there. Developers came and went; even if my memory were good
enough to remember all the code _I_ ever wrote, I'd still have to
periodically repeat the familiarization process with code written by
others.
The point you make is that good tooling is important. I worked 12 years ago
in a large Objective-C environment. The same static versus dynamic wars were
raging at that time (Objective-C vs C++). I fully agree that good toold make
quite a difference. Most often very simple tools can do wonders. The dynamic
nature of Objective-C made also dynamic tools feasible, with an amazing
small extension. The run-time instrumentation proved at least as powerful,
as the compile time tools. Nowadays the same code is ported to Java, but
unfortunately the same powerful instrumentation is lost.
-Jonathan

Contrary to your believe I would jump into larger scale Python development
without hesistation. However, I would introduce a few naming conventions to
support the static tool part.

kind regards, Gerrit
<www.extra.research.philips.com/ natlab/sysarch/>
 
J

Josiah Carlson

Heh. "Large" depends on a lot of things, particularly connectedness,
but I really can't picture 10k being large under any circumstances.

Ok, so what is large? How many orders of magnitude larger than 10k
lines does it take for a piece of software to be large? And why should
you be the judge?

I'd let it slip to medium, but I wouldn't say that the project was small.
Small is something you can do in a weekend because you've been putting
it off. Small is something a newb to the language can do in a week
while they are learning the language.

- Josiah
 
D

Dave Brueck

Josiah said:
Ok, so what is large? How many orders of magnitude larger than 10k
lines does it take for a piece of software to be large? And why should
you be the judge?

I think the only way to compare projects is from a user's or customer's
perspective - what functionality the application provides & its scope. Any
comparison involving lines of code or number of developers won't be reliable
unless other factors (especially implementation language & libraries) are held
semi-constant. For example, at one company I think the total was 1.1 or 1.2
million lines of code (all C++ & about 60-70 developers), and yet I have trouble
imagining how, if I could go back and do it again in Python, it'd take even 200k
lines of code (and the riskier side of me feels it'd come in at under 100k - it
just didn't _do_ a lot despite all that code!)

In that sense, a 10k Python app can be fairly large in terms of end-user
functionality. For example, our main product where I work consists of *many*
different custom servers, a full web-based administrative interface, an end-user
web interface, a client application that does all sorts of interaction with the
servers, and lots of database interaction. Add to this many internal tools,
integration tools we provide to our customers, etc., and I would rate it overall
as on the upper end of medium-sized projects, functionality-wise - not the
largest I've worked on but well beyond any definition of small, and our plans
for the next few quarters will definitely push it into the range of what I'd
normally consider a large system. IIRC we're only in the 10k-20k for lines of
Python code, plus a few modules here and there being C++.

Having said all that, I've found that competitors in our same space tend to have
20-30 developers on the low end to over 100 on the high-end, while we have but a
handful. We don't have quite the same breadth of functionality - at least not
yet - but we generally make up for it by accounting for it architecturally but
not adding it until a customer actually needs it (a sort of JIT approach to
development). As such we've been able to compete head-to-head with others in the
same sector. On more than one occasion I've wondered aloud how so many
developers working for Competitor X can stay busy, and I can only imagine how
many lines of code they're churning out - and yet, from a functionality
perspective we're keeping pace. I also wonder how many hours a day they spend in
meetings trying to coordinate everything. Ugh.

Back to the point at hand: a project using a higher-level language gets out of
hand more slowly; if there were no other advantage it'd still be a "win" IMO
because you encounter "big project" problems a lot later - and that's a huge
benefit in and of itself.
 
D

Dave Brueck

Andreas said:
What classes of errors are completely avoided by "static typing" as
implemented by C++ (Java)?

I'm curious as well, because from what I've seen, the classes of errors "caught"
are (1) a subset of the higher-level (e.g. algorithmic and corner-case) errors
caught by good testing anyway, (2) much more common in code written by
lazy/underexperienced developers who are already considered a liability, and (3)
caused in part by complexities introduced by the language itself*.

More modern/advanced static type systems that let you actually get into the
semantics of the program (as opposed to just deciding which predefined type
bucket your data fits in) may help, but IMO the jury's still out on them (partly
due to complexity, and partly due to _when_ in the development process they must
be defined - perhaps that's the root problem of some static type systems - they
make you declare intent and semantics when you know the _least_ about them!
Consider the parallels to available knowledge in compile-time versus run-time
optimizations).

-Dave

* A trivial example:When programmers need to count something, rarely do they
care about unsigned vs signed or short vs normal vs long vs longlong, and yet in
something like C++ they are _constantly_ making this decision.

Another: in Java, every exception that can be thrown must be mentioned in the
code every step of the way - a maintenance nightmare, not to mention the utter
distraction during development.
 
A

Andreas Kostyrka

Testing is good; preventing entire classes of errors from ever
happening at all is better, particularly when you get large. Avoiding
connectedness helps, but that's not always possible.
What classes of errors are completely avoided by "static typing" as
implemented by C++ (Java)? Just out of curiosity, because this is
usually stated as "true by axiomatic definition" in this kind of
discussions.

Andreas
 
A

Alex Martelli

C++'s casting power makes this a bit moot -- I have seen generally-good
developers (not quite comfy with C++, from a mostly-Fortran then a
little C background) mangle poor innocent rvalues (and even lvalues,
BION, with ample supplies of & and * to help) with such overpowering
hits of reinterpret_cast<> that I'm still queasy to think of it years
later. Java is mercifully a bit less powerful, but of course _its_
casts are generally runtime-checked. So, when one sees:

WhatAWonderfulWord w = (WhatAWonderfulWord) v;

one _IS_ admittedly inclined to think that the "class of error being
completely avoided" is "erroneous omission of a cast that plays no
useful role at all and is going to be checked only at runtime anyway".

However, there _are_ tiny but undeniable advantages to static typing:

1. some typos are caught at compiletime, rather than 2 seconds later by
unit tests -- 2 seconds ain't much, but it ain't 0 either;

2. simple-minded tools have an easier time offering such editing
services as "auto-completion", which may save a little typing;

3. simple-minded compilers have an easier time producing halfway
decent code;

and the like. None deal with "classes of errors completely avoided"
unless one thinks of unittests as an optional add-on and of compilers as
a mandatory must-have, which is wrong -- the point Robert Martin makes
excellently in his artima article about the wonders of dynamic typing of
a bit more than a year ago (dynamic typing is wonderful _with_ unit
testing, but then unit testing is an absolute must anyway, to
summarize).

I'm curious as well, because from what I've seen, the classes of errors
"caught" are (1) a subset of the higher-level (e.g. algorithmic and
corner-case) errors caught by good testing anyway,

Yes, undeniable.
(2) much more common in code written by
lazy/underexperienced developers who are already considered a liability,

No, I think you're wrong here. Typos are just as frequent for just
about all classes of coders, lazy or eager, experienced or not -- the
eager experienced ones often use faster typing (nothing to do with
static typing;-).
and (3)
caused in part by complexities introduced by the language itself*.

Yes, a fair cop. E.g., a typo in one of those redundant mentions of a
type or interface, seen above, is an error introduced only because I'm
required to type the GD thing twice over (though autocompletion may save
me some keystrokes;-).

More modern/advanced static type systems that let you actually get into
the semantics of the program (as opposed to just deciding which predefined
type bucket your data fits in) may help, but IMO the jury's still out on
them (partly due to complexity, and partly due to _when_ in the
development process they must be defined - perhaps that's the root problem
of some static type systems - they make you declare intent and semantics
when you know the _least_ about them! Consider the parallels to available
knowledge in compile-time versus run-time optimizations).

If you mean typesystems such as Haskell's or ML's, allowing extended
inference (and, in Haskell's case, the wonder of typeclasses), I think
you're being a bit unfair here. You can refactor your types and
typeclasses just as much as any other part of your code, so the "when
they must be defined" seems a bit of a red herring to me (unless you
have in mind other more advanced typesystems yet, in which case I'd like
some URL to read up on them -- TIA).

I think we agree at 95% to 99%, btw, I admit I'm just picking nits...


Alex
 
A

Alex Martelli

Josiah Carlson said:
Ok, so what is large? How many orders of magnitude larger than 10k
lines does it take for a piece of software to be large? And why should
you be the judge?

My definition of a large software system is: a system that cannot
sensibly be developed and maintained by just one developer, but requires
a team of developers. Among the factors defining where the boundaries
lie are such things as deployment issues (how many platforms, how
diverse), function points, analysis/requirements, etc, etc, but SLOC
(properly counted/normalized lines of code) are the main determinant.

For a reasonably experienced programmer, with decent tools, and without
hair-raising problems of deployment, optimization, continuous fast
changes to specs, etc, etc, 10k SLOC should be within the threshold of
"can be sensibly developed and maintained by one person"; 100k SLOC
won't be; the threshold is somewhere in-between. Of course, if you're
talking freshman programming trainees, or special problems of the
various sorts mentioned, the thresholds do shift downwards.

I'd let it slip to medium, but I wouldn't say that the project was small.
Small is something you can do in a weekend because you've been putting
it off. Small is something a newb to the language can do in a week
while they are learning the language.

OK, that's your definition of "small", I guess. I don't know that
there's a commonly accepted one. On the other hand, moving from a
project that can all fit in your head, one you can fully develop and
actively maintain by yourself, to a team situation, _is_ a crucial
threshold, as teams have such different strengths and problems than
individuals on their own; and the "Large Scale" monicker is typically
tagged onto projects requiring a team.

We can quibble about special cases (is a 2-people team, with one of them
developing half-time and the rest of the time out selling the system,
comparable to a more typical case of 6-10 people working full-time on
development and maintenance of a system?), but that's always so for
taxonomies, and doesn't add much to the discussion IMHO.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top