Fragile Fences

R

Roedy Green

I have been working with Eclipse and some large enum classes. It has
pounded home an issue of language design I have been raising ever
since the 1960s.


The symptom is this. I have a heavily nested structure of enum, enum
constants, methods, loops etc.

The beginning and end over every last one of these coding elements is
marked with the same vanilla { .. }.

When I am typing I can watch a million error messages appear or
disappear just by inserting or removing one } in some trivial if. The
compiler reinterprets the program in a totally different and
completely bizarre way from the most trivial change. Classes unnest
and renest for example.

From an engineering point of view, this structure is far too delicate.

If engineers designed buildings the way computer programmers design
computer languages, the first woodpecker that came along would destroy
civilisation.


I spend more time sorting out commas, () {} than everything else put
together. This is especially true when constructing code by cut and
paste Frankenstein style from remotely similar code. My snippets may
or may not be balanced.



You want some way of permanently tying the begin and end marker
together so that the compiler knows ever after they are married
together. Any unbalance has then be isolated to whether it is inside
or outside that pair. The compiler can then be much cleverer in
telling you exactly where the missing or extra { } is.

Even the formatter should understand this anchor pairing and hence can
deal more rationally with imperfect programs.

Eclipse is a SCID so it could handle the problem in many ways:

1. Using variable size {}, or variable colour so you can tell
begin/end class, method loop pairs. It is thus not enforcing anything,
just letting you know what sort of animal you are dealing with. You
won't casually delete a huge begin/end class marker.

2. hover help that tells you what sort of "fence" ( { [ character you
are dealing with e.g. "begin for" start xxx method, end xxx class. It
could also remind you where you are in Class X in method y nested X
deep.

This information could also appear as pseudo comments that are not
really in the code but just optionally generated that look like
comments when editing. They constantly automatically update to
describe the current structure.

3. name your large loops and blocks, and have pseudo comments
generated that show you when the matching end is.

You might even have these comments optionally included in exported
source to help the unfortunate scidless maintain the code.

4. allow you to freeze certain fence characters. They become read
only. This prevents a unbalance error from propagating past inside or
outside the fence and making sure you never accidentally delete major
punctuation like class boundaries. By default method and class {} are
not deleteable. Only the whole method or class is or some of its
contents, but not { without } or vice versa.

Your program become like a set of rigid containers into which you can
pour code.

5. refuse a paste that would unbalance class/method begin end.
refuse a del ins that would unbalance class/method begin end.
maybe even block/loop too. Obviously you would have to make this
configurable for those would consider this aid Nazi.

6. Perhaps {} must be considered non-characters. You must insert them
by highlighting a block and hitting a key to insert the {} at once
with no unbalanced {} permitted. Ditto when you remove one end it
removes the other at the same time, and optionally the block contents.
We need to start thinking of IDEs as data entry devices, not text
editors. Data entry would never let a data structure get globally
messy the way IDEs do. You are not typing an ASCII text. You are
entering a data structure. You really should be logically cutting and
pasting parsed trees, not text. Think about how a tree node editor
works.


--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
L

Lasse Reichstein Nielsen

Roedy Green said:
The symptom is this. I have a heavily nested structure of enum, enum
constants, methods, loops etc.

The beginning and end over every last one of these coding elements is
marked with the same vanilla { .. }.

When I am typing I can watch a million error messages appear or
disappear just by inserting or removing one } in some trivial if. The
compiler reinterprets the program in a totally different and
completely bizarre way from the most trivial change. Classes unnest
and renest for example.

However, if it was painfully obvious where an "}" was missing, then
you would not be surprised to see the errors. It is only because
there is no feasible way for a human to keep track of the pairing
that it becomes a problem.

In fact, the use of begin- and end-block markers is usually a
violation of the DRY (Don't Repeat Yourself) principle - the same
information (where blocks start and end) is present in two forms: The
markers for the compiler and the indentation for humans. If these get
out of sync, the compiler spews errors that are not easily visible in
the source.

(That as actually a reason for languages that use only indentation to
delimiter blocks ... something I still have a hard time accepting :)
From an engineering point of view, this structure is far too delicate.

Absolutely. It can fail invisibly to the user ... until he tries to
compile.
If engineers designed buildings the way computer programmers design
computer languages, the first woodpecker that came along would destroy
civilisation.

(Ah, Weinberg's law :)

....
You want some way of permanently tying the begin and end marker
together so that the compiler knows ever after they are married
together. Any unbalance has then be isolated to whether it is inside
or outside that pair. The compiler can then be much cleverer in
telling you exactly where the missing or extra { } is.

What I hear you say is that the concept of a block should be
abstracted out, so that the markers becomes simply that. There is no
such thing as half a block, so there should never be only an opening
marker. It's not so much the pairing of the { and }, but that they
are secondary to the block they delimiter, artifacts of the real
concern.
1. Using variable size {}, or variable colour so you can tell
begin/end class, method loop pairs. It is thus not enforcing anything,
just letting you know what sort of animal you are dealing with. You
won't casually delete a huge begin/end class marker.

That would be merely guiding, not enforcing, the pairing.
2. hover help that tells you what sort of "fence" ( { [ character you
are dealing with e.g. "begin for" start xxx method, end xxx class. It
could also remind you where you are in Class X in method y nested X
deep.

That could be implemented as a marker line in the margin, pointing out
the block that you are currently inside, as well as highlighting the
delimiting markers.

...
6. Perhaps {} must be considered non-characters. You must insert them
by highlighting a block and hitting a key to insert the {} at once
with no unbalanced {} permitted.

Hear, hear! There is no case where you will need a program with
unbalanced brackets, parentheses, square brackets, etc. They should
always be introduced and removed together.
Ditto when you remove one end it removes the other at the same time,
and optionally the block contents. We need to start thinking of
IDEs as data entry devices, not text editors. Data entry would never
let a data structure get globally messy the way IDEs do. You are not
typing an ASCII text. You are entering a data structure. You really
should be logically cutting and pasting parsed trees, not text.
Think about how a tree node editor works.

The only problem with this is the desire to copy half-written code
around, before filling in the blanks. Programmers do think in text,
not syntax trees (most of the time). But maybe that's exactly the
problem :)

/L
 
S

Stefan Schulz

If engineers designed buildings the way computer programmers design
computer languages, the first woodpecker that came along would destroy
civilisation.

Well, the better comparison would be an architect that just happens to
leave a decimal dot in some calculations out... with "interesting"
results. You are after all not actually "building" your program, as
specifying it.
 
R

Roedy Green

What I hear you say is that the concept of a block should be
abstracted out, so that the markers becomes simply that. There is no
such thing as half a block, so there should never be only an opening
marker. It's not so much the pairing of the { and }, but that they
are secondary to the block they delimiter, artifacts of the real
concern.

Exactly. Internally to the SCID, blocks are a single node object
with dependent children for the stuff inside one level deep.

You need an editor that recognizes that.

OSs make you treat files an atomic units. You can have a file missing
a big or end "marker" Blocks should be the same sort of beast. We
should think of the block as the thing, the {} are just decorations on
it to help human see it. Indenting coloured backgrounds nesting
stripes in the margin etc all could also be used to help make these
boundaries more obvious.

My brain HATES nesting of any kind. I am nesting retarded. I have
terrible trouble of keeping track of even 3 levels. I need all the
visual cues and disability aids possible to compensate.

At the bare minimum, if I am working at level 4 editing, I should not
be able to inadvertently meddle with level 2 block boundaries.

I go further. You should not be able to shift boundaries, just move
code in or out of a block. I think the key is editors that refuse to
let your code ever be unbalanced. They would take getting used to,
but in the long run they would make it much easier to manage complex
structure.





--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
R

Roedy Green

That would be merely guiding, not enforcing, the pairing.

There is so much resistance to this sort of thing it will have to be
introduced gradually. At first people think of it as the editor being
mean. once they learn to work within the guidelines, they will see
that restriction as benign.

You can certainly automatically MARK begin ends better than you do
now. Few will complain about that, except the "I code in binary" boys
who see coding as a sort of test of manliness.


The eclipse people have been very clever in the way they mask the fact
the code exits internally as a parse tree. The ASCII files are
generated from it partly to mollify. Eclipse thereby managed to
bypass much of the Luddite reactions I got when I first brought up the
idea of SCIDs.

On the other hand, that tying to Java source has prevented them from
really taking off on SCID possibilities.

see http://mindprod.com/projects/scid.html


--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
R

Roedy Green

That could be implemented as a marker line in the margin, pointing out
the block that you are currently inside, as well as highlighting the
delimiting markers.

you need a lot of visual cues to remind you where you are. One of the
must infuriating errors I make using Eclipse is when I have an
abstract class and fairly similar implementations of it. I think I am
editing class A but really I am working on class B.

I handle this on my website with my header on each page that tells you
where are at all times in the file hierarchy, with different sections
having different colour schemes.

I need something like that when coding I can watch out the corner of
my eye.

In parking lots there have symbols -- animals for example, for each
section that help you remember where you are or where you were.
You could do the same with sections of code, assign them an icon and
get back to it just by clicking on a bookmark bar.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
R

Roedy Green

The only problem with this is the desire to copy half-written code
around, before filling in the blanks. Programmers do think in text,
not syntax trees (most of the time). But maybe that's exactly the
problem :)

I think it would be awkward at first, but you can still do a lot of
pasting, it is just the sucker side would only work in balanced
chunks, or would insist you balance the clipboard before pasting.

I have found that a badly unbalance piece of code can take a long time
to fix. The compiler is useless. It is totally befuddled.

Perhaps then you to introduce the concept you are allowed to unbalance
a maximum of one deep, so the compiler knows the possible range of
where the missing pair is , AND INSERTS it, provisionally and blinks
it until you confirm.

There need not be one size fits all. This is a configurable editor,
not a language feature. Because some people want the total freedom of
Notepad dose not mean others should be blocked from getting more
automated code cranking.

Look at this from management's point of view. Which technique lets you
crank out more working code per hour?

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
C

ChrisWSU

Eclipse kinda(loosely but simliar idea) already implements two of your
suggestions (2,4) if you have the curser on a {} type thing it puts a
block around the oposing one. Body tags can be shrunk on the left hand
side by clicking the arrow, the body has now disapeared and u cant edit
it... cant see it either though... I think as ide's continue to evolve
we will maybe just start seeing these(prob in VIM before eclipse ;).
Great ideas, i could see myself using all of that.
 
R

Roedy Green

I think as ide's continue to evolve
we will maybe just start seeing these(prob in VIM before eclipse ;).
Great ideas, i could see myself using all of that.

I have been pushing the SCID idea for a decades. Most of the reaction
has been extreme hostility. See my essay
http://mindprod.com/projects/scid.html

“All truth passes through three stages. First, it is ridiculed.
Second, it is violently opposed. Third, it is accepted as being
self-evident.”
~ Arthur Schopenhauer

I think SCIDs are moving into stage 3.

I had a smile wrapping around my head as I was exploring Eclipse. My
partner was puzzled why I was so elated. I explained that I had been
envisioning SCID editing tools for many years, now here many of them
were, working even more smoothly than I had envisioned them, in
Eclipse. It was like one those dreams where money keeps appearing
everywhere.

The handling of block delimiter balancing though is still almost the
nightmare it was in the old days of text editors.

Using eclipse is a bit like having a Korean tutor (Remo William's
Chiun) over your shoulder constantly pointing at your typos and other
errors as you make them, but not saying much, or when he does speak,
in strange English.

You don't have a compile-edit cycle any more. You normally discover
and fix problems the instant you make them BUT YOU DON'T HAVE TO
REPAIR THEM right way. I am learning to pay more attention to its
hints as I type and fix problems right way. It makes for ever so much
smoother operation.


I am quite impressed. Normally switching IDEs makes you unproductive
for weeks till you learn the new one. I am already more productive
with Eclipse after just a few days.

If I had the money I'd send a gift certificate to everyone who had
anything to do with creating this marvel so they could buy a huge
bouquet of flowers for someone they love.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
G

Gordon Beaton

I have found that a badly unbalance piece of code can take a long
time to fix. The compiler is useless. It is totally befuddled.

On the other hand, it isn't the compiler's job to fix your source
code.
Perhaps then you to introduce the concept you are allowed to
unbalance a maximum of one deep, so the compiler knows the possible
range of where the missing pair is , AND INSERTS it, provisionally
and blinks it until you confirm.

Why should the *compiler* be doing these things? Doesn't your *editor*
show you how the parentheses match?

In my experience, the automatic code formatting provided by my editor
lets me spot this kind of error almost immediately, since things don't
align as they should below the missing (or additional) parenthesis.

/gordon
 
R

Roedy Green

On the other hand, it isn't the compiler's job to fix your source
code.


Why should the *compiler* be doing these things? Doesn't your *editor*
show you how the parentheses match

I am talking about compiler in the Eclipse incremental compiler sense
not the Javac sense. It is parsing and analysing your syntax on every
keystroke.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
T

Tim Tyler

Roedy Green said:
The eclipse people have been very clever in the way they mask the fact
the code exits internally as a parse tree. The ASCII files are
generated from it partly to mollify. Eclipse thereby managed to
bypass much of the Luddite reactions I got when I first brought up the
idea of SCIDs.

IBM /did/ experiment with putting the source code in what was effectively
a database. That's how Visual Age for Java operated. However, they
ditched the approach with Eclipse - which stores it's Java files
straight into the OS-level filing system as flat ASCII text files.

I'm not sure most of the world is ready for the idea of storing code
in databases. In fact, most of the database world still seems mired
in a bizarre 1970s SQL timewarp - a nightmare world of SQL, fixed-size
fields and unreadable BLOBs. The tools used are typically only
/slightly/ more advanced than OS filing systems use.
 
L

Larry Barowski

Lasse Reichstein Nielsen said:
Roedy Green said:
2. hover help that tells you what sort of "fence" ( { [ character you
are dealing with e.g. "begin for" start xxx method, end xxx class. It
could also remind you where you are in Class X in method y nested X
deep.

That could be implemented as a marker line in the margin, pointing out
the block that you are currently inside, as well as highlighting the
delimiting markers.

Or marker lines embedded in the source text, as in the CSD.
http://www.jgrasp.org/images/csd_large.html
 
R

Roedy Green

IBM /did/ experiment with putting the source code in what was effectively
a database. That's how Visual Age for Java operated. However, they
ditched the approach with Eclipse - which stores it's Java files
straight into the OS-level filing system as flat ASCII text files.

I don't think Eclipse relies on the flat files. It EXPORTS them, but
it keeps private pre-parsed copy in a hidden database somewhere. If
you make changes to the flat files, start up eclipse, it will not
notice the chances unless you say REFRESH. So it works the same as
Visual Age, with just more frequent automatic exporting and easier
importing.

In any case it certainly has a database-like representation while you
are editing between saves.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
R

Roedy Green

Or marker lines embedded in the source text, as in the CSD.
http://www.jgrasp.org/images/csd_large.html

before you know it, we will be back to flow charts.

A loop is really a single thing. It should thus be represented by a
single visual effect, not two disconnected tiny blips marking the
start and end.

we have millions of pixels going to waste that could give you more
than enough visual clues of nesting.

--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
T

Tim Tyler

Roedy Green said:
I don't think Eclipse relies on the flat files. It EXPORTS them, but
it keeps private pre-parsed copy in a hidden database somewhere. If
you make changes to the flat files, start up eclipse, it will not
notice the chances unless you say REFRESH.

That is not accurate. It reads the source files in from disc when you
start up Eclipse - including any modifications made to them.

It may not recompile them and attempt to convert them into a consistent
project until you say "Refresh" - but that's a bit different.
 
T

Tim Tyler

Roedy Green said:
When I am typing I can watch a million error messages appear or
disappear just by inserting or removing one } in some trivial if. The
compiler reinterprets the program in a totally different and
completely bizarre way from the most trivial change. Classes unnest
and renest for example.

From an engineering point of view, this structure is far too delicate.

I *never* use functions inside functions - so that -
in theory - tools can check for functions
defined within functions - and warn me that I
have mismatched braces at that point if it
encounters either pattern.

However, I don't know of any tools that do
this at the moment. Instead they tell
me at the end of the file, and probably
in a zillion other places about missing
functions. It takes detective work to
track down the mis-matched brace.

Tools /ought/ to be able to remember what your
file looked like a few minutes ago - and
notice if the brace level has gone seriously
out of kilter during the last editing session
and give a helpful analysis of the most likely
problem area.

I *never* use functions inside functions - so that -
in theory - tools can check for functions
defined within functions - and warn me that I
have mismatched braces at that point if it
encounters the pattern.

However, I don't know of any tools that do
this at the moment. Instead they tell
me at the end of the file, and probably
in a zillion other places about missing
functions. It takes detective work to
track down the mis-matched brace.

Tools /ought/ to be able to remember what your
file looked like a few minutes ago - and
notice if the brace level has gone seriously
out of kilter during the last editing session
and give a helpful analysis of the most likely
problem area.

An obvious clue about brace level problems
is indentation - for a mismatch identifier,
that could be a pretty useful clue.
 
T

Tim Tyler

Roedy Green said:
I am quite impressed. Normally switching IDEs makes you unproductive
for weeks till you learn the new one. I am already more productive
with Eclipse after just a few days.

One of the things I noticed after adopting it was that the scale of
refactorings I could successfully attempt increased dramatically.

In the past I've found myself having to back out of over-ambitious
refactorings - when they turn out to be more trouble than I had
anticipated in my preliminary reconnaissance mission.

With Eclipse, practically every refactoring I attempt seems to work.
If I had the money I'd send a gift certificate to everyone who had
anything to do with creating this marvel so they could buy a huge
bouquet of flowers for someone they love.

IMO, IBM have quite a history of deserving bouquets from Java programmers.
 
C

Chris Smith

Roedy Green said:
I don't think Eclipse relies on the flat files. It EXPORTS them, but
it keeps private pre-parsed copy in a hidden database somewhere. If
you make changes to the flat files, start up eclipse, it will not
notice the chances unless you say REFRESH. So it works the same as
Visual Age, with just more frequent automatic exporting and easier
importing.

I would hesitate to use the word "database". I don't think it's
appropriate for a representation that lacks the ability to provide
widespread access to data or manage concurrent access from multiple
unrelated clients. Nevertheless, yes Eclipse does use a rich
representation of the contents of a source file, at any number of points
in its functionality.

That's not at issue, though. The question is this: what is the
*authoritative* representation? That question is easily answered by
noting that Eclipse frequently (all too frequently, it seems) throws
away its entire collection of rich representation and builds everything
from scratch again. The parse trees and other rich forms of information
maintained by Eclipse are a *cache*, not an authoritative source of
information. They are entirely dependent on a lot of details like
compiler settings and preferences, and there's not a lot of energy
invested in ensuring that they remain meaningful beyond certain changes
to somewhat trivial details.

It EXPORTS them [flat files]

That's not at all accurate. You would need to demonstrate even one
situation in which Eclipse deletes a source file and replaces it with
something generated from the internal parse tree. It never does so.

This is how it should be. Visual Age taught IBM that many programmers
aren't ready to have their IDE completely own their source code, and the
universal form of source code, at least in Java, is text. Programmers
can include all sorts of useful information in source code by modifying
indentation, spacing, and alignment; and they expect that information to
stay there. An IDE that modifies your source files all willy-nilly just
isn't going to last too awfully long; there will be a quick mutiny, and
most developers will switch to Notepad before returning to that IDE.

A "format" command is one thing, but there's a really good reason that
"format" doesn't run from Eclipse on every save. In fact, I rarely run
it on production code, though I frequently use it with code copied from
USENET, web pages, etc.

What might be interesting, as a research project, would be to categorize
ways that programmers rely on the plain-text format to include
information in code. I suspect there'd be a surprising number of cases
here, ranging from the obvious but sometimes intricate conventions of
indentation, through grouping related pieces of code within a method by
whitespace lines, all the way to choosing line break locations to
emphasize the structure of a long or intricate statement. It might then
be interesting to find a mechanism for encoding all these
"communication" uses of whitespace into the parse tree itself, and then
you might be closer to being able to abandon the plain-text format for
code. Until then, though, it's a very shaky idea.

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
R

Roedy Green

That's not at all accurate. You would need to demonstrate even one
situation in which Eclipse deletes a source file and replaces it with
something generated from the internal parse tree. It never does so.

Try this experiment. Write some code in Eclipse. Shut down eclipse.
modify a source file with a text editor. Start up eclipse. Do not
"refresh". and notice that your changes are missing. It is showing you
something other than the flat files. It must have some other copy.
That might be a cvs like repository or an internal representation, but
it is not the flat files.


--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top