how to reduce bugs due to incorrect indentation

M

msustik

I had a bug in a Python script recently. The code in question was something along the lines of:

if a == 1:
x = y
else:
x = z
y = z + y
z = z + 1

While editing this file I accidentally pushed TAB on the line with 'y = z + y'.

My changes were elsewhere and I did not notice the above one line change when I looked at the diffs before commit. I should have noticed it...

It was rare that a was 1 and therefore the problem did not show up for a while. (I know I should have had tests exercising all cases...)

When the bug showed up, it was kind of difficult to remember what was the original intent. Fortunately, looking at old versions allowed me to find the problem commit and the bug.

Any suggestion on how to avoid this type of error in the future?

Thanks!
 
C

Chris Angelico

I think you just mentioned the best suggestions: (a) have a good
collection of unit tests (where "good" means data and branch coverage),
and (b) look carefully, or have someone else look carefully, at every
commit.

Yep, I'd agree with that. Additionally, a habit of small and
conceptual commits helps hugely, even if you don't look at them too
carefully as they go through; get familiar with 'git gui blame' or
whatever the equivalent is for your system (or just browsing through
'gitk filename.py' or equiv) and you can go back in time to pin down
bugs. I had to fight heavily to get my boss to understand this,
because he said he _never_ went back through the repo's history;
meanwhile, I was doing so frequently, and knowing exactly what code
was changed, when, and why. His attitude to bugs was "don't make any,
and you're a bad programmer if you do". My attitude to bugs was, and
is, "be honest with yourself, you will make them, so design your
systems to handle that".

When your commits are small and tidy, you can often find bugs really
REALLY easily just by looking at the commit that changed some
particular line. With gitk, I'll often make a dummy edit to a line,
then highlight the red "this line deleted" line in the uncommitted
changes display, right-click, "Show origin of this line".
Alternatively, 'git blame' or other annotation can work, but I
generally find that annotating a whole file is overkill and trying to
ask for annotations of one small section is tedious. But whichever way
you do it, you should be able to VERY quickly go from "Hmm, I wonder
why this is indented like this?" to "Here's the commit that made it
like that", and then you can easily decide whether it was right or
not.

Of course, your commit messages have to be useful. I can't honestly
say my own commit messages are perfect, but at least they're not like
these:

https://github.com/douglascrockford/JSLint/commits/master

Note to future readers: That link will take you to the most recent
commits on that repo. It may be (it may be!) that, by the time you
look at it, he'll have changed his practice and started writing
exemplary messages. If so, just page back through the history and find
how it looked in 2014. :)

ChrisA
 
A

Asaf Las

I had a bug in a Python script recently. The code in question was
something along the lines of:
if a == 1:
x = y
else:
x = z
y = z + y
z = z + 1

While editing this file I accidentally pushed TAB on the line with
'y = z + y'.
My changes were elsewhere and I did not notice the above one line
change when I looked at the diffs before commit. I should have noticed
it...

It was rare that a was 1 and therefore the problem did not show up
for a while. (I know I should have had tests exercising all cases...)
When the bug showed up, it was kind of difficult to remember what
was the original intent. Fortunately, looking at old versions allowed
me to find the problem commit and the bug.
Any suggestion on how to avoid this type of error in the future?
Thanks!

I see.

The only solution to mimic C style curly brackets comes to my mind is
to use hash to mark start and stop of your block as:

if a == 1:#
x = y
#
else:#
x = z
#
y = z + y
z = z + 1

and PEP ... Nulla poena sine lege :)

/Asaf
 
T

Terry Reedy

if a == 1:
x = y
else:
x = z
y = z + y
z = z + 1

While editing this file I accidentally pushed TAB on the line with 'y = z + y'.

In this particular case, remove the indentation with
x = y if a == 1 else z
and indenting the next line is syntax error.
 
G

Grant Edwards

I had a bug in a Python script recently. The code in question was something along the lines of:

if a == 1:
x = y
else:
x = z
y = z + y
z = z + 1

While editing this file I accidentally pushed TAB on the line
with 'y = z + y'.

Any suggestion on how to avoid this type of error in the future?

The best advice is to pay closer attention to what you're doing. Look
at the code while you're editing, not your fingers. Before you commit
the change, spend some time looking carefully at it to verify that
changes were intentional.
 
R

Roel Schroeven

(e-mail address removed) schreef:
I had a bug in a Python script recently. The code in question was something along the lines of:

if a == 1:
x = y
else:
x = z
y = z + y
z = z + 1

While editing this file I accidentally pushed TAB on the line with 'y = z + y'.

My changes were elsewhere and I did not notice the above one line change when I looked at the diffs before commit. I should have noticed it...

It was rare that a was 1 and therefore the problem did not show up for a while. (I know I should have had tests exercising all cases...)

When the bug showed up, it was kind of difficult to remember what was the original intent. Fortunately, looking at old versions allowed me to find the problem commit and the bug.

Any suggestion on how to avoid this type of error in the future?

My suggestion: configure your editor to insert the appropriate amount of
spaces instead of a tab when you press the tab key.


Best regards,
Roel
 
L

Larry Martell

(e-mail address removed) schreef:


My suggestion: configure your editor to insert the appropriate amount of
spaces instead of a tab when you press the tab key.

+1 - tabs are evil.
 
E

Ethan Furman

+1 - tabs are evil.

Tabs are not evil, and an argument can be made that tabs are better (a decent editor can be configured to show x many
spaces per tab, then users could decide how much indentation they preferred to see... but I digress).

Using spaces instead of tabs would also have not prevented the error that Msustik encountered, and for that matter we
don't know whether he was using tabs or spaces in his source file, only that he hit the Tab key -- surely you are not
suggesting everyone rip out their tab key and just hit the space bar four times for each level of indentation? ;)
 
L

Larry Martell

Tabs are not evil, and an argument can be made that tabs are better (a
decent editor can be configured to show x many spaces per tab, then users
could decide how much indentation they preferred to see... but I digress).

Using spaces instead of tabs would also have not prevented the error that
Msustik encountered, and for that matter we don't know whether he was using
tabs or spaces in his source file, only that he hit the Tab key -- surely
you are not suggesting everyone rip out their tab key and just hit the space
bar four times for each level of indentation? ;)

The Tab key is not evil, it's the tab character (Ctrl-I). I have been
bitten by this many time when I had to work on a program written by
another. They had their tab stops set at 5 or 6, mine is set at 4, or
they did not have expandtab set, but I did. So you get either a script
that looks misaligned, but works, or one that does not look misaligned
but doesn't work. When I have to pick up someone else's script the
first thing I do is replace the tabs with spaces.
 
C

Chris Angelico

The Tab key is not evil, it's the tab character (Ctrl-I). I have been
bitten by this many time when I had to work on a program written by
another. They had their tab stops set at 5 or 6, mine is set at 4, or
they did not have expandtab set, but I did. So you get either a script
that looks misaligned, but works, or one that does not look misaligned
but doesn't work. When I have to pick up someone else's script the
first thing I do is replace the tabs with spaces.

All you've proven is that *mixing* spaces and tabs is evil. It's like
arguing that oil is evil because, when you mix it with water, weird
stuff happens. But that doesn't mean I want to fry my bacon in water.

Mmm, bacon.

Sorry. I'm back now. Ahem. Arguably, a better fix is to replace spaces
with tabs, because they're more obvious. But mainly, just be
consistent. Whatever one file uses, it uses exclusively. It'd be
pretty easy to create a git commit hook that checks files for leading
indentation and rejects the commit if it's mismatched; I would guess
the same is true in Mercurial.

But none of this would solve the OP's original issue. Whether it's a
tab or spaces, unexpectedly indenting a line of code is a problem.
It's no different from accidentally hitting Ctrl-T in SciTE and
reordering two lines, when one line depends on the other. It's a bug.
So you look at your commits before you make them (to give yourself a
chance to catch it quickly), and you make sure you can always look
back over your commits (in case you didn't catch it quickly). Much
better than blaming the characters involved. Poor innocent U+0009.

ChrisA
 
A

Asaf Las

All you've proven is that *mixing* spaces and tabs is evil. It's like
arguing that oil is evil because, when you mix it with water, weird
stuff happens. But that doesn't mean I want to fry my bacon in water.
Mmm, bacon.
Sorry. I'm back now. Ahem. Arguably, a better fix is to replace spaces
with tabs, because they're more obvious. But mainly, just be
consistent. Whatever one file uses, it uses exclusively. It'd be
pretty easy to create a git commit hook that checks files for leading
indentation and rejects the commit if it's mismatched; I would guess
the same is true in Mercurial.

But none of this would solve the OP's original issue. Whether it's a
tab or spaces, unexpectedly indenting a line of code is a problem.
It's no different from accidentally hitting Ctrl-T in SciTE and
reordering two lines, when one line depends on the other. It's a bug.
So you look at your commits before you make them (to give yourself a
chance to catch it quickly), and you make sure you can always look
back over your commits (in case you didn't catch it quickly). Much
better than blaming the characters involved. Poor innocent U+0009.
ChrisA

pep8 pushed \t to dark side in Python.

though it is better that spaces sometimes. let say someone's indented code
with 2 spaces and user is comfortable with 4. if \t then it is done by
editor's conf without touching code.

/Asaf
 
C

Chris Angelico

pep8 pushed \t to dark side in Python.

Only for the Python stdlib, and only because a decision has to be made
one way or the other. I believe Guido stated at one point that there
was only a very weak push toward "spaces only" rather than "tabs
only", just that it had to be one of those.

ChrisA
 
M

msustik

My suggestion: configure your editor to insert the appropriate amount of

spaces instead of a tab when you press the tab key.

You misunderstood the problem, but managed to start a Tab war! :)

My emacs inserts 4 spaces in python mode when I press the tab key. Python uses the indentation to decide how many lines following the else are executed in the else branch or after the else branch (always that is).

By pressing inadvertently the Tab key I changed the semantics of the code while it is still syntactically correct and PEP8 compliant. This was indeed a user error and my question was towards practices reducing the chance of this happening again.

Based on the responses I arrived to the conclusion that there is no better solution than trying to be careful and have good testing suites.

It would be possible to disable the Tab key completely and type in the spaces all the time. (It is much less likely that one would press the space baraccidentally four times or hold it down to get 4 spaces by mistake.)

Unfortunately this means giving up the indentation help of the editor and that will slow down coding. It will also lead to many indentation mistakes during development (most of which will be caught right away however. Maybe acoloring of the background based on tab position could assist in this.

I also considered adding an extra blank line after the if-else block (similarly for loops) in the hope that it would reduce the chance of missing an inadvertent indentation after the block.

However, this defeats somewhat the python paradigm that got rid of closing braces and endif-s etc. used in other languages to allow more compact code.

-Matyas
 
C

Chris Angelico

It would be possible to disable the Tab key completely and type in the spaces all the time. (It is much less likely that one would press the space bar accidentally four times or hold it down to get 4 spaces by mistake.)

Unfortunately this means giving up the indentation help of the editor andthat will slow down coding. It will also lead to many indentation mistakesduring development (most of which will be caught right away however. Maybea coloring of the background based on tab position could assist in this.

I don't know that it'd really help much anyway. You might reduce one
chance of making errors by hitting a single key, but at the cost of
stupid syntactic salt (indentation requires hitting a key four times?
No thanks), and your fingers would just get used to
whack-whack-whack-whack. No change.

You can spend all your time trying to warp your coding style around
preventing this bug or that bug from happening, or you can just
acknowledge that bugs WILL happen and handle them after the event.
(Hence, source control.) Suppose you come up with a solution to the
accidental-indentation problem. What are you going to do about this
one?

def foo(bar):
if not bar: bat = [0]
for x in bar:
print(len(bar),x)

Now, why is your empty-list handling not working? Oh, there was a
typo. How are you going to deal with that? Well, you could bring in
C-style variable declarations; then you'd get an immediate error
('bat' is undeclared), but somehow I don't think most Python
programmers would prefer this :) Now personally, I do quite like
declared variables, because they allow infinitely-nested scoping, and
I find that feature worth the effort of declaring all my locals; but
it's a tradeoff, and I wouldn't go to that level of effort *just* to
catch typos in variable names. What if there had been a 'bat' at a
higher scope? Then the typo just means the code does something else
wrong. No fundamental difference.

There was a scheme posted to this list a little while ago to have
variable names shown in different colors, which might have helped. (I
disagree with the author's idea that similar names should be in
similar colors - I think that similar names should be in DISsimilar
colors, specifically to catch this sort of error. But anyway.) That's
a theory that might help... but it still might not. And what if your
error is in a literal string that later gets parsed? No, there's no
way that you can catch everything beforehand.

Bugs happen. Find 'em, fix 'em.

ChrisA
 
R

Roel Schroeven

Chris Angelico schreef:
But none of this would solve the OP's original issue. Whether it's a
tab or spaces, unexpectedly indenting a line of code is a problem.

I had misread. I thought the problem was that the OP did want to indent,
but accidentally used the tab key instead of the space bar to do so,
introducing the dreaded mixing of tabs and spaces. Remapping the tab key
would solve that problem.

But the real problem was different, so my advice was indeed not a
solution to the problem. I should've read better.

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
-- Isaac Asimov

Roel Schroeven
 
R

Roel Schroeven

(e-mail address removed) schreef:
You misunderstood the problem, but managed to start a Tab war! :)

Indeed I did; sorry for both :)

--
The saddest aspect of life right now is that science gathers knowledge
faster than society gathers wisdom.
-- Isaac Asimov

Roel Schroeven
 
J

Jurko Gospodnetić

Hi,

Based on the responses I arrived to the conclusion that there
is no better solution than trying to be careful and have good
testing suites.

It would be possible to disable the Tab key completely
...[snipped]...
Maybe a coloring of the background based on tab position
...[snipped]...
I also considered
...[snipped]...

YMMV, but for me, just reading through this fun thread took more
time then ever debugging issues caused by bad Python code
indentation. :-D

So, my suggestion would be to just ignore the problem and deal
with any resulting issues as they occur.

Clean coding & development practices, some of which have been
mentioned earlier in this thread and are useful for many other
reasons as well, will additionally reduce the chance of such
errors causing any non-trivial issues.

Best regards,
Jurko Gospodnetić
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top