Tabs versus Spaces in Source Code

D

Duncan Booth

Sybren said:
I agree with that.


But not with that, since it is contradicting. "Inserting the
characters" could very well be the same as "performing the expected
operations".

It could be, and for some keys (q, w, e, r, t, y, etc. spring to mind) that
is quite a reasonable implementation. For others 'tab', 'backspace',
'enter', 'delete', etc. it is less reasonable, but it is a quality of
implementation issue. If I had an editor which entered a control character
for each of these I would simply move to a better editor.

Some keys will of course do both (e.g. space bar in some editors does
completion and inserts a space), but I prefer editors which keep things
simple. The tab key is particularly prone to excessively complicated
actions, for example the editor I use has the following (not simple at
all, and in fact not even an accurate description of what it does) binding
for the tab key:
indent-previous command

Indent based on the previous line.

This command makes the current line start at the same column as the
previous non-blank line. Specifically, if you invoke this command
with point in or adjacent to a line's indentation, indent-previous
replaces that indentation with the indentation of the previous
non-blank line. If point's indentation exceeds that of the previous
non-blank line, or if you invoke this command with point outside of
the line's indentation, this command simply inserts a tab character.

If a region is highlighted, Epsilon indents all lines in the region by
one tab stop. With a numeric prefix argument, Epsilon indents by that
amount.
(and it is even more complex when you are editing something like Python
where it takes continuation lines into account working out the
indentation).

The problem is that behaviour like this is useful, and mostly even
intuitive, but it's a long way from the definition of a tab or even
the little metal clips you used to stick on the back of a manual
typewriter.
 
S

Sybren Stuvel

Duncan Booth enlightened us with:
It could be, and for some keys (q, w, e, r, t, y, etc. spring to
mind) that is quite a reasonable implementation. For others 'tab',
'backspace', 'enter', 'delete', etc. it is less reasonable, but it
is a quality of implementation issue. If I had an editor which
entered a control character for each of these I would simply move to
a better editor.

Well, my editor *does* enter a control character when I press Enter,
namely \n. It also enters a \t when I press TAB. That does not mean my
editor is flawed.
The problem is that behaviour like this is useful, and mostly even
intuitive, but it's a long way from the definition of a tab or even
the little metal clips you used to stick on the back of a manual
typewriter.

I understand what you are saying, but saying "an editor that insert a
control character when pressing a key is flawed" is most incorrect.

Sybren
 
O

Oliver Bandel

Xah said:
Tabs versus Spaces in Source Code

Xah Lee, 2006-05-13

In coding a computer program, there's often the choices of tabs or
spaces for code indentation. There is a large amount of confusion about
which is better. It has become what's known as “religious war†—
a heated fight over trivia. In this essay, i like to explain what is
the situation behind it, and which is proper.

Simply put, tabs is proper, and spaces are improper.
[...]

I fullheartedly disagree :)

So, no "essay" on this is necessary to read :->


Ciao,
Oliver
 
O

opalpa

Simply put, tabs is proper, and spaces are improper.
Why? This may seem
ridiculously simple given the de facto ball of confusion: the semantics
of tabs is what indenting is about, while, using spaces to align code
is a hack.

The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.
... and the solution is to advance
the sciences such that your source code in some way
embed such information.

If/when time comes where such info is embeded perhaps then tabs will be
OK.

---------------------------------------------------------------

I use spaces because of the many sources I've opened I have many times
sighed on opening tabed ones and never done so opening spaced ones.

I don't get mad, but sighing is a clear indicator of negativity.
Anyway, the more code I write and read the less indentation matters to
me. My brain can now parse akward source correctly far bettter than it
did a few years ago.


All the best,
Opalinski
(e-mail address removed)
http://www.geocities.com/opalpaweb/
 
P

Pascal Bourguignon

[email protected] opalinski from opalpaweb said:
The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.


If/when time comes where such info is embeded perhaps then tabs will be
OK.

---------------------------------------------------------------

I use spaces because of the many sources I've opened I have many times
sighed on opening tabed ones and never done so opening spaced ones.

I don't get mad, but sighing is a clear indicator of negativity.
Anyway, the more code I write and read the less indentation matters to
me. My brain can now parse akward source correctly far bettter than it
did a few years ago.

And anyways, C-x h C-M-\ comes automatically after C-x C-f source RET
Just add this to your ~/.emacs :

(add-hook 'find-file-hook
(lambda () (indent-region (point-min) (point-max)) (pop-mark)))



--
__Pascal Bourguignon__ http://www.informatimago.com/

IMPORTANT NOTICE TO PURCHASERS: The entire physical universe,
including this product, may one day collapse back into an
infinitesimally small space. Should another universe subsequently
re-emerge, the existence of this product in that universe cannot be
guaranteed.
 
D

Dale King

Iain said:
Oh God, I agree with Xah Lee. Someone take me out behind the chemical
sheds...

Xah Lee wrote:
<more worthless nonsense>

Please don't feed the troll!

And for the record, spaces are 100% portable, tabs are not. That ends
the argument for me.

Worse than either tabs or spaces however is Sun's mixture of the two.
 
O

Oliver Bandel

The reality of programming practice trumps original intent of tab
characters. The tab character and space character are pliable in that
if their use changes their semantics change.
[...]


Yes, as I started programming I also preferred tabs.
And with growing experience on how to handle this in true life
(different editors/systems/languages...) I saw, that
converting the "so fine tabs" was annoying.

The only thing that always worked were spaces.
Tab: nice idea but makes programming an annoyance.

Ciao,
Oliver
 
E

Edward Elliott

achates said:
A tab is not equivalent to a number of spaces. It is a character
signifying an indent, just like the newline character signifies the end
of a line.

This link posted over in comp.lang.perl.misc expands on that:

http://numeromancer.dyndns.org/~timothy/tab-width-independence/description.html

To me, tabs are like gotos. In the wrong hands, they can be abused.
Novices will do the most hideous things with them. So do we just ban gotos
alogether? No - we structure their use to avoid the most obnoxious
mistakes and live with the rest in a power/abuse tradeoff. Before you
object that modern languages don't use gotos, think again. Break and
continue are merely restricted forms of goto, as are exceptions. Don't
throw the baby out with the bathwater. Make better tools that allow the
good uses and prevent the bad.
 
A

achates

Duncan said:
but I prefer editors which keep things
simple. The tab key is particularly prone to excessively complicated
actions, for example the editor I use has the following (not simple at
all, and in fact not even an accurate description of what it does) binding
for the tab key:

<description of strange tab-key binding behaviour>

I'm not familiar with your editor, but if that's its default behaviour
when you hit tab then it sounds neither simple nor intuitive.

You haven't explained why you think there's a problem with having a
character which, in an unambiguous and non-implementation-specific way,
means 'one level of indentation'. In Python, of all languages, it makes
sense to have such a character because 'one level of indentation' is a
syntactical token processed by the interpreter.

But consider this: like all real men, I normally program in hex. Here's
an example of some recent code:

0x66 0x6f 0x72 0x20 0x69 0x74 0x65 0x6d 0x20 0x69 0x6e 0x20 0x6d 0x65
0x6e 0x75 0x3a 0x0d 0x0a 0x09 0x70 0x72 0x69 0x6e 0x74 0x20 0x27 0x73
0x70 0x61 0x6d 0x27 0x0d 0x0a 0x70 0x72 0x69 0x6e 0x74 0x20 0x27 0x6c
0x6f 0x76 0x65 0x6c 0x79 0x20 0x73 0x70 0x61 0x6d 0x27

If I wanted to be girly about it I could use an editor, and it would
probably look like this:

for item in menu:
print 'spam'
print 'lovely spam'

But then if I wanted, I could write my own editor, and make it display
tabs as 'negative indents' from column 40, so that it looks like this:

for item in menu:
print 'spam'
print 'lovely spam'

Guess what: the python interpreter wouldn't know about my strange
editor habits! It would read my file and run just fine. What's more you
can view it with *your preferred indentation display methodology* and
we can both live in harmony!

With spaces for indentation, this just isn't possible, because I have
to conform to your viewing preferences, and that makes me unhappy. Why
would you want to make me unhappy?
 
K

Kaz Kylheku

Xah said:
Tabs vs Spaces can be thought of as parameters vs hard-coded values, or
HTML vs ascii format, or XML/CSS vs HTML 4, or structural vs visual, or
semantic vs format. In these, it is always easy to convert from the
former to the latter, but near impossible from the latter to the
former.

Bahaha, looks like someone hasn't thought things through very well.

Spaces, under a mono font, offer greater precision and expressivity in
achieving specific alignment. That expressivity cannot be captured by
tabs.

The difficulty in converting spaces to tabs rests not in any bridgeable
semantic gap, but in the lack of having any way whatsoever to express
using tabs what the spaces are expressing.

It's not /near/ impossible, it's /precisely/ impossible.

For instance, tabs cannot express these alignments:

/*
* C block
* comment
* in a common style.
*/

(lisp
(nested list
with symbols
and things))

(call to a function
with many parameters)
;; how do you align "to" and "with" using tabs?
;; only if "to" lands on a tab stop; but dependence on specific tab
stops
;; destroys the whole idea of tabs being parameters.

To do these alignments structurally, you need something more expressive
than spaces or tabs. But spaces do the job under a mono font, /and/
they do it in a completely unobtrusive way.

If you want to do nice typesetting of code, you have to add markup
which has to be stripped away if you actually want to run the code.

Spaces give you decent formatting without markup. Tabs do not. Tabs are
only suitable for aligning the first non-whitespace character of a line
to a stop. Only if that is the full extent of the formatting that you
need to express in your code can you acheive the ideal of being able to
change your tab parameter to change the indentation amount. If you need
to align characters which aren't the first non-whitespace in a line,
tabs are of no use whatsoever, and proportional fonts must be banished.
 
E

Edward Elliott

achates said:
With spaces for indentation, this just isn't possible, because I have
to conform to your viewing preferences, and that makes me unhappy. Why
would you want to make me unhappy?

+5 QOTW
 
A

achates

Kaz said:
If you want to do nice typesetting of code, you have to add markup
which has to be stripped away if you actually want to run the code.

Typesetting code is not a helpful activity outside of the publishing
industry. You might like the results of your typsetting; I happen not
to. You probably wouldn't like mine. Does that mean we shouldn't work
together? Only if you insist on forcing me to conform to your way of
displaying code.

You are correct in pointing out that tabs don't allow for 'alignment'
of the sort you mention:
(lisp
(nested list
with symbols
and things))
But then neither does Python. I happen to think that's a feature.

(And of course you can do what you like inside a comment. That's
because tabs are for indentation, and indentation is meanigless in that
context. Spaces are exactly what you should use then. I may or may not
like your layout, but it won't break anything when we merge our code.)
 
D

Duncan Booth

achates said:
You haven't explained why you think there's a problem with having a
character which, in an unambiguous and non-implementation-specific way,
means 'one level of indentation'. In Python, of all languages, it makes
sense to have such a character because 'one level of indentation' is a
syntactical token processed by the interpreter.
Because it doesn't mean 'one level of indentation', it means 'move to next
tabstop' and a tabstop isn't necessarily the same as a level of
indentation. In particular a common convention is to have indentations at 4
spaces and tabs expanding to 8 spaces.
 
A

achates

Duncan said:
Because it doesn't mean 'one level of indentation', it means 'move to next
tabstop' and a tabstop isn't necessarily the same as a level of
indentation.

'move to next tabstop' is how your editor interprets a tab character.
'one level of indentation' is how the language parser interprets it.
The two are entirely consistent, in that they are both isomorphic
mappings of the same source file.
In particular a common convention is to have indentations at 4
spaces and tabs expanding to 8 spaces.

Like all space-indenters, you seem to be hung up on the idea of a tab
'expanding' to n spaces. It only does that if you make your editor
delete the tab character and replace it with spaces! Really, that is
the only sense in which your statement makes any sense. If you want
your indentation to have the width of four, eight, or nineteen spaces,
set your tabstops accordingly.

Seriously people, this is about separating the content of a source file
from how it is displayed. It's about letting people work together while
also allowing them to have control over their own environments,
something which is and always has been central to the hacker ethos.
 
D

Duncan Booth

achates said:
Like all space-indenters, you seem to be hung up on the idea of a tab
'expanding' to n spaces. It only does that if you make your editor
delete the tab character and replace it with spaces! Really, that is
the only sense in which your statement makes any sense. If you want
your indentation to have the width of four, eight, or nineteen spaces,
set your tabstops accordingly.

It is strange. You use many of the same words as me, but they don't make
any sense.

The point is about separating the presentation of the source file from the
semantic content. When displaying the file you can choose to expand tabs to
any suitable positions. These may be evenly spaced every n characters, or
may vary across the page. However the important thing is that a tab does
not map to a single indentation level in Python: it can map to any number
of indents, and unless I know the convention you are using to display the
tabs I cannot know how many indents are equivalent to a tabstop.
Seriously people, this is about separating the content of a source file
from how it is displayed. It's about letting people work together while
also allowing them to have control over their own environments,
something which is and always has been central to the hacker ethos.

Precisely. Using spaces everywhere allows this, using tabs everywhere
allows this, mixing spaces and tabs is a bad thing. You have to agree a
convention for the project and conform to it. My experience is that 'spaces
only' is more common, but your experience may differ.
 
K

Kaz Kylheku

achates said:
Typesetting code is not a helpful activity outside of the publishing
industry.

Be that as it may, code writing involves an element of typesetting. If
you are aligning characters, you are typesetting, however crudely.
You might like the results of your typsetting; I happen not
to. You probably wouldn't like mine. Does that mean we shouldn't work
together? Only if you insist on forcing me to conform to your way of
displaying code.

Someone who insists that everyone should separate line indentation into
tabs which achieve the block level, and spaces that achieve additional
alignment, so that code could be displayed in more than one way based
on the tab size without loss of alignment, is probably a "space cadet",
who has a bizarre agenda unrelated to developing the product.

There is close to zero value in maintaining such a scheme, and
consequently, it's hard to justify with a business case.

Yes, in the real world, you have to conform to someone's way of
formatting and displaying code. That's how it is.

You have to learn to read, write and even like more than one style.
You are correct in pointing out that tabs don't allow for 'alignment'
of the sort you mention:

That alignment has a name: hanging indentation.

All forms of aligning the first character of a line to some requirement
inherited from the previous line are called indentation.

Granted, a portion of that indentation is derived from the nesting
level of some logically enclosing programming language construct, and
part of it may be derived from the position of a character of some
parallel constituent within the construct.
(lisp
(nested list
with symbols
and things))
But then neither does Python. I happen to think that's a feature.

Python has logical line continuation which gives rise to the need for
hanging indents to line up with parallel constituents in a folded
expression.

Python also allows for the possibility of statements separated by
semicolons on one line, which may need to be lined up in columns.

var = 42; foo = 53
x = 2; y = 10
(And of course you can do what you like inside a comment. That's
because tabs are for indentation, and indentation is meanigless in that
context.

A comment can contain example code, which contains indentation.

What, I can't change the tab size to display that how I want? Waaah!!!
(;_;)
 
A

achates

Duncan said:
However the important thing is that a tab does
not map to a single indentation level in Python: it can map to any number
of indents, and unless I know the convention you are using to display the
tabs I cannot know how many indents are equivalent to a tabstop.

Sorry but this is just wrong. Python works out the indentation level
for a source file dynamically: see
http://docs.python.org/ref/indentation.html. The particular algorithm
it uses is designed to accommodate people who mix tabs and spaces
(which is unfortunate, and should probably be changed). Nevertheless,
using tabs only, one tab does indeed map to exactly one indentation
level. One tabstop == one indent, on your editor and on mine. You do
not need to know my display convention to run my code.

All I can suggest is that you try it out: create a short source file
indented with tabs only, and play around with your editor's tabstop
setting (and make sure it is writing tab characters, not spaces, to the
source file). I promise you the Python interpreter will neither know
nor care what your editor display settings were when you last wrote the
file.

I realise that a lot of code out there uses spaces only. That's
unfortunate, but it doesn't mean we should stop explaining to people
why tab-indenting is a better standard. This is about freedom:
indenting with spaces lets you control over how other people view your
code; indenting with tabs give them that control.
 
A

Aaron Gray

I was once a religous tabber until working on multiple source code sources,
now I am a religious spacer :)

My 2bits worth,

Aaron
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top