For Loops and Variables

J

Jason Cavett

This was a discussion my co-worker and I had awhile back and I was
curious what the group thought.

for(int i=0; i < someValue; i++) {
// do stuff
}

The discussion was that the use of the variable "i" was not good as it
should be named something that means something (arrayCounter or
whatever) so it is easier to understand the code. The reverse
argument was that using the "i" variable as a counter was an okay
practice and it is alright as long as it is the only variable doing
this (all other variables are well named). Convenience of typing "i"
versus "someLongVariableName" was the argument on this side.

I'm just curious what everybody here thinks about this. Are there any
standards on this topic?

P.S. I purposely didn't state what my position was.
P.P.S. Both parties agreed that meaningful variables were a good thing
overall.
 
G

Gordon Beaton

The discussion was that the use of the variable "i" was not good as
it should be named something that means something (arrayCounter or
whatever) so it is easier to understand the code.

For 50 years i, j and sometimes k have meant basically "loop index".
It's easy to spell and easy to understand. Using long names for index
variables is common among inexperienced programmers.

Often the loop and index variable(s) are used to traverse an array or
other structure, and the long expressions that result from wordy index
variables are *not* necessarily easier to read, especially when you
are iterating over a 2D (or more) structure.

Something like "cell[j]" is clear to any reader who understands
arrays. In some cases short meaningful names like "row" or "col" are
ok.

/gordon

--
 
O

Oliver Wong

Jason Cavett said:
This was a discussion my co-worker and I had awhile back and I was
curious what the group thought.

for(int i=0; i < someValue; i++) {
// do stuff
}

The discussion was that the use of the variable "i" was not good as it
should be named something that means something (arrayCounter or
whatever) so it is easier to understand the code. The reverse
argument was that using the "i" variable as a counter was an okay
practice and it is alright as long as it is the only variable doing
this (all other variables are well named). Convenience of typing "i"
versus "someLongVariableName" was the argument on this side.

I'm just curious what everybody here thinks about this. Are there any
standards on this topic?

There are lots of standards on this topic. The tricky part is choosing
which standard you and your co-worker and going to adopt.

FWIW, I will use "i" (or "j", or "k", etc.) as my for-loop counter
variable.

- Oliver
 
E

Eric Sosman

Jason Cavett wrote On 04/05/07 11:47,:
This was a discussion my co-worker and I had awhile back and I was
curious what the group thought.

for(int i=0; i < someValue; i++) {
// do stuff
}

The discussion was that the use of the variable "i" was not good as it
should be named something that means something (arrayCounter or
whatever) so it is easier to understand the code. The reverse
argument was that using the "i" variable as a counter was an okay
practice and it is alright as long as it is the only variable doing
this (all other variables are well named). Convenience of typing "i"
versus "someLongVariableName" was the argument on this side.

I'm just curious what everybody here thinks about this. Are there any
standards on this topic?

P.S. I purposely didn't state what my position was.
P.P.S. Both parties agreed that meaningful variables were a good thing
overall.

There are two kinds of consumers of source code: compilers
and people. Compilers have a good memory for detail, and don't
care what names you use: If all the variables were named l
and l1 and ll and l1l and ll1 they'd be perfectly content. So
the selection of variable names should be motivated solely by
the rather different cognitive needs of flesh-and-blood readers.

When I myself read code, it seems easier to read short names
than long ones. On the other hand, short names can be cryptic;
my reading is not aided if I need to keep interrupting it to go
look for a comment on a variable declaration a hundred lines
distant. I've found that a pretty reasonable balance is struck
if I let the length of a variable name depend on the "size" of
its scope. I'll use short names for variables that are declared,
used, and abandoned in a "small" region of code, and longer names
for variables with more "staying power" whose visibility and
significance extend over wider spans.

(Corollary: method and field names have significance
throughout an entire class and sometimes even beyond it,
hence their names should always be descriptive.)

So, what category fits loop indices? Well, it depends on
the loop -- that is, on how "big" the `// do stuff' is. For
half a dozen lines `i' is fine; for half a hundred you'd want
`fragmentIndex'; for half a thousand consider refactoring.

It works for me. YMMV.
 
J

Jason Cavett

Jason Cavett wrote On 04/05/07 11:47,:










There are two kinds of consumers of source code: compilers
and people. Compilers have a good memory for detail, and don't
care what names you use: If all the variables were named l
and l1 and ll and l1l and ll1 they'd be perfectly content. So
the selection of variable names should be motivated solely by
the rather different cognitive needs of flesh-and-blood readers.

When I myself read code, it seems easier to read short names
than long ones. On the other hand, short names can be cryptic;
my reading is not aided if I need to keep interrupting it to go
look for a comment on a variable declaration a hundred lines
distant. I've found that a pretty reasonable balance is struck
if I let the length of a variable name depend on the "size" of
its scope. I'll use short names for variables that are declared,
used, and abandoned in a "small" region of code, and longer names
for variables with more "staying power" whose visibility and
significance extend over wider spans.

(Corollary: method and field names have significance
throughout an entire class and sometimes even beyond it,
hence their names should always be descriptive.)

So, what category fits loop indices? Well, it depends on
the loop -- that is, on how "big" the `// do stuff' is. For
half a dozen lines `i' is fine; for half a hundred you'd want
`fragmentIndex'; for half a thousand consider refactoring.

It works for me. YMMV.

Yeah, I can understand the scope thing.

Fortunately, the code being discussed IS being refactored. That's how
the whole discussion came up in the first place. (And, because of
this, the for loops are short a sweet. I don't think there's one yet
that is longer than 50 lines - including brackets and such.)

Heh...brackets on separate lines or on the same line as a method/loop/
whatever - that's a whole other can of beans. :p
 
L

Lew

As Oliver pointed out, 'i', 'j' and 'k' have been enshrined as loop index
names since the 1960s. Anyone who doesn't feel comfortable with that deserves
a job whose main function is asking, "Would you like fries with that?"

That said, I add the letter 'x' (for "index") to these just because I asses
that one-character variables are a tad too easy to lose track of in the source
text, thus 'ix', 'jx' and 'kx', respectively. I feel this makes them stand
out better while still not straying too far from tradition.

It is also popular in Java to name the index variable 'index'.
 
G

Gordon Beaton

I've found that a pretty reasonable balance is struck if I let the
length of a variable name depend on the "size" of its scope.

I second this. The same idea is also mentioned in Rob Pike's "Notes on
C Programming".

/gordon

--
 
C

Chris Uppal

Jason said:
The discussion was that the use of the variable "i" was not good as it
should be named something that means something (arrayCounter or
whatever) so it is easier to understand the code.

Like the others, I see nothing wrong with using i for a loop index, and good
reasons to prefer that common idiom in many cases (the fact that it /is/ a
common idiom not least amongst them).

It might be worth saying a bit about /why/ it is acceptable, although it
appears to break the general rule about preferring communicative names.

Somewhat less than half the point is that there is no real need for long
identifiers here -- that's to say that a (very) short identifier is just as
communicative as a long one in this context. And -- all things being equal --
short identifiers are to be preferred over long ones. (The qualification, "all
things being equal" is, of course, /vital/) In any sensible code, the array
index's scope is limited to the loop over the array -- which is a very limited
scope, so the amount of information the identifier must convey is similarly
limited. There is nothing for it to say that is not immediately evident from
the context in which it appears, simply because it only /exists/ in that
immediate context.

The bigger part of the point is that there (usually) is nothing much for the
identifier to communicate /anyway/. It is an array index, and once you've said
that, there isn't (usually) anything else to add. The meaning (usually) lies
in the thing found in the array, at the index, not in the index itself.

The repeated "(usually)" in the above are because of occasional exceptions to
the rule. Perhaps the most common (which Gordon has already touched on) is
when you are considering row/column indexes in 2D arrays. In such cases it can
be sensible to keep it clear whether an index varies over rows or columns.
That is especially true if you are also messing with X/Y coordinates in the
same code (such as painting a grid), since the normal English word order for
rows and columns ("row" then "column" ;-) is inconsistent with that for "x" and
"y" -- which can easily turn the code into a nightmare unless you take steps to
make the relationships explicit. There are other examples, but that one is
probably sufficient...

-- chris
 
P

Patricia Shanahan

Eric said:
Jason Cavett wrote On 04/05/07 11:47,:

I think the focus should be on convenience of reading, not convenience
of typing, but agree with the conclusion. I also think that if i is a
simple loop index, it is extremely well named.
....
When I myself read code, it seems easier to read short names
than long ones. On the other hand, short names can be cryptic;
my reading is not aided if I need to keep interrupting it to go
look for a comment on a variable declaration a hundred lines
distant. I've found that a pretty reasonable balance is struck
if I let the length of a variable name depend on the "size" of
its scope. I'll use short names for variables that are declared,
used, and abandoned in a "small" region of code, and longer names
for variables with more "staying power" whose visibility and
significance extend over wider spans.

I consider two things, size of scope and whether a longer identifier
would add significantly to the clarity of the code. Even in a small
scope, I prefer a slightly longer identifier if I have something worth
saying about the meaning of the variable.

For example, to find the dot product of two arrays:

double dotProduct = 0;
for(int i=0; i<input1.length; i++){
dotProduct += input1 * input2;
}
// use dotProduct.

There really isn't anything worth saying about i that is not already
said by how it is declared, and its scope is so short that any reader is
going to have that declaration in front of them when looking at a use.
There are many purposes that could lead to "double xxx = 0;", so even in
a short scope, it is worth using the identifier to indicate the purpose
of this particular initially-zero double.

Patricia
 
D

Daniel Pitts

Jason Cavett wrote On 04/05/07 11:47,: [snip]
So, what category fits loop indices? Well, it depends on
the loop -- that is, on how "big" the `// do stuff' is. For
half a dozen lines `i' is fine; for half a hundred you'd want
`fragmentIndex'; for half a thousand consider refactoring.

It works for me. YMMV.

Personally, I use 'i', for unnested loops. For iterators, I often use
iter. For loops longer than a dozen or so lines, I try to extract a
method, and give the index a meaningful name. If, however, the only
meaning that can be attributed to the variable is "index", I stick
with "i"
 
K

Knute Johnson

Daniel said:
Personally, I use 'i', for unnested loops. For iterators, I often use
iter. For loops longer than a dozen or so lines, I try to extract a
method, and give the index a meaningful name. If, however, the only
meaning that can be attributed to the variable is "index", I stick
with "i"

My first high level language was Fortran. I don't remember much of my
Fortran but I think we used i through m or n because they were integers
by default. I've just used i ever since because of that. It is
interesting how much is just tradition.
 
P

Patricia Shanahan

Knute said:
My first high level language was Fortran. I don't remember much of my
Fortran but I think we used i through m or n because they were integers
by default. I've just used i ever since because of that. It is
interesting how much is just tradition.

Looking at this the other way round, why were i through m the initial
letters that make a Fortran identifier integer by default?

The oldest Fortran document I was able to find on-line,
http://community.computerhistory.org/scc/projects/FORTRAN/BackusEtAl-Preliminary Report-1954.pdf,
the 1954 Preliminary Report, defines a fixed-point variable as "a
sequence of 1 or 2 alphabetic or numeric characters, the first one of
which is one of the following: i, j, k, l, m, n".

I suspect that those letters were already preferred as subscripts in the
formulas that Fortran was supposed to represent.

Patricia
 
E

Eric Sosman

Patricia said:
Knute said:
My first high level language was Fortran. I don't remember much of my
Fortran but I think we used i through m or n because they were
integers by default. I've just used i ever since because of that. It
is interesting how much is just tradition.

Looking at this the other way round, why were i through m the initial
letters that make a Fortran identifier integer by default?
> [...]

It was I through N, suggestive of INteger.
 
S

Stefan Ram

Patricia Shanahan said:
Looking at this the other way round, why were i through m the initial
letters that make a Fortran identifier integer by default?

This might be derived from the usage of those letters in
mathematics, where one-letter identifiers are prefered indeed.

They have more letters in mathematics, because they also use
greek letters in mathematics, but then, with Unicode, in Java,
we have greek letters, too!

In physics, one prefers to write

F = ma

instead of

force = mass · acceleration

In early FORTRAN, identifiers had to be short because memory
was precious those days. But for a physicist writing formulas
on paper this was not a restriction. Still, they prefered the
one-letter names.

In mathematics, a two-letter name, like »up« (typeset in
italics), within a term would be read as the product »u·p«.
Only functions might have multi-letter names, like »sin«
(typeset in non-italics). Thus, the usage of single-letter
identifiers is deeply rooted in mathematical notation.
 
K

Knute Johnson

Patricia said:
I suspect that those letters were already preferred as subscripts in the
formulas that Fortran was supposed to represent.

Patricia

You are probably right about that. 1954 does predate my programming
experience, I started in 1971. I don't remember the fellow's name now
but one of my teachers in college had written one of the original
Fortran compilers. I'm sure that I thought it was ancient history at
the time :).
 
S

Stefan Ram

This might be derived from the usage of those letters in
mathematics, where one-letter identifiers are prefered indeed.

While looking for evidence to support the claim that, in
mathematics, the letters »i«, »j«, »k«, »l«, »m«, and »n« are
used for integers, I found

http://members.aol.com/jeff570/variables.html
http://members.aol.com/jeff570/mathsym.html
http://members.aol.com/jeff570/

These pages are surely worth to be read by anyone interested
in names and notation, while I found no support for my claims.

In 1918, Dedekind might have used »n« and »m« for natural numbers.

I thought of uses such as

»For 1 <= i <= m and 1 <= j <= n, we use (i, j) to denote
the cell at the intersection of row i and column j, and we
refer to the symbol contained in that cell by A(i, j).«

http://www.cs.berkeley.edu/~etesami/transversal.pdf

But I can not find a source right now for such usage predating FORTRAN.
 
J

John W. Kennedy

Patricia said:
the 1954 Preliminary Report, defines a fixed-point variable as "a
sequence of 1 or 2 alphabetic or numeric characters, the first one of
which is one of the following: i, j, k, l, m, n".

I suspect that those letters were already preferred as subscripts in the
formulas that Fortran was supposed to represent.

Yes. Also, in 1965, I was taught to remember I-N(teger).

--
John W. Kennedy
"Those in the seat of power oft forget their failings and seek only the
obeisance of others! Thus is bad government born! Hold in your heart
that you and the people are one, human beings all, and good government
shall arise of its own accord! Such is the path of virtue!"
-- Kazuo Koike. "Lone Wolf and Cub: Thirteen Strings" (tr. Dana Lewis)
* TagZilla 0.066 * http://tagzilla.mozdev.org
 
M

Martin Gregorie

Stefan said:
While looking for evidence to support the claim that, in
mathematics, the letters »i«, »j«, »k«, »l«, »m«, and »n« are
used for integers, I found

http://members.aol.com/jeff570/variables.html
http://members.aol.com/jeff570/mathsym.html
http://members.aol.com/jeff570/

These pages are surely worth to be read by anyone interested
in names and notation, while I found no support for my claims.

In 1918, Dedekind might have used »n« and »m« for natural numbers.

I thought of uses such as

»For 1 <= i <= m and 1 <= j <= n, we use (i, j) to denote
the cell at the intersection of row i and column j, and we
refer to the symbol contained in that cell by A(i, j).«

http://www.cs.berkeley.edu/~etesami/transversal.pdf

But I can not find a source right now for such usage predating FORTRAN.
Look at the notation commonly used with the Sigma summation symbol (like
a capital M lying on its left side) and I think you'll see what you're
looking for.

i,j,k are often subscripts in the expression being summed and n is often
used to represent the upper limit.
 
O

Oliver Wong

Stefan Ram said:
This might be derived from the usage of those letters in
mathematics, where one-letter identifiers are prefered indeed.

They have more letters in mathematics, because they also use
greek letters in mathematics, but then, with Unicode, in Java,
we have greek letters, too!

Fortran, Java, Unicode etc. interesting, you'll also probably be
interested to read about Fortress:
http://en.wikipedia.org/wiki/Fortress_(programming_language)

- Oliver
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,056
Latest member
GlycogenSupporthealth

Latest Threads

Top