RubyWart: Iterator naming conventions

rantingrick · Sep 5, 2011

I find the naming conventions for iterators lacking. We use the
exclamation and question marks on our methods to convey more
intuitive meanings however we seemed to have dropped the ball when it
comes to iterators (which take a simple predicate). For instance the
methods:

* select
* reject
* collect
* inject
* detect

Should have had an _if appended to them.

* select_if
* reject_if
* collect_if
* inject_if
* detect_if

The if demands a predicate must follow. numbers.select_if{|x|
predicate}.

Furthermore: the each, each_index, and each_with_index should have
been compiled into one method depending on the local variables in the
block.

rb> [].each{|item| ...}
rb> [].each{|index| ...}
rb> [].each{|item, index| ...}

I find this far more intuitive than having three methods. The fact
that we are iterating over an index, or an item, or enumerating the
collection should NOT be part of the identifier. We can solve this
issue more cleanly and intuitively by looking in the block.

Robert Klemme · Sep 5, 2011

I find the naming conventions for iterators lacking. We use the
exclamation and question marks on our methods to convey more
intuitive meanings however we seemed to have dropped the ball when it
comes to iterators (which take a simple predicate). For instance the
methods:

* select
* reject
* collect
* inject
* detect

Should have had an _if appended to them.

* select_if
* reject_if
* collect_if
* inject_if

There is no point in having _if since the two above do accept a predicate.

* detect_if

The if demands a predicate must follow. numbers.select_if{|x|
predicate}.

Nobody stops you from doing

module Enumerable
def select_if(&b) select(&b) end
def reject_if(&b) reject(&b) end
def detect_if(&b) detect(&b) end
end

and using the new methods. Btw, what do you do with #find? I don't
find "find_if" convincing.

Furthermore: the each, each_index, and each_with_index should have
been compiled into one method depending on the local variables in the
block.

rb> [].each{|item| ...}
rb> [].each{|index| ...}
rb> [].each{|item, index| ...}

I find this far more intuitive than having three methods. The fact
that we are iterating over an index, or an item, or enumerating the
collection should NOT be part of the identifier. We can solve this
issue more cleanly and intuitively by looking in the block.

I beg to differ. First, how do you solve the situation of a Hash
iteration (or any other iteration which needs to yield more than one
value)? For "h.each {|a,b| }" now is a an array with key and value and
b the index? Or is a the key and b the value?

You also put the burden of detecting this on everybody who wants to
implement a Enumerable class whereas today #each_with_index has a
default implementation in Enumerable which you can simply use.

Second, what you are proposing is quite similar to the anti pattern to
provide flags to a method to control its behavior. Rather one creates
several methods with defined semantic to keep aspects cleanly separated.

Third, it's too late anyway to make this change (or these changes) to
the standard library because these are frequently used methods and there
is a reasonable chance that existing code will be broken.

It's just not worthwhile to try to change such fundamental things.
People usually get used to the names as they are so why bother and
invest efforts to change this?

Kind regards

robert

rantingrick · Sep 5, 2011

Nobody stops you from doing

module Enumerable
def select_if(&b) select(&b) end
def reject_if(&b) reject(&b) end
def detect_if(&b) detect(&b) end
end

And induce lethargy on my code base... never.

and using the new methods. Btw, what do you do with #find? I don't
find "find_if" convincing.

Hmm, one of the problems with being consistent is that the current
naming conventions for Ruby don't allow consistency. Also Ruby has
this mantra of user defined control structures are cool but then we
try to abstract away the control structure, why? The methods "reject",
"collect", and "select" are interchangeable:

rb> [1,2,3,4,5].reject{|x| x>3}
[1, 2, 3]
rb> [1,2,3,4,5].select{|x| x<=3}
[1, 2, 3]

And collect just takes a wee bit more work:

rb> [1,2,3,4,5].collect{|x| x if x < 4}.compact
[1, 2, 3]

It's like saying "A red horse" compared to saying "A horse that is
red" or "Red is the color of that horse". We have not gained any
comprehensive value in any of the statements. They all compile to the
same result.

rb> horse.color == 'red'
true

rb> [].each{|item| ...}
rb> [].each{|index| ...}
rb> [].each{|item, index| ...}

[...]

Click to expand...

I beg to differ. First, how do you solve the situation of a Hash
iteration (or any other iteration which needs to yield more than one
value)? For "h.each {|a,b| }" now is a an array with key and value and
b the index? Or is a the key and b the value?

h.each{|key, value| ...}
h.each{|pair| ...}

You also put the burden of detecting this on everybody who wants to
implement a Enumerable class whereas today #each_with_index has a
default implementation in Enumerable which you can simply use.

So why not make my "intuitive each" a separate implementation from the
generic each. Not too difficult.

Second, what you are proposing is quite similar to the anti pattern to
provide flags to a method to control its behavior. Rather one creates
several methods with defined semantic to keep aspects cleanly separated.

I dunno, that sounds like opinion to me. If you look at the syntax for
the existing each and my proposed each you'll see mine is far more
intuitive. Plus it has the benefit of removing multiplicity from the
language. It's about keeping control as close to the problem as
possible. Iterating over a collection is the job of the each method.
However the EXACT control of that iteration must be in the hands of
user defined control structure (the block).

Third, it's too late anyway to make this change (or these changes) to
the standard library because these are frequently used methods and there
is a reasonable chance that existing code will be broken.

I understand that point however my intention is not to change Ruby in
such a drastic way. Ruby is too far along to make these changes now.
What i am doing is planting seeds for the future. A future language
that will combine the greatness of Ruby and Python and what ever else
suits productivity, coherency, and simplicity into one.

rantingrick · Sep 5, 2011

Sorry, i want to add one more important argument for iterables.

a.each{|item| ...}
a.each{|index| ...}
a.each{|item, index| ...}
h.each{|key, value| ...}
h.each{|pair| ...}

When we force people to use proper local variables we create clean
code bases. Anyone who would use anything other than "index" and
"item" for enumerating an array is a fool. The five examples i provide
are perfect.

Sure we must allow SOME freedoms in our language however in edge cases
like these we need to seize the golden opportunity and bring order to
the madness. Local loop variables for built-in collection types (and
derivatives of those types) must be set in stone! There is NO reason
to allow freedoms here because the only freedom you will propagate is
slothfulness, laziness, and all such negative attributes of the
selfish human nature. We are all both individually and collectively
responsible for the ills of humanity.

(sorry again for the back to back posting)

Robert Klemme · Sep 6, 2011

and using the new methods. Btw, what do you do with #find? I don't
find "find_if" convincing.

Click to expand...

Hmm, one of the problems with being consistent is that the current
naming conventions for Ruby don't allow consistency. Also Ruby has
this mantra of user defined control structures are cool but then we
try to abstract away the control structure, why? The methods "reject",
"collect", and "select" are interchangeable:

rb> [1,2,3,4,5].reject{|x| x>3}
[1, 2, 3]
rb> [1,2,3,4,5].select{|x| x<=3}
[1, 2, 3]

And collect just takes a wee bit more work:

rb> [1,2,3,4,5].collect{|x| x if x< 4}.compact
[1, 2, 3]

I start wondering whether you have understood what #collect (and #map)
are all about. You are bending reality to fit your needs. But the
truth is simply that the block passed to #collect and #map is not a
predicate but a transformation. The fact that you can also use a
predicate (i.e. transform the argument to a truth value) does not mean
that the block is used as a predicate in #collect and #map. Hence there
is no point in renaming to collect_if.

A method reasonably named #collect_if would be a different beast, e.g.

def collect_if
a = []
each {|x| e = yield(x) and a << e}
a
end

Don't get me started on #inject.

It's like saying "A red horse" compared to saying "A horse that is
red" or "Red is the color of that horse". We have not gained any
comprehensive value in any of the statements. They all compile to the
same result.

But they are compiled from different individual modules. Modules can be
combined in several ways and it pays off to create them in a way to have
a single task - and so the naming should reflect that and not one
particular usecase.

rb> [].each{|item| ...}
rb> [].each{|index| ...}
rb> [].each{|item, index| ...}

[...]

Click to expand...

I beg to differ. First, how do you solve the situation of a Hash
iteration (or any other iteration which needs to yield more than one
value)? For "h.each {|a,b| }" now is a an array with key and value and
b the index? Or is a the key and b the value?

Click to expand...

h.each{|key, value| ...}
h.each{|pair| ...}

This is not an answer to my question. Did you actually understand the
point of my question?

So why not make my "intuitive each" a separate implementation from the
generic each. Not too difficult.
Correct.

I dunno, that sounds like opinion to me.

Apparently you have never suffered from maintaining code that is filled
with this anti pattern.

If you look at the syntax for
the existing each and my proposed each you'll see mine is far more
intuitive.

I find that totally unintuitive because I would constantly be wondering
how I must define block arguments to make the "intuitive" magic do what
I need.

Plus it has the benefit of removing multiplicity from the
language.

Reducing the number of methods is not a goal in itself. Apparently most
people working with Ruby these days do not take issues with this one
extra method #each_with_index (other than maybe that it's a lot of
typing compared to #each).

It's about keeping control as close to the problem as
possible. Iterating over a collection is the job of the each method.

Exactly that - and nothing more.

However the EXACT control of that iteration must be in the hands of
user defined control structure (the block).

I am sorry, but this is nonsense: the whole point about iterating with
anonymous blocks (as opposed to external iterators like in Java) is that
the iteration is under _complete control_ of the method implementation
and not spread across client code and method implementation. That way
the burden of boilerplate is taken from the client of the iteration and
he only needs to define what he wants to do with elements. The fact
that Java is moving in exactly this direction indicates that a lot of
other people feel the same way.

The only control that the client exerts over iteration once it's started
is premature termination (via break, throw, raise or return). But
that's it.

I understand that point however my intention is not to change Ruby in
such a drastic way. Ruby is too far along to make these changes now.
What i am doing is planting seeds for the future. A future language
that will combine the greatness of Ruby and Python and what ever else
suits productivity, coherency, and simplicity into one.

Oh dear. Good luck!

Cheers

robert

rantingrick · Sep 6, 2011

Okay, let's concentrate on just one of my proposals. The idea of
making local variables for loops over internal collection types
concrete. Here were my examples:

a.each{|item| ...}
a.each{|index| ...}
a.each{|item, index| ...}
h.each{|key, value| ...}
h.each{|pair| ...}

Later you said...

Apparently you have never suffered from maintaining code that is filled
with this anti pattern.

I find that totally unintuitive because I would constantly be wondering
how I must define block arguments to make the "intuitive" magic do what
I need.

Okay you say this won't work. And you say you've had to maintain code
like this (i find that hard to believe) so show me an example of how
using PROPER local variables is going to hurt you (or me).

Robert Klemme · Sep 11, 2011

Okay, let's concentrate on just one of my proposals. The idea of
making local variables for loops over internal collection types
concrete. Here were my examples:

a.each{|item| ...}
a.each{|index| ...}
a.each{|item, index| ...}
h.each{|key, value| ...}
h.each{|pair| ...}

Not sure what you mean with "concrete". It seems like you want to have
special names for arguments which automagically have special meaning.
Apart from the fact that this cannot be done without changing Ruby to
provide argument names at runtime you must be aware that this creates a
new dependency: in most languages argument names are chosen by the
author of a function or method and the naming (ideally) reflects the
usage of parameters in the function. Now you want to have names which
also control the behavior of the _caller_ (method #each in your case).
I am not sure this is such a great idea.

Later you said...

Okay you say this won't work.

I said "I find that totally unintuitive".

And you say you've had to maintain code
like this (i find that hard to believe) so show me an example of how
using PROPER local variables is going to hurt you (or me).

Please be careful with quoting. My statement referred to the anti
pattern of using method arguments to control behavior. Example

def calc(sum)
if sum
s = 0
each {|x| s += x}
s
else
p = 1
each {|x| p =* x}
p
end
end

vs.

def sum
s = 0
each {|x| s += x}
s
end

def product
p = 1
each {|x| p =* x}
p
end

In the first case if you want to add another calculation one might be
tempted to introduce a new flag (or change the existing one to a enum)
which has consequences for all code using this.

In the second case you just add a method for the new calculation and be
done. Also note how methods in the second example are shorter. That
effect is not dramatic in this case but you can easily imagine how that
changes for more complex algorithms. I have seen such code but I won't
publish it (legal reasons).

Cheers

robert

Metaprogramming conventions	1	Mar 24, 2006
LangWart: Method congestion from mutate multiplicty	51	Feb 9, 2013
Naming conventions for iterator methods?	11	Dec 22, 2003
[OT] Naming Conventions: Question of Style, or Library Compatibility?	6	Apr 5, 2004
Enumerable.inject_with_index?	8	Jan 23, 2011
Literature on boolean naming/usage conventions?	12	Sep 15, 2004
About method naming	5	Jan 26, 2007
Coding Conventions and Speech: No Punctuation, Renaming Operators etc...	10	Jul 31, 2005

RubyWart: Iterator naming conventions

rantingrick

Robert Klemme

rantingrick

rantingrick

Robert Klemme

rantingrick

Robert Klemme

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads