RubyWart: Iterator naming conventions

Discussion in 'Ruby' started by rantingrick, Sep 5, 2011.

  1. rantingrick

    rantingrick Guest

    I find the naming conventions for iterators lacking. We use the
    exclamation and question marks on our methods to convey more
    intuitive meanings however we seemed to have dropped the ball when it
    comes to iterators (which take a simple predicate). For instance the
    methods:

    * select
    * reject
    * collect
    * inject
    * detect

    Should have had an _if appended to them.

    * select_if
    * reject_if
    * collect_if
    * inject_if
    * detect_if

    The if demands a predicate must follow. numbers.select_if{|x|
    predicate}.

    Furthermore: the each, each_index, and each_with_index should have
    been compiled into one method depending on the local variables in the
    block.

    rb> [].each{|item| ...}
    rb> [].each{|index| ...}
    rb> [].each{|item, index| ...}

    I find this far more intuitive than having three methods. The fact
    that we are iterating over an index, or an item, or enumerating the
    collection should NOT be part of the identifier. We can solve this
    issue more cleanly and intuitively by looking in the block.
     
    rantingrick, Sep 5, 2011
    #1
    1. Advertisements

  2. There is no point in having _if since the two above do accept a predicate.
    Nobody stops you from doing

    module Enumerable
    def select_if(&b) select(&b) end
    def reject_if(&b) reject(&b) end
    def detect_if(&b) detect(&b) end
    end

    and using the new methods. Btw, what do you do with #find? I don't
    find "find_if" convincing.
    I beg to differ. First, how do you solve the situation of a Hash
    iteration (or any other iteration which needs to yield more than one
    value)? For "h.each {|a,b| }" now is a an array with key and value and
    b the index? Or is a the key and b the value?

    You also put the burden of detecting this on everybody who wants to
    implement a Enumerable class whereas today #each_with_index has a
    default implementation in Enumerable which you can simply use.

    Second, what you are proposing is quite similar to the anti pattern to
    provide flags to a method to control its behavior. Rather one creates
    several methods with defined semantic to keep aspects cleanly separated.

    Third, it's too late anyway to make this change (or these changes) to
    the standard library because these are frequently used methods and there
    is a reasonable chance that existing code will be broken.

    It's just not worthwhile to try to change such fundamental things.
    People usually get used to the names as they are so why bother and
    invest efforts to change this?

    Kind regards

    robert
     
    Robert Klemme, Sep 5, 2011
    #2
    1. Advertisements

  3. rantingrick

    rantingrick Guest

    And induce lethargy on my code base... never. :)
    Hmm, one of the problems with being consistent is that the current
    naming conventions for Ruby don't allow consistency. Also Ruby has
    this mantra of user defined control structures are cool but then we
    try to abstract away the control structure, why? The methods "reject",
    "collect", and "select" are interchangeable:

    rb> [1,2,3,4,5].reject{|x| x>3}
    [1, 2, 3]
    rb> [1,2,3,4,5].select{|x| x<=3}
    [1, 2, 3]

    And collect just takes a wee bit more work:

    rb> [1,2,3,4,5].collect{|x| x if x < 4}.compact
    [1, 2, 3]

    It's like saying "A red horse" compared to saying "A horse that is
    red" or "Red is the color of that horse". We have not gained any
    comprehensive value in any of the statements. They all compile to the
    same result.

    rb> horse.color == 'red'
    true
    h.each{|key, value| ...}
    h.each{|pair| ...}
    So why not make my "intuitive each" a separate implementation from the
    generic each. Not too difficult.
    I dunno, that sounds like opinion to me. If you look at the syntax for
    the existing each and my proposed each you'll see mine is far more
    intuitive. Plus it has the benefit of removing multiplicity from the
    language. It's about keeping control as close to the problem as
    possible. Iterating over a collection is the job of the each method.
    However the EXACT control of that iteration must be in the hands of
    user defined control structure (the block).
    I understand that point however my intention is not to change Ruby in
    such a drastic way. Ruby is too far along to make these changes now.
    What i am doing is planting seeds for the future. A future language
    that will combine the greatness of Ruby and Python and what ever else
    suits productivity, coherency, and simplicity into one.
     
    rantingrick, Sep 5, 2011
    #3
  4. rantingrick

    rantingrick Guest

    Sorry, i want to add one more important argument for iterables.

    a.each{|item| ...}
    a.each{|index| ...}
    a.each{|item, index| ...}
    h.each{|key, value| ...}
    h.each{|pair| ...}

    When we force people to use proper local variables we create clean
    code bases. Anyone who would use anything other than "index" and
    "item" for enumerating an array is a fool. The five examples i provide
    are perfect.

    Sure we must allow SOME freedoms in our language however in edge cases
    like these we need to seize the golden opportunity and bring order to
    the madness. Local loop variables for built-in collection types (and
    derivatives of those types) must be set in stone! There is NO reason
    to allow freedoms here because the only freedom you will propagate is
    slothfulness, laziness, and all such negative attributes of the
    selfish human nature. We are all both individually and collectively
    responsible for the ills of humanity.

    (sorry again for the back to back posting)
     
    rantingrick, Sep 5, 2011
    #4
  5. I start wondering whether you have understood what #collect (and #map)
    are all about. You are bending reality to fit your needs. But the
    truth is simply that the block passed to #collect and #map is not a
    predicate but a transformation. The fact that you can also use a
    predicate (i.e. transform the argument to a truth value) does not mean
    that the block is used as a predicate in #collect and #map. Hence there
    is no point in renaming to collect_if.

    A method reasonably named #collect_if would be a different beast, e.g.

    def collect_if
    a = []
    each {|x| e = yield(x) and a << e}
    a
    end

    Don't get me started on #inject.
    But they are compiled from different individual modules. Modules can be
    combined in several ways and it pays off to create them in a way to have
    a single task - and so the naming should reflect that and not one
    particular usecase.
    This is not an answer to my question. Did you actually understand the
    point of my question?
    Apparently you have never suffered from maintaining code that is filled
    with this anti pattern.
    I find that totally unintuitive because I would constantly be wondering
    how I must define block arguments to make the "intuitive" magic do what
    I need.
    Reducing the number of methods is not a goal in itself. Apparently most
    people working with Ruby these days do not take issues with this one
    extra method #each_with_index (other than maybe that it's a lot of
    typing compared to #each).
    Exactly that - and nothing more.
    I am sorry, but this is nonsense: the whole point about iterating with
    anonymous blocks (as opposed to external iterators like in Java) is that
    the iteration is under _complete control_ of the method implementation
    and not spread across client code and method implementation. That way
    the burden of boilerplate is taken from the client of the iteration and
    he only needs to define what he wants to do with elements. The fact
    that Java is moving in exactly this direction indicates that a lot of
    other people feel the same way.

    The only control that the client exerts over iteration once it's started
    is premature termination (via break, throw, raise or return). But
    that's it.
    Oh dear. Good luck!

    Cheers

    robert
     
    Robert Klemme, Sep 6, 2011
    #5
  6. rantingrick

    rantingrick Guest

    Okay, let's concentrate on just one of my proposals. The idea of
    making local variables for loops over internal collection types
    concrete. Here were my examples:

    a.each{|item| ...}
    a.each{|index| ...}
    a.each{|item, index| ...}
    h.each{|key, value| ...}
    h.each{|pair| ...}

    Later you said...

    Okay you say this won't work. And you say you've had to maintain code
    like this (i find that hard to believe) so show me an example of how
    using PROPER local variables is going to hurt you (or me).
     
    rantingrick, Sep 6, 2011
    #6
  7. Not sure what you mean with "concrete". It seems like you want to have
    special names for arguments which automagically have special meaning.
    Apart from the fact that this cannot be done without changing Ruby to
    provide argument names at runtime you must be aware that this creates a
    new dependency: in most languages argument names are chosen by the
    author of a function or method and the naming (ideally) reflects the
    usage of parameters in the function. Now you want to have names which
    also control the behavior of the _caller_ (method #each in your case).
    I am not sure this is such a great idea.
    I said "I find that totally unintuitive".
    Please be careful with quoting. My statement referred to the anti
    pattern of using method arguments to control behavior. Example

    def calc(sum)
    if sum
    s = 0
    each {|x| s += x}
    s
    else
    p = 1
    each {|x| p =* x}
    p
    end
    end

    vs.

    def sum
    s = 0
    each {|x| s += x}
    s
    end

    def product
    p = 1
    each {|x| p =* x}
    p
    end

    In the first case if you want to add another calculation one might be
    tempted to introduce a new flag (or change the existing one to a enum)
    which has consequences for all code using this.

    In the second case you just add a method for the new calculation and be
    done. Also note how methods in the second example are shorter. That
    effect is not dramatic in this case but you can easily imagine how that
    changes for more complex algorithms. I have seen such code but I won't
    publish it (legal reasons).

    Cheers

    robert
     
    Robert Klemme, Sep 11, 2011
    #7
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.