FirstEachLast, an extension to the Enumerable module.

J

John Carter

I know somewhere is a collection of extensions to the enumerable module.

Is the rubygarden page
http://www.rubygarden.org/ruby?StandardClassExtensions/Enumerable
the only one?

Anyway, I have just added FirstEachLast to that.

Use it like so...

require 'FirstEachLast'

def test_first_each_last(any_enumerable_object)
result = nil
count =
any_enumerable_object.first_each_last( proc {|first|
result = "First #{first}"
},
proc {|i|
result += " then #{i},"
},
proc {|last|
result += " and last of all #{last}"
},
proc {|| result = 'Empty!'}

puts count
puts result
end

test_first_each_last( [1,2,3,4])
test_first_each_last( [])

Will result in the following output...
4
First 1 then 2, then 3, and last of all 4
0
Empty!



John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.
 
R

Robert Klemme

John Carter said:
I know somewhere is a collection of extensions to the enumerable module.

Is the rubygarden page
http://www.rubygarden.org/ruby?StandardClassExtensions/Enumerable
the only one?

Anyway, I have just added FirstEachLast to that.

Use it like so...

require 'FirstEachLast'

def test_first_each_last(any_enumerable_object)
result = nil
count =
any_enumerable_object.first_each_last( proc {|first|
result = "First #{first}"
},
proc {|i|
result += " then #{i},"
},
proc {|last|
result += " and last of all #{last}"
},
proc {|| result = 'Empty!'}

puts count
puts result
end

test_first_each_last( [1,2,3,4])
test_first_each_last( [])

Will result in the following output...
4
First 1 then 2, then 3, and last of all 4
0
Empty!

Nice output, but what do you need that for?

robert
 
J

John Carter

Nice output, but what do you need that for?

Mostly output to other tools, like gnuplot, or for reports. Most time
enum._to_a.join(",") does the job perfectly. However, there are the odd
time when that routine is just neat.


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.
 
R

Robert Klemme

John Carter said:
Mostly output to other tools, like gnuplot, or for reports. Most time
enum._to_a.join(",") does the job perfectly. However, there are the odd
time when that routine is just neat.

Hm, sounds to me like it was not general enough to include it in
Enumerable. Before I see that method I'd prefer to have size and empty?
in Enumerable. Just my 0.02 EUR...

Regards

robert
 
M

Martin DeMello

Robert Klemme said:
Hm, sounds to me like it was not general enough to include it in
Enumerable. Before I see that method I'd prefer to have size and empty?
in Enumerable. Just my 0.02 EUR...

Can empty? be reliably implemented in terms of each for every
enumerable? I can't think of an obvious problem with it, but that
doesn't mean there isn't one.

martin
 
A

Alexander Kellett

Hm, sounds to me like it was not general enough to include it in
Enumerable. Before I see that method I'd prefer to have size and empty?
in Enumerable. Just my 0.02 EUR...

personally the idea of having a method taking 4 procs
in the stdlib is just a bit strange. i'd prefer to see
a generic set of methods that are somehow mixedin to
the values that a Enumerable yields, therefore allowing
things like val.last? or first? or even a generic
val.enum_index to replace the each_with_index rubbish.
i've no idea how to solve this neatly unfortunately as
extend'ing the objects that are yield'ed of course will
have sideeffects... though most of the side effects that
will cause problems shouldn't happen at all, e.g people
calling is_a? a lot will have problems... i do this
and yet i don't have an exception to such an extension...
but this:

blah.each {
|element|
pos = blah.index element
puts blah[pos-1]
puts element
}

is soooooo uggllyyy... not to mention inefficient...
i'd just loooveeee to see a element.previous_in_enum...

cheers,
Alex
 
R

Robert Klemme

Martin DeMello said:
Can empty? be reliably implemented in terms of each for every
enumerable? I can't think of an obvious problem with it, but that
doesn't mean there isn't one.

Yes, that's possible IMHO. My suggestion would be:

module Enumerable
# O(1)
def empty?
each { return false }
true
end

# O(n)
def size
inject(0) {|s,| s+1}
end
end

Do you see any problems? There's only the little disadvantage that size
is O(n) while there are most likely more efficient implementations for
certain collections. But classes like Array and Hash provide their own
implementation anyway, which then overrides Enumerable#size. So there's
no drawback IMHO, only the advantage that you don't have to write it if
you have a collection that can determine its size only via a complete
iteration.

Kind regards

robert
 
M

Michael Neumann

Alexander said:
Hm, sounds to me like it was not general enough to include it in
Enumerable. Before I see that method I'd prefer to have size and empty?
in Enumerable. Just my 0.02 EUR...


personally the idea of having a method taking 4 procs
in the stdlib is just a bit strange. i'd prefer to see
a generic set of methods that are somehow mixedin to
the values that a Enumerable yields, therefore allowing
things like val.last? or first? or even a generic
val.enum_index to replace the each_with_index rubbish.
i've no idea how to solve this neatly unfortunately as
extend'ing the objects that are yield'ed of course will
have sideeffects... though most of the side effects that
will cause problems shouldn't happen at all, e.g people
calling is_a? a lot will have problems... i do this
and yet i don't have an exception to such an extension...
but this:

blah.each {
|element|
pos = blah.index element
puts blah[pos-1]
puts element
}

is soooooo uggllyyy... not to mention inefficient...
i'd just loooveeee to see a element.previous_in_enum...

Why not something like this (each_with_first_last does not exist!):

blah.each_with_first_last {|element, pos|
case pos
when :inside # or middle
...
when :first
...
when :last
...
end
}

Of course this would be somewhat slower than the approach with 4 procs,
as you have to test where you are, inside the loop. If you use the
approach above, make sure that you test for the common-case ("inside")
in the case-statement at the very beginning.

I know, you could also use each_with_index instead (this one exists!):

blah.each_with_index {|element, inx|
if inx == 0
# first
elsif inx < blah.size
# inside
else
# last
end
}

But that might not work for all Enumerable's.

Or maybe, an extended each_with_index2?:

blah.each_with_index2 {|element, inx, pos|
if pos == :first
...
}

Regards,

Michael
 
M

Michael Neumann

Michael said:
I know, you could also use each_with_index instead (this one exists!):

blah.each_with_index {|element, inx|
if inx == 0
# first
elsif inx < blah.size

should be:

elsif inx < blah.size - 1
 
R

Robert Klemme

Alexander Kellett said:
personally the idea of having a method taking 4 procs
in the stdlib is just a bit strange.
Agreed.

i'd prefer to see
a generic set of methods that are somehow mixedin to
the values that a Enumerable yields, therefore allowing
things like val.last? or first? or even a generic
val.enum_index to replace the each_with_index rubbish.
i've no idea how to solve this neatly unfortunately as
extend'ing the objects that are yield'ed of course will
have sideeffects... though most of the side effects that
will cause problems shouldn't happen at all, e.g people
calling is_a? a lot will have problems...

I strongly disagree. All sorts of problems arise from this approach:

- permanent modification of elements' types
- method name collisions
- member state collisions
- thread problems (1 instance used during two parallel iterations)

The index clearly belongs to the *iteration* and not the elements.
Possible solutions:

- yield not the instance but an iteration state that has the current item
as member
- use delegator (this avoids permanent changes but has still method
collision problems)
i do this
and yet i don't have an exception to such an extension...
but this:

blah.each {
|element|
pos = blah.index element
puts blah[pos-1]
puts element
}

is soooooo uggllyyy... not to mention inefficient...
i'd just loooveeee to see a element.previous_in_enum...

First of all you don't cope with the first element properly. Then, there
are simple and efficient solutions available. To name some:

# for Arrays
blah.each_with_index {
|element, pos|
puts blah[pos-1] if pos != 0
puts element
}

NOTHING = Object.new
last = NOTHING
blah.each {
|element|
puts last if NOTHING != last
puts element
last = element
}

NOTHING = Object.new
blah.inject(NOTHING) {
|last, element|
puts last if NOTHING != last
puts element
element
}

NOTHING = Object.new
blah.inject(NOTHING) {
|last, element|
process last if NOTHING != last
element
}
process last if NOTHING != last


etc. depending on what you want to do.

Regards

robert
 
M

Michael Neumann

Robert said:
Yes, that's possible IMHO. My suggestion would be:

module Enumerable
# O(1)
def empty?
each { return false }
true
end

# O(n)
def size
inject(0) {|s,| s+1}
end
end

Do you see any problems? There's only the little disadvantage that size
is O(n) while there are most likely more efficient implementations for
certain collections. But classes like Array and Hash provide their own
implementation anyway, which then overrides Enumerable#size. So there's
no drawback IMHO, only the advantage that you don't have to write it if
you have a collection that can determine its size only via a complete
iteration.

Hehe, and how long does it take to determine the size of infinite or
cyclic Enumerable's? ;-)

require 'enumerator'
to_enum:)loop).each {
...
break if ...
}
to_enum:)loop).size # ???

Regards,

Michael
 
A

Alexander Kellett

I strongly disagree. All sorts of problems arise from this approach:

notice my lack of agreement with myself in any case :)
- permanent modification of elements' types
- method name collisions
- member state collisions
- thread problems (1 instance used during two parallel iterations)

The index clearly belongs to the *iteration* and not the elements.
Possible solutions:

- yield not the instance but an iteration state that has the current item
as member
- use delegator (this avoids permanent changes but has still method
collision problems)

yes. i'd agree that it belongs to the iteration.
otherwise i would have coded up the thing i mentioned,
but... i've never seen an alternate so never bothered.
could you explain what you mean by delegator in this
sense? just a usage case example would do. i'd love
to code this up at some point but never took the time
to come up with a nice api.
First of all you don't cope with the first element properly. Then, there
are simple and efficient solutions available. To name some:

[snip]

yeah. agreed on lack of coping with first element
i just wanted to illustrate the basic idea :)

sorry but i dislike these paradigms instensly :/
the iterator knows, so why should i have to duplicate
code like this. especially in a detect loop this
is incredibly annoying as in many cases a temporary
must be used for the returned value as the last statement
in the loop has to be "old_value = current_value".
also having a temporary outside the loop and polluting
the outer block... is disgusts me... maybe i'm just too
purist but stuff like this *could* be solved so i
see no reason against doing it :)

thanks for the feedback,
Alex
 
R

Robert Klemme

Michael Neumann said:
Hehe, and how long does it take to determine the size of infinite or
cyclic Enumerable's? ;-)

A tad longer, I'd say. :)

There's clearly no point in doing that. But that's the best argument
against Enumerable#size that I've heard so far. Probably there should be
a distinction like BoundEnumerable as sub module of Enumerable.

Thank's for that hint!

Regards

robert
 
R

Robert Klemme

Alexander Kellett said:
notice my lack of agreement with myself in any case :)


yes. i'd agree that it belongs to the iteration.
otherwise i would have coded up the thing i mentioned,
but... i've never seen an alternate so never bothered.
could you explain what you mean by delegator in this
sense? just a usage case example would do. i'd love
to code this up at some point but never took the time
to come up with a nice api.

# NOT RECOMMENDED

class IterationState
attr_accessor :item, :pos

def method_missing(*a,&b)
item.send(*a,&b)
end

undef :to_s, :to_a, :hash # ... and others
end

module Enumerable
def each_with_state
st = IterationState.new

each_with_index do |e, i|
st.item = e
st.pos = i
yield st
end

self
end

end

ar = %w{foo bar baz bubu}

ar.each_with_state do |x|
puts "position #{x.pos} elem #{x.item}"
end
sorry but i dislike these paradigms instensly :/
the iterator knows,

Not necessarily so:

class RandEnum
include Enumerable

def intialize(max=nil) @max=max; end

def each
loop { yield rand @max }
end
end
so why should i have to duplicate
code like this. especially in a detect loop this
is incredibly annoying as in many cases a temporary
must be used for the returned value as the last statement
in the loop has to be "old_value = current_value".

You can use inject.
also having a temporary outside the loop and polluting
the outer block... is disgusts me...

You can't solve it otherwise because you need state that survives each
iteration step. That *must* be kept outside the block - either by using a
local or inject.
maybe i'm just too
purist but stuff like this *could* be solved so i
see no reason against doing it :)

thanks for the feedback,

You're welcome.

robert
 
M

Martin DeMello

Robert Klemme said:
Yes, that's possible IMHO. My suggestion would be:

module Enumerable
# O(1)
def empty?
each { return false }
true
end

My concern was if #each created a temporary data structure in O(n) time
(e.g. an ordered hash) - this would make sense for #each, since its
running time is O(n) anyway, but would be a large hit for empty?

martin
 
A

Ara.T.Howard

My concern was if #each created a temporary data structure in O(n) time
(e.g. an ordered hash) - this would make sense for #each, since its
running time is O(n) anyway, but would be a large hit for empty?

my concern would be for code that created a temporary data structure to
iterate - the designer should re-design! ;-) seriously, the rbtree package
from the raa works exactly like an ordered hash and it does not do this - i
would say it's a serious flaw if an ordered container cannot iterate itself in
order w/o creating temp structures - though your point is well taken and
correct.


-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
R

Robert Klemme

Martin DeMello said:
My concern was if #each created a temporary data structure in O(n) time
(e.g. an ordered hash) - this would make sense for #each, since its
running time is O(n) anyway, but would be a large hit for empty?

Hm, that's true. Although that argument would not stop me from advocating
#empty? Runtimes of each enumerable class differ anyway. Classes override
this method usually anyway. It's a different case with #size which has been
shown to not terminate for certain non totally esoteric implementations.

Here's maybe another reason: emty? and size both depend on an Enumerable
with a deterministic amount of elements. This need not always be the case:

class RandEnum
include Enumerable

def each
rand(10).times { yield rand 20 }
self
end
end
r=RandEnum.new
=> # said:
r.to_a => [15, 7]
r.to_a => [12, 8, 11, 2, 17, 13, 18, 5, 16]
r.to_a => [9, 15, 19, 12, 3, 2, 0]

I start believing that because of this and the termination problem it was
indeed a wise decision of Matz to not include them in Enumerable. And a sub
module for DeterministicEnumerable that adds these two methods to Enumerable
wouldn't be a big win. So it's probably best the way it is now.

Thanks for the discussion!

Kind regards

robert
 
P

Pit Capitain

Robert said:
The index clearly belongs to the *iteration* and not the elements.

Then you should create a new object for the iteration state. How about this:

module Enumerable

class State

include Enumerable

def initialize enum
@enum = enum
@index = -1
end

def each
@enum.each do |e|
if @last.nil?
@last = false
else
@index += 1
yield @next
end
@next = e
end
unless @last.nil?
@last = true
@index += 1
yield @next
end
end

def first?
@index == 0
end

def index
@index
end

def last?
@last
end

end

def state
State.new self
end

end


Now you can get an EnumState object from a normal enumerable and iterate over
the EnumState object:

state = %w"does it work?".state

state.each do |e|
puts "%-5s %-5s %i %s" % [ e, state.first?, state.index, state.last? ]
end

=>

does true 0 false
it false 1 false
work? false 2 true


Regards,
Pit
 
J

John Carter

Can empty? be reliably implemented in terms of each for every
enumerable? I can't think of an obvious problem with it, but that
doesn't mean there isn't one.

def empty?
each do |element|
return false
end
true
end

def size
inject(0) {|n,e| n+=1}
end


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.
 
J

John Carter

Why not something like this (each_with_first_last does not exist!):

blah.each_with_first_last {|element, pos|
case pos
when :inside # or middle
...
when :first
...
when :last
...
end
}

I like the cleaner syntax.
I know, you could also use each_with_index instead (this one exists!):

blah.each_with_index {|element, inx|
if inx == 0
# first
elsif inx < blah.size
# inside
else
# last
end
}

But that might not work for all Enumerable's.

Nope, that is order N^2 for enumerables without a O(1) implementation of
size. Simply won't do for me.
Or maybe, an extended each_with_index2?:

Yip, could hack the .c code for each_with_index of enum.c

blah.first_each_last do |e,pos|
case pos
when :first
....
when :inner
....
when :last
......
when :empty
.....
end
end


John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

The universe is absolutely plastered with the dashed lines exactly one
space long.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,521
Members
44,995
Latest member
PinupduzSap

Latest Threads

Top