# [RCR] More enumerator functionality

Discussion in 'Ruby' started by Kristof Bastiaensen, Jun 17, 2004.

1. ### Kristof BastiaensenGuest

Kristof Bastiaensen, Jun 17, 2004

2. ### Michael NeumannGuest

On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:
> Hi everybody,
>
> I have finally posted my RCR for more enumerator
> functionality.
>
> Find more about it here:
> http://rcrchive.net/rcr/RCR/RCR262

Can you give some more examples, where and how enum_if could be useful?

Regards,

Michael

Michael Neumann, Jun 17, 2004

3. ### Kristof BastiaensenGuest

On Fri, 18 Jun 2004 01:54:19 +0900, Michael Neumann wrote:

> On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:
>> Hi everybody,
>>
>> I have finally posted my RCR for more enumerator
>> functionality.
>>
>> Find more about it here:
>> http://rcrchive.net/rcr/RCR/RCR262

>
> Can you give some more examples, where and how enum_if could be useful?
>
> Regards,
>
> Michael

Well, enum_if is useful in at least two cases (as I wrote in
the analysis). It could be useful as a replacement for
Enumerable#select that may be more efficient in some cases,
(when dealing with a large amount of data).
For example:
large_dataset.enum_if { |d| sometest(d) }.collect do |d|
<data manipulation>
end

It could also be useful to have a specialized enumerator
that will reflect any changes made to the original object:

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Cheers,
Kristof

Kristof Bastiaensen, Jun 18, 2004
4. ### Michael NeumannGuest

On Fri, Jun 18, 2004 at 08:58:33PM +0900, Kristof Bastiaensen wrote:
> On Fri, 18 Jun 2004 01:54:19 +0900, Michael Neumann wrote:
>
> > On Fri, Jun 18, 2004 at 12:03:44AM +0900, Kristof Bastiaensen wrote:
> >> Hi everybody,
> >>
> >> I have finally posted my RCR for more enumerator
> >> functionality.
> >>
> >> Find more about it here:
> >> http://rcrchive.net/rcr/RCR/RCR262

> >
> > Can you give some more examples, where and how enum_if could be useful?
> >
> > Regards,
> >
> > Michael

>
> Well, enum_if is useful in at least two cases (as I wrote in
> the analysis). It could be useful as a replacement for
> Enumerable#select that may be more efficient in some cases,
> (when dealing with a large amount of data).
> For example:
> large_dataset.enum_if { |d| sometest(d) }.collect do |d|
> <data manipulation>
> end
>
> It could also be useful to have a specialized enumerator
> that will reflect any changes made to the original object:

You first enum_if example, is a bit confusing:

(0..4).enum_if { |i| i[0] == 0 }.to_a
=> [0, 2, 4]

It took me a while until I recognized that i[0] means the value of the
lowest bit. Just for clearness, could you write either a comment, or
use (i & 0b1) instead? Or (i % 2) == 0.

> data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
> ints = data.enum_if { |i| i.is_a? Numeric }
> ints.to_a
> => [6, 9, -19, -19]
>
> data += ["Bear", 20, 3]
> ints.to_a
> => [6, 9, -19, -19, 20, 3]
>
> (I have added these examples to the RCR)

Aha, then it's something like a "lazy" enumerable, right?
Have a look at my code I wrote some weeks ago:

require 'generator'
module Enumerable
def select_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
end

def collect_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
end

alias map_lazy collect_lazy

def lazy
ChainingGenerator.new(self)
end

def no_lazy
to_a
end
end

class ChainingGenerator < Generator
def select(&block)
self.class.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
end

def collect(&block)
self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
end

alias map collect

# TODO: implement others
end

[1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]

["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a? Numeric}.to_a

That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).

How performant is enum_if?

Could I write for example:

[1,2,3].enum_if{ cond }.map {|i| i + 1}

Regards,

Michael

Michael Neumann, Jun 18, 2004
5. ### Yukihiro MatsumotoGuest

Hi,

In message "[RCR] More enumerator functionality"
on 04/06/18, Kristof Bastiaensen <> writes:

|I have finally posted my RCR for more enumerator
|functionality.
|
|http://rcrchive.net/rcr/RCR/RCR262

Hmm, how about making Enumerator#select etc. (methods that return
array in Enumerable) return new filtered enumerator, that makes no
need for enum_if and,

huge_data.to_enum.delete_if {|x| ...}

works like a charm?

matz.

Yukihiro Matsumoto, Jun 18, 2004
6. ### Kristof BastiaensenGuest

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

> You first enum_if example, is a bit confusing:
>
> (0..4).enum_if { |i| i[0] == 0 }.to_a => [0, 2, 4]
>
> It took me a while until I recognized that i[0] means the value of the
> lowest bit. Just for clearness, could you write either a comment, or
> use (i & 0b1) instead? Or (i % 2) == 0.
>
>

You have a good point. I changed it.

>> data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"] ints = data.enum_if
>> { |i| i.is_a? Numeric } ints.to_a
>> => [6, 9, -19, -19]
>>
>> data += ["Bear", 20, 3]
>> ints.to_a
>> => [6, 9, -19, -19, 20, 3]
>>
>> (I have added these examples to the RCR)

>
> Aha, then it's something like a "lazy" enumerable, right?

Yes, that's right. The given block will only be executed at the time of
yielding the corresponding value.

> Have a look at my code I wrote some weeks ago:
>
> require 'generator'
> module Enumerable
> def select_lazy(&block)
> Generator.new {|c| self.each { |elem| c.yield(elem) if
> block.call(elem) } }
> end
>
> def collect_lazy(&block)
> Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) }
> }
> end
>
> alias map_lazy collect_lazy
>
> def lazy
> ChainingGenerator.new(self)
> end
>
> def no_lazy
> to_a
> end
> end
>
> class ChainingGenerator < Generator
> def select(&block)
> self.class.new {|c| self.each { |elem| c.yield(elem) if
> block.call(elem) } }
> end
>
> def collect(&block)
> self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) }
> }
> end
>
> alias map collect
>
> # TODO: implement others
> end
>
>
> [1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]
>
> ["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a?
> Numeric}.to_a
>
> That's a more general form, as after the "lazy", all Enumerable
> operations do not create intermediate arrays. Of course, it's very slow
> compared to the non-lazy methods (Generator uses continuations).
>
>

That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

> How performant is enum_if?

It should be quite fast, since all it does is pass each yield through the
block.
Even more so, since Nobu Nokada was kind enough to provide an
implementation in c.

> Could I write for example:
>
> [1,2,3].enum_if{ cond }.map {|i| i + 1}
>
>

Yes, exactly. That was also the kind of thing I had in mind.

Cheers,
Kristof

Kristof Bastiaensen, Jun 18, 2004
7. ### Guest

Hi,

At Fri, 18 Jun 2004 21:55:21 +0900,
Yukihiro Matsumoto wrote in [ruby-talk:104047]:
> Hmm, how about making Enumerator#select etc. (methods that return
> array in Enumerable) return new filtered enumerator, that makes no
> need for enum_if and,
>
> huge_data.to_enum.delete_if {|x| ...}

Then Enumerator#select actually is equivalent to enum_if?

I'm afraid that it might cause confusion, it returns Enumerator
whereas Enumerable#select returns Array.

--

, Jun 19, 2004
8. ### Michael NeumannGuest

On Fri, Jun 18, 2004 at 10:13:25PM +0900, Kristof Bastiaensen wrote:
> On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:
>
> [...]
>
> >
> > That's a more general form, as after the "lazy", all Enumerable
> > operations do not create intermediate arrays. Of course, it's very slow
> > compared to the non-lazy methods (Generator uses continuations).
> >
> >

> That's interesting.
> If I am correct, your collect_lazy and lazy.collect behaves the same as
> enum_for with a block. Your select_lazy and lazy.select the same as
> enum_if.

I don't know what enum_for is doing, but select_lazy is the same as
enum_if, only it's implementation is very different.

Regards,

Michael

Michael Neumann, Jun 19, 2004
9. ### Kristof BastiaensenGuest

On Sat, 19 Jun 2004 19:28:28 +0900, Michael Neumann wrote:

> On Fri, Jun 18, 2004 at 10:13:25PM +0900, Kristof Bastiaensen wrote:
>> On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:
>>
>> [...]
>>
>> >
>> > That's a more general form, as after the "lazy", all Enumerable
>> > operations do not create intermediate arrays. Of course, it's very slow
>> > compared to the non-lazy methods (Generator uses continuations).
>> >
>> >

>> That's interesting.
>> If I am correct, your collect_lazy and lazy.collect behaves the same as
>> enum_for with a block. Your select_lazy and lazy.select the same as
>> enum_if.

>
> I don't know what enum_for is doing, but select_lazy is the same as
> enum_if, only it's implementation is very different.
>

Currently enum_for (and its alias to_enum) creates an enumerable
that uses a different method than each. For example:

require "enumerator"
str = "xyz"

enum = str.enum_foreach_byte)
a = enum.map {|b| '%02x' % b } #=> ["78", "79", "7a"]

My proposal is that enum_for can take a block, so it can
do a custom transformation on the data:

data = [2, 3, 6]
powers = data.enum_for { |i| i * i } #"each" implied
powers.to_a
=> [4, 9, 36]

data << 7
powers.to_a
=> [4, 9, 36, 49]

If I am not mistaken this is the same as your collect_lazy.

Regards,
Kristof

Kristof Bastiaensen, Jun 19, 2004