[RCR] More enumerator functionality

  • Thread starter Kristof Bastiaensen
  • Start date
K

Kristof Bastiaensen

Can you give some more examples, where and how enum_if could be useful?

Regards,

Michael

Well, enum_if is useful in at least two cases (as I wrote in
the analysis). It could be useful as a replacement for
Enumerable#select that may be more efficient in some cases,
(when dealing with a large amount of data).
For example:
large_dataset.enum_if { |d| sometest(d) }.collect do |d|
<data manipulation>
end

It could also be useful to have a specialized enumerator
that will reflect any changes made to the original object:

data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Cheers,
Kristof
 
M

Michael Neumann

Well, enum_if is useful in at least two cases (as I wrote in
the analysis). It could be useful as a replacement for
Enumerable#select that may be more efficient in some cases,
(when dealing with a large amount of data).
For example:
large_dataset.enum_if { |d| sometest(d) }.collect do |d|
<data manipulation>
end

It could also be useful to have a specialized enumerator
that will reflect any changes made to the original object:

You first enum_if example, is a bit confusing:

(0..4).enum_if { |i| i[0] == 0 }.to_a
=> [0, 2, 4]

It took me a while until I recognized that i[0] means the value of the
lowest bit. Just for clearness, could you write either a comment, or
use (i & 0b1) instead? Or (i % 2) == 0.
data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"]
ints = data.enum_if { |i| i.is_a? Numeric }
ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Aha, then it's something like a "lazy" enumerable, right?
Have a look at my code I wrote some weeks ago:

require 'generator'
module Enumerable
def select_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
end

def collect_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
end

alias map_lazy collect_lazy

def lazy
ChainingGenerator.new(self)
end

def no_lazy
to_a
end
end

class ChainingGenerator < Generator
def select(&block)
self.class.new {|c| self.each { |elem| c.yield(elem) if block.call(elem) } }
end

def collect(&block)
self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) } }
end

alias map collect

# TODO: implement others
end


[1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]

["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a? Numeric}.to_a

That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).

How performant is enum_if?

Could I write for example:

[1,2,3].enum_if{ cond }.map {|i| i + 1}


Regards,

Michael
 
Y

Yukihiro Matsumoto

Hi,

In message "[RCR] More enumerator functionality"

|I have finally posted my RCR for more enumerator
|functionality.
|
|Find more about it here:
|http://rcrchive.net/rcr/RCR/RCR262

Hmm, how about making Enumerator#select etc. (methods that return
array in Enumerable) return new filtered enumerator, that makes no
need for enum_if and,

huge_data.to_enum.delete_if {|x| ...}

works like a charm?

matz.
 
K

Kristof Bastiaensen

You first enum_if example, is a bit confusing:

(0..4).enum_if { |i| i[0] == 0 }.to_a => [0, 2, 4]

It took me a while until I recognized that i[0] means the value of the
lowest bit. Just for clearness, could you write either a comment, or
use (i & 0b1) instead? Or (i % 2) == 0.
You have a good point. I changed it.
data = ["a", 6, 9, "foo", -19, "fuga", -19, "bach"] ints = data.enum_if
{ |i| i.is_a? Numeric } ints.to_a
=> [6, 9, -19, -19]

data += ["Bear", 20, 3]
ints.to_a
=> [6, 9, -19, -19, 20, 3]

(I have added these examples to the RCR)

Aha, then it's something like a "lazy" enumerable, right?

Yes, that's right. The given block will only be executed at the time of
yielding the corresponding value.
Have a look at my code I wrote some weeks ago:

require 'generator'
module Enumerable
def select_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(elem) if
block.call(elem) } }
end

def collect_lazy(&block)
Generator.new {|c| self.each { |elem| c.yield(block.call(elem)) }
}
end

alias map_lazy collect_lazy

def lazy
ChainingGenerator.new(self)
end

def no_lazy
to_a
end
end

class ChainingGenerator < Generator
def select(&block)
self.class.new {|c| self.each { |elem| c.yield(elem) if
block.call(elem) } }
end

def collect(&block)
self.class.new {|c| self.each { |elem| c.yield(block.call(elem)) }
}
end

alias map collect

# TODO: implement others
end


[1,2,3,4].lazy.map{|i| i + 1}.to_a # => [2,3,4,5]

["a", 6, 9, "foo", -19, "fuga", -19, "bach"].lazy.select{|i| i.is_a?
Numeric}.to_a

That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).
That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.
How performant is enum_if?

It should be quite fast, since all it does is pass each yield through the
block.
Even more so, since Nobu Nokada was kind enough to provide an
implementation in c. :)

Could I write for example:

[1,2,3].enum_if{ cond }.map {|i| i + 1}
Yes, exactly. That was also the kind of thing I had in mind.

Cheers,
Kristof
 
N

nobu.nokada

Hi,

At Fri, 18 Jun 2004 21:55:21 +0900,
Yukihiro Matsumoto wrote in [ruby-talk:104047]:
Hmm, how about making Enumerator#select etc. (methods that return
array in Enumerable) return new filtered enumerator, that makes no
need for enum_if and,

huge_data.to_enum.delete_if {|x| ...}

Then Enumerator#select actually is equivalent to enum_if?

I'm afraid that it might cause confusion, it returns Enumerator
whereas Enumerable#select returns Array.
 
M

Michael Neumann

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

[...]
That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).
That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

I don't know what enum_for is doing, but select_lazy is the same as
enum_if, only it's implementation is very different.

Regards,

Michael
 
K

Kristof Bastiaensen

On Fri, 18 Jun 2004 21:28:45 +0900, Michael Neumann wrote:

[...]
That's a more general form, as after the "lazy", all Enumerable
operations do not create intermediate arrays. Of course, it's very slow
compared to the non-lazy methods (Generator uses continuations).
That's interesting.
If I am correct, your collect_lazy and lazy.collect behaves the same as
enum_for with a block. Your select_lazy and lazy.select the same as
enum_if.

I don't know what enum_for is doing, but select_lazy is the same as
enum_if, only it's implementation is very different.

Currently enum_for (and its alias to_enum) creates an enumerable
that uses a different method than each. For example:

require "enumerator"
str = "xyz"

enum = str.enum_for:)each_byte)
a = enum.map {|b| '%02x' % b } #=> ["78", "79", "7a"]

My proposal is that enum_for can take a block, so it can
do a custom transformation on the data:

data = [2, 3, 6]
powers = data.enum_for { |i| i * i } #"each" implied
powers.to_a
=> [4, 9, 36]

data << 7
powers.to_a
=> [4, 9, 36, 49]

If I am not mistaken this is the same as your collect_lazy.

Regards,
Kristof
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top