Behaviour of Enumerables reject vs. select mixed into Hash


D

Daniel DeLorme

Trans said:
module Enumerable
def select(&blk)
o = self.class.new
each{|*e| o << e if blk[*e]}
end
end

class Hash
def <<(e)
self[e[0]] = e[1]
end
end

The downside here is, it is less efficient and breaks backward
compatibility.

Why the compatibility breakage? The same could be implemented to be
fully backwards compatible:

module Enumerable
def select(&blk)
o = respond_to?:)<<) ? self.class.new : Array.new
each{|*e| o << e if blk[*e]}
end
end


Daniel
 
Ad

Advertisements

N

Nobuyoshi Nakada

Hi,

At Fri, 22 Jun 2007 06:30:40 +0900,
Trans wrote in [ruby-talk:256489]:
Hmm...well for the first solution, I suppose we need a special
constructor to provide the kind of enumerable result we will be
building. In my example, I used self.class.new, by obviously that's
not always the case, so the class will need to tell us.

I'd prefer this.
 
A

Alex Young

Todd said:
What no one seems to have (directly) focussed on is that Hash#each
gives two-element [k,v] arrays back as the content.

That's right. And if you wanted things to be "just ducky", it would
have to give only the v. I've actually argued in favor of that before,
b/c it's not unreasonable to see that an Array index is like a Hash
key.

Regardless of what we think is the best, it's seems pretty clear to me
that the duplicate of the Hash is broken during a delete_if, reject,
select, etc.
That's one reason I'd like to see more #to_hash methods lying around.
 
R

Robert Dober

David said:
No need to create a new collection as an array, no harm to create a
new collection as an array, actually the ongoing discussion about this
point is most interesting and beyond my scope - well I try to learn a
maximum from it.

I should go with David's approach because I fully understand the consequences.
But I love Tom's and Rick's approach going into unchartered waters...

BTW all this does not answer OP's, Tom's and my *original* concern.
We were talking about Hash#reject and Hash#select not
Enumerable#reject and Enumerable#select, but nobody seems to notice,
sigh :(
This topic would be less difficult to discuss I think.
David is smart ... the linguists are dumb. Don't take offense guys :)
Offense, are you kidding? The only thing which bothers me from a set
theory point of view is the symmetry in your statement above.

Honestly Todd if you wanted to make a point I missed it, completely...


Robert
 
A

Alexander Presber

BTW all this does not answer OP's, Tom's and my *original* concern.
We were talking about Hash#reject and Hash#select not
Enumerable#reject and Enumerable#select, but nobody seems to notice,
sigh :(

As I see it, there should _be_ no Hash#reject (nor Array#reject).

Hash (and in fact any class mixing in Enumerable) should be able to
use Enumerable#reject to filter out elements.
That is what Trans's/Daniels code aims at (hope I am not
misinterpreting them here).

Other than providing exactly this generality for iteration
(==enumeration) methods I see no need to have introduced the class
Enumerable at all.

Sincerely yours
Alex
 
R

Robert Dober

As I see it, there should _be_ no Hash#reject (nor Array#reject).

Alexander thank you so much :)
In the future maybe.
Right now Hash#reject is just the poor man's implementation of the
concepts which are discussed in this post.
Right now I feel it would be nice if its sibbling Hash#select returned
a Hash too.
Hash (and in fact any class mixing in Enumerable) should be able to
use Enumerable#reject to filter out elements.
That is what Trans's/Daniels code aims at (hope I am not
misinterpreting them here). Not at all IMHO

Other than providing exactly this generality for iteration
(==enumeration) methods I see no need to have introduced the class
Enumerable at all.
Sure that is it's purpose but be careful it is a module.
I do not see any harm in overriding some of the mixed in methods for
special cases like Hash, it is reasonable approach.
Sincerely yours
Alex
Robert
 
Ad

Advertisements

T

Todd Benson

BTW all this does not answer OP's, Tom's and my *original* concern.
We were talking about Hash#reject and Hash#select not
Enumerable#reject and Enumerable#select, but nobody seems to notice,
sigh :(
This topic would be less difficult to discuss I think.

My hash isn't your hash. What's the most generic behavior to expect?

See, you've already dismissed me right there, because you've decided
if my Hash doesn't work the way _you_ expect it to, it's my fault.
Offense, are you kidding? The only thing which bothers me from a set
theory point of view is the symmetry in your statement above.

Honestly Todd if you wanted to make a point I missed it, completely...

That wouldn't be the first time I've been accused of saying a whole
lot of nothin' :)

I stopped talking about Ruby, and started talking instead about
programming in general. I introduced too many things at once. My
bad.

Todd
 
R

Robert Dober

My hash isn't your hash. What's the most generic behavior to expect?

See, you've already dismissed me right there, because you've decided
if my Hash doesn't work the way _you_ expect it to, it's my fault.
I do not recall having dismissed you at all? I have mad a judgement
about two levels of complexity. If you do not like the behavior of
Hash#reject or Hash#select it might be a good idea to say so, right?
I do not mind at all if somebody says, I like Hash#reject to return a
Set because (this is just an example) etc. etc.

I guess I got confused about who thinks what in this thread :(
 
D

dblack

Hi --

What no one seems to have (directly) focussed on is that Hash#each
gives two-element [k,v] arrays back as the content.

That's right. And if you wanted things to be "just ducky", it would
have to give only the v. I've actually argued in favor of that before,
b/c it's not unreasonable to see that an Array index is like a Hash
key. So, for a full parallel we'd need to see something like:

{:x=>'m'}.each { |v| v #=> 'm'

['m'].each { |v| v #=> 'm'

{'x'=>'m'}.each_assoc{ |a| a #=> ['x','m']

['m'].each_assoc{ |a| a #=> [0,'m']

One could easily argue that an Assoc class would be quite useful here,
rather than relying on 2-element Array to fulfill the roll. With that
in hand, it would re easy enough to add the #<< method for enumerable
construction.

I don't think there's any reason to expect just the value when
iterating through a hash. In fact if you're using a hash, you
probably have a reason for storing data that way and iterating
pair-wise seems logical to me.

I think it's a question of how one looks at the underlying types or
behaviors. I don't think the language has to converge around the
smallest possible number of interfaces. You can look at it the other
way around. It's useful (extremely) to have hashes, and to iterate
over them in pairs. Enumerable is one way to help introduce that
construct into the language -- not a one-stop-shopping hash
implementation, but helpful.

If we then find fault with hash behavior because it's not in line
precisely with other enumerables, that's a kind of reverse logic; it's
a way to talk the language out of having something useful, which I
don't think is a good idea. Enumerable not only allows but requires
that each class implement #each, and there's no constraint that every
enumerable class has to yield exactly one value at a time. I'd want
to see more concrete evidence that having hashes and arrays behave
differently is really creating problems before wanting to normalize
them around one construct.


David

--
* Books:
RAILS ROUTING (new! http://www.awprofessional.com/title/0321509242)
RUBY FOR RAILS (http://www.manning.com/black)
* Ruby/Rails training
& consulting: Ruby Power and Light, LLC (http://www.rubypal.com)
 
Ad

Advertisements

T

Trans

Hi --



What no one seems to have (directly) focussed on is that Hash#each
gives two-element [k,v] arrays back as the content.
That's right. And if you wanted things to be "just ducky", it would
have to give only the v. I've actually argued in favor of that before,
b/c it's not unreasonable to see that an Array index is like a Hash
key. So, for a full parallel we'd need to see something like:
{:x=>'m'}.each { |v| v #=> 'm'
['m'].each { |v| v #=> 'm'
{'x'=>'m'}.each_assoc{ |a| a #=> ['x','m']
['m'].each_assoc{ |a| a #=> [0,'m']
One could easily argue that an Assoc class would be quite useful here,
rather than relying on 2-element Array to fulfill the roll. With that
in hand, it would re easy enough to add the #<< method for enumerable
construction.

I don't think there's any reason to expect just the value when
iterating through a hash. In fact if you're using a hash, you
probably have a reason for storing data that way and iterating
pair-wise seems logical to me.

I think it's a question of how one looks at the underlying types or
behaviors. I don't think the language has to converge around the
smallest possible number of interfaces. You can look at it the other
way around. It's useful (extremely) to have hashes, and to iterate
over them in pairs. Enumerable is one way to help introduce that
construct into the language -- not a one-stop-shopping hash
implementation, but helpful.

If we then find fault with hash behavior because it's not in line
precisely with other enumerables, that's a kind of reverse logic; it's
a way to talk the language out of having something useful, which I
don't think is a good idea. Enumerable not only allows but requires
that each class implement #each, and there's no constraint that every
enumerable class has to yield exactly one value at a time. I'd want
to see more concrete evidence that having hashes and arrays behave
differently is really creating problems before wanting to normalize
them around one construct.

I don't necessarily disagree with you. I was following the natural
conclusion of one possible perspective, namely, that the a hash key
corresponds to the array index. It's one possible way to fix the the
issue posited by the thread. And yet, if our hashes were in fact
ordered, as some have asked for, this assumption would fail. So I'm
not actually for it, but it does offer some contrast.

Consider the order hash perspective. We have an index, plus a key and
a value. So in this case, what exactly are we enumerating? We say
"pairs" as if it is something, but Ruby doesn't really have such a
thing. The closest we come to is a 2-element array. Perhaps that is
enough, but it's hardly embraced as such. We see it only in the
iteration of #each _if_ we use a single var. There is no
Hash.new([:a,'x'],[:b,'y']) or hash << [:a,'x'], etc. If there were, I
think this issue wouldn't exist. I think Enumerable would a little
more robust, and we could expect that a Hash be returned from #select.
While the 2-element array covers the need, maybe not so much the want,
and we might even consider a real Pair object:

pair = :)a => 'x')
pair.key #=> :a
pair.value #=> 'x'

If we don't take this perspective (irregardless of an actual Pair
class, or not) I don't see any good reason to have Enumerable included
in Hash. Just define the desired "enumerating" methods on Hash itself
--just like #reject. But personally, I'd prefer we get Enumerable
right.

And while were on the subject --it seems that's exactly what we're
doing with String. I hear that String will no longer be Enumerable in
future Ruby. I really just don't get this. What's so problematic with
a default "view" of strings as ordered-sets of characters? All it
requires is the proper definition of #each. Clearly the way things are
now is broken. But does this really require us to scrap String
enumerablity all together? At the very least, how terribly inefficient
it will be to have to convert an string to an array of characters (eg.
bunches of little strings), just to iterate over it.

T.
 
Ad

Advertisements


Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top