Making Hash a first-class citizen

  • Thread starter Erik Michaels-Ober
  • Start date
E

Erik Michaels-Ober

[Note: parts of this message were removed to make it a legal post.]

I've noticed a couple inconsistencies in the way Ruby hashes are treated
(compared to, say, arrays). While I've been able to monkey-patch solutions,
I thought I would bring these inconsistencies to the attention of the group
with they hope that they might be resolved in a future version of the
language, which I love dearly.

For starters, there's no to_hash method on nil. This causes a problem when
I want to call an instance method on an object that may or may not be nil
(for example, a hash of parameters from an HTTP request).

A common way to avoid a NoMethodError is by casting an object to a
particular type before calling a method on it. For example:

string_or_nil.to_s.capitalize

This technique can be used for arrays, floats, integers, and strings. It
would seem to follow that I could do the same for hashes:

hash_or_nil.to_hash.rehash

but instead I must do something like this:

(hash_or_nil || nil).rehash

or patch NilClass like this:

class NilClass
def to_hash
{}
end
end

which seems to me like it shouldn't be necessary for a "primitive" type.


Second, I would argue that there should be + (plus), - (minus), and &
(ampersand) methods on hashes, that function the same way they do for arrays
(concatenation, difference, and intersection, respectively).

These few changes would go a long way toward making Hash a first-class
citizen in Ruby.
 
J

Joel VanderWerf

Erik said:
Second, I would argue that there should be + (plus), - (minus), and &
(ampersand) methods on hashes, that function the same way they do for arrays
(concatenation, difference, and intersection, respectively).

Hash is a very flexible citizen, though. It plays many roles. So how
does this come out:

{0=>1, 1=>0} - {1=>1} = ?

{0=>1, 1=>0}
# if you look at a hash as a set of pairs

{0=>0}
# if you look at this hash as another way of expressing
# the same indexed collection as the array [1,0], and use
# Array#-

{0=>1}
# if you look at a hash as a set of keys, with the value, as
# a boolean, representing membership or non-membership in the set

{0=>1, 1=>-1}
# maybe, if you look at a hash as a mathematical function

And then there are "Bags":

{1=>6} - {1=>1} == {1=>5}
 
E

Erik Michaels-Ober

[Note: parts of this message were removed to make it a legal post.]
Second, I would argue that there should be + (plus), - (minus), and &

Hash is a very flexible citizen, though. It plays many roles. So how does
this come out:

{0=>1, 1=>0} - {1=>1} = ?

{0=>1, 1=>0}
# if you look at a hash as a set of pairs


I do look at Hash as a set of pairs and I would consider the other uses you
cite to be "non-standard". The first line of RDoc for the class states: "A
Hash is a collection of key-value pairs."

I would argue it's better for +, -, and & to be defined for the standard use
case than to remain undefined in the language. Those using Hash in a
non-standard way can simply avoid these methods.

You could just as well argue that someone might want [6] - [1] to return
[5], but that wouldn't make sense given that "Arrays are ordered,
integer-indexed collections of any object."


Any objections to nil.to_hash?

Note: in the example I meant to type (hash_or_nil || {}).rehash instead of
(hash_or_nil || nil).rehash
 
R

Robert Klemme

I do look at Hash as a set of pairs and I would consider the other uses you
cite to be "non-standard". The first line of RDoc for the class states: "A
Hash is a collection of key-value pairs."

You omit a very important additional property: no key can occur twice in
a Hash. Actually I would consider at least option one and three of
Joel's list as common usage of a Hash.
I would argue it's better for +, -, and & to be defined for the standard use
case than to remain undefined in the language. Those using Hash in a
non-standard way can simply avoid these methods.

The problem is that there is no standard use case. Removing based on
the identical pair seems to me as valid as removing based on the key
only as implementations for Hash#-.

Concatenation does not make sense with an unordered collection. See
also various discussions why Hash does not (or rather did not) implement
#hash and #eql? as many people expected.
You could just as well argue that someone might want [6] - [1] to return
[5], but that wouldn't make sense given that "Arrays are ordered,
integer-indexed collections of any object."

Well, the implementation of Array#- does not fit there well, does it?
Because it works like set substraction while Array#+ works as array
concatenation - not very consistent either - but apparently useful.
Any objections to nil.to_hash?

Note: in the example I meant to type (hash_or_nil || {}).rehash instead of
(hash_or_nil || nil).rehash

What other classes do implement to_hash? In 1.9.1:

irb(main):007:0> ObjectSpace.each_object(Module) {|cl| p cl if
cl.instance_methods.include? :to_hash}
Hash
=> 407
irb(main):008:0>

Btw, invoking x.to_hash is not a cast. First, there are no casts in
Ruby because variables do not have types and second, to_hash is an
ordinary method. Casts on the other hand are usually not implemented
via ordinary methods. (In C++ for example casts are operators which can
be overloaded.)

Kind regards

robert
 
L

lith

Concatenation does not make sense with an unordered collection.

There are also the merge and delete methods. Should + be simply
aliases for those? I personally would favour making those operators
work on the set of keys.

Nil is nil. Maybe there is a reason why the variable/value in question
is nil and not {}? If so, you probably shouldn't ignore that
distinction?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,139
Latest member
JamaalCald
Top