Why Does Hash Apparently Reorder Its Internal Representation AndOther Associated Ponderings

T

thoran

I was surprised to find the following...

irb(main):001:0> {:a => 'a', :b => 'b'}
=> {:a=>"a", :b=>"b"}
irb(main):002:0> {:a => 'a', :b => 'b', :c => 'c'}
=> {:a=>"a", :b=>"b", :c=>"c"}
irb(main):003:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd'}
=> {:a=>"a", :d=>"d", :b=>"b", :c=>"c"}
irb(main):004:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd', :e =>
'e'}
=> {:a=>"a", :d=>"d", :b=>"b", :e=>"e", :c=>"c"}
irb(main):005:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd', :e =>
'e', :f => 'f'}
=> {:a=>"a", :d=>"d", :b=>"b", :e=>"e", :c=>"c", :f=>"f"}

I expected that the sequence of hashes returned would be identical in
order to the presented sequence. Or, that there would be some
discernable sequence such as if a stack, if not as a queue.

I realise that it is a hash and that access is expected to be by key,
but sometimes retaining the presented order may be desired. And
anyway, why reorder it at all?

Furthermore, the returned order is different from a previous sequence
I did! So, it appears that this is somewhat random, however unlikely
that it is.

I then assumed that there might be some kind of optimisation going on,
but how much optimisation of {:a => 'a', :b => 'b', ... } can there
be?

It doesn't seem very PoLS to have it reordered, although perhaps one
shouldn't be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: "Make no assumptions about the order of hashes!"

And would eval %{ instance_of_hash[instance_of_fixnum] } really be so
evil? Perhaps that was a little obscure... What's wrong with ordered
access, using a numeric element reference as with Array, to Hash?
Excepting that the order can't be relied upon, but assume that it
could. Even simpler might have been to give an example:

h = {:a => 'a', :b => 'b'}
h[0]
=> 'a'

Similarly one might be able to treat an Array like a Hash as in eval
%{ instance_of_array['instance_of_string_representation_of_a_fixnum']
}, such as with:

a = [ 1, 2 ]
a['0']
=> 1

There's no great call for the immediately above I would think, but if
I implemented one then I would implement the other also, simply for
the purposes of symmetry. I'm not even sure there is any need for
either, such that I may be trying to achieve the unnecessary?...

Even so, would some unwritten law be being broken if I did this stuff?
 
P

Phillip Hutchings

I was surprised to find the following...

Really? Take a comsci course at uni
I realise that it is a hash and that access is expected to be by key,
but sometimes retaining the presented order may be desired. And
anyway, why reorder it at all?

It's not 'reordered', it's stored in a hash table. Think of how MD5
represents a unique key for almost every possible value, but the hash
has no real connection to the original value. A hash table is stored
using similar keys, though you are allowed duplicate keys for a hash
table, and if you go in to the mathematical theory of it there is a
lot of speed gained this way especially if you're using large keys,
such as in i18n applications where each string is stored in a hash
table. That's why it's called a hash and not an array.
 
D

Daniel Baird

I was surprised to find the following...

irb(main):001:0> {:a => 'a', :b => 'b'}
=> {:a=>"a", :b=>"b"} [..]
I then assumed that there might be some kind of optimisation going on,
but how much optimisation of {:a => 'a', :b => 'b', ... } can there
be?

Hashes are unordered so that Ruby can use the hash function of the
keys. Read up on hashes at wikipedia:
http://en.wikipedia.org/wiki/Hash_table

There was a thread about making an "ordered hash" in the last few
weeks. Worth searching for.

;Daniel
 
H

Hal Fulton

I was surprised to find the following...

[snip]

This has been discussed many times over the last
six years. Do a search of the archives.
It doesn't seem very PoLS to have it reordered, although perhaps one
shouldn't be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: "Make no assumptions about the order of hashes!"

Don't invoke POLS. It's Matz's surprise that matters, not yours or
mine. And his voice is anything but loud.
Even so, would some unwritten law be being broken if I did this stuff?

Personally I favor introducing an ordered hash of some form into Ruby.
Other people don't. Many want the original Hash kept as it is for speed
(though I am still unconvinced that merely keeping a sequence number
along with each key would impact speed dramatically).

What might be best is a class that inherits from Hash and preserves
order. We'd need a literal notation, however.


Hal
 
Y

Yukihiro Matsumoto

Hi,

In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"

|Personally I favor introducing an ordered hash of some form into Ruby.
|Other people don't. Many want the original Hash kept as it is for speed
|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven't yet read the "A use case
for an ordered hash" thread yet. I am facing a huge mountain of mails
after the vacation.

matz.
 
H

Hal Fulton

Yukihiro said:
Hi,

In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"

|Personally I favor introducing an ordered hash of some form into Ruby.
|Other people don't. Many want the original Hash kept as it is for speed
|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven't yet read the "A use case
for an ordered hash" thread yet. I am facing a huge mountain of mails
after the vacation.

My use case (I started that thread) may not be compelling. Other people,
including Bil Kleb, have said it would be useful to them also.

Bil's example was related to NASA's CEV. So an ordered hash would help
put people back on the moon. ;)


Hal
 
B

Bil Kleb

Hal said:
My use case (I started that thread) may not be compelling. Other people,
including Bil Kleb, have said it would be useful to them also.

So far, I've managed to duplicate the functionality of two
Java codes having 834 and 1084 lines of codes with 24 and 55
lines of Ruby, respectively. The second one is so long[1]
due to the lack of ordered Hashes (and my lack of Ruby mastery).

FWIW, the Ruby versions are more robust and general than the
Java versions due to the mini DSL I wrote for expressing
tolerances in a natural way.

Thanks again for Ruby!

Regards,
 
W

William James

Yukihiro said:
Hi,

In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"

|Personally I favor introducing an ordered hash of some form into Ruby.
|Other people don't. Many want the original Hash kept as it is for speed
|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance.

Ruby is already slow enough. Those who are griping should be
using association lists.
 
W

William James

Bil said:
Hal said:
My use case (I started that thread) may not be compelling. Other people,
including Bil Kleb, have said it would be useful to them also.

So far, I've managed to duplicate the functionality of two
Java codes having 834 and 1084 lines of codes with 24 and 55
lines of Ruby, respectively. The second one is so long[1]
due to the lack of ordered Hashes

Did you try association lists?

class Array
def to_assoc
f=nil ; partition{f=!f}.transpose
end
def set k, v
(pair = assoc(k)) ? self[ index(pair) ] = [k,v] : push( [k,v] )
end
def rm k
(pair = assoc( k )) && slice!( index( pair ) )
end
end

a = [:foo,22, :bar,33, :baz,44].to_assoc
a.set( :bar, 99 )
a.set :yes, -1
a.rm :baz
p a
 
B

Bil Kleb

William said:
Did you try association lists?

Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.

Speed is not an issue (for me). The 5,000 simulations
I am running take days to run. Even if the Ruby I am
use to set them up takes 5 minutes instead of 5 seconds,
I'll take the beauty of an ordered Hash over association
lists any day.

Regards,
 
M

Martin DeMello

Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.

You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

m.
 
H

Hal Fulton

Martin said:
Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.


You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

There are any number of ways to do this sort of thing.
But they all suffer from not having a convenient notation
for literals.


Hal
 
B

Bil Kleb

Martin said:
You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

Hmmm... good idea. I've largely missed out on the
whole method_missing idiom. Sounds like a good use.

I'll try to look into it after I return from JPL next
week. However, if you'd like to throw down an example,
I might be able to work it in now...

Thanks,
 
M

Martin DeMello

Martin said:
You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

Hmmm... good idea. I've largely missed out on the
whole method_missing idiom. Sounds like a good use.

I'll try to look into it after I return from JPL next
week. However, if you'd like to throw down an example,
I might be able to work it in now...

Quick proof of concept:

require 'enumerator'

class OHash
include Enumerable

def initialize
@a = []
@h = {}
end

def []=(k,v)
@a.delete(k) if @h[k]
@h[k] = v
@a << k
end

def delete(k, &blk)
@a.delete(k)
@h.delete(k)
end

def each
p @a, @h
each_key {|k| yield [k, @h[k]]}
end

def each_key
@a.each {|k| yield k}
end

def method_missing(*args)
@h.send(*args)
end
end

def o(*ary)
r = OHash.new
ary.each_slice(2) {|k,v| r[k] = v }
r
end

# testing
a = o("hello", "world", :foo, "bar", 5, 6)
a.each {|k,v| p [k,v]}
puts a["hello"]
a["hello"] = 5
a.each {|k,v| p [k,v]}

martin
 
J

Just Another Victim of the Ambient Morality

I was surprised to find the following...

I realise that it is a hash and that access is expected to be by key, but
sometimes retaining the presented order may be desired. And anyway, why
reorder it at all?

I am always surprised whenever this question comes up. Whenever it
does, it's obvious that the poster does not know what a hash is.

I think what you are thinking of is a red black tree (or just a binary
tree, in general) and not a hash...

It doesn't seem very PoLS to have it reordered, although perhaps one
shouldn't be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: "Make no assumptions about the order of hashes!"

People invoke the PoLS way too often considering it's a claim that Matz
has never made.
Personally, the Principle of Least Surprise is more about consistency
than it is about how often you're literally surprised. Often, you're
surprised because of your own ignorance (as in this instance), so you can
hardly blame Ruby for that! Ruby is very consistent and, in my opinion,
follows the PoLS. Whenever it doesn't, it ususally does so for a good
(pragmatic) reason...


Honestly, complaining that hashes aren't ordered is like complaining
that rand() doesn't return the same number every time. Pray tell, what
made you think it should be ordered? That was an honest question, by the
way! You know enough about programming to come here and decree that this
is very weird yet you didn't already know what a hash is or how one works.
Very strange...
 
P

Pedro Côrte-Real

def method_missing(*args)
@h.send(*args)
end
end

You probably want to define respond_to? as well. Something like (untested code):

def respond_to?(*args)
@h.respond_to?(*args)
end

So that the new hash correctly answers which methods it responds to
(the same as Hash).

Pedro.
 
J

Just Another Victim of the Ambient Morality

Martin DeMello said:
Martin said:
You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

Hmmm... good idea. I've largely missed out on the
whole method_missing idiom. Sounds like a good use.

I'll try to look into it after I return from JPL next
week. However, if you'd like to throw down an example,
I might be able to work it in now...

Quick proof of concept:

require 'enumerator'

class OHash
include Enumerable

...

def method_missing(*args)
@h.send(*args)
end
end

Is there a reason why you don't simply inheret from Hash, or is this
just another one of those There's More Than One Way To Do It scenarios?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top