Why Does Hash Apparently Reorder Its Internal Representation AndOther Associated Ponderings

Discussion in 'Ruby' started by thoran@thoran.com, Aug 21, 2006.

1. Guest

I was surprised to find the following...

irb(main):001:0> {:a => 'a', :b => 'b'}
=> {:a=>"a", :b=>"b"}
irb(main):002:0> {:a => 'a', :b => 'b', :c => 'c'}
=> {:a=>"a", :b=>"b", :c=>"c"}
irb(main):003:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd'}
=> {:a=>"a", :d=>"d", :b=>"b", :c=>"c"}
irb(main):004:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd', :e =>
'e'}
=> {:a=>"a", :d=>"d", :b=>"b", :e=>"e", :c=>"c"}
irb(main):005:0> {:a => 'a', :b => 'b', :c => 'c', :d => 'd', :e =>
'e', :f => 'f'}
=> {:a=>"a", :d=>"d", :b=>"b", :e=>"e", :c=>"c", :f=>"f"}

I expected that the sequence of hashes returned would be identical in
order to the presented sequence. Or, that there would be some
discernable sequence such as if a stack, if not as a queue.

I realise that it is a hash and that access is expected to be by key,
but sometimes retaining the presented order may be desired. And
anyway, why reorder it at all?

Furthermore, the returned order is different from a previous sequence
I did! So, it appears that this is somewhat random, however unlikely
that it is.

I then assumed that there might be some kind of optimisation going on,
but how much optimisation of {:a => 'a', :b => 'b', ... } can there
be?

It doesn't seem very PoLS to have it reordered, although perhaps one
shouldn't be surprised that a hash is unordered? Perhaps Matz is
convincing us of this statement? Said Matz unto the flock in a loud,
Godly voice: "Make no assumptions about the order of hashes!"

And would eval %{ instance_of_hash[instance_of_fixnum] } really be so
evil? Perhaps that was a little obscure... What's wrong with ordered
access, using a numeric element reference as with Array, to Hash?
Excepting that the order can't be relied upon, but assume that it
could. Even simpler might have been to give an example:

h = {:a => 'a', :b => 'b'}
h[0]
=> 'a'

Similarly one might be able to treat an Array like a Hash as in eval
%{ instance_of_array['instance_of_string_representation_of_a_fixnum']
}, such as with:

a = [ 1, 2 ]
a['0']
=> 1

There's no great call for the immediately above I would think, but if
I implemented one then I would implement the other also, simply for
the purposes of symmetry. I'm not even sure there is any need for
either, such that I may be trying to achieve the unnecessary?...

Even so, would some unwritten law be being broken if I did this stuff?

, Aug 21, 2006

2. Phillip HutchingsGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On 8/21/06, thoran @ thoran. com <> wrote:
> I was surprised to find the following...

Really? Take a comsci course at uni

> I realise that it is a hash and that access is expected to be by key,
> but sometimes retaining the presented order may be desired. And
> anyway, why reorder it at all?

It's not 'reordered', it's stored in a hash table. Think of how MD5
represents a unique key for almost every possible value, but the hash
has no real connection to the original value. A hash table is stored
using similar keys, though you are allowed duplicate keys for a hash
table, and if you go in to the mathematical theory of it there is a
lot of speed gained this way especially if you're using large keys,
such as in i18n applications where each string is stored in a hash
table. That's why it's called a hash and not an array.

--
Phillip Hutchings
http://www.sitharus.com/

Phillip Hutchings, Aug 21, 2006

3. Daniel BairdGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On 8/21/06, thoran @ thoran. com <> wrote:
> I was surprised to find the following...
>
> irb(main):001:0> {:a => 'a', :b => 'b'}
> => {:a=>"a", :b=>"b"}

[..]
> I then assumed that there might be some kind of optimisation going on,
> but how much optimisation of {:a => 'a', :b => 'b', ... } can there
> be?

Hashes are unordered so that Ruby can use the hash function of the
keys. Read up on hashes at wikipedia:
http://en.wikipedia.org/wiki/Hash_table

There was a thread about making an "ordered hash" in the last few
weeks. Worth searching for.

;Daniel

--
Daniel Baird
http://tiddlyspot.com (free, effortless TiddlyWiki hosting)
http://danielbaird.com (TiddlyW;nks! :: Whiteboard Koala :: Blog ::
Things That Suck)

Daniel Baird, Aug 21, 2006
4. Hal FultonGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

wrote:
> I was surprised to find the following...

[snip]

This has been discussed many times over the last
six years. Do a search of the archives.
>
> It doesn't seem very PoLS to have it reordered, although perhaps one
> shouldn't be surprised that a hash is unordered? Perhaps Matz is
> convincing us of this statement? Said Matz unto the flock in a loud,
> Godly voice: "Make no assumptions about the order of hashes!"

Don't invoke POLS. It's Matz's surprise that matters, not yours or
mine. And his voice is anything but loud.

>
> Even so, would some unwritten law be being broken if I did this stuff?
>

Personally I favor introducing an ordered hash of some form into Ruby.
Other people don't. Many want the original Hash kept as it is for speed
(though I am still unconvinced that merely keeping a sequence number
along with each key would impact speed dramatically).

What might be best is a class that inherits from Hash and preserves
order. We'd need a literal notation, however.

Hal

Hal Fulton, Aug 21, 2006
5. Yukihiro MatsumotoGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

Hi,

In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"
on Mon, 21 Aug 2006 16:37:00 +0900, Hal Fulton <> writes:

|Personally I favor introducing an ordered hash of some form into Ruby.
|Other people don't. Many want the original Hash kept as it is for speed
|(though I am still unconvinced that merely keeping a sequence number
|along with each key would impact speed dramatically).

I am open to introduce order into 1.9 Hash, as long as we can
accomplish reasonable performance. I haven't yet read the "A use case
for an ordered hash" thread yet. I am facing a huge mountain of mails
after the vacation.

matz.

Yukihiro Matsumoto, Aug 21, 2006
6. Hal FultonGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

Yukihiro Matsumoto wrote:
> Hi,
>
> In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"
> on Mon, 21 Aug 2006 16:37:00 +0900, Hal Fulton <> writes:
>
> |Personally I favor introducing an ordered hash of some form into Ruby.
> |Other people don't. Many want the original Hash kept as it is for speed
> |(though I am still unconvinced that merely keeping a sequence number
> |along with each key would impact speed dramatically).
>
> I am open to introduce order into 1.9 Hash, as long as we can
> accomplish reasonable performance. I haven't yet read the "A use case
> for an ordered hash" thread yet. I am facing a huge mountain of mails
> after the vacation.

My use case (I started that thread) may not be compelling. Other people,
including Bil Kleb, have said it would be useful to them also.

Bil's example was related to NASA's CEV. So an ordered hash would help
put people back on the moon.

Hal

Hal Fulton, Aug 21, 2006
7. Bil KlebGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

Hal Fulton wrote:
>
> My use case (I started that thread) may not be compelling. Other people,
> including Bil Kleb, have said it would be useful to them also.

So far, I've managed to duplicate the functionality of two
Java codes having 834 and 1084 lines of codes with 24 and 55
lines of Ruby, respectively. The second one is so long[1]
due to the lack of ordered Hashes (and my lack of Ruby mastery).

FWIW, the Ruby versions are more robust and general than the
Java versions due to the mini DSL I wrote for expressing
tolerances in a natural way.

Thanks again for Ruby!

Regards,
--
Bil
http://fun3d.larc.nasa.gov

[1] If you can call 55 LOC long!

Bil Kleb, Aug 21, 2006
8. William JamesGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

Yukihiro Matsumoto wrote:
> Hi,
>
> In message "Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings"
> on Mon, 21 Aug 2006 16:37:00 +0900, Hal Fulton <> writes:
>
> |Personally I favor introducing an ordered hash of some form into Ruby.
> |Other people don't. Many want the original Hash kept as it is for speed
> |(though I am still unconvinced that merely keeping a sequence number
> |along with each key would impact speed dramatically).
>
> I am open to introduce order into 1.9 Hash, as long as we can
> accomplish reasonable performance.

Ruby is already slow enough. Those who are griping should be
using association lists.

William James, Aug 21, 2006
9. Hal FultonGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

William James wrote:
>
> Ruby is already slow enough. Those who are griping should be
> using association lists.
>

What's an association list?

Hal

Hal Fulton, Aug 21, 2006
10. William JamesGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

Bil Kleb wrote:
> Hal Fulton wrote:
> >
> > My use case (I started that thread) may not be compelling. Other people,
> > including Bil Kleb, have said it would be useful to them also.

>
> So far, I've managed to duplicate the functionality of two
> Java codes having 834 and 1084 lines of codes with 24 and 55
> lines of Ruby, respectively. The second one is so long[1]
> due to the lack of ordered Hashes

Did you try association lists?

class Array
def to_assoc
f=nil ; partition{f=!f}.transpose
end
def set k, v
(pair = assoc(k)) ? self[ index(pair) ] = [k,v] : push( [k,v] )
end
def rm k
(pair = assoc( k )) && slice!( index( pair ) )
end
end

a = [:foo,22, :bar,33, :baz,44].to_assoc
a.set( :bar, 99 )
a.set :yes, -1
a.rm :baz
p a

William James, Aug 21, 2006
11. Bil KlebGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

William James wrote:
>
> Did you try association lists?

Sort of, but I liked the interface of Hash too much
to abandon it. So far, I am carrying along an array
of keys in order of creation for the one place that
I need it. Otherwise, I have the beauty of the stock
Hash interface at my disposal.

Speed is not an issue (for me). The 5,000 simulations
I am running take days to run. Even if the Ruby I am
use to set them up takes 5 minutes instead of 5 seconds,
I'll take the beauty of an ordered Hash over association
lists any day.

Regards,
--
Bil
http://fun3d.larc.nasa.gov

Bil Kleb, Aug 21, 2006
12. Martin DeMelloGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On 8/21/06, Bil Kleb <> wrote:
>
> Sort of, but I liked the interface of Hash too much
> to abandon it. So far, I am carrying along an array
> of keys in order of creation for the one place that
> I need it. Otherwise, I have the beauty of the stock
> Hash interface at my disposal.

You can do this automatically (if you aren't already) by creating an
object that holds a hash and an array, defines []=, each and delete to
do the right thing, and delegates missing methods to the hash.

m.

Martin DeMello, Aug 21, 2006
13. Hal FultonGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

Martin DeMello wrote:
> On 8/21/06, Bil Kleb <> wrote:
>
>>
>> Sort of, but I liked the interface of Hash too much
>> to abandon it. So far, I am carrying along an array
>> of keys in order of creation for the one place that
>> I need it. Otherwise, I have the beauty of the stock
>> Hash interface at my disposal.

>
>
> You can do this automatically (if you aren't already) by creating an
> object that holds a hash and an array, defines []=, each and delete to
> do the right thing, and delegates missing methods to the hash.

There are any number of ways to do this sort of thing.
But they all suffer from not having a convenient notation
for literals.

Hal

Hal Fulton, Aug 21, 2006
14. Bil KlebGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

Martin DeMello wrote:
> On 8/21/06, Bil Kleb <> wrote:
>
> You can do this automatically (if you aren't already) by creating an
> object that holds a hash and an array, defines []=, each and delete to
> do the right thing, and delegates missing methods to the hash.

Hmmm... good idea. I've largely missed out on the
whole method_missing idiom. Sounds like a good use.

I'll try to look into it after I return from JPL next
week. However, if you'd like to throw down an example,
I might be able to work it in now...

Thanks,
--
Bil
http://fun3d.larc.nasa.gov

Bil Kleb, Aug 21, 2006
15. Martin DeMelloGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On 8/21/06, Bil Kleb <> wrote:
> Martin DeMello wrote:
> > On 8/21/06, Bil Kleb <> wrote:
> >
> > You can do this automatically (if you aren't already) by creating an
> > object that holds a hash and an array, defines []=, each and delete to
> > do the right thing, and delegates missing methods to the hash.

>
> Hmmm... good idea. I've largely missed out on the
> whole method_missing idiom. Sounds like a good use.
>
> I'll try to look into it after I return from JPL next
> week. However, if you'd like to throw down an example,
> I might be able to work it in now...

Quick proof of concept:

require 'enumerator'

class OHash
include Enumerable

def initialize
@a = []
@h = {}
end

def []=(k,v)
@a.delete(k) if @h[k]
@h[k] = v
@a << k
end

def delete(k, &blk)
@a.delete(k)
@h.delete(k)
end

def each
p @a, @h
each_key {|k| yield [k, @h[k]]}
end

def each_key
@a.each {|k| yield k}
end

def method_missing(*args)
@h.send(*args)
end
end

def o(*ary)
r = OHash.new
ary.each_slice(2) {|k,v| r[k] = v }
r
end

# testing
a = o("hello", "world", :foo, "bar", 5, 6)
a.each {|k,v| p [k,v]}
puts a["hello"]
a["hello"] = 5
a.each {|k,v| p [k,v]}

martin

Martin DeMello, Aug 21, 2006
16. Bil KlebGuest

Re: Why Does Hash Apparently Reorder Its Internal RepresentationAnd Other Associated Ponderings

Martin DeMello wrote:
>
> Quick proof of concept:

Thanks!

Later,
--
Bil
http://fun3d.larc.nasa.gov

Bil Kleb, Aug 21, 2006
17. Just Another Victim of the Ambient MoralityGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

<> wrote in message
news:2d4684b07b9f965e5078fdb46754fac3@shiny...
>I was surprised to find the following...
>
> I realise that it is a hash and that access is expected to be by key, but
> sometimes retaining the presented order may be desired. And anyway, why
> reorder it at all?

I am always surprised whenever this question comes up. Whenever it
does, it's obvious that the poster does not know what a hash is.

I think what you are thinking of is a red black tree (or just a binary
tree, in general) and not a hash...

> It doesn't seem very PoLS to have it reordered, although perhaps one
> shouldn't be surprised that a hash is unordered? Perhaps Matz is
> convincing us of this statement? Said Matz unto the flock in a loud,
> Godly voice: "Make no assumptions about the order of hashes!"

People invoke the PoLS way too often considering it's a claim that Matz
Personally, the Principle of Least Surprise is more about consistency
than it is about how often you're literally surprised. Often, you're
surprised because of your own ignorance (as in this instance), so you can
hardly blame Ruby for that! Ruby is very consistent and, in my opinion,
follows the PoLS. Whenever it doesn't, it ususally does so for a good
(pragmatic) reason...

Honestly, complaining that hashes aren't ordered is like complaining
that rand() doesn't return the same number every time. Pray tell, what
made you think it should be ordered? That was an honest question, by the
way! You know enough about programming to come here and decree that this
is very weird yet you didn't already know what a hash is or how one works.
Very strange...

Just Another Victim of the Ambient Morality, Aug 21, 2006
18. Chris GehlkerGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On Aug 21, 2006, at 9:05 AM, Just Another Victim of the Ambient
Morality wrote:

> s like complaining
> that rand() doesn't return the same number every time

OMG! That's why my project is so buggy. ;-)
--
Vegetarians eat Vegetables, Humanitarians frighten me

Chris Gehlker, Aug 21, 2006
19. Pedro Côrte-RealGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

On 8/21/06, Martin DeMello <> wrote:
> def method_missing(*args)
> @h.send(*args)
> end
> end

You probably want to define respond_to? as well. Something like (untested code):

def respond_to?(*args)
@h.respond_to?(*args)
end

So that the new hash correctly answers which methods it responds to
(the same as Hash).

Pedro.

Pedro Côrte-Real, Aug 21, 2006
20. Just Another Victim of the Ambient MoralityGuest

Re: Why Does Hash Apparently Reorder Its Internal Representation And Other Associated Ponderings

"Martin DeMello" <> wrote in message
news:...
> On 8/21/06, Bil Kleb <> wrote:
>> Martin DeMello wrote:
>> > On 8/21/06, Bil Kleb <> wrote:
>> >
>> > You can do this automatically (if you aren't already) by creating an
>> > object that holds a hash and an array, defines []=, each and delete to
>> > do the right thing, and delegates missing methods to the hash.

>>
>> Hmmm... good idea. I've largely missed out on the
>> whole method_missing idiom. Sounds like a good use.
>>
>> I'll try to look into it after I return from JPL next
>> week. However, if you'd like to throw down an example,
>> I might be able to work it in now...

>
> Quick proof of concept:
>
> require 'enumerator'
>
> class OHash
> include Enumerable
>
> ...
>
> def method_missing(*args)
> @h.send(*args)
> end
> end

Is there a reason why you don't simply inheret from Hash, or is this
just another one of those There's More Than One Way To Do It scenarios?

Just Another Victim of the Ambient Morality, Aug 21, 2006