A use case for an ordered hash

Chad Perrin · Aug 13, 2006

I'm in the habit of automagically writing...

hash.keys.sort

. . but if you want them in a *specific* order (other than that
provided by .sort), you'll have to write your own method, of course.

Hal Fulton · Aug 13, 2006

Phillip said:
Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob's statistics.
Sometimes you just don't want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person's benefit would kill the speed and jack up memory usage.

Your point is well taken.

However, it remains to be seen whether it would kill the speed.
Personally I doubt it would be significant, but I've seen no
numbers one way or the other.

Memory I would consider a more serious issue. Every pair in the
hash would occupy an extra N bytes (addmitedly a low value of N,
like 4). In the case of 30,000 pairs, that's an additional (say)
120K of RAM used.

As for "one person's benefit" -- it's true that the people
wanting this functionality are in the minority. But we are
far more than one person.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

That is definitely a better description of the finctionality.

I would even say that "associative array" alone would be a sufficient
term, since "array" implies order. (But I may be wrong there.)

I'd be happy with a separate class for that functionality *IF* there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

Hal

Trans · Aug 13, 2006

Hal said:
Your point is well taken.

However, it remains to be seen whether it would kill the speed.
Personally I doubt it would be significant, but I've seen no
numbers one way or the other.

Memory I would consider a more serious issue. Every pair in the
hash would occupy an extra N bytes (addmitedly a low value of N,
like 4). In the case of 30,000 pairs, that's an additional (say)
120K of RAM used.

As for "one person's benefit" -- it's true that the people
wanting this functionality are in the minority. But we are
far more than one person.

That is definitely a better description of the finctionality.

I would even say that "associative array" alone would be a sufficient
term, since "array" implies order. (But I may be wrong there.)

I'd be happy with a separate class for that functionality *IF* there
were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

+1 for Hal

Also, while not a full fledged class in it's own right, you might find
Facets' association.rb useful. Ex.

c = [ :a >> 1, :b >> 2 ]
c.each { |k,v| puts "#{k} associated with #{v} }

T.

Phillip Hutchings · Aug 13, 2006

I'd be happy with a separate class for that functionality *IF* there

were a convenient notation for literals. (For example, the combination
of square brackets and arrows that I mentioned previously, which Ruby
currently interprets as an array with a single element which is a hash.)

Without the literal notation, it would be to me just another kludge.

I would support this as there are uses for it.

Hal Fulton · Aug 14, 2006

Trans said:
Also, while not a full fledged class in it's own right, you might find
Facets' association.rb useful. Ex.

c = [ :a >> 1, :b >> 2 ]
c.each { |k,v| puts "#{k} associated with #{v} }

I've seen that, and it's kind of cool.

But you can't use an object as a key if it already defines >> for
something else (e.g. Fixnum).

Hal

Logan Capaldo · Aug 14, 2006

Trans said:
Trans said:

Also, while not a full fledged class in it's own right, you might
find
Facets' association.rb useful. Ex.
c = [ :a >> 1, :b >> 2 ]
c.each { |k,v| puts "#{k} associated with #{v} }

Click to expand...

I've seen that, and it's kind of cool.

But you can't use an object as a key if it already defines >> for
something else (e.g. Fixnum).

Hal

This is probably too verbose, even for a stop gap solution but:

class AssocList
def initialize(first, second)
@array = [ [ first.keys.first, first.values.first ],
[ second.keys.first, second.values.first ] ]
end

def <(other)
@array << [ other.keys.first, other.values.first ]
self
end
end

class Hash
def <(other)
AssocList.new(self, other)
end
end

{:a=>1} < {:b=>2} < {:c=>3}
=> #<AssocList:0x328564 @array=[[:a, 1], [:b, 2], [:c, 3]]>

I was going for f

a => 1 < :b => 2) but the operator precedence
isn't right

Austin Ziegler · Aug 14, 2006

I'm in the habit of automagically writing...

hash.keys.sort

or sometimes

hash.keys.sort_by{|k| hash[k]}.each{|k| value=hash[k]; ... }

In the case that Hal is talking about, and in the case where
PDF::Writer uses an ordered hash, this would be inappropriate.
Insertion-ordered associations are generally better. I will probably
be modifying PDF::Writer a bit to make it clear that it's using an
ordered association, not an ordered "hash", but it would be useful to
have an ordered association. Maybe we *could* take advantage of a
literal for this:

ordered_assoc = ( 1 => 2, 2 => 3, 'a' => 5 )

Maybe. For Ruby 2.0.

-austin

Hal Fulton · Aug 14, 2006

Austin said:
In the case that Hal is talking about, and in the case where
PDF::Writer uses an ordered hash, this would be inappropriate.
Insertion-ordered associations are generally better. I will probably
be modifying PDF::Writer a bit to make it clear that it's using an
ordered association, not an ordered "hash", but it would be useful to
have an ordered association. Maybe we *could* take advantage of a
literal for this:

ordered_assoc = ( 1 => 2, 2 => 3, 'a' => 5 )

Maybe. For Ruby 2.0.

Either (=>) or [=>] would be fine with me. I'd slightly prefer
the latter.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

I once suggested Map -- a nice short name -- but there was some
reason this wasn't acceptable.

I'd even suggest (gasp) letting it inherit from Hash.

Hal

Martin DeMello · Aug 14, 2006

Either (=>) or [=>] would be fine with me. I'd slightly prefer
the latter.

Or we could use another %?{ } constructor, which are, after all, being
reserved for future expansion.

But the big question is, what would the class be called? I dislike
long names such as Association or AssociativeArray or even
OrderedHash. (I suppose the last might be best of those.)

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Also, it would probably include http://www.rcrchive.net/rcr/show/306
along the way

m.

Austin Ziegler · Aug 14, 2006

OHash or Assoc or even Dict (though the latter at least implies an O(1) lookup).

Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

-austin

Martin DeMello · Aug 14, 2006

Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

True, I suppose you could do it with a doubly linked list overlaid on
a hash. Was thinking of sorted order.

martin

gwtmp01 · Aug 14, 2006

True, I suppose you could do it with a doubly linked list overlaid on
a hash. Was thinking of sorted order.

Can't you just maintain an internal array representing the key insertion
order? Something like:

class OHash < Hash
# ...
def each
keyseq.each { |k| yield k, self[k] }
end
end

Where keyseq is just an array that is updated whenever keys are
added/deleted?

Gary Wright

Austin Ziegler · Aug 14, 2006

Can't you just maintain an internal array representing the key insertion
order? Something like:

class OHash < Hash
# ...
def each
keyseq.each { |k| yield k, self[k] }
end
end

Where keyseq is just an array that is updated whenever keys are
added/deleted?

It's a little more complex than that, but that's pretty much what all
existing implementations do. We're talking about wanting a way of
having an ordered associative list (that defaults to insert ordering,
but can be changed to a sorted ordering, perhaps) in the Ruby core
language *with a literal constructor*, because there is value for it.

-austin

gwtmp01 · Aug 14, 2006

We're talking about wanting a way of
having an ordered associative list (that defaults to insert ordering,
but can be changed to a sorted ordering, perhaps) in the Ruby core
language *with a literal constructor*, because there is value for it.

Yep, I understand the need for the literal constructor. It is a bit
awkward to do:

OHash.new [ [1,2], [3,4], [4,5] ]

or something similar. Random ideas:

%o{ 1 => 2, 3 => 4 }
%o{ 1,2,3,4 }
{1,2,3,4}! # the ! indicates that order is important!
!{1,2,3,4}
%!{1,2,3,4}

Gary Wright

Bil Kleb · Aug 14, 2006

Hal said:
There have been numerous occasions when I wanted an
ordered hash, but usually I can't remember to write
them down.

Here's my latest one[1]:

I had a string with a bunch of tagged "tolerance fields",
e.g., >>1+/-10% 'tag'<< (a mini DSL?). Based on some sort
of distribution function, e.g., uniform or Gaussian,
I generated an array of samples for each tagged field.
A regular hash was a perfect fit,

samples[tag] = [ sample1, sample2, .. ]

*until* another requirement came up: to present the tags
in order of appearance.

Later,

Francis Cianfrocca · Aug 14, 2006

Here's my latest one[1]:

I had a string with a bunch of tagged "tolerance fields",
e.g., >>1+/-10% 'tag'<< (a mini DSL?). Based on some sort
of distribution function, e.g., uniform or Gaussian,
I generated an array of samples for each tagged field.
A regular hash was a perfect fit,

samples[tag] = [ sample1, sample2, .. ]

*until* another requirement came up: to present the tags
in order of appearance.

I've had to deal with this (interestingly, in something that was
*intended* as a DSL). Just add another field to the structure you
store in the hash and have it increment a global sequence number when
you create it. A little clunky but later you can say:

hash.values.sort {|a,b| a.seq <=> b.seq}.each {|v|
# whatever
}

Hal Fulton · Aug 15, 2006

Austin said:
Wouldn't it be the same as Hash lookup? It's still got every other
characteristic of a Hash; it just guarantees an order that defaults to
insertion order.

OHash lookup should be as fast as Hash lookup.

Insertion would be slightly slower, and iteration (and to_a) would
be slowed down more.

Hal

Hal Fulton · Aug 15, 2006

Bil said:
Here's my latest one[1]:

I had a string with a bunch of tagged "tolerance fields",
e.g., >>1+/-10% 'tag'<< (a mini DSL?). Based on some sort
of distribution function, e.g., uniform or Gaussian,
I generated an array of samples for each tagged field.
A regular hash was a perfect fit,

samples[tag] = [ sample1, sample2, .. ]

*until* another requirement came up: to present the tags
in order of appearance.

Later,
--
Bil
http://fun3d.larc.nasa.gov

[1] Part of a CEV (Crew Exploration Vehicle) task

See, guys? An ordered hash in Ruby would help support
human space exploration!

/me ponders and RCR and a Dyson sphere design

Hal

Hal Fulton · Aug 15, 2006

Christian said:
Fun history fact:

#524<5|>lilith:~/src$ grep Dict ruby-0.49/dict.c
...
C_Dict = rb_define_class("Dict", C_Object);
rb_name_class(C_Dict, rb_intern("Hash")); /* alias */
...

Now, that's funny.

At least that seems to show that the name Dict isn't outside
the realm of possibility.

Although it adds to the bad joke potential. "I Dict around
and made a Hash of everything. Array!!"

I'm trying to remember: What is wrong with the name Map?

Hal

William James · Aug 21, 2006

Phillip said:
I think you mean "wouldn't care"?

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

Click to expand...

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob's statistics.
Sometimes you just don't want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person's benefit would kill the speed and jack up memory usage.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

Something like this would probably sit better, and is still shorter
than the PHP equiv.:

# a = [['key 1','1'],['key 2', '2'],['key 3',3]].toAssoc

# a['key 1']
=> 1

# p a
=> [['key 1','1'],['key 2', '2'],['key 3',3]]

# a = ['key 1','1', 'key 2', '2', 'key 3',3].to_assoc
=> [["key 1", "1"], ["key 2", "2"], ["key 3", 3]]
# a.assoc('key 2')
=> ["key 2", "2"]
# a.set 'key 2', 99
=> ["key 2", 99]
# a
=> [["key 1", "1"], ["key 2", 99], ["key 3", 3]]

class Array
def to_assoc
f=nil ; partition{f=!f}.transpose
end
def set k, v
(pair = assoc(k)) ? self[ index(pair) ] = [k,v] : push( [k,v] )
end
def rm k
(pair = assoc( k )) && slice!( index( pair ) )
end
end

Boids, a use case	6	Mar 22, 2010
Case-insensitive hash?	2	Jan 6, 2006
Question about case statement and lambdas	6	Oct 30, 2008
Extending to an existing case menu	1	Nov 24, 2007
GET NEIL DEGRASSES TYSON, I ripped a hole with this one...	0	Nov 10, 2022
Get an arbitrary hash key, quickly.	43	Jan 24, 2008
Newbie: two cool discoveries	7	Nov 26, 2010
I need a case against the use of Pop ups for critical user functionality	3	Oct 26, 2004

A use case for an ordered hash

Chad Perrin

Hal Fulton

Trans

Phillip Hutchings

Hal Fulton

Logan Capaldo

Austin Ziegler

Hal Fulton

Martin DeMello

Austin Ziegler

Martin DeMello

gwtmp01

Austin Ziegler

gwtmp01

Bil Kleb

Francis Cianfrocca

Hal Fulton

Hal Fulton

Hal Fulton

William James

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads