A use case for an ordered hash

H

Hal Fulton

There have been numerous occasions when I wanted an
ordered hash, but usually I can't remember to write
them down.

Here's just one.

Once I wanted a "dynamic case statement" of sorts.
I wanted to use procs as values in a hash. Something
like:

actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}

Then I could just iterate through the keys looking
for a match, then use that key to find the associated
proc and call it.

However, I quickly noticed that this is no good. The
order of iteration is unpredictable. So I couldn't
guarantee that (for example) /abcd/ would be tested
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.


Hal
 
M

M. Edward (Ed) Borasky

Hal said:
There have been numerous occasions when I wanted an
ordered hash, but usually I can't remember to write
them down.

Here's just one.

Once I wanted a "dynamic case statement" of sorts.
I wanted to use procs as values in a hash. Something
like:

actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}

Then I could just iterate through the keys looking
for a match, then use that key to find the associated
proc and call it.

However, I quickly noticed that this is no good. The
order of iteration is unpredictable. So I couldn't
guarantee that (for example) /abcd/ would be tested
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.


Hal

That sort of code is what Chuck Moore evolved into Forth. :) Seriously,
though, isn't there a way to do this with a single array? Store the
patterns followed by the lambdas in sequence. Find the first index that
matches the pattern and take the lambda at [index+1]? Or two parallel
arrays? I use two parallel arrays for this sort of thing a lot -- it's
easy for me to read, given my FORTRAN background. :)
 
H

Hal Fulton

M. Edward (Ed) Borasky said:
That sort of code is what Chuck Moore evolved into Forth. :) Seriously,
though, isn't there a way to do this with a single array? Store the
patterns followed by the lambdas in sequence. Find the first index that
matches the pattern and take the lambda at [index+1]? Or two parallel
arrays? I use two parallel arrays for this sort of thing a lot -- it's
easy for me to read, given my FORTRAN background. :)

Yes, I could use a single flat array... I could also program
in FORTRAN... ;)


Hal
 
F

Francis Cianfrocca

actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}


actions.keys.sort {|a,b| b.to_s <=> a.to_s}.each {|k|
if k =~ your_string
actions[k].call
break
end
}

A more-greedy regex sorts lexicographically behind a head-matching
less-greedy one.
 
H

Hal Fulton

Francis said:
actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}


actions.keys.sort {|a,b| b.to_s <=> a.to_s}.each {|k|
if k =~ your_string
actions[k].call
break
end
}

A more-greedy regex sorts lexicographically behind a head-matching
less-greedy one.

Clever workaround. Thanks for that.

There are all kinds of workarounds for situations like these.

I don't *need* an ordered hash. I don't even *need* a case
statement, as I could do the same thing with a bunch of ifs.
And so on.


Hal
 
R

Robert Klemme

Hal said:
There have been numerous occasions when I wanted an
ordered hash, but usually I can't remember to write
them down.

Here's just one.

Once I wanted a "dynamic case statement" of sorts.
I wanted to use procs as values in a hash. Something
like:

actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}

Then I could just iterate through the keys looking
for a match, then use that key to find the associated
proc and call it.

However, I quickly noticed that this is no good. The
order of iteration is unpredictable. So I couldn't
guarantee that (for example) /abcd/ would be tested
before /abc/, and so on.

So yeah, I ended up using an array of arrays. But it
just felt wrong.

To me a Hash feels wrong. Why? Because you don't make any use of hash
properties for fast lookup. You just iterate in plain order.

Kind regards

robert
 
M

Matt Todd

Absolutely... an Array is perfect for ordered information.

actions = [
{ /abcd/ => lambda { do_this } },
{ /xyz/ => lambda { do_that } },
{ /abc/ => lambda { other } }
]

Then, all you do is...

actions.select { |action| action[input] }

where input = /xyz/ you will receive the appropriate hash (key and
Proc) in an array. So, to do something with that, you could...

actions.select { |action| action[input] }.first[input].call

which calls the Proc associated with the matching element.

I'm not too thrilled with its convolutedness, and I'm sure there's a
better way, but I'm pretty sleepy, so this is the best I'm coming up
with right now. :) As always, you can write a method to hide the gory
details.

def exec_route input
actions.select { |action| action[input] }.first[input].call
end

and just call:

exec_route /xyz/

and blam! do_that is executed!

Hope this helps.

M.T.
 
M

Mauricio Fernandez

Absolutely... an Array is perfect for ordered information.

actions = [
{ /abcd/ => lambda { do_this } },
{ /xyz/ => lambda { do_that } },
{ /abc/ => lambda { other } }
]

Then, all you do is...

actions.select { |action| action[input] }

where input = /xyz/ you will receive the appropriate hash (key and
Proc) in an array. So, to do something with that, you could...

actions.select { |action| action[input] }.first[input].call

which calls the Proc associated with the matching element.

That's not what he wants though (the point is matching against the Regexp ---
Hash#[] doesn't buy you anything there).

Maybe

require 'enumerator'
def ASSOC(x); x.enum_for:)each_slice, 2).to_a end

actions = ASSOC [
/abcd/, lambda{ 1 },
/xyz/, lambda{ 2 },
/abc/, lambda{ 3 }
]

# Then he's got to

def run(command, actions)
actions.each{|re, action| return action[command] if re =~ command}
nil
end

run("abcdef", actions) # => 1
run("abcf", actions) # => 3


# But this would be nicer
class Array
def cassoc(x) # "case assoc"
each{|arr| return arr if arr[0] === x}
nil
end
end

re, action = actions.cassoc("abcdef")
action and action.call # => 1
re, action = actions.cassoc("abcx")
action and action.call # => 3
 
J

James Edward Gray II

To me a Hash feels wrong. Why? Because you don't make any use of
hash properties for fast lookup. You just iterate in plain order.

He *is* making use of a Hash property. He wants the keys to be
unique. None of the other solutions shown in this thread have
addressed that.

James Edward Gray II
 
M

Mauricio Fernandez

Hal said:
Once I wanted a "dynamic case statement" of sorts.
I wanted to use procs as values in a hash. Something
like:
actions = { /abcd/ => lambda { do_this },
/xyz/ => lambda { do_that },
/abc/ => lambda { other }}
[...]
To me a Hash feels wrong. Why? Because you don't make any use of
hash properties for fast lookup. You just iterate in plain order.

He *is* making use of a Hash property. He wants the keys to be
unique. None of the other solutions shown in this thread have
addressed that.

Hal is making the case for *literal* "ordered hashes". The keys will be
unique because he's writing them himself manually. I don't think he wants to
make use of this:

actions = { /a/ => 1, # <--
/b/ => 2, # \
/a/ => 4 # <--+-- why do this in the first place.
}
actions[/a/] # => 4

The way he uses the structure is reminiscent of Array#assoc. The only thing
a Hash has going for it in this case is the cleaner notation and the arrow.
But
actions = ASSOC [/abcd/, lambda{ ... },
/abc/, lambda{ ... } ]
(using parentheses if you want) looks good enough for me, and something like
the Array#cassoc I showed before would do fine in this case.
 
H

Hal Fulton

Robert said:
To me a Hash feels wrong. Why? Because you don't make any use of hash
properties for fast lookup. You just iterate in plain order.

I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Hal
 
H

Hal Fulton

Mauricio said:
# But this would be nicer
class Array
def cassoc(x) # "case assoc"
each{|arr| return arr if arr[0] === x}
nil
end
end

re, action = actions.cassoc("abcdef")
action and action.call # => 1
re, action = actions.cassoc("abcx")
action and action.call # => 3

Now, that's clever. That's probably the least painful
solution I've seen using arrays.

I still wish we had a so-called ordered hash. ;)


Hal
 
J

James Edward Gray II

# But this would be nicer
class Array
def cassoc(x) # "case assoc"
each{|arr| return arr if arr[0] === x}
nil
end
end

You could use find() there, right?

class Array
def cassoc(match)
find { |arr| arr.first === match }
end
end

James Edward Gray II
 
R

Robert Klemme

Hal said:
I never use a hash for fast lookup. I use it because it
allows me easily to associate an arbitrary key with a
value.

Um... So you're saying that you wouldn't bother if lookups were O(n)? Wow!

Kind regards

robert
 
H

Hal Fulton

Robert said:
Um... So you're saying that you wouldn't bother if lookups were O(n)?
Wow!

I think you mean "wouldn't care"?

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

If I were storing 20 million elements, I wouldn't use a Hash
either. I'd either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

I'm a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I'd have said it was faster than O(n), of course.)


Hal
 
R

Robert Klemme

Hal said:
I think you mean "wouldn't care"?

Probably. I'm not a native speaker so I'm certainly not authoritative
on this matter. :)
Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

That probably also depends on the frequency of lookups. :) But yes, my
hashes are typically larger (although I also use small ones).
If I were storing 20 million elements, I wouldn't use a Hash
either. I'd either use a database or flat file; or if I
needed it in RAM I might customize something (or likely use
the red-black tree implementation).

I'm a late-optimizer. Until you mentioned it now, I had never
thought of speed as a reason to use a hash. (Although, if you
had asked me, I'd have said it was faster than O(n), of course.)

Interesting.

GoOd (n)ight

robert

;-)
 
J

John Carter

There have been numerous occasions when I wanted an
ordered hash, but usually I can't remember to write
them down.

I'm in the habit of automagically writing...

hash.keys.sort

or sometimes

hash.keys.sort_by{|k| hash[k]}.each{|k| value=hash[k]; ... }

It's just such a common idiom in my code I think I should push it into
my little collection of standard extensions to standard objects.

Why is it so common? Nobody wants a script that does very things in a
very different order depending on very small changes.

* You just spend too long explaining to people why this happens.
* Makes sporadic bugs even harder to find (it works for me, but customer is whinging)
* The results tend not to be comparable (can't diff) or repeatable.

John Carter Phone : (64)(3) 358 6639
Tait Electronics Fax : (64)(3) 359 4632
PO Box 1645 Christchurch Email : (e-mail address removed)
New Zealand

Carter's Clarification of Murphy's Law.

"Things only ever go right so that they may go more spectacularly wrong later."

From this principle, all of life and physics may be deduced.
 
P

Phillip Hutchings

I think you mean "wouldn't care"?

Anyway: I'm happy that lookups are fast. But 99% of the time,
my hashes are small enough that the time savings is probably
very small in terms of program execution time.

Unfortunately for you some of us do things that involve searching
through 30,000 sales records to pull out Joe Bob's statistics.
Sometimes you just don't want to do that on the production database,
it just uses far too many resources. Using an ordered hash for one
person's benefit would kill the speed and jack up memory usage.

I know an ordered associative array (which I think is a better
description of the functionality) would make some things very easy,
but replacing the Hash is not the way to go.

Something like this would probably sit better, and is still shorter
than the PHP equiv.:

# a = [['key 1','1'],['key 2', '2'],['key 3',3]].toAssoc

# a['key 1']
=> 1

# p a
=> [['key 1','1'],['key 2', '2'],['key 3',3]]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top