Help optimizing

L

Luke Ivers

I'm going to cross-post this from the Rails group, because some of the
people here are Ruby ninjas and don't read that forum, and I'd like help
getting this function optimized... its results are useful in a Rails
perspective, but it's functionality has nothing to do with Rails at all.

class Hash
def to_params(parent = '')
ret = ''
self.keys.each do |key|
if self[key].is_a? Hash
if parent == ''
ret += self[key].to_uri(key.to_s)
else
ret += self[key].to_uri(parent + "[#{key.to_s}]")
end
else
if parent == ''
ret += "#{key}=#{self[key]}&"
else
ret += "#{parent}[#{key}]=#{self[key]}&"
end
end
end
return ret.chomp('&')
end
end

Anybody got any optimizations(either quicker speed, or less text and
comparable speed) for that one?

Thanks.
 
L

Luke Ivers

ret += self[key].to_uri(key.to_s)
else
ret += self[key].to_uri(parent + "[#{key.to_s}]")

s/to_uri/to_params

Originally wrote it with a different function name.
 
P

Phrogz

I'm going to cross-post this from the Rails group, because some of the
people here are Ruby ninjas and don't read that forum, and I'd like help
getting this function optimized... its results are useful in a Rails
perspective, but it's functionality has nothing to do with Rails at all.

class Hash
def to_params(parent = '')
ret = ''
self.keys.each do |key|
if self[key].is_a? Hash
if parent == ''
ret += self[key].to_uri(key.to_s)
else
ret += self[key].to_uri(parent + "[#{key.to_s}]")
end
else
if parent == ''
ret += "#{key}=#{self[key]}&"
else
ret += "#{parent}[#{key}]=#{self[key]}&"
end
end
end
return ret.chomp('&')
end
end

I realize the code is there, but I'm having trouble figuring out what
it's used for. Could you describe what 'parent' is, and what it means
to have nested hashes?

One guess I have at improving the speed it to change:
self.keys.each{ |key|
# repeated function calls needed for self[key]
}
to
self.each_pair{ |key,value|
# direct references to value
}
 
L

Luke Ivers

Gavin said:
I realize the code is there, but I'm having trouble figuring out what
it's used for. Could you describe what 'parent' is, and what it means
to have nested hashes?
For {:test => 'testing'} the return should be test=testing
For {:test => {:test => 'testing'}} it should be test[test]=testing
Any nested hashes past that point should just continue to be added on as
an array reference:
{:test => {:test => {:test => 'testing'}}}
=> test[test][test] = testing
One guess I have at improving the speed it to change:
self.keys.each{ |key|
# repeated function calls needed for self[key]
}
to
self.each_pair{ |key,value|
# direct references to value
}

That one helps.
 
P

Phrogz

For {:test => 'testing'} the return should be test=testing
For {:test => {:test => 'testing'}} it should be test[test]=testing
Any nested hashes past that point should just continue to be added on as
an array reference:
{:test => {:test => {:test => 'testing'}}}
=> test[test][test] = testing

So, the parent attribute is really just for the recursive calls?

Is the output of the code correct for these cases:

puts( { :foo=> 'bar', :jim=>'jam time' }.to_params )
#=> jim=jam time&foo=bar

puts( { :foo => { :bar=>1, :jim=>'jam' } }.to_params )
#=> foo[jim]=jam&foo[bar]=1

puts( { :foo =>
{ :bar=>{ :jim=>'jam', :jar=>'jib' }, :jim=>'jam' } }.to_params )
#=> foo[jim]=jam&foo[bar][jim]=jam&foo[bar][jar]=jib

(Note the unencoded space in the first example.)
 
L

Luke Ivers

I wrote another function that basically does the exact same thing, only
tries to do it in as short a space as possible: here's what I got (and
also benchmark times for the two of them)

I also tried splitting the original function into two seperate functions
so it didn't have to do as many comparisons.

require 'benchmark'

class Hash
#to avoid spamming more than necessary, see original email in thread
for definition of to_params

def to_params2(parent = '')
self.keys.inject('') do |k, v|
(self[v].is_a? Hash) ?
(parent == '' ? k += self[v].to_params2(v.to_s) : k +=
self[v].to_params2(parent + "[#{v.to_s}]")) :
(parent == '' ? k += "#{v}=#{self[v]}&" : k +=
"#{parent}[#{v}]=#{self[v]}&")
end
end

def to_params3()
ret = ''
self.each_pair do |key, value|
if value.is_a? Hash
ret += value.to_params3_with_parent(key.to_s)
else
ret += "#{key}=#{value}&"
end
end
return ret.chomp('&')
end

def to_params3_with_parent(parent)
ret = ''
self.each_pair do |key, value|
if value.is_a? Hash
ret += value.to_params3_with_parent(parent + "[#{key.to_s}]")
else
ret += "#{parent}[#{key}]=#{value}&"
end
end
return ret.chomp('&')
end

end

n = 100000
h = {:user => {:subuser => {:name => 'test'}, :name => 'test2'}, :name
=> 'test3'}

Benchmark.bm do |x|
x.report { n.times do; h.to_params; end }
x.report { n.times do; h.to_params2; end }
x.report { n.times do; h.to_params3; end }
end

Here's the results:

user system total real
5.132000 0.000000 5.132000 ( 5.174000) #original
5.803000 0.000000 5.803000 ( 5.844000) #with inject
4.868000 0.000000 4.868000 ( 4.880000) #split into two functions

Can anyone do better than the last one?
 
L

Luke Ivers

Is the output of the code correct for these cases:

Yes.

Noting the unencoded space: not a big deal. I can encode the whole
string later, or throw in an encode in the parameterization process, but
that's not relevant to just optimizing the base functionality.

:)
 
R

Robert Klemme

I'm going to cross-post this from the Rails group, because some of the
people here are Ruby ninjas and don't read that forum, and I'd like help
getting this function optimized... its results are useful in a Rails
perspective, but it's functionality has nothing to do with Rails at all.

class Hash
def to_params(parent = '')
ret = ''
self.keys.each do |key|
if self[key].is_a? Hash
if parent == ''
ret += self[key].to_uri(key.to_s)
else
ret += self[key].to_uri(parent + "[#{key.to_s}]")
end
else
if parent == ''
ret += "#{key}=#{self[key]}&"
else
ret += "#{parent}[#{key}]=#{self[key]}&"
end
end
end
return ret.chomp('&')
end
end

Anybody got any optimizations(either quicker speed, or less text and
comparable speed) for that one?

Use << instead of += in all places.

replace the line

self.keys.each do |key|

with

each do |key, value|

Replace "self[key]" with "value" then.

Maybe change the major if then else with case when end in order to more
easily adjust to special treatment of other types than Hashes.

If you need more efficiency improvements, extract the "if parent=''"
from the loop, make it a top level decision and have two iterations (if
and else branch).

And, make the string / stream to append to a parameter. That way you
don't need to create potentially large strings during recursion before
you append them but you can directly append - you basically just have one.

Typing left as an exercise for the reader. :)

Kind regards

robert
 
R

Robert Klemme

And, make the string / stream to append to a parameter. That way you
don't need to create potentially large strings during recursion before
you append them but you can directly append - you basically just have one.

PS: Forgot to mention that the last one might be one of the improvements
that bring most benefits together with using <<. The pattern is

def meth(out = '')
...
...
# recursion
another.meth(out)
...
out
end

Have fun!

robert
 
L

Luke Ivers

And, make the string / stream to append to a parameter. That way you
don't need to create potentially large strings during recursion before
you append them but you can directly append - you basically just have
one.
This one was great: using two functions as I did in one of the earlier
emails, however, still provides a significant speed bump over a
top-level if-else decision on parent==''.

I changed the two-function thing to pass the returned string as a param,
and re-wrote the original function using exactly your suggestions,
giving these benchmarks:

robert 4.414000 0.000000 4.414000 ( 4.461000)
two-fun 4.088000 0.000000 4.088000 ( 4.097000)
 
L

Luke Ivers

Luke said:
This one was great: using two functions as I did in one of the earlier
emails, however, still provides a significant speed bump over a
top-level if-else decision on parent==''.

I changed the two-function thing to pass the returned string as a param,
and re-wrote the original function using exactly your suggestions,
giving these benchmarks:

robert 4.414000 0.000000 4.414000 ( 4.461000)
two-fun 4.088000 0.000000 4.088000 ( 4.097000)


Oh, and I used str.concat: should I have used something else?
 
S

Simon Kröger

Luke said:
Here's the results:

user system total real
5.132000 0.000000 5.132000 ( 5.174000) #original
5.803000 0.000000 5.803000 ( 5.844000) #with inject
4.868000 0.000000 4.868000 ( 4.880000) #split into two functions

Can anyone do better than the last one?

That wasn't fair at all (you know i couldn't resist, do you?) :)

--------------------------------------------------------------------------
def to_params4()
result = ''
stack = []

each do |key, value|
Hash === value ? stack << [key, value] : result << "#{key}=#{value}&"
end

stack.each do |parent, hash|
hash.each do |key, value|
if Hash === value
stack << ["#{parent}[#{key}]", value]
else
result << "#{parent}[#{key}]=#{value}&"
end
end
end
result.chop
end
--------------------------------------------------------------------------

as you can see i unrolled the recursion, the benefit isn't that
high but it was fun to code.

(and yes, i know i shouldn't modify an array i'm iterating over,
but hey it seems to work (appending seems to be fine))

cheers

Simon
 
R

Robert Klemme

Oh, and I used str.concat: should I have used something else?

I'd use << - in that case you can also use StringIO and IO objects (i.e.
files). I don't think it makes a performance difference.

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top