Alle mercoled=EC 1 agosto 2007, Jean-S=E9bastien ha scritto:
Hi,
I'm tryind to count distinct ids from an array (array_of_ids =3D
[1,2,3,1,2]), and want a result array of hashes looking like it :
result =3D [{;id=3D>1,:count=3D>2},{;id=3D>2,:count=3D>2},{;id=3D>3,:coun= t=3D>1}]
i tried it, but it always returns errors :
result =3D array_of_ids.inject(Array.new){ |a,x| elt =3D a.find{|h| h[:id]
=3D=3D x} || {:id =3D> x, :count =3D> 0}; elt[:count] +=3D 1; elt}
i took my information from this post :
http://groups.google.com/group/comp.lang.ruby/browse_thread/thread/312221= 2f
6db8d7f5/04a7b10da8195cec?lnk=3Dgst&q=3Darray++count&rnum=3D8#04a7b10da819=
5cec
If someone could help.
regards
There are two problems in your code:
1- in the inject block, you return elt, which is (or should be, if the code=
=20
worked) the hash containing the id which is being processed. You should=20
return a, that is the array which contains the hashes. Correcting this shou=
ld=20
give a piece of code which executes without errors, but which returns an=20
empty array
2- you never insert the hashes you create inside the inject block into the=
=20
array a: you only store them in the local variable elt, which gets destroye=
d=20
after each iteration. The inject block should be:
result =3D array_of_ids.inject(Array.new){ |a,x|=20
elt =3D ( a.find{|h| h[:id] =3D=3D x} || a[a.size] =3D {:id =3D> x, :co=
unt =3D> 0} )
elt[:count] +=3D 1
a
}
As you can see, after the || operator a new hash is created and inserted at=
=20
the end of a (corresponding to the index a.size). Since an assignment alway=
s=20
return the value being assigned (this is why I didn't use <<, it returns th=
e=20
array, not the inserted element), elt is then set to the new hash. Of cours=
e,=20
all this happens only if find returns nil.
If you can rely in the id to be positive integers, and don't care if the=20
resulting array contains the hashes in the same order as the id are stored =
in=20
the array, here's another approach you can consider:
result =3D array_of_ids.inject([]) do |res, i|
res[i-1] ||=3D {:id =3D> i, :count =3D> 0}
res[i-1][:count] +=3D 1
res
end
This code stores the data relative to the id i in the i-1 position in the=20
array (the -1 is there to avoid a nil element at the beginning). This shoul=
d=20
make it faster, since you don't need to iterate all the array to check=20
whether the data corresponding to an id is already there or not: either is =
in=20
the position id - 1 or it itsn't there. A quick benchmark, done by creating=
=20
an array of ids of 100_000 elements, with values randomly chosen between 1=
=20
and 11 gives:
user system total real
original approach 5.730000 1.490000 7.220000 ( 7.310224)
user system total real
alternative approach 1.250000 0.150000 1.400000 ( 1.416871)
Changing the range of the ids from 11 to 101 gives:
user system total real
alternative approach 1.270000 0.190000 1.460000 ( 1.472353)
user system total real
original approach 37.730000 11.360000 49.090000 ( 51.056527)
Increasing it to 1001 gives
user system total real
alternative approach 1.500000 0.220000 1.720000 ( 1.733568)
The original approach takes much more time (I didn't have the patience to w=
ait=20
for it to complete).
I hope this helps
Stefano