Manipulating of an Array of Structs

A

Anthony Wright

I have data that I've extracted from a database, that I wish to pass to
a calling (external) piece of software through Drb.

The are a large number of data set formats, but an employee based
example might look something like:

ID Name Age
1 Mary Smith 23
3 Frank Zappa 52
19 Mary Jones 41

I'm trying to find a "nice" structure to pass this data around in, so
that it can be easily understood and manipulated.




I'm currently using an Array of Hashes so it looks something like:

[ { :id => 1, :name => "Mary Smith", :age => 23 }, { :id => 3, :name =>
"Frank Zappa", :age => 52 }, { :id =>19, :name => "Mary Jones", :age =>
41 } ]

or more nicely formatted

[
{ :id => 1, :name => "Mary Smith", :age => 23 },
{ :id => 3, :name => "Frank Zappa", :age => 52 },
{ :id =>19, :name => "Mary Jones", :age => 41 }
]



This is fine if I want to extract single pieces of information, but it's
clumsy to manipulate the whole array, for example, extract out all the
names, sort the array by age, create a new array containing employees
over 40. It's also inefficient as I'm repeating the labels on every record.

I realise I can do this by writing block code to process the array, but
I feel there must be a cleaner way to store and manipulate data of this
form in ruby. Something on the lines of Struct would be great to remove
the excessing labels, but Struct doesn't help with the array manipulation.


Suggestions would be really welcome, preferabling using standard
classes, but will look at extensions.

thanks,

Anthony Wright.
 
M

Matthew K. Williams

I have data that I've extracted from a database, that I wish to pass to
a calling (external) piece of software through Drb.

The are a large number of data set formats, but an employee based
example might look something like:

ID Name Age
1 Mary Smith 23
3 Frank Zappa 52
19 Mary Jones 41

I'm trying to find a "nice" structure to pass this data around in, so
that it can be easily understood and manipulated.




I'm currently using an Array of Hashes so it looks something like:

[ { :id => 1, :name => "Mary Smith", :age => 23 }, { :id => 3, :name =>
"Frank Zappa", :age => 52 }, { :id =>19, :name => "Mary Jones", :age =>
41 } ]

or more nicely formatted

[
{ :id => 1, :name => "Mary Smith", :age => 23 },
{ :id => 3, :name => "Frank Zappa", :age => 52 },
{ :id =>19, :name => "Mary Jones", :age => 41 }
]



This is fine if I want to extract single pieces of information, but it's
clumsy to manipulate the whole array, for example, extract out all the
names, sort the array by age, create a new array containing employees
over 40. It's also inefficient as I'm repeating the labels on every record.

I realise I can do this by writing block code to process the array, but
I feel there must be a cleaner way to store and manipulate data of this
form in ruby. Something on the lines of Struct would be great to remove
the excessing labels, but Struct doesn't help with the array manipulation.

Take a gander at Dr Nic's map_by_method gem, it has this functionality --
including sort_by, group_by, and index_by

http://drnicwilliams.com/category/ruby/map_by_method/
 
B

Brian Candler

Anthony said:
I have data that I've extracted from a database, that I wish to pass to
a calling (external) piece of software through Drb.

The are a large number of data set formats, but an employee based
example might look something like:

ID Name Age
1 Mary Smith 23
3 Frank Zappa 52
19 Mary Jones 41

I'm trying to find a "nice" structure to pass this data around in, so
that it can be easily understood and manipulated.




I'm currently using an Array of Hashes so it looks something like:

[ { :id => 1, :name => "Mary Smith", :age => 23 }, { :id => 3, :name =>
"Frank Zappa", :age => 52 }, { :id =>19, :name => "Mary Jones", :age =>
41 } ]

or more nicely formatted

[
{ :id => 1, :name => "Mary Smith", :age => 23 },
{ :id => 3, :name => "Frank Zappa", :age => 52 },
{ :id =>19, :name => "Mary Jones", :age => 41 }
]



This is fine if I want to extract single pieces of information, but it's
clumsy to manipulate the whole array, for example, extract out all the
names,

arr.map { |e| e[:name] }
sort the array by age,

arr.sort_by { |e| e[:age] }
create a new array containing employees
over 40.

arr.select { |e| e[:age] > 40 }
It's also inefficient as I'm repeating the labels on every
record.

Beware of premature optimisation... do what's simplest or cleanest, and
only when it's too slow, profile it to work out where the improvement is
needed.

However I think you'll find it's pretty efficient as it is. You are
using symbols, so there is only a single object in the system for each
of :name, :age etc. It is the object ID (4 bytes on a 32-bit system)
which is the hash key.

Having said that, it *will* be unpacked to the sequence of characters by
DRb.

irb(main):002:0> Marshal.dump:)name)
=> "\004\b:\tname"

Using a Hash is slightly more efficient from this point of view than a
Struct.

irb(main):006:0> Emp = Struct.new:)id,:name,:age)
=> Emp
irb(main):007:0> frank = Emp.new(3, "Frank Zappa", 52)
=> #<struct Emp id=3, name="Frank Zappa", age=52>
irb(main):008:0> Marshal.dump(frank)
=> "\004\bS:\bEmp\b:\aidi\b:\tname\"\020Frank Zappa:\bagei9"
irb(main):009:0> frank2 = {:id=>3, :name=>"Frank Zappa", :age=>52}
=> {:name=>"Frank Zappa", :age=>52, :id=>3}
irb(main):010:0> Marshal.dump(frank2)
=> "\004\b{\b:\tname\"\020Frank Zappa:\bagei9:\aidi\b"

You'll squidge it down a bit by using an Array for each employee instead
of a Hash, but it's no longer a '"nice" structure' that 'can be easily
understood and manipulated.'
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top