Opinions on library interface design?

K

Kirk Haines

I have a question about an interface design issue for Kansas that I wanted
to get some general input on.

Kansas, if you are unaware, is an object-relational mapping library. Result
sets from queries are returned as hash-like objects. However, be object
knows where it came from, and updating the object causes the underlying
database data to be updated transparently. This lets one manipulate the
data without worrying about writing SQL.

i.e.

-----
discontinued_products = ksdbh.select:)Products) {|p|
p.idx.between(104,114)}

discontinued_products.each do |dp|
dp.name = "#{dp.name} (DISCONTINUED)"
dp.status = 'unavailable'
end
-----

This small example would select all products with an idx between 104 and 114
from the database. Then by iterating over the array and changing attributes
on the objects, we change the underlying database. Great. Okay, now for
the question.

There are times where one would like to use the query interface to select
data, but one does not want the result sets that are returned to be coupled
to the database because the data is going to be manipulated in some sort of
temporary way. Maybe we select an array of results and we're going to
iterate over them and display them on a web page, but we want to run the
array through some sort of filtering or modification first. The changes are
for display only and are not to be applied back to the original data in the
database.

I have two thoughts about this. First would be nice way of telling the
select() method to return it's data in a different format than the standard
format. Maybe via a seperate set of options to the library or via an
additional parameter passed to the method.

Second would be to have some method attached to the data objects returned by
Kansas that would have the object return its data in some other object
format. Maybe it could be as simple as having the object honor a to_h()
method or a to_a() method to return a representation of itself as a hash or
an array?

The first approach includes more magic. However, if I am doing a select on
10000 rows, and I want to retrieve them simply as hashes, it'd be nice to be
able to easily tell Kansas that for this query, return the result set as
hashes without having to return the result set as a db coupled object, then
iterate through all of them asking for hash representation of each.

So....what is a nice, elegant way to tell select() that I want the result
set back as hashes, or arrays (or arrayfields), or structs, or whatever,
instead of the default db coupled objects?

Thanks in advance for your ideas,

Kirk Haines
 
H

Hal Fulton

Kirk said:
There are times where one would like to use the query interface to select
data, but one does not want the result sets that are returned to be coupled
to the database because the data is going to be manipulated in some sort of
temporary way. Maybe we select an array of results and we're going to
iterate over them and display them on a web page, but we want to run the
array through some sort of filtering or modification first. The changes are
for display only and are not to be applied back to the original data in the
database.

I have two thoughts about this. First would be nice way of telling the
select() method to return it's data in a different format than the standard
format. Maybe via a seperate set of options to the library or via an
additional parameter passed to the method.

Second would be to have some method attached to the data objects returned by
Kansas that would have the object return its data in some other object
format. Maybe it could be as simple as having the object honor a to_h()
method or a to_a() method to return a representation of itself as a hash or
an array?

The first solution seems nicer to me (although the second one doesn't
seem totally exclusive of that). You might add on a parameter or even
use a different method name (select! comes to mind, but it's probably
not appropriate).

For one thing, I can't help thinking that it's more efficient to return
"uncoupled" data except when you need it coupled.

But I am not a db guy, so my opinion is far from definitive.


Hal
 
A

Ara.T.Howard

Kansas, if you are unaware, is an object-relational mapping library. Result
sets from queries are returned as hash-like objects. However, be object
knows where it came from, and updating the object causes the underlying
database data to be updated transparently. This lets one manipulate the
data without worrying about writing SQL.

i was unaware - sounds super cool - i'll check it out today.

So....what is a nice, elegant way to tell select() that I want the result
set back as hashes, or arrays (or arrayfields), or structs, or whatever,
instead of the default db coupled objects?

i like the idea of NOT using 'select' since, in essence these object have not
been 'selected' as our working set - we would only be looking at them. to me
that suggests a completely new method name, something like

def browse criteria, as = Hash
cb = block_given?
accum = []

query(criteria) do |intermediate|
res =
case as
when Hash
intermediate.to_h
when Array
intermediate.to_a
when Struct
intermediate.to_su
else
as.new intermediate
end

cb ? yield res : accum << res
end

cb ? self : accum
end

which provide some canned types to be returns and would provide user
defineable types to be returned (case - else).

thoughts...

-a
--
===============================================================================
| EMAIL :: Ara [dot] T [dot] Howard [at] noaa [dot] gov
| PHONE :: 303.497.6469
| A flower falls, even though we love it;
| and a weed grows, even though we do not love it.
| --Dogen
===============================================================================
 
K

Kirk Haines

On Thu, 1 Jul 2004 01:02:52 +0900, Ara.T.Howard wrote
i was unaware - sounds super cool - i'll check it out today.

I have a couple of super stupid bugs in the current release at rubyforge.
I'll release a slight update shortly that fixes them. You can find some
docs at:

http://enigo.com/projects/kansas

It can be downloaded at:

http://rubyforge.org/projects/kansas
i like the idea of NOT using 'select' since, in essence these object
have not been 'selected' as our working set - we would only be
looking at them. to me that suggests a completely new method name,
something like

def browse criteria, as = Hash
cb = block_given?
accum = []

query(criteria) do |intermediate|
res =
case as
when Hash
intermediate.to_h
when Array
intermediate.to_a
when Struct
intermediate.to_su
else
as.new intermediate
end

cb ? yield res : accum << res
end

cb ? self : accum
end

which provide some canned types to be returns and would provide user
defineable types to be returned (case - else).

That's a good idea. I like the method name, too. I'm going to give this a
whirl in some code I'm writing now.


Thanks,

Kirk Haines
 
S

Sean O'Dell

I have two thoughts about this. First would be nice way of telling the
select() method to return it's data in a different format than the standard
format. Maybe via a seperate set of options to the library or via an
additional parameter passed to the method.

Second would be to have some method attached to the data objects returned
by Kansas that would have the object return its data in some other object
format. Maybe it could be as simple as having the object honor a to_h()
method or a to_a() method to return a representation of itself as a hash or
an array?

I think, at every row iteration, you should have a "accept" or "discard"
method. Changes probably shouldn't occur automatically without a call to
"accept", and row iterations without a call to "accept" should automatically
enact a "discard" action. That way, only when someone changes data and then
calls "accept" does the underlying data get called.

Doing it that way means that iterating over thousands of rows to build a web
page goes fast; no special calls are made just to read data, and changes are
automatically discarded. Reads are usually more common than writes, so
speeding up the read is probably the best idea.

Transforming rows into hashes at each iteration would be a burden to reads,
which are more common than writes. I would leave your code alone, and
present the data the way you are now, but don't write data changes directly
to the DB. Cache the changes and write to the DB only when "accept" is
called. Discard changes when "discard" is called or when the row is iterated
again.

Sean O'Dell
 
K

Kirk Haines

On Thu, 1 Jul 2004 02:04:47 +0900, Sean O'Dell wrote
Transforming rows into hashes at each iteration would be a burden to
reads, which are more common than writes. I would leave your code
alone, and present the data the way you are now, but don't write
data changes directly to the DB. Cache the changes and write to the
DB only when "accept" is called. Discard changes when "discard" is
called or when the row is iterated again.

Hmmm. What you are describing, basically, is to not autocommit changes.
This is a configurable setting in Kansas. When autocommit is off, changes
to an object only get committed to the database if commit() is called. If
rollback() is called, the object is returned to the state it was in
originally.

That is a viable solution, but I think that I prefer the notion that when
you use Kansas to select() from the database, you are getting back data that
is bidirectionally mapped to the database whereas something like browse()
can be used to perform syntactically identical selects that, rather than
returning bidirectionally mapped data, return something that is more speed
and memory efficient for reading (such as plain old DBI::Row objects or
arrays or whatever), such as what Ara suggested.

This doesn't preclude using the transactional features to avoid accidental
writes to the database. That's actually one very good reason to make use of
the transaction features, IMHO.


Thanks,

Kirk Haines
 
S

Sean O'Dell

On Thu, 1 Jul 2004 02:04:47 +0900, Sean O'Dell wrote


Hmmm. What you are describing, basically, is to not autocommit changes.
This is a configurable setting in Kansas. When autocommit is off, changes
to an object only get committed to the database if commit() is called. If
rollback() is called, the object is returned to the state it was in
originally.

That is a viable solution, but I think that I prefer the notion that when
you use Kansas to select() from the database, you are getting back data
that is bidirectionally mapped to the database whereas something like
browse() can be used to perform syntactically identical selects that,
rather than returning bidirectionally mapped data, return something that is
more speed and memory efficient for reading (such as plain old DBI::Row
objects or arrays or whatever), such as what Ara suggested.

This doesn't preclude using the transactional features to avoid accidental
writes to the database. That's actually one very good reason to make use
of the transaction features, IMHO.

If you prefer a browse method which is analogous to select, except that it's
read-only, then I would make browse simply turn off autocommit and then call
select. That way you are always giving row data to the developers using the
same objects; they don't have to re-write any code to suddenly decide that
they need to make small changes to a row as they iterate through a table.
Your pattern here could be "select creates a return set and changes to fields
are written, browse creates the same return set but changes to fields are
ignored." You could also allow changes to the data set returned by browse to
be written back to the database with an explicit call.

If you want hash-like functionality for your rows for both select and browse,
do something like what Ara did with arrayfields. Write a special method that
accesses fields in a row by name, so you can say:
dataset.fields_by_name['id'] or dataset.fields[0] and if id is the first
field in the row, both return the same value. You could also allow setting
the field to a new value this way.

Sean O'Dell
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top