[ANN] Lafcadio 0.7.0, 0.6.1: Excessively Clever Query Caching

F

Francis Hwang

Hi everybody,

I've just released the newest dev release of Lafcadio, 0.7.0, and the
bugfix release 0.6.1 for the stable branch.

== What's Lafcadio? ==
An object-relational mapping library for use with MySQL. It supports a
lot of advanced features, including in-Ruby field value checking,
extensive aid in mapping to legacy databases, an advanced query engine
that allows you to form queries in Ruby that can be run either against
the live database, or an in-memory mock store for testing purposes.

Lafcadio is more than a year old and is currently in use on production
websites, most notably http://rhizome.org/, an online community that
has a 6-year-old legacy database and gets more than 3 million hits a
month.

== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this: Everytime you run a
select against the DB, Lafcadio caches the results in memory. Then, if
you later run a second select that is a subset of the first, Lafcadio
detects it, figures out what it's a subset of, filters out the results
in memory, and returns you the results. This all happens transparently.

What does this mean? It means a significantly faster app, because if
you run these three queries:

select * from users where lname = 'Smith'
select * from users where lname = 'Smith' and fname like '%john%'
select * from users where lname = 'Smith' and email like '%hotmail%'

Lafcadio will only ask MySQL for the results for the first select
statement, and do the rest for you without using the DB connection.

Francis Hwang
http://fhwang.net/
 
S

Shashank Date

Hey Francis,

--- Francis Hwang said:
I've just released the newest dev release of
Lafcadio, 0.7.0, and the
bugfix release 0.6.1 for the stable branch.
Link?

http://rubyforge.org/projects/lafcadio


== What's Lafcadio? ==

== What's new in 0.7.0? ==
Excessively Clever Query Caching goes like this:

<snip>

Awesome ! This is of great interest to me.

Have you though about parallel query dispatch over
horizontally partitioned data? I have done something
like this for MS SQL 2000. Interested?


-- shanko




__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com
 
F

Francis Hwang

<snip>

Awesome ! This is of great interest to me.

Have you though about parallel query dispatch over
horizontally partitioned data? I have done something
like this for MS SQL 2000. Interested?

Quite. But seeing as I'm pretty unschooled in DB theory in general,
I've never heard of "parallel query dispatch". Care to explain, or
offer a link?

Also, if you're interested in seeing this feature ported over to a DB
you use (such as MS SQL 2000) I'm open to extending Lafcadio to work
with any other DB as long as I've got people actively testing them on
other DBs. (I always use MySQL, hence Lafcadio's MySQL focus up 'til
now.)

Francis Hwang
http://fhwang.net/
 
S

Shashank Date

Quite. But seeing as I'm pretty unschooled in DB
theory in general, I've never heard of "parallel
query dispatch".

Well, of course ! My bad: I am using our internal
terminology while talking to outside world ;-)

The correct term is "Federated Databases". And even
that term is context dependant. Google it in the
context of SQL Server 2K and you will get what I mean.
Care to explain, or offer a link?
http://www.sql-server-performance.com/federated_databases.asp

Also, if you're interested in seeing this feature
ported over to a DB
you use (such as MS SQL 2000) I'm open to extending
Lafcadio to work
with any other DB as long as I've got people
actively testing them on other DBs.

I can surely help testing. Especially if involves
running test cases in the background. I won't be able
to devote too much time on the foreground though.
(I always use MySQL, hence Lafcadio's
MySQL focus up 'til
now.)

No problem. Let me know how I can get started.

-- shanko



__________________________________
Do you Yahoo!?
Meet the all-new My Yahoo! - Try it today!
http://my.yahoo.com
 
F

Francis Hwang


Intriguing stuff. Once you've set this up in MS SQL 2k, what
requirements are there for a client to manage them? I mean, besides
what the database takes care of for you automatically.

And by the way, if you're working with federated databases, how big are
these tables you're dealing with? I'm just wondering how much bigger
the tables at my work can get before I need to look into something like
this.
I can surely help testing. Especially if involves
running test cases in the background. I won't be able
to devote too much time on the foreground though.

Well, I'll put "port to MS SQL" on my to-do list and let you know when
a new beta release has MS SQL support ... then I just need a steady
supply of specific bug reports to chase down, after that.

Francis Hwang
http://fhwang.net/
 
S

Shashank Date

Hi Francis,
Intriguing stuff. Once you've set this up in MS SQL
2k, what requirements are there for a client to
manage them? I mean, besides what the database
takes care of for you automatically.

Umm .... mantaining the indexes comes to mind. I don't
know the details since we never actually used it as it
comes out of the box. We found out that the queries
were not being executed in parallel. Hence we wrote
our own version (in Ruby of course) and called it
"parallel query dispatcher" :)
And by the way, if you're working with federated
databases, how big are
these tables you're dealing with? I'm just wondering
how much bigger
the tables at my work can get before I need to look
into something like
this.

It is not only the size that matters (in this case
;-)) but the nature of the application. Our data is
being collected at various data centers and then
coalesced at the central server. So it comes naturally
partitioned. Further our queries are rarely (almost
never) across the partitions. This is a very important
aspect which lends itself to federation.
Add to that the fact that our combined database is
about 100GB and tables are typically over 5 Million
rows. So when we did not have the budget to scale up
we decided to scale out and were reasonably
successful. We were in production for almost a year on
four 3-server clusters throwing hundreds of queries
every day. We did dynamic load balancing and were
working on query caching (like the one you have
provided in Lafcadio) when the project got the
attention of higher-ups and a more generous budget to
scale up ... which almost always is a better
alternative.
Well, I'll put "port to MS SQL" on my to-do list and
let you know when
a new beta release has MS SQL support ... then I
just need a steady supply of specific bug reports
to chase down, after that.

Great ! Let me know ...

-- shanko



__________________________________
Do you Yahoo!?
Yahoo! Mail - 250MB free storage. Do more. Manage less.
http://info.mail.yahoo.com/mail_250
 
F

Francis Hwang

Hi Francis,


Umm .... mantaining the indexes comes to mind. I don't
know the details since we never actually used it as it
comes out of the box. We found out that the queries
were not being executed in parallel. Hence we wrote
our own version (in Ruby of course) and called it
"parallel query dispatcher" :)

So are you saying that the data came naturally partitioned, and you
left it partitioned, and then used Ruby to analyze queries and dispatch
them to the right database transparently? I suppose I could use a
concrete example to help me grok this.

Francis Hwang
http://fhwang.net/
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top