Object/Relational Mapping is the Vietnam of Computer Science

Austin Ziegler · Mar 21, 2007

If you want a "real programming language" version of SQL, just use
PL/SQL with Oracle. Ew.

Which is a better language than most people think. What's interesting
is that it isn't a version of SQL, but a version of Ada (or Modula 2?)
with SQL cursors as a native data type and built-in recognition of
existing database data types and SQL statements. It's closer to Pro*C
(C/C++ with embedded SQL) than a programming language version of SQL.

It's still saddled with the limitations of SQL.

-austin

Phrogz · Mar 21, 2007

As you said, the idea isn't original, but as far as I know, the
formulation is unique to me and this particular conversation. I'm
somewhat flattered to have the suggestion, but something like "the
Rule of Data" is probably better

Given that "Data is King", I particularly like the (non-sexual) double-
entendre of "Rule of Data".

Chad Perrin · Mar 21, 2007

As you said, the idea isn't original, but as far as I know, the
formulation is unique to me and this particular conversation. I'm
somewhat flattered to have the suggestion, but something like "the
Rule of Data" is probably better

I'm inclined, then, to go with Ziegler's Rule of Data, or something
along those lines. It's a good'un.

Chad Perrin · Mar 21, 2007

Which is a better language than most people think. What's interesting
is that it isn't a version of SQL, but a version of Ada (or Modula 2?)
with SQL cursors as a native data type and built-in recognition of
existing database data types and SQL statements. It's closer to Pro*C
(C/C++ with embedded SQL) than a programming language version of SQL.

It's still saddled with the limitations of SQL.

That's really the major problem I have with it -- the limitations of
SQL, thanks to including SQL.

Another way of looking at it is that it's just SQL with Ada-inspired
sugar. While I haven't until now run across the description of it from
the other direction (that it's Ada with embedded SQL), I still think
that calling it SQL with Ada-inspired sugar better encompasses my
distaste for it.

ara.t.howard · Mar 21, 2007

Given that "Data is King", I particularly like the (non-sexual) double-
entendre of "Rule of Data".

one of the great things about this

-a

Austin Ziegler · Mar 21, 2007

Data is a dead fish. Applications are knowing how to fish. There are
minor edge cases where data is more important ("feed me now or I die
of starvation"), but in the grand scheme of things data is
insignificant compared to the applications that produce and transform
it.

Thank ghu I don't have to do business with you, because I wouldn't trust
your programs to work with my most important assets. I assure you that
my data is far more important than the applications which do something
with the data. The applications increase value, but they NEVER provide
value. It's the data.

* What's the most valuable thing that Amazon has? It isn't the programs;
those are constantly updated and occasionally replaced. It's the
customer DATA that they've amassed.
* What's the biggest worry intelligent people have about Google? It
isn't the programs, it's the amount of DATA that Google contains about
people.
* What have a number of commercial firms found themselves in trouble for
in the alst eighteen months? Losing personal DATA about their
customers.
* What do hackers and phishers want from you? Your personal DATA. They
don't really give a damn about your programs.

Scientists are worried about losing data from older sources, not the
programs. Data, once available, can be squished and manipulated and
dealt with in many dozens of different ways -- and often MUST be.

I can work with my pictures in iPhoto or LightRoom with no problems. The
pictures are more important than which program I use to edit them. I can
play my MP3s with any of a dozen different programs; the songs are more
important than which program I use.

I reiterate: Data is king. Applications are pawns. You can squawk all
kinds of ways to next Tuesday that this isn't true or that it's "minor
edge cases", but in reality it's just the opposite -- and ALWAYS WILL
BE. The application is more important than the data in the most rare of
cases. This is where a lot of OO-heads screw up. They think that the
application is far more important than the data. This is never true. The
application is, for the most part, a footnote to the data. Businesses
don't care *that* much when they lose an application. They care
*significantly* when they lose data.

We use relational databases as object stores because they're cheap and
easily available, not because they're good for the task.

No, that's why we use SQL databases. The reason that we don't use object
databases is that they're not cheap, they're not easily available, and
they're disastrous for your DATA because they completely lock you into a
single view of that data. Which matters a LOT more than any pissant
little program ever will.

Please. Try a little harder next time before you try analogies that
don't hold up to even the barest of comparisons.

-austin

ara.t.howard · Mar 21, 2007

Data is a dead fish. Applications are knowing how to fish. There are minor
edge cases where data is more important ("feed me now or I die of
starvation"), but in the grand scheme of things data is insignificant
compared to the applications that produce and transform it.

We use relational databases as object stores because they're cheap and
easily available, not because they're good for the task.

here at the national geophysical data ceter

http://ngdc.noaa.gov/

we say that data is useless, only the combination of applications and human
reasoning can turn it into __information__. so, with that in mind, i'd say
that data and applications are useless and that it's only by combining the two
using logic (aka business rules) that anything meaningful arises.

cast in point : we've 260tb of 'data' sitting in our mass storage device.
less than 0.01% ever comes back out. that small percentage is massaged into
meaningful __information__ via complex application and human logic though and
it's those kernels we're interested in.

2 cts.

-a

Austin Ziegler · Mar 21, 2007

Really? What's better?

I've debunked Mr Moore's base premise in a separate post, but as far as
simply storing data -- not ensuring transactional integrity or any
number of other things that database management systems provide you --
absolutely nothing beats flat files on the filesystem where the
filesystem provides your indexing and can perform amazingly quickly as
long as you're working with fixed data.

The problem, of course, is that filesystems are hierarchical in nature
and if your data -- or at least your indexing scheme(s) can't be
represented hierarchically, you're toast.

-austin

Austin Ziegler · Mar 21, 2007

here at the national geophysical data ceter

http://ngdc.noaa.gov/

we say that data is useless, only the combination of applications and human
reasoning can turn it into __information__. so, with that in mind, i'd say
that data and applications are useless and that it's only by combining the two
using logic (aka business rules) that anything meaningful arises.

I have to disagree with you, Ara. If you start with a set of data and
your business rules, you can reformulate the applications to derive
*value* from the set of data. On the other hand, if you have a set of
applications that implement your business rules and no data ... you
can't derive value at all.

Without data, you absolutely cannot do anything. If customers have to,
they can buy or write new programs to work with their data. They can
almost *never* recover lost data.

Case in point: Alaska Revenue just had to spend $200,000 in overtime
to rescan paper data that had been lost from their online system and
the backup was unreadable.

cast in point : we've 260tb of 'data' sitting in our mass storage device.
less than 0.01% ever comes back out. that small percentage is massaged into
meaningful __information__ via complex application and human logic though and
it's those kernels we're interested in.

Right, but if you needed to, you could easily (fsvo easily) rewrite
the complex application; it would be far harder to try to recreate the
data -- especially the historical data you have from satellite
imagery. It's not as if you can rewind the clock seven days to get a
satellite image you lost a week ago.

You are right that programs help you derive *value* from the data, but
programs are far more easily replaced than data.

-austin

Clifford Heath · Mar 21, 2007

Chad said:
Technically, it's a "query language". Why come up with more names for
it?

It's not a name, but a description. "describing" language is
not the standard term - Olivier means SQL is a "declarative"
language. But even that's only true of the standard, the actual
implementations have procedural features as well.

ara.t.howard · Mar 21, 2007

I have to disagree with you, Ara. If you start with a set of data and your
business rules, you can reformulate the applications to derive *value* from
the set of data. On the other hand, if you have a set of applications that
implement your business rules and no data ... you can't derive value at all.

Without data, you absolutely cannot do anything. If customers have to, they
can buy or write new programs to work with their data. They can almost
*never* recover lost data.

Case in point: Alaska Revenue just had to spend $200,000 in overtime to
rescan paper data that had been lost from their online system and the backup
was unreadable.

don't get me wrong - i understand your point. still, it's not quite so clear
cut imho though. for instance, we store both raw and derived satellite
products in our mass store. people tend to consider the raw in just those
terms you are describing - the foundation of it all. however, as the
developer that manages the system which manages that data i can say that there
are literally dozens of small but critical peices of software which touches
the data before it hits disk. and this doesn't even take into account the
fact that the data has been stored and replayed from a crappy magnetic tape
which then relayed the stuff to a downlink and then bounces a few hops around
the world to get to us. in reality the 'raw' data is only as good as the
weakest link in all those applications and hardware bits.

that might seem far fetched, but my experience is that this sort of thinking -
that data is something hard and real - is pervasive in science and nearly
always wrong. not long ago i go a bunch of data dumped on my lap: hundreds of
cds of ionosounde data from stations all around the world. in theory the data
should all carry a unique signature and all the code this group used made this
assumption. of course they were wrong: i wrote a script to scour the data
looking for 'impossible' contradictions. the results? thousands of dups and
logical contradictions that they didn't even know about.

i could tell 20 more stories like this. people think the word 'data' is holy
and that it's somehow different from the data collection sofware and hardware
which collected it. maybe this is obvious to most people, but it's worth
stating for posterity that the 'data' is often 'crap' because not enough
attention was paid to the software and hardware which collected and verified
it and, in that sense, this whole commentary is a bit circular.

nevertheless i do agree that 'data' is more important when a person is using
the normal divisions we use when thinking about it. it's just that those
divisions can be artificial sometimes without people being aware of it.

Right, but if you needed to, you could easily (fsvo easily) rewrite the
complex application; it would be far harder to try to recreate the data --
especially the historical data you have from satellite imagery. It's not as
if you can rewind the clock seven days to get a satellite image you lost a
week ago.

this is true of course.

You are right that programs help you derive *value* from the data, but
programs are far more easily replaced than data.

it depends on the collection method - but that is nearly always true too.

cheers.

-a

Jimmy George · Mar 21, 2007

Please. Try a little harder next time before you try analogies that
don't hold up to even the barest of comparisons.

Yes, 100% agree with Austin..
Without DATA there is no need for Applications !.. DATA are 'fact' and its
sensitive and mostly very hard to recreate (if at all) unlike Apps.

Let me focus back into scripting now... This is clear enough to anyone.. ;-)

Regards,
Jimmy

Clifford Heath · Mar 21, 2007

Chad said:
That's really the major problem I have with it -- the limitations of
SQL, thanks to including SQL.

The "limitations" of SQL stem from it being a true child
of the 1970's, but also from the relational model it adopts,
and from the requirements of transactional processing. This
last one is the most intractable and the least-commonly
understood. It's easy to pile criticisms on SQL, but it's
funny how the people who do it almost *never* seem to have
a deep understanding of the amazingly complex field of
transactional processing.

Object databases are even more fraught with problems arising
from the model they espouse than relational ones. I could go
on for a week about this, but suffice it to say that generally,
by trying to force persistence into an object model, they've
lost the plot regarding transactional behaviour, despite some
well-meaning attempts and even partial successes.

The right answer is more subtle and simpler than either, and
it's fact-oriented databases. I'll be having more to say about
that as my ActiveFacts project progresses.

Clifford Heath.

Ryan Davis · Mar 21, 2007

Like Pascal, I have little patience for people who speak out of
ignorance, especially when they say stupid things like "I think
relational databases are evil."

And what about those of use who don't speak out of ignorance and
STILL don't like relational DBs??? Or would you just assume we're
ignorant too???

And I thought I wouldn't touch this topic with a 10 foot pole... I
generally won't touch a thread that is one of your hot topics because
it just isn't worth it (see your comment about Pascal above). You
entered this thread as abusively as you could, pretty much on par
with all your other hot topic threads. I think you do a lot of good
work, but this regrettably makes pretty much most of it unapproachable.

Rick DeNatale · Mar 21, 2007

That's only slightly disingenuous. I'd put it this way:

"Never call a general resource 'fact-checking', regardless of the
specific example of a general resource."

If you want to check facts, you need to go after the specific,
stand-alone, specialized resources. Always check with primary sources,
or be aware you could be wrong. It doesn't matter whether it's Google,
Wikipedia, Britannica, the OED for etymology, or Ask Chad -- general
resources are not authoritative primary sources on specifics (generally
speaking, har har).

Sorry. I just tend to get my back up a little when someone singles out
a specific general resource, as though the problem isn't endemic to
general resources in general, almost tautologically.

I'm in total agreement. But it does seem like the Wikipedia is a
favorite whipping-boy these days.

A week or so ago a columnist in the local paper*, wrote a piece about
his experience with the accuracy of wikipedia. It seems that he
anonymously inserted** a bunch of imaginative junk in the article
about himself in wikipedia***.

He went on an on about how long such stuff can live on in WP, but the
upshot of the column was that someone discovered his fanciful 'spam'
and deleted it.

All in all the self-policing of the wikipedia seems to work a lot
better than it's press would have one believe, and there's some
evidence that the wikipedia is just as, if not more, reliable on
average than more 'respectable' and often more dated sources, like the
Britannica which has been on an intermittent vendetta against it.

Not to belittle the point that it's certainly not a primary source and
shouldn't be considered as such, any more than the other references
cited.

* I can't recall his name, and I don't know if he's syndicated or how widely.

** He wasn't clear if the article in question even existed before he got there.

** Which of course violated the wikipedia policy of not posting
directly about yourself.

Clifford Heath · Mar 21, 2007

Gary said:
I'm curious as to why query language development got hung up on SQL.
I've read a little bit about Tutorial D. Is SQL simply
another example of pre-mature standardization?

There's a different kind of standardization?

What would a Ruby interface to the underlying database engine (indexed
tables) look like? Could it get closer to Tutorial D by bypassing the
standard technique of 'marshaling' requests into SQL statements? Is
the impedance mismatch between Ruby (or any other OO language) and
Codd's relational algebra too great to cross smoothly?

Not with a fact-based model. ConQuer, though it's not
yet been realized commercially, is the absolute bees
knees - a raw beginner can compose queries that would
make a seasoned DBA quake. See www.orm.net for more...

ara.t.howard · Mar 21, 2007

Yes, 100% agree with Austin.. Without DATA there is no need for
Applications !.. DATA are 'fact' and its sensitive and mostly very hard to
recreate (if at all) unlike Apps.

can you give an example of some data which were not processed by at least one
peice of software, hardware, or meatware (people) before they were stored?

it's quite dangerous to assume those links are error free because it looses
sight of the fact that data are merely representations of facts, not facts
themselves.

-a

Chad Perrin · Mar 21, 2007

All in all the self-policing of the wikipedia seems to work a lot
better than it's press would have one believe, and there's some
evidence that the wikipedia is just as, if not more, reliable on
average than more 'respectable' and often more dated sources, like the
Britannica which has been on an intermittent vendetta against it.

Not to belittle the point that it's certainly not a primary source and
shouldn't be considered as such, any more than the other references
cited.

I agree with that assessment 100% -- and not just because I was the
Wikimedia Foundation's first-ever paid employee.

I guess maybe I should have mentioned that disclaimer earlier.

Chad Perrin · Mar 21, 2007

There's a different kind of standardization?

There are at least four types of standardization:

1. premature standardization
2. post-obsolescence standardization
3. theoretically optimal standardization, which may or may not be real
4. Microsoft standardization, which is anti-standardization with a bow

Sam Smoot · Mar 21, 2007

Ok. I can stand the SQL love-in no longer.

Anyone who's actually used db4o:

* Knows it's a perfectly viable solution for the majority of
applications out today as they don't approach it's limits performance,
or storage wise
* Knows it's a simpler database to develop for than generating reams
of mapping files or accepting the limitations of a system like
ActiveRecord
* Knows the *data* is safe because the database is open-source,
exports *very* easily, and no one is about to timebomb the frameworks
* Knows that for many common scenarios the performance will wipe the
floor with many popular RDBMS's

Oh, and "toy" comments are tired. Most developers would probably still
call Ruby a "toy" language. That doesn't mean they know something you
don't. More than likely they're just uninformed and biased. I'd hope
we could do better.

CFP: GAMEON 2007, November 20-22, 2007, University of Bologna, Bologna,Italy	0	Jun 15, 2007
Ruby Weekly News 30th May - 5th June 2005	1	Jun 7, 2005
Ruby Weekly News 14th - 20th March 2005	0	Mar 20, 2005

Object/Relational Mapping is the Vietnam of Computer Science

Austin Ziegler

Phrogz

Chad Perrin

Chad Perrin

ara.t.howard

Austin Ziegler

ara.t.howard

Austin Ziegler

Austin Ziegler

Clifford Heath

ara.t.howard

Jimmy George

Clifford Heath

Ryan Davis

Rick DeNatale

Clifford Heath

ara.t.howard

Chad Perrin

Chad Perrin

Sam Smoot

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads