Good Fowler article on ORM

L

Lew

Arved said:
http://martinfowler.com/bliki/OrmHate.html

Given some of the threads we've had...

Bookmarked.

This article succinctly explains points I've struggled to make over the years.
It gives an excellent description of the "impedance mismatch" between the
object and relational world views.

Now if all programmers who deal with object-to-relational mapping would
only read it.

Thanks for bringing it to our attention.
 
R

Roedy Green

http://martinfowler.com/bliki/OrmHate.html

Given some of the threads we've had...

Object/Relational mapping = ORM

We are very close to the point for many problems when you can store
your entire database in RAM or Flash RAM. It would seem to me, that
should spawn a new set of tools for managing data that don't have to
worry about fine tuning access times.

I also think there should be a database that deals with Java objects,
that lets you iterate using ordinary java for : each loops. If the
whole thing were designed with a Java API in mind, surely it should be
able to do more that SQL with much less futzing.
--
Roedy Green Canadian Mind Products
http://mindprod.com
Programmers love to create simplified replacements for HTML.
They forget that the simplest language is the one you
already know. They also forget that their simple little
markup language will bit by bit become even more convoluted
and complicated than HTML because of the unplanned way it grows.
..
 
A

Arne Vajhøj

Object/Relational mapping = ORM

Well - everybody knows that.
We are very close to the point for many problems when you can store
your entire database in RAM or Flash RAM. It would seem to me, that
should spawn a new set of tools for managing data that don't have to
worry about fine tuning access times.

Persistence to disk is not just done because RAM is more expensive
than disk. People like to have their data after a power failure.

Flash RAM is disk not memory.
I also think there should be a database that deals with Java objects,

It is called an OODBMS.

And if you had read the article you would know how they did not succeed.

Arne
 
A

Arne Vajhøj

J

Jan Burse

Roedy said:
Object/Relational mapping = ORM

We are very close to the point for many problems when you can store
your entire database in RAM or Flash RAM. It would seem to me, that
should spawn a new set of tools for managing data that don't have to
worry about fine tuning access times.

I also think there should be a database that deals with Java objects,
that lets you iterate using ordinary java for : each loops. If the
whole thing were designed with a Java API in mind, surely it should be
able to do more that SQL with much less futzing.

You got the Collection classes in Java. You can
more or less map an SQL query on that if you have
your objects in the memory.

But what this doesn't buy you are indexes, respectively
automatic indexing as databases do nowadays. It is
very tediuous to manually have indexes and also doesn't
make your domain model easily extensible.

I have worked on a solution for an interpreted language
and came up with an automatic indexing scheme.
Initially it did only speed up the access. Recently
I have also worked out relative quick updates on the
data (*). Problem is a little bit to not throw away the
indexes too quickly. And having logical cursor like
access is also challenging.

I also went for a custom implementation of some of
the Collection classes, to have the algorithms not
use iterators, but inline loops for speed. Isn't
possible with the existing classes since one cannot
access the fields. Also the custom classes automatically
shrink the indexes, what the normal Collection classes
don't do.

Overall implementation size of the indexer:
- 9 Classes

I think indexes, or what has often been called access
paths, in one way or the other are always key to
databases. The problem carries over to memory
based solutions. But one can do with a library
of only a few classes for memory.

Bye

(*)
https://plus.google.com/u/0/b/103259555581227445618/103259555581227445618/posts/FtcxQBCudjU
 
J

Jan Burse

Jan said:
Overall implementation size of the indexer:
- 9 Classes

Maybe there are other, especially public, libraries
around. Didn't research yet, since I wasn't sure
about the specs until recently.

Bye
 
D

David Lamb

I also went for a custom implementation of some of
the Collection classes, to have the algorithms not
use iterators, but inline loops for speed. Isn't
possible with the existing classes since one cannot
access the fields.

Isn't that the sort of optimization a JIT compiler is supposed to be
able to do?
 
J

Jan Burse

David said:
Isn't that the sort of optimization a JIT compiler
is supposed to be able to do?

This would be a very very good JIT compiler. Since the
issues is not simply inlining setters/getters.

The issue is that there are at first hand no setter/getters.
For example the table field of a HashMap is private.

And then an iterator implies creating a new stateful object.
Take for example the following trivial iteration without
an iterator over a HashMap:

for (int i = 0; i<table.length; i++) {
Entry e = table;
while (e!=null) {
/* do something */
}
}

When doing the above iteration with an iterator, the iterator
must keep i and e as a state, typically on the heap. Without
an iterator i and e can be registers.

Maybe some JITs are able to eliminate the heap allocation,
there is a further issues. The above loop does not do
modification checks. Which is valid in my application scenario.

So this is the second reason to inline the loops manually
for speed, less functionality is needed.

Of course one looses encapsulation of iterators. So it only
works a for a particular implementation of a HashMap, and
polymorphism is not anymore supported.

Bye
 
D

David Lamb

This would be a very very good JIT compiler. Since the
issues is not simply inlining setters/getters.

I confess to a high degree of ignorance about what current JIT compilers
are capable of, but I'd be surprised if inlining is all they can accomplish.
The issue is that there are at first hand no setter/getters.
For example the table field of a HashMap is private.

Hmm. Seems to me private/public status can't matter that much if a JIT
can inline setter/getters, since those typically access private data also.
And then an iterator implies creating a new stateful object. ....
When doing the above iteration with an iterator, the iterator
must keep i and e as a state, typically on the heap. Without
an iterator i and e can be registers.

Many moons ago I was peripherally involved in a project that was
producing highly optimizing compilers for conventional programming
languages. I seem to recall some discussion of being able to eliminate
some heap allocations via dependency analysis, where one could sometimes
detect that the heap object lifetime didn't extend beyond the invocation
of the procedure that allocated it. Is this not possible with Java/JIT?
 
J

Jan Burse

David said:
Many moons ago I was peripherally involved in a project that was
producing highly optimizing compilers for conventional programming
languages. I seem to recall some discussion of being able to eliminate
some heap allocations via dependency analysis, where one could sometimes
detect that the heap object lifetime didn't extend beyond the invocation
of the procedure that allocated it. Is this not possible with Java/JIT?

Yes some JITs can do the required escape analysis to some
extent. But when talking about JITs there is always a weak
JIT and a strong JIT, since there are different providers on
the market.

For example I am developing the same code base for later
use in both Swing and Android. And the Dalvik JIT for
Android is lacking a little bit behind, you can even read
recommendations to not use setters/getters (sic!) if possible
in code written for Dalvik.

So I am helping the JIT and I am helping the application.
The code fragment under discussion is heavily used internally
to the API, since the indexing is dynamic. It is still possible
for the dynamic multi-indexing API, the package which has 9 classes,
to provide a proper Iterator interface to the outside, and use
this by the application.

But if you know that youre deployment range will be only
top-notch JITs you might go into the pain of adding additional
class to the package for the iterator implementations. This
would blow up the packgage to 11 classes, counting the
..class files.

Bye
 
J

Jan Burse

Jan said:
But if you know that youre deployment range will be only
top-notch JITs you might go into the pain of adding additional
class to the package for the iterator implementations. This
would blow up the packgage to 11 classes, counting the
.class files.

But I doubt this is necessary, since these classes will not
be seen by the client. The client only sees:

- Give me all tupples that match a given pattern

Inside the API this is then translated into:

- Oh the client wants tupples for a given pattern, lets
first find a suitable index.
- Alternative 1:
- Oh this part of the tupple already has a an index,
lets lookup this part of the tupple
- Pick the set found by the lookup
- Alternative 2:
- Oh this part of the tupple could profit from an index,
but there is none yet, lets build an index.
- Lookup this part of the tupple in the new index
- Pick the set found by the lookup
- Oh we have a set now, lets check whether the
set is already suitable.
- Alternative 1:
- The set is already small enough or there are
no more potential sub index.
- Return the set
- Alternative 2:
- The set is still large and there is a potential
sub index.
- Continue use case from start inside the set.

So the involved data structure is something along:

Index = ArrayList<HashMap<IndexAndTupples>>

But this is not visible to the client. The client
will only see:

Tupples = ArrayList<Tupple>
 
M

markspace

For example I am developing the same code base for later
use in both Swing and Android. And the Dalvik JIT for
Android is lacking a little bit behind, you can even read
recommendations to not use setters/getters (sic!) if possible


Ah, good catch. I'd think that most JVMs intended for large servers
would easily do these sorts of optimizations, but I'd forgot about the
embedded market.

Just curious: I've never done an Android development. How does one
profile code for that environment? Do you have a Dalvik JVM that runs
on Windows/Unix (i.e. an emulator of some sort)? Is there a good
profiling tool that can attach to certain Android devices?
 
J

Jan Burse

markspace said:
Just curious: I've never done an Android development. How does one
profile code for that environment? Do you have a Dalvik JVM that runs
on Windows/Unix (i.e. an emulator of some sort)? Is there a good
profiling tool that can attach to certain Android devices?

I guess there are some tools around. Android has its own
way of instrumentation for tracing.
http://developer.android.com/guide/developing/debugging/debugging-tracing.html
Usually you can connect either to an emulator or to a
device connected via USB.

But I did not yet use it. Was just benchmarking my App and
saw that it runs much slower on Android. But that has also
to do with that I was using a tablet with something of 1GHz
ARM and was comparing against a 3.4GHz 64-bit Intel.

If you don't use a device but an emulator, you also observe
slowdown, since the Dalvik then runs inside the emulator (sic!).

(All the above just snapshot of 12.Mai 2012 and what I
currently have tried/know)

Bye
 
L

Lew

Jan said:
Yes some JITs can do the required escape analysis to some
extent. But when talking about JITs there is always a weak
JIT and a strong JIT, since there are different providers on
the market.

True statements.
For example I am developing the same code base for later
use in both Swing and Android. And the Dalvik JIT for
Android is lacking a little bit behind, you can even read
recommendations to not use setters/getters (sic!) if possible
in code written for Dalvik.

"Setter" and "getter" are well-established informal terms with nothing
shameful in their pedigree.

One must be judicious in accepting such recommendations. I don't oppose direct
use of attribute values /per se/, but I do warn against microoptimization
early in the development cycle.

Write the code that most clearly expresses the model and behaviors it implements.

If you do use, say, 'public' variables in a class, strongly consider using
read-only values to immutable instances.

It is neither microoptimization nor premature to consider whether data will be
primarily read or frequently written. A good domain model considers the flow
and "shape" of information (size of data packets, frequency of transactions,
proportion of duplicates, etc.) and its transformation, not just the static
object model. Considerations of read-heaviness vs. write-happiness originate
in the domain model and are appropriate topics for early analysis. (Aside:
"write-happiness" was a typing accident that I shall let stand.) Whether an
attribute comes as a variable reference or a method call is an implementation
detail perhaps irrelevant to the domain model. An immutable final variable is
not dangerous and can be justified without fear that it's premature. It
directly expresses the intent, might (!) help on an Android and won't hurt
elsewhere.

OTOH I will continue to write my own Android code with getters and setters.
So I am helping the JIT and I am helping the application.

And I'm sure the JIT is ever so grateful for your undoubtedly most useful
assistance.

You should back up such claims with hard evidence.

Measurable, repeatable tests.

I'm not saying you aren't helping, but "helping the optimizer" is so often
such an utterly outrageous claim that it can never be accepted on the face.
The code fragment under discussion is heavily used internally
to the API, since the indexing is dynamic. It is still possible
for the dynamic multi-indexing API, the package which has 9 classes,
to provide a proper Iterator interface to the outside, and use
this by the application.

But if you know that youre deployment range will be only
top-notch JITs you might go into the pain of adding additional
class to the package for the iterator implementations. This
would blow up the packgage to 11 classes, counting the
.class files.

I don't think I get your last paragraph here. What pain? What additional
classes? Why?

Regardless, source-code structure should nearly always express algorithm, not
platform. Deviations should stem from measured results.

I'm willing to lay odds that for your use cases the difference made by
accessors and mutators is not the low-hanging fruit.

Harmless optimizations that also strengthen code structure are always
acceptable, of course.
 
L

Lew

markspace said:
Good link, thanks for posting that.

Oh, really? You think that just might have a teensy-weensy little bit to do
with the performance difference, just maybe?

I've worked a bit with Android code and deployment environments here and there.

No question, you have to be rather conservative of resources in Android, but
it's at the platform level for the most part, not the JVM level.

I am highly dubious of the claim that accessor/mutator time was the major
determinant of the putative performance issue.

Android deployments suffer from platform limitations - Jan mentioned one, the
slower ARM vs. the customary wideband desktop, usually multicore. The
programming model differs, too. Android is more like a Xen or other
virtualized meta-OS, with each Dalvik a different virtual host.

So things move in and out of memory differently - think of old-fashioned
memory overlays - than they do on the desktop. Applications move in and out of
memory on Android at the whim of Android, not Dalvik. You do have to bear this
in mind as you write for Android. For example, you have to be ready to resume
your application from a total shutdown at any instant, even just to change
orientation. You're far more bound to the UI than you might be used to.

Your GC is clunkier than Java SE's, and you sort of do have to watch your RAM,
and more importantly, your threads.

But RAM on a typical Android device ranges from a quarter gigabyte up. We're
not talking microwave-oven controllers, here. These are quite literally pocket
supercomputers. 1 GHz is slow? Come on!

Another factor is logging. Android has a syslog called "logcat", to which
everyone, their cousin and the family dog contributes. That surely affects
performance, and it makes it interesting to find your own log data amidst all
the noise.

How many of you are good at designing log output?

Liars. Of the ones who did not raise their hand, about half likely are good at
it. That's why they don't claim to be. The other half have a hope to be someday.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top