EJB find methods. Why do they return only the primary key?

A

Andrea Desole

Playing around with EJBs, I'm noticing that the find methods in the home
interface return a complete bean (well, its interface), but the
implementation of the find method in the bean itself returns only the
primary key.
What I know is that the container first gets the primary key using the
find method, and then retrieves the object (maybe using ejbLoad?).
What I don't understand is why I should access the database (or whatever
repository where I can find my information) twice, instead of just
building the bean and returning it in the find method. It can't be
because the client only works on a stub, because all the calls on my
stub will eventually go to the real object, so I'll need it anyway. It
can be because the complete bean is not always needed (for example for a
remove), but is it really a gain?
Am I missing something?
 
H

hal

Andrea said:
Playing around with EJBs, I'm noticing that the find methods in the home
interface return a complete bean (well, its interface), but the
implementation of the find method in the bean itself returns only the
primary key.
What I know is that the container first gets the primary key using the
find method, and then retrieves the object (maybe using ejbLoad?).
What I don't understand is why I should access the database (or whatever
repository where I can find my information) twice, instead of just
building the bean and returning it in the find method. It can't be
because the client only works on a stub, because all the calls on my
stub will eventually go to the real object, so I'll need it anyway. It
can be because the complete bean is not always needed (for example for a
remove), but is it really a gain?
Am I missing something?

well, you just discovered one of many reasons why CMP EJB's and EJB in
general is a bad but very hyped idea.
http://c2.com/cgi/wiki?WhatsWrongWithEjb

And no, your not missing anything, accessing n entity beans, by
specification, leads to at least n+1 calls to the database.
http://c2.com/cgi/wiki?EntityBmpFinders
 
G

Grzegorz Trafny

well, you just discovered one of many reasons why CMP EJB's and EJB in
general is a bad but very hyped idea.
http://c2.com/cgi/wiki?WhatsWrongWithEjb

And no, your not missing anything, accessing n entity beans, by
specification, leads to at least n+1 calls to the database.
http://c2.com/cgi/wiki?EntityBmpFinders

Hi,

Yes, but we should consider fact that well configured container can connect
many calls (your 'n + 1') in one big SELECT. Naturally, only EB CMP come in
here in the game. And if (however practically always) we will add "caching"
of EBs, general performance can be even bigger than in case of using
low-level poor JDBC calls.

I apologize for generalities, but I don't know any official and adequate to
needs comparisons.

BTW. It it is not possible here to skip fact, that J2EE and EJB additionaly
introduce many auxiliary services.

Greetings
GT
 
D

Doug Pardee

What I don't understand is why I should access the database (or
whatever
repository where I can find my information) twice, instead of just
building the bean and returning it in the find method.

There are multiple cases to consider.

The container uses the ejbFindByPrimaryKey method to test if the
database has a record with that primary key. If so, ejbFindByPrimaryKey
returns the primary key; if not it throws an ObjectNotFoundException.
In the event that more than one record exists with the primary key (not
usually possible), the finder throws a FinderException. This process
doesn't necessarily involve doing a database access, but in practice it
almost always does.

The container uses other Single-Object Finders to translate the lookup
criteria into a primary key for a record in the database. If there is
no such record, the finder must throw an ObjectNotFoundException. If
more than one primary key is found, the finder must throw a
FinderException. This lookup might not involve reading the actual
record from the database; e.g., the primary key might be obtained by
querying a different table using the search criteria.

In both of the above cases, if the container receives a primary key
back from the finder, it then looks in its bean pool to see if it
already has a bean instance for that key. It returns that instance if
so, or allocates a new instance if not (or sets up lazy allocation for
it). Thus, it's possible that the bean instance returned is one from
the pool, not the one that executed the finder. It could also be just a
stub for lazy allocation.

The container uses Multi-Object Finders to perform queries based on the
lookup criteria. The finder returns a Collection of primary keys
(possibly empty) associated with the result set of the query. This
might not involve reading the actual records from the database; e.g.,
the primary keys might be obtained by querying a different table using
the search criteria.

In this case, the container looks in its bean pool to see if it has
bean instances for any of the returned keys. It allocates new instances
(or sets up lazy allocation) for any keys that didn't have pooled
instances, and then returns the lot. Multiple bean instances are
returned, and maybe none of them are the one that executed the finder.
Some might just be a stub for lazy allocation.

The client might then use the bean instance(s) that it received from
the container. The first access to each bean instance will trigger a
call to its ejbLoad method, which typically will issue another database
read.

So, if you call a finder method that returns 'n' bean instances, and
then access each of those instances, you'll typically end up with 'n+1'
database accesses.

Some containers (at least WebLogic and JBoss) can be instructed to
preload all of the returned beans if they're CMP beans. In a few cases
this might be wasteful; for example, if the result set was 1000 beans
and you only wanted to look at the top 5. It can also result in your
database going into lock escalation.

Entity beans are designed to always behave correctly under all
conditions. They are inherently low-performance and should be
approached carefully in any system that is expected to be under heavy
load. In addition to the 'n+1' problem (which can sometimes be
circumvented with some containers), you have scalability challenges
introduced by the limitation that only one client can be accessing an
entity bean at a time. Entity beans must be accessed inside
transactions, which generally is inappropriate for OLAP applications.
And depending on your database, you might end up with unnecessary lock
escalation which can further damage scalability.
 
S

Sudsy

Doug Pardee wrote:
<snip>

Thank you for a most eloquent description.
Entity beans are designed to always behave correctly under all
conditions. They are inherently low-performance and should be
approached carefully in any system that is expected to be under heavy
load. In addition to the 'n+1' problem (which can sometimes be
circumvented with some containers), you have scalability challenges
introduced by the limitation that only one client can be accessing an
entity bean at a time. Entity beans must be accessed inside
transactions, which generally is inappropriate for OLAP applications.
And depending on your database, you might end up with unnecessary lock
escalation which can further damage scalability.

I agreed with everything you said, save this last paragraph. In cases
where there is a lot of contention for table (or view) rows, CMP entity
EJBs can actually improve performance.
Well-designed implementations draw on the experience gained through
many years and iterations and access methods which provide the best
performance with minimal contention.
I'd be the first to agree that the initial attempts in this area fell
far short of potential. You only have to look at the ommission of the
ORDER BY clause in the initial implementation of EJB-QL to see that
they didn't offer what many consider to be essential functionality
at the beginning. Things have improved considerably since then.
I still believe that CMP entity EJBs (especially when you utilize
CMR) provide flexibility and power as a component of an enterprise
application.
And I also believe that many of the performance limitations are behind
us. Now if someone wants to fund a research project to prove that
premise... ;-)
 
D

Doug Pardee

Well-designed implementations draw on the experience gained
through many years and iterations and access methods
which provide the best performance with minimal contention.

That is just wishful thinking.

The specification for Entity Beans in both EJB1 and EJB2 is such that
it is simply not possible to handle entity beans in a performant
manner, no matter how clever the container. That is why Marc Fleury
chose to extend the EJB specification for JBoss, with the motto of
"Real tech not spec".

The easiest way to end up with a poorly performing EJB application is
to ignore the effect of entity beans involved in transactions. Client
access to entity beans is single-threaded, with transactions acting as
the locks. As long as each bean is only needed by one client thread,
this is no problem. But if you have a popular bean that many or all
clients need to access, you can end up with single-threaded behavior
(and performance) if you're not careful.

Furthermore, during the course of a transaction, each entity bean that
is accessed is locked up to that client until the transaction
terminates. If you're not paying attention, it's awfully easy to end up
locking up a lot of entity beans that you didn't mean to.

We've already talked about the "n+1" problem. As noted, this may be
able to be circumvented (in some containers) if you know about it.

Finders that return multiple beans can be a problem if they return too
many. The container may not be able to create that many bean instances.
The more rows you lock into your transaction, the more likely that
you'll have a conflict. Worse, the database may go into lock
escalation.

For beans that are essentially read-only data, Commit Option A can help
by allowing the bean to cache the data between transactions. However,
you need to be wary of triggering lock escalation in the database,
because the container is required to lock all data rows being held by
Commit Option A beans.

For OLAP applications, transactioning is something that you want to
avoid. There's no reason to cause any data in an OLAP application to be
locked. CMP beans are useless here, but carefully crafted BMP beans can
work.

One of Marc Fleury's pet peeves about the EJB specification is the
requirement that all argument and result data must be serialized and
deserialized. That can significantly add to the overhead, which is why
JBoss provides an extension to skip that and just pass reference
pointers. WebLogic has added a similar extension for Local EJBs.

If the application is deployed onto clustered servers, it becomes even
more imperative to carefully consider the transaction architecture. And
if you're using entity beans, clustered servers make it even harder to
create a transaction architecture that doesn't bog down performance.

The bottom line is that with entity beans it's easy to make a poorly
performing EJB application. You can avoid some of the performance
problems by knowing your application well and making adjustments,
particularly in the transaction architecture. You can reduce some other
performance problems by knowing what extensions your particular
container provides: n+1 reduction, serialization elimination, special
commit options, etc., while bearing in mind that each of these
extensions bypasses some safety feature of the EJB specification. You
also need to know if your database uses lock escalation, and if so, how
to avoid it.

As I said, "Entity Beans... should be approached carefully in any
system that is expected to be under heavy load." There are a lot of
things that need to be taken into consideration.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top