pooled connection myth

D

David McDivitt

Thanks Virgil. Your criticism of three active shared connections as I
propose represents a pooling approach. The three shared connections are not
pooled, but shared. Round robin is used for load balancing between them. If
the collection of connections grows and shrinks dynamically as you propose,
then literally these are being created and destroyed based on need and there
is no advantage whatsoever to having anything pooled. The concept of pooling
implies connections are not destroyed, meaning no close or logoff ever
happens at the server for them, and new requests get a hot connection. The
idea of pooling is a waste of time. What's needed is connection sharing. If
trans blocks are needed a standalone connection should be obtained instead.

Yes, these are struts applications. The classes used are all user written,
or, pasted from something else and changed a bit. They are:

DataSourcePlugIn
Padbmt00DBConnectionManager
BaseDAO
PersistenceMap
PersistenceMapDAO
PersistenceMapFactory

getPooledConnection is in the BaseDAO class and has the following:

try {
db = Padbmt00DBConnectionManager.getInstance();
Connection conn = db.getConnection();
String query = "SELECT '1' FROM SYSIBM.SYSDUMMY1";
Statement statement1 = conn.createStatement();
ResultSet rs1 = statement1.executeQuery(query);
rs1.close();
statement1.close();
return conn;
}
catch (com.ibm.ejs.cm.portability.StaleConnectionException e) {
try {
Connection conn =
Padbmt00DBConnectionManager.getInstance().getConnection();
return conn;
catch(Exception ee){
ee.printStackTrace();
return null;
}
}
catch (Exception e) {
e.printStackTrace();
return null;}
}

The driver name COM.ibm.db2.jdbc.app.DB2Driver can be found in
datasource.xml.

One high priority project has 12 developers working as needed. Things are so
convoluted, people have recently been instantiating their own driver and
connection and bypassing the connection management.
 
D

David McDivitt

From: Lee Fesperman said:
Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 19:58:04 GMT



As John said, the usefulness of concurrent sharing for connection objects varies by the
driver. In addition, the Connection object does contain state information. Take a look
at the various setXXX() methods in java.sql.Connection.

Your assumptions about pooling are not very valid. Other types of resources, including
threads, are pooled in Java but rarely concurrently. Concurrent use of pooled resources
would tend to imply that all state be read-only, thus no pooling as such is not needed.
All you need to do is to keep a common reference (GC will take care of disposing of the
object when no longer needed). While this isn't absolutely true in all cases, it covers
most of the interesting ones.

Also, caching of connection properties is just one aspect and will normally include more
than just user/password. Then there is the very important feature of avoiding the
overhead of making (and ending) a connection to the db and resultant pressure on
resources on the server. In addition, some connection pooling software support
(automatic, hidden) pooling of prepared statements.

Finally, please do not top-post. It's just basic netiquette.


If more functionality is present in the connection manager I want to see it.
I do not want to take it on faith. What you say here does not represent
enough value compared to increased complexity. Shared connections are good.
Pooled connections, are not good, since pooling itself does not accomplish
anything. The only way it can be of value is to intercept close and open
calls and reuse hot connections. This is not being done where I am.
 
L

Lee Fesperman

David said:
What I'm trying to do is establish some arguments to do things differently.
Just because a package says something does not mean it really does that. The
package may only be presenting an abstract idea. Some people believe
applications should be built with ultimate conceivable robustness in mind,
always, with however many unused pieces to facilitate the idea. I do not
agree with that. Just because a new gimmick infers robustness does not mean
it should be used on that basis alone. The applications we have where I am
have too many of these and code cannot be followed for maintenance.

From what I've seen I see no reason for connection pooling under any
circumstance, even if it was working. A connection manager should return one
of three or four shared connections. An additional method should be present
which returns a standalone connection for times when such might be needed
for tran blocks, etc. Anything beyond this is just more bloated java crap.

My, my, you are very insistent on your point of view. If you really see things that way
for the general case, you might consider responding to the points in my other posting.

While it is possible that your approach has validity in *your special case*, I really
doubt it does for database connections. Let me say that I don't know about the internal
workings of the DB2 server. But....

A related issue was recently discussed on comp.lang.java.databases in reference to
Oracle and other DBMSs. Joe from BEA even asserted that it was true for all major
databases that "individual operations are serialized on the same connection." This tends
to reduce the effectiveness of sharing a connection. Concurrent use of resultsets is not
a good idea, and that of statements is limited. All you have is the use of different
statements concurrently. If transactions aren't important, there seems to be little or
even reduced performance gain over multiple connections. This just leaves you with
reduced use of resources on the client and possibly on the server. Not something to jump
up and down about, except in very special circumstances.

Don't top-post!
 
D

David McDivitt

From: Lee Fesperman said:
Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 21:23:20 GMT



My, my, you are very insistent on your point of view. If you really see things that way
for the general case, you might consider responding to the points in my other posting.

While it is possible that your approach has validity in *your special case*, I really
doubt it does for database connections. Let me say that I don't know about the internal
workings of the DB2 server. But....

A related issue was recently discussed on comp.lang.java.databases in reference to
Oracle and other DBMSs. Joe from BEA even asserted that it was true for all major
databases that "individual operations are serialized on the same connection." This tends
to reduce the effectiveness of sharing a connection. Concurrent use of resultsets is not
a good idea, and that of statements is limited. All you have is the use of different
statements concurrently. If transactions aren't important, there seems to be little or
even reduced performance gain over multiple connections. This just leaves you with
reduced use of resources on the client and possibly on the server. Not something to jump
up and down about, except in very special circumstances.

Don't top-post!

We seem to be getting to the issue at hand, finally. Pages render very fast
from the app server. There is little if any sustained database access. The
app gets what it needs and displays it. Three shared connections working
round robin is excellent for this. The probability the app would serialize a
new query on a connection, before the previous requester is done with it, is
almost nonexistent. So, increase to five. Even if a very infrequent request
was serialized, it would take no time. Alternatively, if a connection is
created then dropped for each request, that involves much more resources and
is much slower on both the app server and database server.

Again, the efficiency of pooling is a myth. Sharing is better. When needed
for trans blocks a standalone connection should be obtained.
 
J

John C. Bollinger

David said:
If more functionality is present in the connection manager I want to see it.

Well it certainly depends on the connection manager you use, doesn't it?
I do not want to take it on faith. What you say here does not represent
enough value compared to increased complexity. Shared connections are good.

Shared connections are *BAD*. Connection instances have mutable state,
so safely sharing them in a multithreaded environment requires
considerable effort, whether or not their use is serialized. By the
time you build up enough wrapper code to make the sharing safe, you are
_at least_ as badly off, complexity-wise, as if you had used a proper
connection pool. Moreover, you need to produce and maintain the code
for this scheme *yourself*, whereas JDBC driver providers generally
provide connection pool implementations along with the driver.
Pooled connections, are not good, since pooling itself does not accomplish
anything. The only way it can be of value is to intercept close and open
calls and reuse hot connections. This is not being done where I am.

Pooling accomplishes exactly what it aims to do, which is to reduce the
number of distinct connections made to the DB in a manner transparent to
connection users. This is the same goal that you claim for your "shared
connection" idea, so you must consider it a worthy one. Apparently your
organization is not in fact using connection pooling, and perhaps that
is the real cause for your complaint. The code you showed in your
response to Virgil does not exhibit connection pooling, no matter what
the classes and methods involved are named (and as you yourself have
already argued), so it is no basis for any kind of criticism of
connection pooling.

As far as I can tell, you are conducting a prolonged diatribe against
your own company's code. You seem on the one hand to understand that it
does not correctly implement a connection pool, but you seem on the
other hand to be using it as a basis to criticize connection pooling in
general. That just doesn't make sense. You have also aired other
complaints about the convolution and complexity of your company's code.
These may well be legitimate, but other than sympathizing with you
there's not much we can say or do about it.

In my opinion, your company had already lost when it decided to do its
own connection management instead of using a DataSource implementation
-- pooled or not -- provided with the DB driver. I can't say I
especially like the hand-rolled code I saw, but the worst thing about it
is that it probably shouldn't exist at all.
 
L

Lee Fesperman

David said:
If more functionality is present in the connection manager I want to see it.
I do not want to take it on faith. What you say here does not represent
enough value compared to increased complexity. Shared connections are good.
Pooled connections, are not good, since pooling itself does not accomplish
anything. The only way it can be of value is to intercept close and open
calls and reuse hot connections. This is not being done where I am.

Thanks for responding (and for inline posting). I don't know the connection manager that
you are using. Note: ConnectionManager is a specific facility in JCA; this is probably
not the JCA one. The one you're using may be a poor facility, however you also made
assertions about connection pooling in general which aren't valid.

I think I did list the major benefits of connection pooling. The one I missed was
reduced resource requirements and better control of such by the containing software (for
one thing, it gives the container the ability to say: you can't have another connection
at this time.) If those are not of interest to you, fine, though tighter control of
resources is quite important to many containers. For instance, JCA gives the container
control over the use of threads by connectors.

As to shared connections, I see that to be of very limited use. I made comments on that
in the other sub-thread and would be interested in your counter. If you also look at the
thread on comp.lang.databases (I'll get the title if you wish), you will see arguments
why a server would not want to increase that capability.
 
D

David McDivitt

From: Lee Fesperman said:
Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 22:31:32 GMT

Thanks for responding (and for inline posting). I don't know the connection manager that
you are using. Note: ConnectionManager is a specific facility in JCA; this is probably
not the JCA one. The one you're using may be a poor facility, however you also made
assertions about connection pooling in general which aren't valid.

I think I did list the major benefits of connection pooling. The one I missed was
reduced resource requirements and better control of such by the containing software (for
one thing, it gives the container the ability to say: you can't have another connection
at this time.) If those are not of interest to you, fine, though tighter control of
resources is quite important to many containers. For instance, JCA gives the container
control over the use of threads by connectors.

As to shared connections, I see that to be of very limited use. I made comments on that
in the other sub-thread and would be interested in your counter. If you also look at the
thread on comp.lang.databases (I'll get the title if you wish), you will see arguments
why a server would not want to increase that capability.

Yes, please give me something for a text search in that newsgroup. I do not
download it so should start.

If a database driver will handle multiple threads on one connection, even if
serialized, that is by far the best scenario. The connection manager I made
makes my application incredibly fast. All it does is return the same three
connections round robin. I had ten people hit it as fast as they could, over
and over, at the same time for a test. No one has said what is actually
"pooled" in connection pooling. It is a myth. When closed the connections go
away. If their number increases and decreases dynamically, that means they
are being opened and closed. Realize that each open is a separate login to
the database server requiring user name and password. There is no possible
way you can show that to be more efficient than sharing a few connections
and never closing them. Limited functionality is given to a caller, yes, but
the only limitation is trans blocks. For those a standalone can be obtained.
 
D

David McDivitt

From: "John C. Bollinger said:
Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 17:29:18 -0500



Well it certainly depends on the connection manager you use, doesn't it?


Shared connections are *BAD*. Connection instances have mutable state,
so safely sharing them in a multithreaded environment requires
considerable effort, whether or not their use is serialized. By the
time you build up enough wrapper code to make the sharing safe, you are
_at least_ as badly off, complexity-wise, as if you had used a proper
connection pool. Moreover, you need to produce and maintain the code
for this scheme *yourself*, whereas JDBC driver providers generally
provide connection pool implementations along with the driver.


Pooling accomplishes exactly what it aims to do, which is to reduce the
number of distinct connections made to the DB in a manner transparent to
connection users. This is the same goal that you claim for your "shared
connection" idea, so you must consider it a worthy one. Apparently your
organization is not in fact using connection pooling, and perhaps that
is the real cause for your complaint. The code you showed in your
response to Virgil does not exhibit connection pooling, no matter what
the classes and methods involved are named (and as you yourself have
already argued), so it is no basis for any kind of criticism of
connection pooling.

As far as I can tell, you are conducting a prolonged diatribe against
your own company's code. You seem on the one hand to understand that it
does not correctly implement a connection pool, but you seem on the
other hand to be using it as a basis to criticize connection pooling in
general. That just doesn't make sense. You have also aired other
complaints about the convolution and complexity of your company's code.
These may well be legitimate, but other than sympathizing with you
there's not much we can say or do about it.

In my opinion, your company had already lost when it decided to do its
own connection management instead of using a DataSource implementation
-- pooled or not -- provided with the DB driver. I can't say I
especially like the hand-rolled code I saw, but the worst thing about it
is that it probably shouldn't exist at all.


Sharing is not BAD. Why? No support is required. If the driver supports it,
what's the big deal? The driver should be used to obtain as much
functionality as possible.
 
L

Lee Fesperman

David said:
We seem to be getting to the issue at hand, finally. Pages render very fast
from the app server. There is little if any sustained database access. The
app gets what it needs and displays it. Three shared connections working
round robin is excellent for this. The probability the app would serialize a
new query on a connection, before the previous requester is done with it, is
almost nonexistent. So, increase to five. Even if a very infrequent request
was serialized, it would take no time. Alternatively, if a connection is
created then dropped for each request, that involves much more resources and
is much slower on both the app server and database server.

Again, the efficiency of pooling is a myth. Sharing is better. When needed
for trans blocks a standalone connection should be obtained.

Again, thanks for inline posting. I just don't see that you have made a case that shared
connections are better, in general. Of course a decent connection pool will not create
and drop a connection for each request. That's the main purpose of pooling. It simply
does not cost more than your solution.

I think John covered the shared connection situation pretty well. Shared connections
have very limited use and require very special client coding which is hard to insure is
correct. It could easily produce bugs ... the really bad kind that only show up under
special conditions. In the general case, the possibly of concurrent use of queries is
very real, and your solution prevents that use. The server is likely to provide improved
performance for that situation, if given a chance.

By not using a trans block, I guess you mean using autocommit. Serializing that too will
descrease performance. You also seem to be assuming that the server is never performing
any work except on behalf of web activity, not a valid assumption.

In addition, database servers are not oriented to providing good service for shared
connections. They're concerned about good service to multiple connections. Catering to
special use of individual connections adds complexity that just isn't worth it for
servers.

You are taking control from the app and db server on the premise that you can do it
better. I no tin so (quoting Rita Moreno in response to the question, "Is Canseco a man
with the public good on his mind?")
 
L

Lee Fesperman

David said:
Yes, please give me something for a text search in that newsgroup. I do not
download it so should start.

Sorry about that; I was being lazy. The title is "Single connection, multiple threads,
one Statement per thread" in comp.lang.java.databases.
If a database driver will handle multiple threads on one connection, even if
serialized, that is by far the best scenario. The connection manager I made
makes my application incredibly fast. All it does is return the same three
connections round robin. I had ten people hit it as fast as they could, over
and over, at the same time for a test. No one has said what is actually
"pooled" in connection pooling. It is a myth. When closed the connections go
away. If their number increases and decreases dynamically, that means they
are being opened and closed. Realize that each open is a separate login to
the database server requiring user name and password. There is no possible
way you can show that to be more efficient than sharing a few connections
and never closing them. Limited functionality is given to a caller, yes, but
the only limitation is trans blocks. For those a standalone can be obtained.

No, regular connection pooling will provide better performance. The connection cost is
the same because connection pooling doesn't reconnect. And the performance on operations
will be better because it won't be serialized and doesn't require autocommit. If you
refuse to believe that connection pooling does work, then try it and/or take a look at
the code of DBCP on SourceForge; it's open source. Repeatedly claiming it is a myth
doesn't advance your case.

Your simplistic testing does not prove it for the general case. A test that forces
serialization will prove the opposite.

What is "pooled" in connection pooling? The *open* driver object implementing
java.sql.Connection. Do we have to say that?

I've already pointed out that trans blocks are not the only state encapsulated by a
Connection, ignoring various active statements/resultsets. Do you want chapter and
verse? You can say they are not important to you, but no way can you say they are not
important to anyone.

Why do you keep repeating the same falsehoods when corrected? I'm beginning to lose
interest in your responses because of your intellectual dishonesty.
 
C

Chris Smith

David McDivitt said:
No one has said what is actually
"pooled" in connection pooling. It is a myth. When closed the connections go
away. If their number increases and decreases dynamically, that means they
are being opened and closed. Realize that each open is a separate login to
the database server requiring user name and password.

Let me explain this in a slightly different way. The interface that
JDBC applications use to speak to the database -- java.sql.Connection --
is an interface. It can have any number of possible implementations.
At the most basic level, JDBC drivers can provide a direct physical
connection, in which the Connection object represents a session wioth
the database (with authentication and what-have-you), and close() causes
that session to end.

However, that's *not* the only option. Connection pooling is actually
implemented by providing a different (you could say "fake" if you
wanted) implementation of the same interface. In this new pretend
connection, the Connection object has little to do with the actual login
session with the database. When you acquire the Connection object, a
new session is NOT started with the database; and when you close it, the
database session is NOT ended. The object is merely pretending to be a
database connection. Basically, you're sharing the connection in the
important senses of the word, but it's guaranteed that no one else is
using that connection while you've got it.

To confuse things further, you seem to have been looking at
ConnectionPoolDataSource (based on your reference to a
getPooledConnection method). That class does *not* implement a pooled
data source; it implements a data source with features that make it
easier to IMPLEMENT pooling. To use connection pooling, you would
typically use an implementation of the DataSource interface -- one which
implements pooling. Such a data source is typically provided and bound
into JNDI in a J2EE app server environment. It's also occasionally
provided as part of a JDBC driver, and there are implementations
available as third-party code.

So you may be right that your code doesn't implement connection pooling;
but you are not right that connection pooling can't work. It does work,
on a very wide scale and every single day. You may as well say that
cars can't be red; the evidence is obvious and plentiful to the
contrary.
There is no possible
way you can show that to be more efficient than sharing a few connections
and never closing them. Limited functionality is given to a caller, yes, but
the only limitation is trans blocks. For those a standalone can be obtained.

I have no doubt that your scheme is fast, when it works. However, it
will occasionally fail; not just with transactions, but any time that
the state of the connection is modified. On the other hand, you could
do connection pooling, and it would be essentially as fast, handle
transactions and other connection state just fine, and would not require
programmers to learn your new set of restrictions. Why not do it that
way?

--
www.designacourse.com
The Easiest Way To Train Anyone... Anywhere.

Chris Smith - Lead Software Developer/Technical Trainer
MindIQ Corporation
 
J

John C. Bollinger

David said:
Sharing is not BAD. Why? No support is required. If the driver supports it,
what's the big deal? The driver should be used to obtain as much
functionality as possible.

Drivers *don't* support it -- none of them -- and I explained why. As I
also explained, you can build up your own connection packaging to
support it, but then you're just in the same race on a different horse.
As long as you're on about the driver, why do you refuse to consider
using the connection pools that are distributed with the drivers?
 
D

David McDivitt

From: Lee Fesperman said:
Subject: Re: pooled connection myth
Date: Fri, 18 Mar 2005 00:53:25 GMT

Sorry about that; I was being lazy. The title is "Single connection, multiple threads,
one Statement per thread" in comp.lang.java.databases.


No, regular connection pooling will provide better performance. The connection cost is
the same because connection pooling doesn't reconnect. And the performance on operations
will be better because it won't be serialized and doesn't require autocommit. If you
refuse to believe that connection pooling does work, then try it and/or take a look at
the code of DBCP on SourceForge; it's open source. Repeatedly claiming it is a myth
doesn't advance your case.

Your simplistic testing does not prove it for the general case. A test that forces
serialization will prove the opposite.

What is "pooled" in connection pooling? The *open* driver object implementing
java.sql.Connection. Do we have to say that?

I've already pointed out that trans blocks are not the only state encapsulated by a
Connection, ignoring various active statements/resultsets. Do you want chapter and
verse? You can say they are not important to you, but no way can you say they are not
important to anyone.

Why do you keep repeating the same falsehoods when corrected? I'm beginning to lose
interest in your responses because of your intellectual dishonesty.

I will read that thread. The title sounds good. I will also get the code at
SourceForge and look at it.

With round robin connection sharing, if the previous user of a connection is
done with it by the time it's handed out again, that ends up being the same
as single ownership. Serialization would be a rare fallback.

Another point about connections not discussed is whether to instantiate new
connection objects from the same driver object. If it is not inefficient to
multi-thread a driver, neither is it inefficient to multi-thread a
connection. If serialization is done that would occur at the driver level
behind the connection. If it is said a connection should not be
multi-threaded because that can't be trusted, then neither can the driver,
and a new driver object should be instantiated for each connection as well.

I can see the advantage of pooling if stuff is really being pooled, which
means connections stay open. I will see about putting together a class to do
that either by writing one or using open source already present. In the
meantime, and until I get that tested, I will use a very simple connection
sharing class where callers do not close the connection.
 
D

David McDivitt

From: Lee Fesperman said:
Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 23:28:42 GMT

Again, thanks for inline posting. I just don't see that you have made a case that shared
connections are better, in general. Of course a decent connection pool will not create
and drop a connection for each request. That's the main purpose of pooling. It simply
does not cost more than your solution.

I think John covered the shared connection situation pretty well. Shared connections
have very limited use and require very special client coding which is hard to insure is
correct. It could easily produce bugs ... the really bad kind that only show up under
special conditions. In the general case, the possibly of concurrent use of queries is
very real, and your solution prevents that use. The server is likely to provide improved
performance for that situation, if given a chance.

By not using a trans block, I guess you mean using autocommit. Serializing that too will
descrease performance. You also seem to be assuming that the server is never performing
any work except on behalf of web activity, not a valid assumption.

In addition, database servers are not oriented to providing good service for shared
connections. They're concerned about good service to multiple connections. Catering to
special use of individual connections adds complexity that just isn't worth it for
servers.

You are taking control from the app and db server on the premise that you can do it
better. I no tin so (quoting Rita Moreno in response to the question, "Is Canseco a man
with the public good on his mind?")

I would have to look at driver code, but I surmise serialization at the
connection level is no different than serialization done at the driver
level. If multiple connections are made from one driver object, or the
driver is multi-threaded, multi-threading a connection would be no
different. A test platform would be required to prove this out and see where
performance degrades.

Not using a trans block and not using autocommit is no big deal. Just don't
put in code. There. Problem solved. If these are needed a standalone
connection should be obtained. No biggie. There's no reason a separate
method cannot be present for that.

Multi-use of a connection is not an issue at the server because a thread
should be done with it by the time the same connection is given to another
thread. Pooling gives an advantage, if connections really stay open, because
monitoring of connections would be possible through open and close calls,
which would not be present with shared connections. This is not enough
reason to implement a connection pooling scheme, and another piece of
software, unless it is really needed.
 
D

David McDivitt

Subject: Re: pooled connection myth
Date: Thu, 17 Mar 2005 23:02:56 -0700



Let me explain this in a slightly different way. The interface that
JDBC applications use to speak to the database -- java.sql.Connection --
is an interface. It can have any number of possible implementations.
At the most basic level, JDBC drivers can provide a direct physical
connection, in which the Connection object represents a session wioth
the database (with authentication and what-have-you), and close() causes
that session to end.

However, that's *not* the only option. Connection pooling is actually
implemented by providing a different (you could say "fake" if you
wanted) implementation of the same interface. In this new pretend
connection, the Connection object has little to do with the actual login
session with the database. When you acquire the Connection object, a
new session is NOT started with the database; and when you close it, the
database session is NOT ended. The object is merely pretending to be a
database connection. Basically, you're sharing the connection in the
important senses of the word, but it's guaranteed that no one else is
using that connection while you've got it.

To confuse things further, you seem to have been looking at
ConnectionPoolDataSource (based on your reference to a
getPooledConnection method). That class does *not* implement a pooled
data source; it implements a data source with features that make it
easier to IMPLEMENT pooling. To use connection pooling, you would
typically use an implementation of the DataSource interface -- one which
implements pooling. Such a data source is typically provided and bound
into JNDI in a J2EE app server environment. It's also occasionally
provided as part of a JDBC driver, and there are implementations
available as third-party code.

So you may be right that your code doesn't implement connection pooling;
but you are not right that connection pooling can't work. It does work,
on a very wide scale and every single day. You may as well say that
cars can't be red; the evidence is obvious and plentiful to the
contrary.


I have no doubt that your scheme is fast, when it works. However, it
will occasionally fail; not just with transactions, but any time that
the state of the connection is modified. On the other hand, you could
do connection pooling, and it would be essentially as fast, handle
transactions and other connection state just fine, and would not require
programmers to learn your new set of restrictions. Why not do it that
way?


No it will not fail. On what basis?

I like simplicity. I do not feel something should be implemented if it
cannot be understood and maintained. If I have pooling software which truly
leaves connections open, I will consider using it.

I do not doubt connection pooling works. But I also see many things which
say connection pooling and actually are not.
 
D

David McDivitt

From: "John C. Bollinger said:
Subject: Re: pooled connection myth
Date: Fri, 18 Mar 2005 09:47:22 -0500



Drivers *don't* support it -- none of them -- and I explained why. As I
also explained, you can build up your own connection packaging to
support it, but then you're just in the same race on a different horse.
As long as you're on about the driver, why do you refuse to consider
using the connection pools that are distributed with the drivers?

What you say makes no sense. Multiple connections can be made from a single
driver object. If a query did happen to get serialized in a connection, that
serialization would be done in the driver, anyway. If connections cannot be
shared then neither should driver objects be shared: one thread per
connection and one connection per driver object.
 
A

alin

I've been following this thread for quite a while now but I finally
decided it's time to say something. I'm a JDBC driver developer (I
might not be the best around, but I have seen and do understand the
inner workings of one).

David said:
If more functionality is present in the connection manager I want to see it.
I do not want to take it on faith.

This is so full of crap. So do you take the DB2 documentation on faith?
I highly doubt you have the complete DB2 sources to "see". Why don't
you move to MySQL then?
What you say here does not represent
enough value compared to increased complexity. Shared connections are good.
Pooled connections, are not good, since pooling itself does not accomplish
anything. The only way it can be of value is to intercept close and open
calls and reuse hot connections. This is not being done where I am.

If intercepting close and open calls is not being done where you are
it's because you don't have a connection pool implementation. You have
a connection sharing algorithm in place. That's indeed not good.

Pooling does accomplish a lot of things, such as restoring connection
state (auto commit, transaction isolation etc. -- I know, you are
relying on the fact that no one touches that until it will blow up in
your face), closing resources that the user forgot to close so that you
don't run out of memory because you end up with way too many open
Statements (the garbage collector is not going to save your ass here,
it's _not guaranteed_ to clean up all resources as soon as they become
unavailable) and checking that connections are still alive (I see
that's the one thing you are also doing). Additional connections are
only created when there's need for them, i.e. when there are more
threads requesting connections than connections available; you say this
never happens with your application; then why do you worry about it?

You say that there is little contention on your three connections. Why
do you need three then? Why can't you stick with just one? Have you
wondered what happens if a connection crashes when one thread does some
stupid stuff? Or if it hangs? And even with low connection contention,
do you realize it's very probable that at some point a Statement will
try to execute something immediately after another one has executed a
SELECT that returned a rather large ResultSet, forcing the driver to
cache all that ResultSet into memory?

Well, I guess all this doesn't really count when compared with your
doubt that connection pool managers actually do everything they
advertise. Good luck, then.

Alin,
The jTDS Project.
 
J

John C. Bollinger

David said:
What you say makes no sense. Multiple connections can be made from a single
driver object. If a query did happen to get serialized in a connection, that
serialization would be done in the driver, anyway. If connections cannot be
shared then neither should driver objects be shared: one thread per
connection and one connection per driver object.

Do not confuse a Driver object (which acts as a factory for Connections)
with a whole JDBC driver, which encompasses at least one implementation
of Driver, Connection, Statement, ResultSet, and several other
interfaces, along with supporting classes. Once a Driver has created a
Connection, it need not have anything to do with communication via that
Connection, so a Driver object's thread safety is unrelated to the
thread safety of the Connections it hands out. If you want to talk
about thread safety of the whole driver (lowercase 'd'), however, then
of course you have to include the Connection implementation(s) in the
discussion.

A key difference here between a Driver and a Connection is that Drivers
are stateless, whereas Connections are explicitly stateful. (The
Connection interface defines methods for manipulation of persistent
state.) Connections' statefulness is what makes them unsuitable for
concurrent use by multiple threads, even if they internally serialize
individual operations. This is much the same reason that the
synchronized collections (Vector, Hashtable,
Collections.synchronizedXXX(foo)) are not safe for concurrent use: the
synchronization / serialization ensures that the objects' internal state
remains consistent, but that's not sufficient to shield one thread from
the effects of other threads' manipulation of the objects.

If you know what you can and cannot safely do, then yes, you can write
code that shares Connections among multiple threads and never have a
problem. There may or may not be performance consequences relative to
using a connection pool (in either direction, but probably not too
significant). The fact that you must know and abide by nontrivial
restrictions on the use of the shared Connections makes that approach
anything but transparent to Connection users, however, and the fact that
the approach provides fertile ground for concurrency bugs should strike
terror into your heart.

Alternatively, your JDBC driver likely comes with a connection pool
implementation that you can use out of the box. If it doesn't then you
can still get one from a third party and thereby not have to maintain
your own connection manager. Users of the Connections provided by such
a pool do not need to worry about any restrictions on their normal use
of the connections, and there are no special concurrency concerns
involved. A connection pool satisfies exactly the same performance
considerations that you shared connection mechanism does, and moreover
can provide resource management hooks that the shared connection
mechanism does not provide. Tell me again, then, what were the relative
advantages of shared connections?


You seem to be very determined to believe that your shared connection
plan is good, so I suspect that none of this has persuaded you. You are
free to believe whatever you want to believe on the topic, and I am
finished trying to tell you otherwise.
 
V

Virgil Green

Top posting corrected.
Thanks Virgil. Your criticism of three active shared connections as I
propose represents a pooling approach.

As I understand it so far, your "Connection Manager" is handing out one of
three connections in a round-robin fashion. I've assumed so far that the
clients of this "Manager" simply ask for a connection and get one of the
three. Is that correct? If so, that is a rudimentrary connection pool.
The three shared connections
are not pooled, but shared.

Please explain to me the difference if my understanding as described above
is incorrect.
Round robin is used for load balancing
between them. If the collection of connections grows and shrinks
dynamically as you propose, then literally these are being created
and destroyed based on need and there is no advantage whatsoever to
having anything pooled.

You keep creating this fictional situation of a pool closing all connections
and then claiming that pooling is of no merit. If some class claims to be
pooling but closes all connections, then it isn't pooling. The failure of
that particular class to pool connections does not make connection pooling
itself any kind of myth or of no value.

Any good connection pool will begin with a configurable number of
connections or may start with no connections and then maintain a pool that
never drops below a threshold once that number of connections has been
reached. The growing and shrinking I'm talking about are the number of
connections that might be made over and above the minimum. Of course the
minimum number of connections in the pool would be maintained as open
connections at all times (once created).
The concept of pooling implies connections
are not destroyed, meaning no close or logoff ever happens at the
server for them, and new requests get a hot connection.
Correct.

The idea of
pooling is a waste of time.

Why do you say this when the statement immediately before described exactly
what you want in pooling and basically describes pooling correctly (minus a
lot of details about stale connections, etc).
What's needed is connection sharing. If
trans blocks are needed a standalone connection should be obtained
instead.

Sharing means serialization on that connection, as has been pointed out by
others in this thread. Transactions (or trans blocks as they seem to have
been dubbed in this thread) should simply be implemented by the client code
requesting a connection from the pool and then not releasing it until the
commit or rollback is requested. There is no conflict between transactions
and pooling, but there is conflict between transactions and sharing.
Yes, these are struts applications. The classes used are all user
written, or, pasted from something else and changed a bit. They are:

DataSourcePlugIn
Padbmt00DBConnectionManager
BaseDAO
PersistenceMap
PersistenceMapDAO
PersistenceMapFactory

getPooledConnection is in the BaseDAO class and has the following:

try {
db = Padbmt00DBConnectionManager.getInstance();
Connection conn = db.getConnection();
String query = "SELECT '1' FROM SYSIBM.SYSDUMMY1";
Statement statement1 = conn.createStatement();
ResultSet rs1 = statement1.executeQuery(query);
rs1.close();
statement1.close();
return conn;
}
catch (com.ibm.ejs.cm.portability.StaleConnectionException e) {
try {
Connection conn =
Padbmt00DBConnectionManager.getInstance().getConnection();
return conn;
catch(Exception ee){
ee.printStackTrace();
return null;
}
}
catch (Exception e) {
e.printStackTrace();
return null;}
}


So, who wrote baseDAO? The code above is the getPooledConnection() method?
This is homebrewed code? This is what you're complaining about when you say
that pooling is a myth? There's no pooling going on there unless it has been
hidden in db.getConnection(). Why aren't you beating up on the author of
this code rather than decrying the worth of connection pools? That's like me
putting marbles in a gumball machine, you buying a "gumball" and then you
claiming that gumballs are inedible.
The driver name COM.ibm.db2.jdbc.app.DB2Driver can be found in
datasource.xml.

One high priority project has 12 developers working as needed. Things
are so convoluted, people have recently been instantiating their own
driver and connection and bypassing the connection management.


I'm not much surprised if the code above is passing for "pooling" code in
your shop.
 
D

David McDivitt

From: (e-mail address removed)
Subject: Re: pooled connection myth
Date: 18 Mar 2005 07:55:46 -0800

I've been following this thread for quite a while now but I finally
decided it's time to say something. I'm a JDBC driver developer (I
might not be the best around, but I have seen and do understand the
inner workings of one).



This is so full of crap. So do you take the DB2 documentation on faith?
I highly doubt you have the complete DB2 sources to "see". Why don't
you move to MySQL then?


If intercepting close and open calls is not being done where you are
it's because you don't have a connection pool implementation. You have
a connection sharing algorithm in place. That's indeed not good.

Pooling does accomplish a lot of things, such as restoring connection
state (auto commit, transaction isolation etc. -- I know, you are
relying on the fact that no one touches that until it will blow up in
your face), closing resources that the user forgot to close so that you
don't run out of memory because you end up with way too many open
Statements (the garbage collector is not going to save your ass here,
it's _not guaranteed_ to clean up all resources as soon as they become
unavailable) and checking that connections are still alive (I see
that's the one thing you are also doing). Additional connections are
only created when there's need for them, i.e. when there are more
threads requesting connections than connections available; you say this
never happens with your application; then why do you worry about it?

You say that there is little contention on your three connections. Why
do you need three then? Why can't you stick with just one? Have you
wondered what happens if a connection crashes when one thread does some
stupid stuff? Or if it hangs? And even with low connection contention,
do you realize it's very probable that at some point a Statement will
try to execute something immediately after another one has executed a
SELECT that returned a rather large ResultSet, forcing the driver to
cache all that ResultSet into memory?

Well, I guess all this doesn't really count when compared with your
doubt that connection pool managers actually do everything they
advertise. Good luck, then.

You say some good things. Garbage collection is not an issue. Result sets
and the like go out of scope in the module where coded, not the connection.
Whatever connection scheme used will be used in house only, people will know
what it does, and they will not code adverse to it. I see the benefit of
getting a good connection manager that does real pooling. A lot of time and
investigation must go into this.

I appreciate the time you, John, Lee, and Chris took to answer my questions.
To solve immediate needs I will use a connection sharing technique, but will
see about getting pooling classes and testing some that really work.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,771
Messages
2,569,587
Members
45,099
Latest member
AmbrosePri
Top