Database & 'autonumber' in multithreaded environment

J

Joerg Gippert

Hello all!

I´m working on a webapplication that makes use of a database. I used to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique. So, I´m just thinking of how to
provide my own unique primary keys. The book mentioned above doesn´t
really say how to create your own unique keys. How sophisticated do
those functions have to be, in order to create unique keys? The more
sophisticated it gets, the more the server has to compute, which could
be a problem when there´s a huge load put on the server, no? And how
relieable are those "autonumber" features in RDBMS today (Postgres and
MySQL to be specific, can´t afford Oracle on my server :' ( ... ). Does
anyone has some information or links on that topic?

Any hint is greatly appreciated!

Regards,
Joerg!
 
M

Michael Borgwardt

Joerg said:
Iï½´m working on a webapplication that makes use of a database. Iused to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique.

As long as you use only a single DB host and understand the keys to be unique
only on a per-table basis, there will be no problem.

However, the feature is not standardized across DB systems and thus
very portable.

So, Iï½´m just thinking of how to
provide my own unique primary keys. The book mentioned above doesnï½´t
really say how to create your own unique keys. How sophisticated do
those functions have to be, in order to create unique keys?

The problem has nothing to do with how simple or sophisticated the
"function" is, a simple increment is perfectly sufficient. However,
if you'd want the ID numbers to be unique across multiple servers
or even just multiple tables, you'd incur synchronization overhead
that could, depending on the circumstances, absolutely kill
performance. There is *no* way around that except to settle
for "probably unique" IDs or IDs that are unique only within their
local context.
 
T

Tim Ward

Joerg Gippert said:
I´m working on a webapplication that makes use of a database. I used to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique.

How odd. I wonder what Java does to mess this up then, it's standard
recommended practice in other languages.

One point is that the underlying databases do not do this in a standard way,
because for some inexplicable reason (anyone on the original standards
committee who would like to explain?) this clearly essential feature got
omitted from standard SQL. Being clearly essential, RDBMS vendors added the
feature anyway, but given the lack of a standard each did it in his own way
(although some are now accepting their competitors' syntax as well as their
own, which is helpful).

Whether the Java database inteface layers know about all these non-standard
methods and package them up in a standard Java API I don't know, because
I've never used them, but I would rather hope they did (unlike ODBC).
 
V

VisionSet

Joerg Gippert said:
Hello all!

I´m working on a webapplication that makes use of a database. I used to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique. So, I´m just thinking of how to
provide my own unique primary keys. The book mentioned above doesn´t
really say how to create your own unique keys. How sophisticated do
those functions have to be, in order to create unique keys? The more
sophisticated it gets, the more the server has to compute, which could
be a problem when there´s a huge load put on the server, no? And how
relieable are those "autonumber" features in RDBMS today (Postgres and
MySQL to be specific, can´t afford Oracle on my server :' ( ... ). Does
anyone has some information or links on that topic?

Do a search on 'DAO pattern' for the sun J2EE discussion.
If multiple database implementations are important, you probably need to do
something like this. Then you *can* use the DB autonumber feature, but need
to reimplement it for each DB.
If you know you will only need to use one DB then this isn't an issue.
 
B

Bruce Lewis

Joerg Gippert said:
Hello all!

I´m working on a webapplication that makes use of a database. I used to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique.

The postgresql documentation of the currval() function claims the number
will be not only unique (should always be guaranteed by autoincrement
systems), but also correct for the current thread, i.e. you won't get
some other transaction's currval() when you ask for your own.

If your data doesn't have a natural primary key and you want to avoid
the locking that happens with sequences and autoincrement, check your
database's documentation as to whether CURRENT_TIMESTAMP is guaranteed
to be different for concurrent transactions. If so, you could use a
timestamp column as your primary key.

I didn't see any such guarantee in the postgresql docs, but AFAIK it
handles locking of sequences pretty well.
 
J

Joerg Gippert

Thanks to all that replied to my post. Now, some things cleared up. Just
got worried about that article´s headline "Don´t rely on built-in number
generation". I only use one database (Postgres) and the application is
only running on one server (Tomcat). With "multithreaded" I meant, that
there is concurrent read/write access to the database due the
webapplication can be accessed my multiplpe clients of course. I also do
not plan to use EJB in it either, neither do I really want to make a
portable JDBC app. By knowing all this, I should be save to use the
auto-increment functions, right?

Thanks again for clearing this. And congratulations to Mike for passing
the SCJP ;) I´m going for the exam by the end of October ;)

Bye,
Joerg
 
V

VisionSet

Thanks again for clearing this. And congratulations to Mike for passing
the SCJP ;) I´m going for the exam by the end of October ;)

Gee, thanks! And good luck!!

Threads & Inner classes are the toughie, for me at any rate, still 88%
wasn't bad. And in fact I more or less dropped 1 question in each of the
objectives, guess that makes me a real balanced programmer ;-)
 
R

Roedy Green

However,
if you'd want the ID numbers to be unique across multiple servers
or even just multiple tables, you'd incur synchronization overhead
that could, depending on the circumstances, absolutely kill
performance. There is *no* way around that except to settle
for "probably unique" IDs or IDs that are unique only within their
local context.

I learned this way back in the 70s with primitive databases that did
not support sequential files. The overhead of finding the next record
number was more than writing the record. There is a lot of overhead to
incrementing a counter -- all those before/after looks, logging etc.
Database writes have to do a lot of physical I/O to flush to prepare
for the eventuality of a crash.

Presumably any autoincrement technique done by the SQL engine itself
will be much more efficient.

I wrote an essay on how to generate unique numbers efficiently
with little disk overhead. It works with a single source or multiple
loosely coupled servers that crash frequently.

see http://mindprod.com/projuniquenumber.html
 
R

Roedy Green

One point is that the underlying databases do not do this in a standard way,
because for some inexplicable reason (anyone on the original standards
committee who would like to explain?)

I was not on that committee, but in general the problem is this:

1. vendors with small databases don't want to have to write extra
features. This is expensive for them, and means they can't stay light
and mean.

2. vendors with big database don't want the feature included as
standard, because having their proprietary version gives them bragging
rights. It also helps lock customers in to their particular way of
doing it.

The USERS of databases want standardisation. However, they are not
well represented on committees.
 
J

John C. Bollinger

Joerg said:
Hello all!

I´m working on a webapplication that makes use of a database. I used to
make use of the "autonumber" or "autoincrement" feature of the
underlying database system. However, I read in a Java "best practices"
book, that one should not rely on those features and the number
generated, could possibly *not* unique.

Well, that's really a function of the relevant DBMS, DB schema, and DB
usage scenarios. If you have an autonumbered PK or other uniquely
indexed column then you should expect to get an SQLException in one
thread or another if a duplicate is ever accidentally generated. If
transactions are kept short then the likelihood of such a problem is small.
So, I´m just thinking of how to
provide my own unique primary keys. The book mentioned above doesn´t
really say how to create your own unique keys. How sophisticated do
those functions have to be, in order to create unique keys?

Simple, limited scalability, and [very] dirty (for numeric keys only):
generate random keys until you find one that is unused.

If the web application was full control of the DB then you could set up
an object that provides a monotonic sequence of numbers to the rest of
the app for use as keys; the class for such an object should probably be
a singleton lest there be more than one independant sequence, and there
needs to be a means to persist the current state of the sequence or to
otherwise pick up again if the app is restarted.

I don't know any other approach without a dependency on the DBMS that
gives it at least as many problems as an autoincrement column has.


John Bollinger
(e-mail address removed)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top