Not using exceptions only for exceptional conditions

villo · Jun 28, 2006

Hi everybody

I always have agreed that exception shouldn't be used for flow control,
nevertheless I have incurred in a situation where breaking this
principle is probably good, for other reasons than design.

The situation is inserting a row in a table, extremely simple stuff. It
happens that the application is basically only doing this and the table
has ~10^7 records, so I have rewritten the application logic, that
would be correctly described as

ARecordObject r=...

if(aDao.find(r)!=null)
aDao.insert(r);
else
aDao.merge(r,anotherRecordObject);

in the form of

if(!aDao.insert(r))
aDao.merge(r,anotherRecordObject);

where insert catches DataIntegrityViolationException and returns true
iff an insert was made.

Now, I see that the first form is better for many reasons, but if the
latter performs faster should I really
bother?

Cheers
Francesco

lordy · Jun 28, 2006

Hi everybody

The situation is inserting a row in a table, extremely simple stuff. It
happens that the application is basically only doing this and the table
has ~10^7 records, so I have rewritten the application logic, that
would be correctly described as

ARecordObject r=...

if(aDao.find(r)!=null)
aDao.insert(r);
else
aDao.merge(r,anotherRecordObject);

in the form of

if(!aDao.insert(r))
aDao.merge(r,anotherRecordObject);

where insert catches DataIntegrityViolationException and returns true
iff an insert was made.

Now, I see that the first form is better for many reasons, but if the
latter performs faster should I really
bother?

For maximum performance you should push that logic into a stored
procedure on the database.

Failing that the latter method is good and relevant bits of the program
should document the 'flow'.
Have you also considered relative occurence of inserts() to merges() in
your application.
Depeneding on incomming data it may be quicker to try a merge() first.
(Presumable this would be an update that returns zero rows updated
rather than an exception)

Lordy

Rastislav Komara · Jun 28, 2006

villo said:
Hi everybody

I always have agreed that exception shouldn't be used for flow control,
nevertheless I have incurred in a situation where breaking this
principle is probably good, for other reasons than design.

The situation is inserting a row in a table, extremely simple stuff. It
happens that the application is basically only doing this and the table
has ~10^7 records, so I have rewritten the application logic, that
would be correctly described as

ARecordObject r=...

if(aDao.find(r)!=null)
aDao.insert(r);
else
aDao.merge(r,anotherRecordObject);

in the form of

if(!aDao.insert(r))
aDao.merge(r,anotherRecordObject);

where insert catches DataIntegrityViolationException and returns true
iff an insert was made.

Now, I see that the first form is better for many reasons, but if the
latter performs faster should I really
bother?

Cheers
Francesco

It is absolutly bad solution. You cannot achieve reasonable performance
using second way. Exceptions are realy realy slow for building stack
trace. If you do not want to use SQL test over DB use other way to
determine which operation is required. I cant recomend DB stored
porcedures. This is hard to write and bound application to special
Database infrastructure lovering portability and cacheing abilities.

Chris Uppal · Jun 29, 2006

villo said:
if(aDao.find(r)!=null)
aDao.insert(r);
else
aDao.merge(r,anotherRecordObject);

Now, I see that the first form is better for many reasons,

It isn't better unless you are running your database at such a high isolation
level that the failed find() will prevent anyone else adding a record which
/would/ have matched the conditions in the SELECT. Similarly, the find()
could succeed, but the row be removed before your merge() executed; but
forbidding that doesn't require such a high isolation level[*].

OTOH, the second form is not a lot better, since once the error has been
triggered (and /assuming/ you've managed to find a -- database dependent -- way
of recognising the scenario accurately), there is /still/ no guarantee that the
data won't change under you before your alternative code is executed, so you
need some sort of loop somewhere.

Speaking as a non-expert, I prefer the second form, despite that problem.

I'd much prefer a third form with an UPDATE_OR_INSERT atomic action -- but
SQL doesn't provide that[**] :-(

-- chris

[*] I gather from the PostgreSQL documentation, that that DB may be quite
likely to use an isolation level high enough for this, since the level needed
for the second to work (read repeatability) is apparently just a synonym for
the highest level "serializable".

[**] Unless, perhaps, one could cobble something together with select for
update -- I'm no expert, as I mentioned.

Chris Uppal · Jun 29, 2006

Rastislav said:
It is absolutly bad solution. You cannot achieve reasonable performance
using second way. Exceptions are realy realy slow for building stack
trace.

I find it /really/ difficult to believe that the cost of building, throwing,
and catching, an exception is even remotely comparable with the cost of a round
trip to a database. (And that's assuming that a select followed by an insert
runs in approximately the same time as a single insert -- which is probably
true when taking DB caching into account, but only at a far coarser granularity
than the time taken to throw an exception.)

-- chris

villo · Jun 29, 2006

Thanks for the feedback. Just a couple of remarks:
-I know that on the average the cost of building the stacktrace and
throwing the exception
is not comparable with a round trip to the db. I have to admit that I
haven't done any serious
benchmark in my case, which I'll probably do soon. But I have the
feeling that in my case
it can be slightly different (see below)
-Chris made a good point, actually my rdbms (hsql) supports only read
uncommitted, so it is possible that I have a race condition here.
Nevertheless this doesn't represent a problem at the application level:
in case a record has been inserted between find() and merge() I simply
catch the Duplicate...Exception again.
-Select for update could be a way but I'm not sure wheter hsqldb
supports it. And I wonder if I can mimic the merge() method with bare
sql...

The reason for the first point is that due to requirements I'm forced
to use the embedded file storage hsqldb provides. Now Inserting the
data is not a problem, but search time when the table size is in the
order of 10^7 tend to be unacceptable for the current scenario, where
the objective is to write as fast as I can.
Just to give an overview of what I'm doing: I have a thread pool making
axfr requests and writing some information to a single table. Each
record contains basically an IP and the ccTld it belongs to. Laterwards
the collected data will be merged to a separate database to make some
data analysis. This will be done in batches.

So to summarize I guess I will have to make some real benchmarks to see
which approach performs better; I expect that as long as the db is
small the first will be faster, while the second should catch up as the
size increases...Which is the pointcut it is totally unclear right
now

cheers
Francesco

Chris Uppal · Jun 30, 2006

villo said:
The reason for the first point is that due to requirements I'm forced
to use the embedded file storage hsqldb provides. Now Inserting the
data is not a problem, but search time when the table size is in the
order of 10^7 tend to be unacceptable for the current scenario, where
the objective is to write as fast as I can.

Note that inserting into a table requires the logical equivalent of a search
anyway, since rows are unique. If you are seeing a significant difference
between the time taken to do an insert (of a record you know is not already
present), and the time take to search for that record (and not find it), then
it sounds as if something is very wrong -- and presumably fixable ;-)

You might get better performance by indexing the table (only) on an otherwise
meaningless incrementing integer and/or timestamp. The idea is to make writing
this table as near as possible like writing an ordinary sequential file.

-- chris

Simple Login programming: Authentication does not work for me!	1	Nov 28, 2007
For Peer Review	1	Apr 2, 2010
Errata for The C Programming Language, Second Edition, by Brian Kernighanand Dennis Ritchie	4	May 16, 2009
word_set = set() def should_preceed_with_an(phrase): first_word =	1	Jan 26, 2013
Why C Is Not My Favourite Programming Language	132	Feb 5, 2005
Download the JAVA , .NET and SQL Server interview PDF	0	Sep 17, 2006
Download the JAVA , .NET and SQL Server interview with answers	0	Sep 14, 2006
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

Not using exceptions only for exceptional conditions

villo

lordy

Rastislav Komara

Chris Uppal

Chris Uppal

villo

Chris Uppal

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads