Advise on mapping JTable to large PostgreSQL table requested?

  • Thread starter Joost Kraaijeveld
  • Start date
J

Joost Kraaijeveld

Hi,

I *must* (as in: I have no choice and it can't be solved another way by
customer demand) display a large table (> 1.100.000 records) in a JTable
( or something that looks like a JTable for the customer).

The customer wants to able to search the data and wants the focus (if
the search succeeds) to be on the record in the table.

AS a JTable works with rows, so I have written code that calculates the
row of the record in the query and maps any row request to an index in
the query.

To make it more clear (?;-):

1. Customer wants the record with 'some text' in column 'a'
2. I calculate the index:
select count(*) from table where a < 'some text'
3. I check if the index is in a local cache ( an AbstractTableModel
derived object that maintains a vector of cached record).
4. If not, I get 100 records before and after the requested record,
including the record itself and put them in the local cache so that
JTable can call AbstratTableModel.getValueAt(row,column) without the
need of accessing the database.
5. I set the JTable selection to the index found in step 2 and as long
as a requested row is in my cache I have a snappy respons.

The trouble is that step 2 takes a lot of time (it is proportional to
the size of the table?).

I assume that I am not the first person that tries something like this.
I want to know if my way of solving this problem is the most adequate,
or that there are other ways of achieving my goal (which is indeed
emulating an ISAM database).

TIA

Joost
 
Z

zero

Hi,

I *must* (as in: I have no choice and it can't be solved another way by
customer demand) display a large table (> 1.100.000 records) in a JTable
( or something that looks like a JTable for the customer).

The customer wants to able to search the data and wants the focus (if
the search succeeds) to be on the record in the table.

AS a JTable works with rows, so I have written code that calculates the
row of the record in the query and maps any row request to an index in
the query.

To make it more clear (?;-):

1. Customer wants the record with 'some text' in column 'a'
2. I calculate the index:
select count(*) from table where a < 'some text'

If I understand correctly your bottleneck is in accessing the database -
which makes sense, it is after all I/O. Is it feasable to read the whole
table once, and keep it in memory, using hashing techniques to retreive the
correct row number?
 
J

Joost Kraaijeveld

zero said:
If I understand correctly your bottleneck is in accessing the database -
which makes sense, it is after all I/O. Is it feasable to read the whole
table once, and keep it in memory, using hashing techniques to retreive the
correct row number?
Nop, the loading of the records take too long (> 1.100.000 Bean based
ValueObjects with some relations).

Joost
 
P

pascal.lecointe

Joost Kraaijeveld a écrit :
Nop, the loading of the records take too long (> 1.100.000 Bean based
ValueObjects with some relations).

Joost

Did you have created an index on the column which contains the value ?
If an index is created, the query should be much more fast
 
R

Richard Wheeldon

Joost said:
2. I calculate the index:
select count(*) from table where a < 'some text'
The trouble is that step 2 takes a lot of time (it is proportional to
the size of the table?).

Sounds like it's the database end that's causing you problems, not
the java side. What's the exact sql you use and what indices exist
on the table ?

Richard
 
R

Roedy Green

The trouble is that step 2 takes a lot of time (it is proportional to
the size of the table?).

What happens if you use an estimated interim count, e.g. same as last
time you did this query, and get on with the display hopefully using
just what you have in the cache and on a separate thread do the count
and touch up?

You also have the problem of your cache records going stale. I
remember working on a team where they used a home brew database with a
scheme to notify clients of cache invalidations.
 
J

Joost Kraaijeveld

Richard said:
Sounds like it's the database end that's causing you problems, not
the java side. What's the exact sql you use and what indices exist
on the table ?
The SQL is correct and the database is doing what it supposed to be
doing (from the relevant PostgreSQL mailinglists) but I wonder if the
general pattern is correct: is the way I am doing it correct? Maybe
there is a *smarter* way of doing the things I do.

Joost
 
P

pascal.lecointe

Joost Kraaijeveld a écrit :
The SQL is correct and the database is doing what it supposed to be
doing (from the relevant PostgreSQL mailinglists) but I wonder if the
general pattern is correct: is the way I am doing it correct? Maybe
there is a *smarter* way of doing the things I do.

Joost

You can also simplfy the query to do
SELECT * FROM MY_TABLE;

If the driver give you a scrollable ResultSet, you can do
ResultSet rs = stmt.executeQuery (qry);
rs.absolute(100000); // this will go to the line 100 of the table,
without reading the 100000 element
 
C

Chris Uppal

Joost said:
but I wonder if the
general pattern is correct: is the way I am doing it correct? Maybe
there is a *smarter* way of doing the things I do.

I think the underlying problem is that you are abusing the SQL database. A SQL
"table" isn't a table in the same sense as say the concept underlying a JTable.
It doesn't consist of /numbered/ rows. SQL purists will tell you that the rows
are in fact unordered, and in practical terms if you push the database hard
enough (as you are doing) they are right.

What /is/ ordered is the set (or sequence) of results returned by one actual
query. But the DB has no way to tell what will be the 5000000-th row in that
sequence except by /counting/ the rows. (This is implementation-dependent of
course, but it's more likely to be true than not, and is true of PostgreSQL in
my -- limited -- experience).

So one basic, and probably incurable, problem you have is that finding the N-th
row is infeasible for large N. A possible workaround for that would be to
change the UI a bit and /start/ with the row of interest rather than trying to
capture a group of rows around it. In that case you would issue a "select
<whatever> where <whatever> >= ?" and use the next 200 rows (actually you'd
probably restrict the number of rows in the query itself, but I can't remember
the syntax offhand). An alternative workaround would be to include actual row
numbers as an indexed field in the database itself, but (a) that's a stupid
abuse of the data to support a broken UI, and (b) inserts and deletes would
become all-but-impossible.

Of course that will mean changing the UI that the customer has asked for. But
then you'll have to do that anyway. At least, you do if the idea is to provide
the illusion of a JTable which contains /all/ the 1000000 rows -- that is
simply Too Many for a scrolling interface (even if you could fix the
performance problems). If that /is/ what the customer is asking for then I
suggest you create a mock-up with dynamically-generated data filling a virtual
1000000-row table, and give it to the customer to try out.

BTW, we discussed approximately this issue in June/July; you might find it
helpful to review the thread entitled "How to populate a very large recordset
into JTable?".

-- chris
 
S

Silvio Bierman

Joost Kraaijeveld said:
Hi,

I *must* (as in: I have no choice and it can't be solved another way by
customer demand) display a large table (> 1.100.000 records) in a JTable
( or something that looks like a JTable for the customer).

The customer wants to able to search the data and wants the focus (if the
search succeeds) to be on the record in the table.

AS a JTable works with rows, so I have written code that calculates the
row of the record in the query and maps any row request to an index in the
query.

To make it more clear (?;-):

1. Customer wants the record with 'some text' in column 'a'
2. I calculate the index:
select count(*) from table where a < 'some text'
3. I check if the index is in a local cache ( an AbstractTableModel
derived object that maintains a vector of cached record).
4. If not, I get 100 records before and after the requested record,
including the record itself and put them in the local cache so that JTable
can call AbstratTableModel.getValueAt(row,column) without the need of
accessing the database.
5. I set the JTable selection to the index found in step 2 and as long as
a requested row is in my cache I have a snappy respons.

The trouble is that step 2 takes a lot of time (it is proportional to the
size of the table?).

I assume that I am not the first person that tries something like this. I
want to know if my way of solving this problem is the most adequate, or
that there are other ways of achieving my goal (which is indeed emulating
an ISAM database).

TIA

Joost

Joost, If you are searching for (prefixes of) full column content you can
search the records by using

SELECT * FROM TABLE WHERE COLUMN_A < 'XXX' ORDER BY COLUMN_A DESC

and

SELECT * FROM TABLE WHERE COLUMN_A >= 'XXX' ORDER BY COLUMN_A

By setting both queries to max. 100 rows you could even keep the IO down.
Remember that only on an indexed column will the perform swiftly, a full
table scan (twice in this case) will result oterwise.

This scheme can be extended with a primairy key to guarantee unique
ordering. In that case you could actually page-up and down from
top/bottom/any location.

Oh, and you really need to have a serious chat with your customer...

Silvio Bierman
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,534
Members
45,007
Latest member
OrderFitnessKetoCapsules

Latest Threads

Top