Improving website responsiveness.

R

Rico

We have this JSP-based site whose code I've seen and I came across those
gems that would query a database and store all relevant results into
Vector's. But for each result in one given Vector the guy chose to make
one corresponding query to the db to get e.g the corresponding path.

The site feels horribly slow and I thought that the above was the cause
since it's so highly inefficient. So I set to task to store this table
from which we tap corresponding paths for each document in a Hashtable
first and then just loop through the document ID's to extract the matching
Hashtable values.

I was all excited about how much gain it would bring and pictured the
site to be a lot more slick... I was disappointed... I notice hardly any
difference :-(

Was that to be expected?
Could it be that what they tell us about how costly db connections are is
exaggerated? Or is the "optimization" I tried to make, thinking it's
actually the 'normal' way it should have been done, not much of an
optimization? :(

Rico.
 
W

Will Hartung

Rico said:
The site feels horribly slow and I thought that the above was the cause
since it's so highly inefficient. So I set to task to store this table
from which we tap corresponding paths for each document in a Hashtable
first and then just loop through the document ID's to extract the matching
Hashtable values.

I was all excited about how much gain it would bring and pictured the
site to be a lot more slick... I was disappointed... I notice hardly any
difference :-(

Was that to be expected?

What is shows is what happens when you try to fix a performance problem
without actually knowing what the performance problem is besides "it's
slow". This goes back to the sin of early optimization, which is basically
the same issue.

You need to step back and perform some analysis to get a better idea of
what's really slowing your site down. To quote Apollo 13: "Work the problem.
Don't make it worse by guessing!"

Even if you wrap some of your slow pages with a "System.out.println((new
java.util.Date()).getTime());" will tell you how much time is actually spent
within the page and not within the container or on the network.

It's a crude, but effective tool to give you a better idea where things are
slow in your code.

Regards,

Will Hartung
([email protected])
 
R

Rico

All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

The answer, after I took a break: Hell Yes!!

I timed it: it's slower by a factor of 65 !
A full solid minute compared to one second.
Now, why is the site still so slow then? hummm...



Rico.
 
B

Bryce

All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

Well.. Yea, because #1 above makes 3000 SQL queries, whereas #2 only
one. #2 is letting the Database do what it does best.
 
K

kaeli

All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

The answer, after I took a break: Hell Yes!!

I timed it: it's slower by a factor of 65 !
A full solid minute compared to one second.
Now, why is the site still so slow then? hummm...

Plus, if you want to do things with the result and re-loop through results
and whatnot, it's a much nicer and quicker way than doing the query over and
over...
It also allows you do get the same field multiple times, which is not allowed
by specs from a ResultSet. (You can do it in the sense that the compiler
won't stop you, but it's unpredicatable and can fail utterly.)
It also allows you to pass the data around after the resultset or connection
is closed, which for me is a great plus. I was losing tract of open
connections.
It also allows you to implement a backwards/forwards "resultset", which some
drivers or DBMS have some problems with.
I wrote my queries to be able to return hashes or resultsets as desired long
ago for these reasons.

As to why the site is slow, it may be server settings. Test servers are set
to check all the pages upon request to see if they've changed so they can be
recompiled. The production server should NOT have this option turned on. It
can slow things down a lot.

There are a lot of things that could be slowing it down. Maybe the JSP
instantiates too many objects and the server doesn't have good memory
management, meaning it has to write to disk a lot. It could be an issue in
the time it takes to send a ton of data out from the server to the client.

Try these things:

View the source of one of the pages that was slow and copy/paste it into a
new document (as is) and save it over to the server. Then request it. Does it
take the same amount of time to get the static version as the JSP version? If
so, this has nothing at all to do with java.

If the static version came fast, but the JSP was slow, start breaking down
the JSP code into sections and testing them and see which sections are really
taking a lot of time. Keep going to find the bottleneck.

HTH

--
 
J

John B. Matthews

Rico said:
All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

The answer, after I took a break: Hell Yes!!

I timed it: it's slower by a factor of 65 !
A full solid minute compared to one second.
Now, why is the site still so slow then? hummm...

Rico.

Also, after adding the optimization, be sure to remove the old
query! (Not that _I've_ ever forgotten:)
 
D

David Hilsee

Rico said:
All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

The answer, after I took a break: Hell Yes!!

I timed it: it's slower by a factor of 65 !
A full solid minute compared to one second.
Now, why is the site still so slow then? hummm...

As Will Hartung indicated, it is best to use metrics. If you can run
something like a profiler that will allow you to determine what exactly is
taking a long time to execute, then you can fix the problem. Don't dive
into the code optimizing sections that have not been identified as
bottlenecks and expect a significant improvement, because that tends to be a
gamble.
 
T

Tor Iver Wilhelmsen

Rico said:
for x=1 to 3000
select * from table where item = 'x'

If you use Statement, that will be horrendously slow. Use a
PreparedStatement for select * from table where item = '?', and
setInt(1, x);
 
R

ras_nas

Will Hartung said:
What is shows is what happens when you try to fix a performance problem
without actually knowing what the performance problem is besides "it's
slow". This goes back to the sin of early optimization, which is basically
the same issue.
You need to step back and perform some analysis to get a better idea of
what's really slowing your site down. To quote Apollo 13: "Work the problem.
Don't make it worse by guessing!"

Thanks for the input Will.
But "It is unreasonable to make a 1000 requests on the db only to get just one
string of path each time" is a far cry from "it's slow". A no-brainer really.

Further, when one function turns out to run 65 times slower than my new
version, I am pretty sure that my guess was a pretty good one.

It's just a matter then of locating all the _variants_ of such a function
whose sin is duplicated all over the place: for every little piece of info
I need, ask the db - Screw computer science classes and all those stuffs about
principle of locality and what not.

I'll keep you Guys updated how things turn out after I've fixed that mess.

Regards,
Rico.
 
B

Bruce Lewis

Rico said:
All right. The post proper is:

Would this:
for x=1 to 3000
select * from table where item = 'x'

be a lot slower than:
write result of (select * from table) into Hashtable
for x=1 to 3000
hashtable.get(x)

The answer, after I took a break: Hell Yes!!

Are you really planning on doing something with each row in that order?

If so, why store 3000 objects in a data structure? What you really want
is

select exactly those columns you want in case the table gets altered
from table
where item between 1 and 3000
order by item

Then process each row.

And while we're prematurely optimizing, note that if you know your keys
are going to be consecutive integers, an array will be faster than a
hash table.

Good luck finding the real reason why the site is slow. I think your
suspicion that thare are other SQL-clueless factors to the slowness is
worth pursuing, but you do have to measure.
 
R

Rico

Are you really planning on doing something with each row in that order?

No. I only wanted to express the idea that there's a big loop going on
without going into redundant details.
And while we're prematurely optimizing, note that if you know your keys

How can you tell that "we're prematurely optimizing" ?
I actually think that I am rewriting some parts of the code the way they
should have been written in the first place. That's not optimization.

Rico.
 
W

Will Hartung

ras_nas said:
Thanks for the input Will.
But "It is unreasonable to make a 1000 requests on the db only to get just one
string of path each time" is a far cry from "it's slow". A no-brainer
really.

Yes, but if in fact this isn't being done all the time but only once in a
blue moon, it becomes less of an issue, doesn't it?
Further, when one function turns out to run 65 times slower than my new
version, I am pretty sure that my guess was a pretty good one.

It was my understanding that you tried to fix "the problem" and "it was
still slow". That was my point. Rather than instrumenting your code and
finding where it was ACTUALLY slow based on your use cases, you found
something that "looked" slow, changed it, and it didn't appear to have the
effect you anticipated.
It's just a matter then of locating all the _variants_ of such a function
whose sin is duplicated all over the place: for every little piece of info
I need, ask the db - Screw computer science classes and all those stuffs about
principle of locality and what not.

This is a tuning problem, not necessarily a design problem. This goes back
to early optimization. Why construct a framework to speed up getting
information, particularly early in development, when you don't necessarily
know which information is needed more often than others, or which
information is particularly difficult to get compared to others?

If your code is factored sufficiently, then all of those DB calls would have
been wrapped anyway (say, in a Data Access Object), then you'd simply change
the wrapper on the select few calls that your metrics show are the offenders
and cache them (or whatever), and not worry about the rest.

It's always better to have a functional system that you can test and tune
than a "fast" system that doesn't work. Through instrumentation, you can
find the critical paths through your system that really need optimization.

Our system has boatloads of "slow code", but we've tuned out and fixed the
most gross offenders and we meet our peformance numbers. When those demands
change, we'll do the tuning process again, meanwhile we can focus on new
functionality that the users really want. It's an incremental process.

For our instrumentation, we rely on the vast amounts of DEBUG level logging
we have in the system anyway, an awk script to digest the log, and a SQL
logging feature we implemented to point out particularly slow queries. We
tried the different analysis tools optimizers and what not, but they never
helped enough for us to justify the costs that they want for them. We run
the SQL log in production to point out whatever horrors the users throw at
our DB.
I'll keep you Guys updated how things turn out after I've fixed that mess.

Good luck! Let us know!

Regards,

Will Hartung
([email protected])
 
W

Will Hartung

Rico said:
How can you tell that "we're prematurely optimizing" ?
I actually think that I am rewriting some parts of the code the way they
should have been written in the first place. That's not optimization.

I posted earlier, but...

If the code you're changing doesn't end up having a net effect on the site,
then it's premature optimization because if it has no net effect, then it's
not really "broken". It may be stylistically ugly, but if the goal is site
performance, it's better to spend time on that factor alone and clean up the
code for the sake of clean code at a later time. Cleaner code may well be
fast code, but if the slow code is functional and not on the critical path,
don't worry about it at this juncture.

Make it work, then make it fast.

Measure to find your real bottlenecks, then fix them one by one.
Inevitiably, you may well be led to your ugly code. But, then again, you may
not. Don't fix it until you know.

Regards,

Will Hartung
([email protected])
 
R

Rico

Good luck! Let us know!

So, what do you think happened Will? :)

The page I worked on is the very first page that everyone accesses after
the login. Doesn't sound like 'once in a blue moon', right?

And it's not "early in development" here. In that JSP page I had
identified a couple of functions that were part of the problem.

It is my understanding that just because I didn't notice a difference you
think it means that I modified the wrong method because it's not "ACTUALLY
slow" there. Did you consider that it's not "ACTUALLY slow" _only_ there?

After modifying another 4 or 5 such similar methods, I achieved what I
wanted.

Rico.
 
M

marcus

Rico,
I don't have a clue how to solve the various problems you mentioned (bad
code, slow response, whatever), but if I did I would not have posted a
reply. Your attitude is both defensive and insulting. I can only
imagine others who may have had insight similarly failing to respond.

I can hear you now -- "I ended up solving it myself anyway, so F* you all."

-- clh
 
R

Rico

Rico,
I don't have a clue how to solve the various problems you mentioned (bad
code, slow response, whatever), but if I did I would not have posted a
reply. Your attitude is both defensive and insulting. I can only
imagine others who may have had insight similarly failing to respond.

I can only imagine that you have the right to your opinion. I respect that.
I can hear you now -- "I ended up solving it myself anyway, so F* you all."

My opinion is that you hear wrongly.

Care for a cup of Java? Here... (C)(C) Cheers!

Rico.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top