optimizeit: how to workaround/minimize wrong cput time in Object.wait()

N

NOBODY

Hi,

Optimizeit reports imprecise cpu time in Object.wait()
even though you use
record only cpu usage. (also stream read,
Thread.sleep, etc...)

(partially explained in
http://groups.google.com/[email protected]&rnum=1
)


????? Is there anyways to work it around or minimize that effect?


Whether I use sampling or instrumentation mode, it is
very off by large fraction of thread time share%.

Changing sampling interval didn't help either.

I'm stuck! Feel like this 800$ piece of software is useless!
(optimizeit profile 4.2 enterprise edition, build 020502,
in case you know anything about that version)

Thanks.
 
R

Robert Klemme

NOBODY said:
Hi,

Optimizeit reports imprecise cpu time in Object.wait()

Times for Object.wait() are quite uninteresting IMHO because you can't
change them anyway. Also, Object.wait() does not much - most of the time
spent in this methods a thread is stopped because it waits for
Object.notify() (or notifyAll()). The overhead of recording this thread's
status etc. is neglectible IMHO.
even though you use
record only cpu usage. (also stream read,
Thread.sleep, etc...)

(partially explained in
http://groups.google.com/[email protected]&rnum=1
)


????? Is there anyways to work it around or minimize that effect?


Whether I use sampling or instrumentation mode, it is
very off by large fraction of thread time share%.

How did you determine that it's that far off?
Changing sampling interval didn't help either.

I'm stuck! Feel like this 800$ piece of software is useless!
(optimizeit profile 4.2 enterprise edition, build 020502,
in case you know anything about that version)

I find OptimizeIt very useful. Maybe you just use it the wrong way. What
exactly is it that you want to determine?

Kind regards

robert
 
N

NOBODY

Times for Object.wait() are quite uninteresting IMHO because you can't
change them anyway. Also, Object.wait() does not much - most of the
time spent in this methods a thread is stopped because it waits for
Object.notify() (or notifyAll()). The overhead of recording this
thread's status etc. is neglectible IMHO.

IMHO, you are not on the track... ;-)
Lemme exlain again.
I know 'wait()' is uninterresting. And I expect it to disappear from
results too! But it doesn't, in a very annoying way...

The fact is, (if you had read the link I sent), you would have
understood that optimizeit may report false non-null cpu time for a
thread that went out, did some work (JVMPI thread events that flag it as
non blocked) and went back in the wait() (or sleep, or i/o read,
etc...), between profiler 2 samples (lets say 5 ms).

OI will report that this method (known as blocking only to humans, not
the profiler) actually did some work as per the JVMPI events. Even if
the work was only 0.1 ms of a 5 ms time window, OI will report that the
method, here the wait(), has worked 5 ms.

Basically, if I got Will McQueen right, OI will only be told when a
thread is blocking/unblocking, it does timestamp those JVMPI events to
sum up cpu time, but doesn't save it per method/line at a lower
granularity than the stackframe seen by the corser sampler (5ms). It
only raise an activity flag ans assigns the time to that method/line.

How did you determine that it's that far off?

I profile with sampling mode, per line, not method, the line is an
Object.wait() and the profiler reports ~70% of time in that method.
Can't be clearer.
Beside, many other users have complained about that problem. This post
is about figuring solution to a know problem. I don't have to argue
about its existence.

I find OptimizeIt very useful. Maybe you just use it the wrong way.
What exactly is it that you want to determine?

I have used OI for the last 6 years. I think I know how to use
it. What I want to determine is the real wait() %cpu due to real
contention to reaquire the monitor. Right now, I've put many hour tuning
and fixing hypothetical performance problems and yet, the know design
pattern to improve the performance don't solve the problem, which means
they were not the problem at first. I was mislead by bad profiler
results.

I want to know how to eliminate/minimize those wrong results.

Thanks.
 
M

marcus

Man -- even if I knew the answer I wouldn't help you out, you
supercilious b@stard.
 
R

Robert Klemme

NOBODY said:
IMHO, you are not on the track... ;-)

If I were you, I wouldn't be so sure of that.
Lemme exlain again.
I know 'wait()' is uninterresting. And I expect it to disappear from
results too! But it doesn't, in a very annoying way...

The fact is, (if you had read the link I sent),

How do you know I didn't?
you would have
understood that optimizeit may report false non-null cpu time for a
thread that went out, did some work (JVMPI thread events that flag it as
non blocked) and went back in the wait() (or sleep, or i/o read,
etc...), between profiler 2 samples (lets say 5 ms).

OI will report that this method (known as blocking only to humans, not
the profiler) actually did some work as per the JVMPI events. Even if
the work was only 0.1 ms of a 5 ms time window, OI will report that the
method, here the wait(), has worked 5 ms.

Yep, that's perfectly clear to me. It's a typical rounding error.
Basically, if I got Will McQueen right, OI will only be told when a
thread is blocking/unblocking, it does timestamp those JVMPI events to
sum up cpu time, but doesn't save it per method/line at a lower
granularity than the stackframe seen by the corser sampler (5ms). It
only raise an activity flag ans assigns the time to that method/line.



I profile with sampling mode, per line, not method, the line is an
Object.wait() and the profiler reports ~70% of time in that method.
Can't be clearer.
Beside, many other users have complained about that problem. This post
is about figuring solution to a know problem. I don't have to argue
about its existence.

But maybe you draw the wrong conclusions from your data. Did it occur to
you that other conclusions could be valid or even more appropriate?
Apparently the thread does not much more than waiting most of the time.
This could either mean, all is well and your program is quite optimal. Or
it means, there are too many invocations of wait(). You'd probably then
want to switch from notifyAll() to notify(). Or you have a problem on the
notifier side, for example with a farmer worker scenario where the farmer
is too slow putting tasks into the task queue; then workers will starve
and will wait most of the time.

Without knowing you app's architecture it's difficult to speculate. I
guess you will have to find out the real reasons yourself. But you should
definitely consider the option that OptimizeIt does give correct results
(apart from the rounding error).
I have used OI for the last 6 years. I think I know how to use
it. What I want to determine is the real wait() %cpu due to real
contention to reaquire the monitor.

So you don't want CPU time but wall clock time. That's something
different. If you monitor CPU time, you won't get an idea of the real
waiting time because a suspended thread does not take up CPU time.
Right now, I've put many hour tuning
and fixing hypothetical performance problems and yet, the know design
pattern to improve the performance don't solve the problem, which means
they were not the problem at first. I was mislead by bad profiler
results.

.... or because you asked the wrong questions or you interpreted profiler
results incorrectly.
I want to know how to eliminate/minimize those wrong results.

As I said, I'm not sure that these results are rellay wrong. Did you
profile with -Xrunhprof or another profiler to get a second opinion?

robert
 
N

NOBODY

marcus said:
Man -- even if I knew the answer I wouldn't help you out, you
supercilious b@stard.

Thanks marcus, you are at least as much constructive as I have been.
 
N

NOBODY

Try to use another profiling tool (for example, YourKit profiler
http://www.yourkit.com). It has an ability to skip such Object.wait()
times.

I tried to filter out Object.wait, but it works only in method sampling,
not line sampling.

I will give it a try anyway. Thanks.
 
N

NOBODY

Or it means, there are too many invocations of wait().
You'd probably then want to switch from notifyAll() to notify(). Or
you have a problem on the notifier side, for example with a farmer
worker scenario where the farmer is too slow putting tasks into the
task queue; then workers will starve and will wait most of the time.

I already have separated writers semaphore and readers semaphore.
The writer notify() one blocked reader, reader notify() one blocked
writer. Works great. That was the fix I tried. But didn't change much the
profile.

So you don't want CPU time but wall clock time.

I never want clock time. I already have monitored throughput of my code.

A stackframe could be on wait() but actually the thread is done waiting
(waited, in JVMPI language) and tries to acquire the monitor.

Would that part will be shown as cpu or not? Not documented. I supposed
yes because that would explain the 70% of time in wait() caused by
contention. I agree the cpu is not working for that thread though and I
shouldn't care, but...

But excessive contention (like you mentionned through notifyAll()) is
only detectable by guessing if the invocation count is to high, which
requires intrumentation mode, hence give a very biased result.

So I don't know if I have high contention or if it is just a rounding
error! That's the whole problem.


I was planning to try other profilers. I guess this is inevitable.


If any reader have the answer to my original question (how to minimize
now called 'rounding error' in optimize it), please answer. I'm still
listening.

Thanks.
 
R

Robert Klemme

NOBODY said:
I already have separated writers semaphore and readers semaphore.
The writer notify() one blocked reader, reader notify() one blocked
writer. Works great. That was the fix I tried. But didn't change much the

I never want clock time. I already have monitored throughput of my code.

Then are you satisfied with the throughput? If yes, why do you bother
with the 70%?
A stackframe could be on wait() but actually the thread is done waiting
(waited, in JVMPI language) and tries to acquire the monitor.

Would that part will be shown as cpu or not? Not documented. I supposed
yes because that would explain the 70% of time in wait() caused by
contention. I agree the cpu is not working for that thread though and I
shouldn't care, but...

I did a small test where I increased overall wait time by sleeping before
the notify and wait didn't even show up in the profiler results
(with -Xrunhprof). Probably testing another profiler is best you can do
at the moment.
But excessive contention (like you mentionned through notifyAll()) is
only detectable by guessing if the invocation count is to high, which
requires intrumentation mode, hence give a very biased result.

So I don't know if I have high contention or if it is just a rounding
error! That's the whole problem.

Since this seems to affect only one place in the code, did you consider
instrumenting the code with measurements of your own? You could have a
wait counter and a wait time counter and thus determine average wait time.
Also, you might want to check average queue size (I assume you have a
queue to transport the tasks or whatever it is).
I was planning to try other profilers. I guess this is inevitable.

I'd start with "java -Xrunhprof". For me this gives quite good results
and it's easy to use. You can use HPJmeter to view values or a tool named
PerfAnal which you can find on sun's website. Both are free.

Good luck!

robert
 
N

NOBODY

It is definitly an optimizeit limitation.
OI 6 gives the same shit.
Jprobe doesn't.
The weird YourKitProfiler doesn't.
I will try good old fashion sun's hprof today (just because this one is
free... and may be more of a reference...)
Then are you satisfied with the throughput?
No. And I'm just trying to apply the 80%-20% rule.
Once optimized, I should see code in the hotspots, not wait()!
I have never seen that before, ever, in 6 years.
Modern pc's are probably too fast for optimizeit approach of profiling.
Note that in instrumentation mode, OI gave me substackframes under wait
()!
Like wait called my other methods! (probably those methods that ran
between the 2 wait)... I want my money back! hehehe!

I did a small test where I increased overall wait time by sleeping
before the notify and wait didn't even show up in the profiler results
(with -Xrunhprof). Probably testing another profiler is best you can
do at the moment.

You have to notify more, not less.
You want to cause the waiting thread wait to exit, do a quick job and go
back to wait, so that the OI profiler thinks no other method/line of code
was running but thread was not blocked.

Since this seems to affect only one place in the code, did you
consider instrumenting the code with measurements of your own? You
could have a wait counter and a wait time counter and thus determine
average wait time.

System.currentTimeMillis is only ms, not nanoseconds...!


Allright, Thanks Robert.
I will report a bug to borland... if at all possible...
Have a good week.
 
R

Robert Klemme

NOBODY said:
It is definitly an optimizeit limitation.
OI 6 gives the same shit.
Jprobe doesn't.
The weird YourKitProfiler doesn't.
I will try good old fashion sun's hprof today (just because this one is
free... and may be more of a reference...)

I'm quite satisfied with that for profiling. Avoids the overhead of a GUI
applciation and breaks only rarely. :)
No. And I'm just trying to apply the 80%-20% rule.
Once optimized, I should see code in the hotspots, not wait()!
I have never seen that before, ever, in 6 years.
Modern pc's are probably too fast for optimizeit approach of profiling.
Note that in instrumentation mode, OI gave me substackframes under wait
()!
Like wait called my other methods! (probably those methods that ran
between the 2 wait)... I want my money back! hehehe!

You won't - not after 6 years. ;-)
You have to notify more, not less.
You want to cause the waiting thread wait to exit, do a quick job and go
back to wait, so that the OI profiler thinks no other method/line of code
was running but thread was not blocked.

That's basically what my program did (there were a lot wait-notify cycles)
only that I artifically increased the time between two notifies. The code
between wait wakeup and next wait is essentially a for loop with int
increment => fast. Even reducing the period between two notifies didn't
make wait() show up in the profiler output.
System.currentTimeMillis is only ms, not nanoseconds...!

That might be sufficient or not - that depends on your scenario. If
workers wait most of the time, then they might wait longer than 1 ms. You
could also round up 0ms to 1ms (or 10ns) to avoid 0.
Allright, Thanks Robert.
I will report a bug to borland... if at all possible...
Have a good week.

Have fun!

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top