OutOfMemoryError - how to find root cause

K

kaeli

Very recently, a web application ( JSP / Iplanet6 / Solaris 2.8) has begun to
freeze up. Web server logs indicate:
Internal error: Unexpected error condition thrown
java.lang.OutOfMemoryError,no description), stack: java.lang.OutOfMemoryError

This happens intermittently, with no obvious cause, and it just started
happening this morning (no changes have been made to either the server or the
application). It resolves itself and the application resumes just fine. But
it's irritating for pages to freeze up on people.

I *suspect* another application is using too much memory, but can't figure
out if that is the case.
Raising the JVM heap (in server admin GUI) to 256 didn't help. We can't /
don't want to raise it any higher and don't think we should need to -- we
want to find the root cause of this problem, since it hasn't happened before.

Can anyone tell me:
1. Do applications share the JVM? If so, can we turn that off?
2. How best to isolate the root cause of the problem? By the time we see the
error in the web server logs, ps -efl shows no processes eating a ton of
memory and the app has already fixed itself.

Any insights appreciated.

--
 
W

Wibble

Run your program in the debugger (or attach to it) and trap on
java.lang.OutOfMemory.
 
R

Robert

You should also try JProbe that helped us out here tremendously. And
no apps don't share the JVM.
 
K

kaeli

You should also try JProbe that helped us out here tremendously. And
no apps don't share the JVM.

The problem mysteriously went away into the void from which it came.
It definitely looks like the problem was caused by another application. If
they don't share the JVM, what other conditions might cause outOfMemoryError?
This is a medium server with 512 RAM and nearly unlimited virtual memory. The
machine admins were flummoxed, so little ole' me is trying to figure this
out.

--
 
K

kaeli

Run your program in the debugger (or attach to it) and trap on
java.lang.OutOfMemory.

How would I do that for JSP (IPlanet 6.0)?

And how would that tell me anything?
If I know exactly what was executing at the time the error occurred, it still
would only be the straw that broke the camel's back, not the actual cause of
the problem, quite likely. I saw the error occur at multiple times, on
different pages, at seemingly random times. The pages would then just work
again in a couple minutes. So I highly suspect that the real problem has
nothing to do with my application and everything to do with someone else's.
The error last occured yesterday at 10:24 am. It had occured on and off all
morning. But it's all gone now. Yet everyone is still using my application.

It's a mystery I'd like to solve just so I know WTF happened and can have it
not happen again.

--
 
M

Marcus Eaton

kaeli said:
How would I do that for JSP (IPlanet 6.0)?

And how would that tell me anything?
If I know exactly what was executing at the time the error occurred, it still
would only be the straw that broke the camel's back, not the actual cause of
the problem, quite likely. I saw the error occur at multiple times, on
different pages, at seemingly random times. The pages would then just work
again in a couple minutes. So I highly suspect that the real problem has
nothing to do with my application and everything to do with someone else's.
The error last occured yesterday at 10:24 am. It had occured on and off all
morning. But it's all gone now. Yet everyone is still using my application.

It's a mystery I'd like to solve just so I know WTF happened and can have it
not happen again.

How? Use a java profiling tool, such as JProbe or JProfiler. I haven't
used JProbe, but with jprofiler you basically plug in remote debugging
capabilities into a java process, which you can then connect to via the
Jprofiler tool.

Profilers will be able to show you things like:
Where most memory is being used, which methods and classes etc
Ditto for time usage, and nifty things like that.

Once you have the profiler connected to your app server (A test server.
Probably not wise to do this on prod :) if you run a load testing tool,
or something, you can see where the most memory is being allocated, and
target that for investigation.

Not sure if there are any decent open-source / free java profiling
tools, anyone know of one?

Marcus
 
G

Gerbrand van Dieijen

kaeli schreef:
The problem mysteriously went away into the void from which it came.
It definitely looks like the problem was caused by another application. If
they don't share the JVM, what other conditions might cause outOfMemoryError?
This is a medium server with 512 RAM and nearly unlimited virtual memory. The
machine admins were flummoxed, so little ole' me is trying to figure this
out.

Java 1.4 and lower has an fixed maximum to the heap memory.
You can increase it by adding paramaters (see java -X for options and
the web). This won't explain the cause of the error, but might solve it.
 
K

kaeli

meaton1 said:
Once you have the profiler connected to your app server (A test server.
Probably not wise to do this on prod :)

See, that's the problem.
It isn't happening on the test server.

The problem has come back, with a vengence, and none of the web stuff is
functioning (even pure html pages and the web server admin GUI), so it's
looking more and more like a machine issue.

--
--
~kaeli~
Not one shred of evidence supports the notion that life is
serious.
http://www.ipwebdesign.net/wildAtHeart
http://www.ipwebdesign.net/kaelisSpace
 
K

kaeli

kaeli schreef:


Java 1.4 and lower has an fixed maximum to the heap memory.
You can increase it by adding paramaters (see java -X for options and
the web). This won't explain the cause of the error, but might solve it.

We tried that already. First thing I tried, actually.

I posted in another reply that now the machine isn't even rendering static
html pages properly, so it's looking more like a machine issue than a java
issue. Everything is locked up at the moment.
I was just the first to notice the budding problems, apparently. Lucky me.
;)

--
--
~kaeli~
"When dogma enters the brain, all intellectual activity
ceases" -- Robert Anton Wilson
http://www.ipwebdesign.net/wildAtHeart
http://www.ipwebdesign.net/kaelisSpace
 
J

John C. Bollinger

kaeli said:
The problem mysteriously went away into the void from which it came.
It definitely looks like the problem was caused by another application. If
they don't share the JVM, what other conditions might cause outOfMemoryError?
This is a medium server with 512 RAM and nearly unlimited virtual memory. The
machine admins were flummoxed, so little ole' me is trying to figure this
out.

Java applications do not share a VM unless you do work to cause it to be
so. There is, however, some question as to how to define "application."
In this case the "application" is your IPlanet application server,
which conceivably may host more than one service (avoiding the term
"application" for this). All services hosted by IPlanet share a VM (or
multiple VMs in a clustering scenario, but you indicated that there was
only one server). Even if you have only installed one application of
your own on the server, it may come with built-in services (such as
management services) that consume VM resources when active.

The amount of RAM and virtual memory on the box is irrelevant, so long
as it is sufficient to support the Java VM's maximum heap, with enough
to spare to keep the rest of the box running. The machine will throw
OOME if the VM is unable to satisfy an allocation request despite its
best effort to free heap space by expanding the heap (if possible) and
performing garbage collection, and the only way this happens is if the
application server and the services running in it fill up the heap with
reachable objects. It may be possible to recover from the OOME if it
causes enough stack frames to pop and / or enough threads to die that
there is once again sufficient free heap.

There are a few reasonably likely scenarios here:

(1) Your application has an acute packratting problem (i.e. it holds on
to large objects past its need for them, but the problem is localized).
There are some gatchas related to String handling that are a
reasonably good generic candidate here if your application ever deals
with large Strings. A problem such as this could be data-dependent,
which could explain why it didn't manifest before and now has stopped
manifesting -- it might have been related to what some particular client
was doing with the app.

(2) Your application has a chronic packratting problem (i.e. it holds on
to medium sized objects long past its need for them -- possibly
forever). Much of the above applies here as well, but the problem might
not have strong dependencies on specific data. It just slowly fills up
the heap until there is nothing left. I'm assuming here that the OOME
might kill the whole service, but that IPlanet recognizes that and
restarts it.

(3) Your server may have been subject to a DOS attack. A reasonable
level of logging ought to show this up, though it's possible that the
memory situation could make logging fail.


Overall, I have one general piece of advice: when debugging a service,
look first at the logs. You've already done some of that, but you may
not have considered _all_ the logs. For instance, does IPlanet keep
access logs separate from service and / or error logs? The access logs
are a key part of the picture: what was being requested of the service
when it failed?

I hope some of that helps.
 
K

kaeli

There are some gatchas related to String handling that are a
reasonably good generic candidate here if your application ever deals
with large Strings. A problem such as this could be data-dependent,
which could explain why it didn't manifest before and now has stopped
manifesting -- it might have been related to what some particular client
was doing with the app.

Ooh, ooh, I deal with some strings. Strings that might be small, but might
also be large. And other objects that can be large.
Do you have more you can say about this issue or links?
Even if it isn't the problem, I'd like to know more about gotchas and garbage
collection in general. I am self-taught in Java (among other languages and
technologies), and there's always more for me to learn.

For example, I was under the impression that once a JSP was done (no
threading, thread-safe set to off so it gets a new instance per request), its
resources were garbage collected, including classes it calls as beans. If I'm
mistaken, that would be a real problem.

Note that we just found out that the JVM heap (in the IPlanet config) was
only set to 1MB (that's the default, apparently!), so that would be a
definite problem. $diety forbid the 'Help' docs actually tell you what the
default is. My server admin had to go poking around in config files owned by
root. The GUI didn't say what the default was nor the format of the input for
"Maximum Heap Size". We had to guess at that, and our first guess (128) was
very, very wrong. LOL It needed bytes, not megabytes. Hence the not compiling
problem.

Anyway, we fixed that (upped it to 128 mb) and are waiting to see if the
problem recurs.

If nothing else, I've learned a lot today. :D

--
 
J

John C. Bollinger

kaeli said:
Ooh, ooh, I deal with some strings. Strings that might be small, but might
also be large. And other objects that can be large.
Do you have more you can say about this issue or links?
Even if it isn't the problem, I'd like to know more about gotchas and garbage
collection in general. I am self-taught in Java (among other languages and
technologies), and there's always more for me to learn.

The main issue with Strings is that the char[] backing one may have
(much) greater capacity than the String itself has characters. This
happens when you construct a substring in any of several ways
(String.substring(), StringTokenizer, probably regex captured group,
maybe others). If you extract a small substring of a large string in
any of these ways, you may be fooled into thinking that it uses less
heap than it really does. The solution for this issue is to construct a
new String with use of the String(String) constructor, passing in the
substring. That causes a new, right-sized char array to be created for
the new String, and it is, IMO, the only reason to ever use the
String(String) constructor.
For example, I was under the impression that once a JSP was done (no
threading, thread-safe set to off so it gets a new instance per request), its
resources were garbage collected, including classes it calls as beans. If I'm
mistaken, that would be a real problem.

You should never set JSPs to non-thread-safe or make servlets implement
SingleThreadModel. (My opinion; the two amount to the same thing.)
This slightly impacts memory use, because only one instance of a JSP not
declared non-thread-safe will ever be maintained by the application at
one time. The *reason* to not do this has little to do with memory,
however, and more to do with expectations: setting a JSP to
non-thread-safe has nonobvious semantics, in that it does not completely
insulate the page implementation from thread safety issues. In fact, it
does *absolutely nothing* to insulate a JSP from thread-safety issues
related to accessing external resources (DB connections, I/O objects,
etc.). When writing JSP you need to account for and deal with the fact
that the code will run in a multi-threaded environment, and declaring it
non-thread-safe doesn't do the job.

As for resources, it depends. I find that it helps to keep in mind that
JSP is just a shortcut to Java code; if I know how JSP constructs and
actions map to Java then I can predict a JSP's fine behavioral details.
It is also essential to understand how a JSP container is permitted
and expected to use a JSP. Here are a few, possibly relevant details:

(1) The JSP container translates JSP code into Java servlet code, and
compiles it to get a Java servlet that implements the JSP's behavior.

(2) The container will maintain exactly one instance (or possibly a pool
of several instances if flagged non-thread-safe) of the servlet class to
reuse across many page invocations.

(3) JSP declarations (<%! ... %>) translate into class-level
declarations in the servlet.

(4) Beans declared via <jsp:useBean> translate into "attributes" of the
application (servletContext), session (HttpSession), request
(HttpServletRequest), or page.

From those points you should recognize:
(a) If you use assign an object to a variable declared via a JSP
declaration, then that object will remain reachable long after the end
of processing for a particular request -- typically at least until the
variable is overwritten or the application is shut down.
(b) If you assign a bean to "session" scope, it remains reachable until
you remove it from the session or the session is invalidated. (You _do_
have a reasonable session timeout, right?)
(c) If you assign a bean to "application" scope, it remains reachable
until you remove it or replace it, or until the application shuts down.

There are also potential issues shared with plain Java, such as being
certain to close() resources when you are done with them (where applicable).
Note that we just found out that the JVM heap (in the IPlanet config) was
only set to 1MB (that's the default, apparently!), so that would be a
definite problem.

Surely that was the initial heap size, not the maximum. I doubt whether
IPlanet itself could run with that little heap, before any applications
even come into the picture. The initial heap size will have little
relationship to your OutOfMemoryError, though increasing it may get you
slightly faster startup of IPlanet.
 
?

.

See, that's the problem.
It isn't happening on the test server.

The problem has come back, with a vengence, and none of the web stuff is
functioning (even pure html pages and the web server admin GUI), so it's
looking more and more like a machine issue.

I recently spent 3 days trying to figure out a problem with my app when I
ported it to WebSphere 6.0. It worked fine on WebSphere 5.1 but some of
the admin console settings have moved to different locations. Plus there
were some additionally changes like 'Buses'.

I thought it was something to do with the setting of resources at first.
Then I started thinking it was my app doing something that is now obsolete
(might have been deprecated back in WAS 5.1).

Try Google and found a few other people with the same problem but no
solution. After three days of poking, prodding and reading I tried Google
again. Turned out to be a problem with WAS 6.0.0.2. Got a Fix Pack to
bring me up to 6.0.0.3 and the problem went away. Strange thing is the Fix
Pack has been around for a month but searches with Google and on IBM's
site never found it.

Maybe you are experiencing a similar problem. Do what I did. Ask, poke,
prod, read but also google every day. A google that found nothing
yesterday might find something today... or tomorrow.
 
K

kaeli

Maybe you are experiencing a similar problem. Do what I did. Ask, poke,
prod, read but also google every day. A google that found nothing
yesterday might find something today... or tomorrow.

Specifying 128 MB of memory as the maximum java heap in IPlanet (instead of
default) seems to have fixed it. No recurrences the last couple days...

*crosses fingers*

--
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top