Cocoon 2.1.0 with JBoss 3.2.1 freezez under heavy load

B

Blazej Zak

Hello,

I got big problem with cocoon under heavy load. The effects are quite
strange - application server 'freezez' currently no exceptions are
thrown, but still no requests are processed - not even error page is
shown - just TCP timeout message is returned.

Let me tell the story from the begining:

The web application http://www.vide.pl I'm talking about was working
nice and smoothly for two weeks under the load of 200-400 visitors a day
(each about 20 document requests). When the load reached 500 users/day
funny things started to happen.

It happened that the jboss 3.2.1 application server thats holding cocoon
app, has frozen. ie. the processor usage quickly rised from 5-10% up to
70% (not 100%!!) and stayed on this level. When this happens no requests
are processed as i wrote above. jboss accepts incomming connections, but
doesn't respond with anything, the connection is just broken after
timeout time. Under 500 users/day it was happening like 3 times a day.

During the time we were looking for a bug, a ad campaign has started.
Quite big campaign... the first hour we had about 1000 unique visitors,
and... jboss/cocoon was freezing every 3 minutes! That was nightmare!
believe me!

What we found out:

The bug that ocurred quite often was null pointer exception during
relasing avalon component, while the content agreggator is returned to
pool. trace bellow:

INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,171 ERROR [STDERR]
java.lang.NullPointerException
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,328 ERROR [STDERR] at
org.apache.cocoon.components.CocoonComponentManager.release(CocoonComponentManager.java:510)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,390 ERROR [STDERR] at
org.apache.cocoon.components.CocoonComponentManager.release(CocoonComponentManager.java:510)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,468 ERROR [STDERR] at
org.apache.cocoon.environment.AbstractEnvironment.release(AbstractEnvironment.java:521)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,546 ERROR [STDERR] at
org.apache.cocoon.sitemap.ContentAggregator.recycle(ContentAggregator.java:303)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,624 ERROR [STDERR] at
org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,687 ERROR [STDERR] at
org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,765 ERROR [STDERR] at
org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
INFO | jvm 1 | 2003/11/18 11:19:53 | 11:19:53,843 ERROR [STDERR] at
org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:53,906 ERROR [STDERR] at
org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:320)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:53,984 ERROR [STDERR] at
org.apache.cocoon.components.pipeline.AbstractProcessingPipeline.recycle(AbstractProcessingPipeline.java:641)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,062 ERROR [STDERR] at
org.apache.cocoon.components.pipeline.impl.BaseCachingProcessingPipeline.recycle(BaseCachingProcessingPipeline.java:112)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,140 ERROR [STDERR] at
org.apache.cocoon.components.pipeline.impl.AbstractCachingProcessingPipeline.recycle(AbstractCachingProcessingPipeline..java:952)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,218 ERROR [STDERR] at
org.apache.avalon.excalibur.pool.ResourceLimitingPool.put(ResourceLimitingPool.java:438)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,281 ERROR [STDERR] at
org.apache.avalon.excalibur.component.PoolableComponentHandler.doPut(PoolableComponentHandler.java:245)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,343 ERROR [STDERR] at
org.apache.avalon.excalibur.component.ComponentHandler.put(ComponentHandler.java:452)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,421 ERROR [STDERR] at
org.apache.avalon.excalibur.component.ExcaliburComponentSelector.release(ExcaliburComponentSelector.java:336)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,484 ERROR [STDERR] at
org.apache.cocoon.components.ExtendedComponentSelector.release(ExtendedComponentSelector.java:320)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,546 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.InvokeContext.dispose(InvokeContext.java:310)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,609 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:311)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,656 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.MountNode.invoke(MountNode.java:133)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,734 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:84)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,828 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.PreparableMatchNode.invoke(PreparableMatchNode.java:164)
INFO | jvm 1 | 2003/11/18 11:19:54 | 11:19:54,906 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:84)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:54,968 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.ActTypeNode.invoke(ActTypeNode.java:176)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,046 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:84)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,124 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.ActTypeNode.invoke(ActTypeNode.java:176)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,203 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:108)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,296 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.PipelineNode.invoke(PipelineNode.java:164)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,390 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.AbstractParentProcessingNode.invokeNodes(AbstractParentProcessingNode.java:108)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,468 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.sitemap.PipelinesNode.invoke(PipelinesNode.java:161)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,562 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:327)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,640 ERROR [STDERR] at
org.apache.cocoon.components.treeprocessor.TreeProcessor.process(TreeProcessor.java:309)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,734 ERROR [STDERR] at
org.apache.cocoon.Cocoon.process(Cocoon.java:639)
INFO | jvm 1 | 2003/11/18 11:19:55 | 11:19:55,812 ERROR [STDERR] at
org.apache.cocoon.servlet.CocoonServlet.service(CocoonServlet.java:1096)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:55,890 ERROR [STDERR] at
javax.servlet.http.HttpServlet.service(HttpServlet.java:853)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:55,999 ERROR [STDERR] at
org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:360)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,093 ERROR [STDERR] at
org.mortbay.jetty.servlet.WebApplicationHandler.dispatch(WebApplicationHandler.java:294)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,218 ERROR [STDERR] at
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:558)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,312 ERROR [STDERR] at
org.mortbay.http.HttpContext.handle(HttpContext.java:1714)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,421 ERROR [STDERR] at
org.mortbay.jetty.servlet.WebApplicationContext.handle(WebApplicationContext.java:507)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,499 ERROR [STDERR] at
org.mortbay.http.HttpContext.handle(HttpContext.java:1664)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,593 ERROR [STDERR] at
org.mortbay.http.HttpServer.service(HttpServer.java:863)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,656 ERROR [STDERR] at
org.jboss.jetty.Jetty.service(Jetty.java:460)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,734 ERROR [STDERR] at
org.mortbay.http.HttpConnection.service(HttpConnection.java:775)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,796 ERROR [STDERR] at
org.mortbay.http.HttpConnection.handleNext(HttpConnection.java:939)
INFO | jvm 1 | 2003/11/18 11:19:56 | 11:19:56,874 ERROR [STDERR] at
org.mortbay.http.HttpConnection.handle(HttpConnection.java:792)
INFO | jvm 1 | 2003/11/18 11:19:57 | 11:19:56,937 ERROR [STDERR] at
org.mortbay.http.SocketListener.handleConnection(SocketListener.java:201)
INFO | jvm 1 | 2003/11/18 11:19:57 | 11:19:56,999 ERROR [STDERR] at
org.mortbay.util.ThreadedServer.handle(ThreadedServer.java:289)
INFO | jvm 1 | 2003/11/18 11:19:57 | 11:19:57,062 ERROR [STDERR] at
org.mortbay.util.ThreadPool$PoolThread.run(ThreadPool.java:455)


This exception ocured quite often (but not always) after a message:


INFO | jvm 1 | 2003/10/27 11:04:20 | 11:04:20,822 WARN [jbossweb]
WARNING: OUT OF THREADS: [email protected]:8080


So we increased number of threads for SocketListener from 50 to 300 what
solved the problem at load 500 visitors /day. But didn't work anymore at
heavy load - 1000 visitors / hour. So then we increased the number of
threads again from 300 to 3000 to make sure this is not gonna happen again.

No luck.

Cause the campaign was in progress, jboss was freezing every 3 minutes,
we made drastical steps.
In cocoon-dev group archive i found somewhere (lost the have link
though) that Content Agreggator can have problem with releasing
components when some of the parts it aggregates are taken from cocoon cache.
So we changed the sitemaps to uncacheable, to prevent caching...

No luck.

The next step was to follow the trace quoted above. I saw there
CocoonComponentManager.release(CocoonComponentManager.java:510) and
ContentAggregator.recycle(ContentAggregator.java:303).
Guesed this is problems with releasing components back to pool... so we
turned off pooling of, i think, every single component, by changing
sitemaps like this:
<map:generator logger="sitemap.generator.reprodukcja" name="reprodukcja"
pool-grow="0" pool-max="0" pool-min="0"
src="pl.vide.reprodukcja.GeneratorReprodukcji">
and in few places in cocoon.xconf...

App worked slower, but stable... for whole 5 minutes... so still...

no luck.

Well, it didn't actually throw any exceptions anymore - the logs are
clear. But the effect stayed. it was still freezing sometimes with
characteristic ~ constatnt 70% processor usage.

So at the moment under heavy load cocoon freezez every 7 min, doesn't
throw any exceptions.

Our main hypothesis is deadlock, and active loop somwhere. waiting for
some resources ? componenets from the pool? Since no exception are
thrown anymore I got no idea where to look for it. [and besides turning
off pooling isn't really the best idea..]

We had to stop campaign after 1.5 hour, although we bought it for a
whole day... nightmare...
now under light load everything works stable again... but, we'd really
like to make our page popular...

Well, I looked at the logs again and find recurring warning:


WARN (2003-11-18) 13:45.42:687 [core.manager]
(/vide/kolekcje/reprodukcja.html)
PoolThread-292/ExcaliburComponentManager: Attempted to release a
org.apache.cocoon.components.treeprocessor.sitemap.ComponentsSelector
but its handler could not be located.
WARN (2003-11-18) 13:45.42:687 [core.manager]
(/vide/kolekcje/reprodukcja.html)
PoolThread-292/ExcaliburComponentManager: Attempted to release a
org.apache.cocoon.components.treeprocessor.sitemap.ComponentsSelector
but its handler could not be located.


But that still doesn't enlight me what's going on.


The software configuration of the system is:

Apache 2.0 httpd + mod_proxy, mod_cache
JBoss 3.2.1 + Jetty
Cocoon 2.1.0 with our applicatin
Hibernate, and direct JDBC database access
JTDS 0.5.1 - jdbc driver
MS SQL 2000 Standard

Everything on Windows 2003 Server cause MSSQL won't run anywhere else.
I must add MSSQL works stable and is responding - intranet system uses
the same database, but seperate application server and works fine even
when jboss freezez. The same with httpd.

Hardware is Inter Xeon 2.4 GHz, and 2x18 GB SCSI RAID.


I beg you !! please!! help.


Boss gonna kick my ass for the money wasted on campaign.

greetz,

Blazej Zak
(e-mail address removed)
 
S

Sudsy

Blazej said:
Hello,

I got big problem with cocoon under heavy load. The effects are quite
strange - application server 'freezez' currently no exceptions are
thrown, but still no requests are processed - not even error page is
shown - just TCP timeout message is returned.

Let me tell the story from the begining:

The web application http://www.vide.pl I'm talking about was working
nice and smoothly for two weeks under the load of 200-400 visitors a day
(each about 20 document requests). When the load reached 500 users/day
funny things started to happen.

It happened that the jboss 3.2.1 application server thats holding cocoon
app, has frozen. ie. the processor usage quickly rised from 5-10% up to
70% (not 100%!!) and stayed on this level. When this happens no requests
are processed as i wrote above. jboss accepts incomming connections, but
doesn't respond with anything, the connection is just broken after
timeout time. Under 500 users/day it was happening like 3 times a day.

What you didn't specify about your hardware configuration is the
amount of memory installed. As I was reading the first few
paragraphs I couldn't help but think that the machine was "thrashing".
That's when the machine is doing so much paging that it can't get
any real work done. It's a premise consistent with CPU utilization
but I'd need to see memory stats to be sure and I have absolutely
no idea how to get those on your software platform.
On *nix I could use top and vmstat to confirm or deny my nagging
suspicions.
 
B

Blazej Zak

What you didn't specify about your hardware configuration is the
amount of memory installed. As I was reading the first few
paragraphs I couldn't help but think that the machine was "thrashing".
That's when the machine is doing so much paging that it can't get
any real work done. It's a premise consistent with CPU utilization
but I'd need to see memory stats to be sure and I have absolutely
no idea how to get those on your software platform.
On *nix I could use top and vmstat to confirm or deny my nagging
suspicions.

Sorry, about that...
we have 1GB physical memory installed, jboss allocates around 140 MB,
all together with system cache there's around 600 MB allocated.
Windows Task Manager says there's 400 MB physical memory availible
almost all time.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top