RMI connection refused

F

FutureScalper

I have a situation where a number of local Java processes register
themselves in a local rmiregistry so they can talk to each other. All
the correct things are being done, and this works 99.9% of the time
perfectly. Each process uses a unique name "XXServer" to bind, etc.

Each process periodically unbinds and rebinds itself to the registry
successfully. But when this Connection refused problem occurs
(rarely), even though a process unbinds and rebinds itself
successfully in the registry, it cannot nvoke RMI methods, due to
connection refused.

url is correct, specifying port, etc, there is no issue in this area.
No firewalls.

This Connection refused runtime problem in the problem process never
resolves itself, and I don't know what I can do to get the process to
recover from this error. All other processes continue to work
normally, until they might rarely experience the same issue.

When I kill and restart the particular application process , then
everything is again normal. So it's that particular process which is
somehow being refused connection due to < insert solution here > I
can't reproduce the problem easily, but once it happens the process
never recovers.

sun.rmi.transport.tcp.TCPEndpoint.newSocket is where the Connection
refused originates. I wonder how I can avoid what appears to be some
resource limitation problem near TCPEndpoint as in the stack trace
below (no line numbers). I'm trying to invoke an RMI method at the
time of exception.

Java 6 Update 17

java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getAskRunLengthFast(Unknown
<------- invoking
Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more
 
F

FutureScalper

Perhaps I need to do this:

sun.rmi.transport.tcp.TCPEndpoint

public static void shedConnectionCaches()

Release idle outbound connections to reduce demand on I/O resources.
All transports are asked to release excess connections.

Seems to me something is accumulating in this area.
 
E

EJP

Perhaps I need to do this:

sun.rmi.transport.tcp.TCPEndpoint

public static void shedConnectionCaches()

Release idle outbound connections to reduce demand on I/O resources.
All transports are asked to release excess connections.

Seems to me something is accumulating in this area.

That wouldn't help in the slightest. 'Connection refused' has one
meaning only. Nothing is listening at the target host:port.

Is 127.0.0.1 the expected IP address to connect to?
 
F

FutureScalper

That wouldn't help in the slightest. 'Connection refused' has one
meaning only. Nothing is listening at the target host:port.

Is 127.0.0.1 the expected IP address to connect to?

Well, that can't be the case. There is an RMIRegistry process running
locally and N other apps are bound to it.

Restarting the rmiregistry doesn't help. The client actually has to
be killed and restarted.

Thanks for any other suggestions, as I try and debug this situation
which develops only after some hard usage.

I still think something is accumulating, such as connections, etc., or
some resource.
 
F

FutureScalper

Well, that can't be the case.  There is an RMIRegistry process running
locally and N other apps are bound to it.

Restarting the rmiregistry doesn't help.  The client actually has to
be killed and restarted.

Thanks for any other suggestions, as I try and debug this situation
which develops only after some hard usage.

I still think something is accumulating, such as connections, etc., or
some resource.

I explicitly use a Java Web Start property to fully specify the
localhost as:
<property name="java.rmi.server.hostname" value="127.0.0.1" />
I also spec the port explicitly, and periodically unbind and rebind
processes.
The convention is XXServer, so, for example
Core.rebindToRemote [rmi://127.0.0.1:1098/YMServer] serverImpl is an
example
of the bind URL for a client. Everything is explicitly specified.
As I said, this thing works for quite a long time, and then under
circumstances
I can't figure out, connection is permanently refused, and I have so
far been unable
to get the Java application client to recover without restart.

But the rmiregistry is there, and running. Now, perhaps it is
refusing connection
to a specific client for some reason, such as an accumulating resource
within
the rmiregistry process itself ?? However, restarting rmiregistry,
and clients
rebinding to it, does not fix the client's connection problems, so I
don't think
that's it.
 
R

Roedy Green

Well, that can't be the case.  There is an RMIRegistry process running
locally and N other apps are bound to it.

Restarting the rmiregistry doesn't help.  The client actually has to
be killed and restarted.

Thanks for any other suggestions, as I try and debug this situation
which develops only after some hard usage.

I still think something is accumulating, such as connections, etc., or
some resource.

I explicitly use a Java Web Start property to fully specify the
localhost as:
<property name="java.rmi.server.hostname" value="127.0.0.1" />
I also spec the port explicitly, and periodically unbind and rebind
processes.
The convention is XXServer, so, for example
Core.rebindToRemote [rmi://127.0.0.1:1098/YMServer] serverImpl is an
example
of the bind URL for a client. Everything is explicitly specified.
As I said, this thing works for quite a long time, and then under
circumstances
I can't figure out, connection is permanently refused, and I have so
far been unable
to get the Java application client to recover without restart.

But the rmiregistry is there, and running. Now, perhaps it is
refusing connection
to a specific client for some reason, such as an accumulating resource
within
the rmiregistry process itself ?? However, restarting rmiregistry,
and clients
rebinding to it, does not fix the client's connection problems, so I
don't think
that's it.

you might try snooping on the conversation with something like
WireShark. There might be something in the messages that would give
you a bit more information.

Is there any sort of debug mode on the RMI server that will log stuff
that could give you a clue?

When you detect the failure in the client, do you go right back to
square 1? Have you profiled to see if there is some strange object
accumulation in either client or server?

After you get a fail, do all other clients thereafter fail, or just
the one that failed?
 
E

EJP

The number of apps that are bound to the registry isn't relevant. It is
possible that the Registry's accept thread has stopped somehow, which
would cause its backlog queue to fill up, which on Windows also provokes
an ECONN. That and a firewall are the the only other conditions besides
no listener that causes ECONN, and ECONN is the only condition that
causes 'connection refused' in Java.

That's bizarre. Indicative but bizarre. Does restarting the client help
without restarting anything else?
Now, perhaps it is
refusing connection
to a specific client for some reason

TCP can't do that. But it could start refusing all clients, as above.
However, restarting rmiregistry,
and clients rebinding to it

Don't you mean servers rebinding to it? (as clients ;-))
does not fix the client's connection problems, so I
don't think that's it.

I'm getting confused here. RMI Servers do Registry.bind/rebind. RMI
clients do Registry.lookup. RMI Servers are in fact clients of the
Registry, which is also an RMI server, but let's not add that
complication. If you restart the Registry it will have no bindings, so
servers would have to rebind (or be restarted) before clients would work.

I think it would be worthwhile running the Registry with some RMI
tracing properties - see the links via the RMI home page.
 
F

FutureScalper

The number of apps that are bound to the registry isn't relevant. It is
possible that the Registry's accept thread has stopped somehow, which
would cause its backlog queue to fill up, which on Windows also provokes
an ECONN. That and a firewall are the the only other conditions besides
no listener that causes ECONN, and ECONN is the only condition that
causes 'connection refused' in Java.


That's bizarre. Indicative but bizarre. Does restarting the client help
without restarting anything else?


TCP can't do that. But it could start refusing all clients, as above.


Don't you mean servers rebinding to it? (as clients ;-))


I'm getting confused here. RMI Servers do Registry.bind/rebind. RMI
clients do Registry.lookup. RMI Servers are in fact clients of the
Registry, which is also an RMI server, but let's not add that
complication. If you restart the Registry it willhaveno bindings, so
servers wouldhaveto rebind (or be restarted) before clients would work.

I think it would be worthwhile running the Registry with some RMI
tracing properties - see the links via the RMI home page.

I'm sorry I neglected to make it clear that EACH process is both a
client and a server.

In other words, it can look up itself using RMI, as well as looking up
other clients.

Each process exposes the same interface both to itself, and to other
clients.

I know, sounds a little weird, but it's a trading application which
has to query both its own state, and potentially the state of other
apps, which analyze other futures contracts. Works perfectly, except
when it doesn't, which is quite RARE.

So part of each process, if you like, is a client, and the other part
is a server.

Thanks to all who made suggestions. I'm following up.
 
F

FutureScalper

I'm sorry I neglected to make it clear that EACH process is both a
client and a server.

In other words, it can look up itself using RMI, as well as looking up
other clients.

Each process exposes the same interface both to itself, and to other
clients.

I know, sounds a little weird, but it's a trading application which
has to query both its own state, and potentially the state of other
apps, which analyze other futures contracts.  Works perfectly, except
when it doesn't, which is quite RARE.

So part of each process, if you like, is a client, and the other part
is a server.

Thanks to all who made suggestions.  I'm following up.

Another thing, is that there is NO FIREWALL being used.

Sorry I am unable to reproduce this problem at will. I've enabled
line number tracebacks in my clients so that if/when this happens
again I'll get precise info on just exactly where it failed.

For performance reasons I usually do not run with debug, as this thing
does dozens of rmi queries as a group, about 6 times per second both
locally and remotely.

So it's pretty fast, and so far quite reliable until I get this
connection refused issue in one of the clients and I can't figure out
how to help it recover.
 
F

FutureScalper

Another thing, is that there is NO FIREWALL being used.

Sorry I am unable to reproduce this problem at will.  I've enabled
line number tracebacks in my clients so that if/when this happens
again I'll get precise info on just exactly where it failed.

For performance reasons I usually do not run with debug, as this thing
does dozens of rmi queries as a group, about 6 times per second both
locally and remotely.

So it's pretty fast, and so far quite reliable until I get this
connection refused issue in one of the clients and I can't figure out
how to help it recover.

Here's what it looks like, and each stack trace (sorry no line
numbers)
can be seen to be calling a different method on the interface.
Once I get this, I'm not sure what to do to clear the problem.
I have a watchdog which unbinds/rebinds the server implementation to
the
rmiregistry periodically, and also, I am calling that TCPEndpoint
static method sun.rmi.transport.tcp.TCPEndpoint
public static void shedConnectionCaches() hoping to "harvest" or
recycle whatever may lurk in the connection caches :)
I don't believe it helps at all, and I call it prior to each
periodic rebind. So I expect it to fail until the next watchdog
rebind, and then to clear itself... but, alas, it doesn't.



ERR: 10.05.06 12:36:02.369: java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
Source) <---
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:02.853: MainFrame focus LOST.
ERR: 10.05.06 12:36:03.380: java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
<---
Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:04.013: FutureScalperBookChart average elapsed
(msecs) is: 1.7
OUT: 10.05.06 12:36:04.368: ## ChartFiller run [10m] 17(0) msec
(ACTIVE)
ERR: 10.05.06 12:36:04.388: java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
<---
Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

OUT: 10.05.06 12:36:04.388: displayXTMessage:---fail: B-
notAboveTrigger
OUT: 10.05.06 12:36:04.388: displayXTMessage:*** BUY checks per sec:
0.3 <-- normally 6.0 per second
OUT: 10.05.06 12:36:04.594: UnifiedInventory avg(50) elapsed: 1.1
(slowed down due to exception processing, etc.)
ERR: 10.05.06 12:36:05.554: java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
Source) <---
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more

ERR: 10.05.06 12:36:06.554: java.rmi.ConnectException: Connection
refused to host: 127.0.0.1; nested exception is:
java.net.ConnectException: Connection refused: connect
java.rmi.ConnectException: Connection refused to host: 127.0.0.1;
nested exception is:
java.net.ConnectException: Connection refused: connect
at sun.rmi.transport.tcp.TCPEndpoint.newSocket(Unknown Source)
at sun.rmi.transport.tcp.TCPChannel.createConnection(Unknown
Source)
at sun.rmi.transport.tcp.TCPChannel.newConnection(Unknown
Source)
at sun.rmi.server.UnicastRef.invoke(Unknown Source)
at
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
<---
Source)
at com.twc.trader.TradeEntryManager$AutoTrader.run(Unknown
Source)
Caused by: java.net.ConnectException: Connection refused: connect
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(Unknown Source)
at java.net.PlainSocketImpl.connectToAddress(Unknown Source)
at java.net.PlainSocketImpl.connect(Unknown Source)
at java.net.SocksSocketImpl.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.connect(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at java.net.Socket.<init>(Unknown Source)
at
sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(Unknown
Source)
at
sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(Unknown
Source)
... 6 more
 
E

EJP

I am calling that TCPEndpoint
static method sun.rmi.transport.tcp.TCPEndpoint
public static void shedConnectionCaches() hoping to "harvest" or
recycle whatever may lurk in the connection caches :)

Don't do that.
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
Source)<---
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getBidRunLengthFast(Unknown
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getLastAct(Unknown
com.twc.remote.RemoteIndicatorServiceImpl_Stub.getActFastMacdTrend(Unknown

None of these has anything to do with the Registry so I don't know why
you thought restarting the Registry would do anything. They are all in
calls to the *same* remote object, RemoteIndicatorService. So is that
one doing something odd? like deadlocking itself?

BTW are all these objects exported on the same port? They should be.
 
F

FutureScalper

Don't do that.


None of these has anything to do with the Registry so I don't know why
you thought restarting the Registry would do anything. They are all in
calls to the *same* remote object, RemoteIndicatorService. So is that
one doing something odd? like deadlocking itself?

BTW are all these objects exported on the same port? They should be.

I thank you for the suggestions.

No, concurrency is not an issue, and no deadlocks taking place as far
as I know and this whole thing is highly threads tolerant.

I have enough suggestions to work on and I'm also not reproducing the
issue myself since last post.

I'll try and break it again by having a couple of them heavily cross-
referencing each other.

Appreciate your help, will post resolution if I find one.
 
F

FutureScalper

I thank you for the suggestions.

No, concurrency is not an issue, and no deadlocks taking place as far
as I know and this whole thing is highly threads tolerant.

I have enough suggestions to work on and I'm also not reproducing the
issue myself since last post.

I'll try and break it again by having a couple of them heavily cross-
referencing each other.

Appreciate your help, will post resolution if I find one.

Thanks again for suggestions. To avoid possible deadlock, I've just
made the RMI server implementation single-threaded even though it's
read only stuff and should not require synchronization.

Don't think that's the issue, but just as a sanity check.

I've experienced the issue once since last post under fairly heavy
usage so it's hard for me to reproduce.
 
F

FutureScalper

I thank you for the suggestions.

No, concurrency is not an issue, and no deadlocks taking place as far
as I know and this whole thing is highly threads tolerant.

I have enough suggestions to work on and I'm also not reproducing the
issue myself since last post.

I'll try and break it again by having a couple of them heavily cross-
referencing each other.

Appreciate your help, will post resolution if I find one.

I've been thinking about this issue, and I suspect that the RMI socket
reader thread may be crashing for some unknown reason. I don't have
any direct evidence of this right now, but I smell something like that
happening.

So, I'll look into how I can determine that, and possibly implement my
own more reliable RMI reader thread is that's the issue. Can't think
of anything else that would cause this situation, other than the
socket reader thread dying within my client process (acting as a
server).
 
E

Esmond Pitt

I've been thinking about this issue, and I suspect that the RMI socket
reader thread may be crashing for some unknown reason. I don't have
any direct evidence of this right now, but I smell something like that
happening.

*I* smell the remote object deadlocking itself so that nothing can
proceed. You don't have any evidence about the RMI reader thread
crashing and I have personally never seen it in 13 years of RMI. And why
would it only crash for one specific remote object? Obviously the
problem is associated with that object, not the RMI system. You're
barking up the wrong tree.
Can't think of anything else that would cause this situation

I had already made the suggestion above.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

RMI connection refused... eventually 5
RMI & connection refused 10
Java.rmi.ConnectException 2
rmi connection problem 2
Can't start JMX Server 0
Can't start any JMX Server 0
RMI firewall issues 1
connect exception 2

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top