Help needed: Wierd Socket/RMI issue

D

daniel.tietze

Hi.

I'm having a potential problem at a customer site which
has me completely baffled and with which I could do with
some help.

I've written and deployed at the customer site an Client/
Server application, developed using RMI. The problem occurs
on the server side.

I've implemented my own SocketFactory and am restricting
the sockets to ports 8091 and 8091 - through the SocketFactory
and through the constructor of java.rmi.server.UnicastRemoteObject.

The server is waiting for incoming commections. The DGC
regularly does maintenance and garbage collection, triggering
my SocketFactory. Server machine is Windows 2000. The server
machine has local server ports 8090 and 8091 in LISTENING
state (on 0.0.0.0:8090 and 0.0.0.0:8091).

The server, machine A, is waiting for incoming connections from
the client application, on machine B, which is not running. On the
same network segment are machines C, D and E (all Linux machines),
on which NONE of my software is running, and which should have
NOTHING to do with my system. These machines are communicating
amongst themselves, but not with machine A.

The customer claims (I'm not on-site, so I can't check), that
after a random time of running my server application 24/7 (the
first time after about a week, the second time after only 3 hours),
machines C, D and E were no longer able to exchange data. As soon
as my server application on machine A is aborted, systems C,
D and E are again able to exchange data - immediately.

What I don't understand is:
How is it possible, that an RMI server application, which
has opened sockets, accept()s them, has the open sockets
in LISTENING state, but is not actively sending data,
IN ANY WAY influences the traffic on the rest of the subnet?

Do opened Sockets in the accept() state send ANY form of
traffic over the network?

Is there any way I can influence other machines on the
network, simply by opening a bunch of inbound sockets? To
the best of my understanding of networks and TCP/IP there
should be no way to do that. Or am I wrong?

Again, I'd appreciate any help or pointer or even wild
theory related to this problem.

Thanks,

Daniel
 
C

Chris Uppal

Is there any way I can influence other machines on the
network, simply by opening a bunch of inbound sockets? To
the best of my understanding of networks and TCP/IP there
should be no way to do that. Or am I wrong?

Shouldn't be, I think. Assuming that the machine "knows" its own IP address
correctly, and especially if the other machines it interferes with are not
using ports 809{01}.
Again, I'd appreciate any help or pointer or even wild
theory related to this problem.

I'd get a network sniffer such as:

http://www.ethereal.com

onto the server machine pronto. The server /must/ be sending some sort of data
onto the network (unless you've got poltergeists, or a bad case of
synchronicity) to affect the other machines, and finding out exactly /what/
that data is that it's sending/receiving looks like the best way to diagnose
this.

That might not be too easy, given that its a remote site. But then if the
customer's running a heterogeneous LAN, they may have someone who is already
skilled-up to tell you what's happening at the network level.

-- chris
 
J

John C. Bollinger

What I don't understand is:
How is it possible, that an RMI server application, which
has opened sockets, accept()s them, has the open sockets
in LISTENING state, but is not actively sending data,
IN ANY WAY influences the traffic on the rest of the subnet?

It is not possible that an application restricted only to the behavior
you describe has any effect on remote machines. Do note, however, that
if other machines actually connect to the application, intentionally or
unintentionally, then the application will not be restricted to only
those behaviors.
Do opened Sockets in the accept() state send ANY form of
traffic over the network?

You should be thinking at the TCP/IP level for this, in which context
the socket state is LISTENING. Sockets in that state do not send data
over the network, but they do change state when they receive a (real or
apparent) connection request.
Is there any way I can influence other machines on the
network, simply by opening a bunch of inbound sockets? To
the best of my understanding of networks and TCP/IP there
should be no way to do that. Or am I wrong?

You are right as far as you go, but you greatly underestimate the
possibilities. For instance, setting up a listening socket opens up the
possibility that the affected machines will connect to that socket.
They might even make a lot of connections, which could indeed have
effects similar to the client's description. Such a scenario would
probably reflect a flaw in the software running on the affected
machines, but that might just be tough luck for you.

It is also possible that your application itself is not the problem, but
that it is something else on the machine your application runs on (which
may be new to the network, or at the least has been reconfigured to
run your application).

You really need more data to come to any conclusion. Does the client
have any competent networking people? A Linux admin worth mentioning?
It is important to get much more detail on _why_ the Linux machines'
communications freeze up. The first thing I would do is to put a
monitor on those machines' open connections; that might be as simple as
running a suitable netstat every X minutes and capturing the results.
Ideally, you would do something similar on your application's host as
well. You might also start asking for details about what is running on
those systems and how they use the network.
 
M

marcus

Welcome to windows networking. If you did not hate M$ before, you will
soon.
In the past the Windows OS intentionally abused non-M$ brand software;
M$ intentionally rewrote language specs, etc. Chances are what you are
experiencing is nothing more intentional than the nasty programming
habits of entrenched engineers who only care that Exchange server runs
properly.
Why whould any non-M$ app run on a windows network?
My guess is some security protocol installed during an upgrade from
win95 to NT is blacklisting your machine and shutting down network
access. DOS is certainly a possibility as a forth-party app sniffs
relentlessly at your open ports.

Advice? Accuse your client's employees of installing backdoors and
surfing porn, and learn NET. (kidding!)
 
N

Nigel Wade

Hi.

I'm having a potential problem at a customer site which
has me completely baffled and with which I could do with
some help.

I've written and deployed at the customer site an Client/
Server application, developed using RMI. The problem occurs
on the server side.

I've implemented my own SocketFactory and am restricting
the sockets to ports 8091 and 8091 - through the SocketFactory
and through the constructor of java.rmi.server.UnicastRemoteObject.

The server is waiting for incoming commections. The DGC
regularly does maintenance and garbage collection, triggering
my SocketFactory. Server machine is Windows 2000. The server
machine has local server ports 8090 and 8091 in LISTENING
state (on 0.0.0.0:8090 and 0.0.0.0:8091).

The server, machine A, is waiting for incoming connections from
the client application, on machine B, which is not running. On the
same network segment are machines C, D and E (all Linux machines),
on which NONE of my software is running, and which should have
NOTHING to do with my system. These machines are communicating
amongst themselves, but not with machine A.

I'd want a network sniffer to verify that. I would not assume that just
because they are not *meant* to be communicating with A that they are not.
The customer claims (I'm not on-site, so I can't check), that
after a random time of running my server application 24/7 (the
first time after about a week, the second time after only 3 hours),
machines C, D and E were no longer able to exchange data. As soon
as my server application on machine A is aborted, systems C,
D and E are again able to exchange data - immediately.

You need to have clarified exactly what the customer means by "no longer
able to exchange data". As stated it doesn't mean much at all.
What I don't understand is:
How is it possible, that an RMI server application, which
has opened sockets, accept()s them, has the open sockets
in LISTENING state, but is not actively sending data,
IN ANY WAY influences the traffic on the rest of the subnet?

Do opened Sockets in the accept() state send ANY form of
traffic over the network?
No.


Is there any way I can influence other machines on the
network, simply by opening a bunch of inbound sockets? To
the best of my understanding of networks and TCP/IP there
should be no way to do that. Or am I wrong?

My immediate response, in light of the lack of any other information, would
be to check whether any of the machines have the same IP address. This
might be due to misconfiguration, or a faulty DHCP client or server
implementation. If A has the same IP as one of C,D or E then opening those
ports would mean that there were, potentially, two systems listening on the
same IP:port. Although I would expect all sorts of other problems, and
they most likely would not go away when your software was stopped.

Dump the arp tables on the Linux systems and see what they consider to be
the MAC address associated with each of the IP addresses of A,C,D and E.
Also, monitor the network traffic on the Linux boxes to find out what is
going on on port 8090 and 8091.
 
S

Steve Horsley

Hi.

I'm having a potential problem at a customer site which
has me completely baffled and with which I could do with
some help.
<snip>

I have to agree with the other respondents. There is something
going on at a level other than the level you are thinking on.
This could be anything from the customer getting completely
the wrong end of the stick (your server has nothing to do
with their problem) to strange virus activity from your box
to simething wierd at the MAC / ARP or default IP route
level.

So I suggest some things:

1) find out exactly how they disable your server - kill the
process, pull out the LAN cable etc.

2) Get Ethereal on your box and take a trace while they
are having trouble - see if there is ANY communication
between yours and theirs.

3) Find out what their communication problem is: Can they
ing each other? If so (sounds mad but do it anyway), do a
traceroute between their machines.

4) Check the IP addresses of all machines (ipconfig and
ifconfig).

5) Check the ARP caches for consistency ("arp -a").

Of these, number 2 will probably tell you most, and reveal
possible completely unexpected happenings.

Steve
 
D

daniel.tietze

Hi.

Nigel said:
I'd want a network sniffer to verify that. I would not assume that just
because they are not *meant* to be communicating with A that they are not.
[...]
My immediate response, in light of the lack of any other information, would
be to check whether any of the machines have the same IP address.

That's exactly what I suspected as well. The problems
that this might cause (which we're probably all
familiar with) might not be cropping up because
one set of servers on the subnet is Linux and one is
Windows. If they share nothing else over the network,
and aren't running DHCP, maybe the duplicate IP addresses
could go undetected. Even more so if folks don't regularly
check /var/log/messages.

So far, I have not yet heard back from them.

Steve said:
1) find out exactly how they disable your server - kill
the process, pull out the LAN cable etc.

They just kill the process and apparently everything goes
back to normal. This seems to support that one of their machines
is talking to my process, even though it shouldn't be (or my process
is grabbing their response packets, since they're listening on the
same addres/port). And if my process (port) goes away, the packets
then reach whoever they're supposed to reach.


I appreciate ALL of your inputs - thanks, guys! You confirmed
my suspicions so at least I have a "second opinion" to back
me up (and a third, and a fourth .... :) ).

Daniel
 
J

John C. Bollinger

That's exactly what I suspected as well. The problems
that this might cause (which we're probably all
familiar with) might not be cropping up because
one set of servers on the subnet is Linux and one is
Windows.

I'm sorry, I didn't realize that you were seriously considering the idea
that simply having Windows and Linux machines together on the same
subnet or LAN was the problem. The idea is complete hogwash. Linux and
Windows machines coexist nicely on the same physical LAN in the same
logical subnet, along with Solaris machines, MacOS 9 and X machines, AIX
machines, IRIX machines, network devices of various makes, etc.. Linux
machines in fact make fairly good file servers for Windows clients
(better than Windows servers, some have claimed) and can even be used as
domain controllers for Windows NT domains.

It is certainly possible to set up a machine, Linux or otherwise, on an
existing network, and to thereby disrupt the network. That's a question
of the machine configuration, its processes and services and their
configurations, and also to some extent a question of the behavior and
configuration of the other systems on the network as well.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,187
Latest member
RosaDemko

Latest Threads

Top