TCP client application does not detect network failure

O

Olivier Merigon

Hi,

This is a behavior of the TCP socket in Java I dont' understand:.
Let say we have a simple client/server application running on TWO different
machines.
The client send bytes to the server.
The server recieved the bytes and wait for about X sec.
During this waiting time, we disconnect the server from the network (just by
unplugging the server network cable).
After the waiting time the server is aware of the network failure (a
SocketException is thrown : connection reset, because it is trying to send
the response)
But on the client side it is still stuck on a "rcv =
commandInput.readLine();" statement (see after for the complete code), it
will never be aware of the network failure !!! Even after one hour the
client is still waiting to read something on a closed socket. Is this the
normal behavior ?

In reality we are dealing with an application that use the FTP server to
server mode (we are controlling only the command socket, the data transfer
is made by the servers). We have to transfer huge files, thus we can not set
up a time out. If the last scenario occurs, some of our transfers are stuck
and it is not possible for us to detect the failure. The client will wait
for ever the response.

Does anybody have already deal with that ?

Below is the simple client and server I made in order to make my experiment.
The client just read the keyboard input and send it to the server; the
server just echo the request except if the request
is 'wait X', in this case it will wait for X sec before responding.

Thanks by advance,

Olivier MERIGON


CLIENT:
----------------------------------------------------------------------------
-------

package com.iratensolutions.test.ftp;import java.io.BufferedReader;import
java.io.IOException;import java.io.InputStreamReader;import
java.io_OutputStreamWriter;import java.io.PrintWriter;import
java.net.Socket;/*** A Client/Server application to test the network failure
behavior with TCP.* This the client part. It just sends the keyboard input
to the server.* @author Olivier MERIGON*/public class
TestNetworkFailureClient {public static void main(String[] args) {if
(args.length != 1) {System.out.println("usage:
javacom.iratensolutions.test.ftp.TestNetworkFailureClient
serverAdress");}Socket commandSocket = null;BufferedReader
commandInput;BufferedReader keyboardInput;PrintWriter commandOutput;try
{commandSocket = new Socket(args[0], 666);commandInput = new
BufferedReader(newInputStreamReader(commandSocket.getInputStream()));keyboar
dInput = new BufferedReader(new InputStreamReader(System.in));commandOutput
= new PrintWriter(newOutputStreamWriter(commandSocket.getOutputStream()));}
catch (Exception e) {e.printStackTrace();return;}System.out.println("CLIENT
STARTED");String snd;String rcv = null;try {do {System.out.print("KEYBOARD:
");snd = keyboardInput.readLine();if (snd != null)
{commandOutput.println(snd);commandOutput.flush();System.out.println("SND: "
+ snd);rcv = commandInput.readLine();System.out.println("RCV: " + rcv);}}
while (rcv != null);} catch (IOException e1) {e1.printStackTrace();}}}

--------------------------------------------------------------------------

SERVER

----------------------------------------------------------------------

package com.iratensolutions.test.ftp;import java.io.DataInputStream;import
java.io.IOException;import java.io.PrintStream;import
java.net.ServerSocket;import java.net.Socket;/*** A Client/Server
application to test the network failure behavior with TCP.* This the server
part. The server can wait X second after responding to arequest.* If a wait
request is received, the server just answer the time waitedafter* the
requested wait time, otherwise it respond immediatly the requestnumber.*
usage: wait intNbSec | any text* @author Olivier MERIGON*/public class
TestNetworkFailureServer {public static void main(String args[])
{ServerSocket echoServer = null;String line;DataInputStream is;PrintStream
os;Socket clientSocket = null;try {echoServer = new
ServerSocket(666);System.out.println("SERVER STARTED");} catch (IOException
e) {System.out.println(e);}while (true) {try {clientSocket =
echoServer.accept();System.out.println("Handling new client...");is = new
DataInputStream(clientSocket.getInputStream());os = new
PrintStream(clientSocket.getOutputStream());int loopId = 0;do {line =
is.readLine();System.out.println("RCV: " + line);//Handle "Wait" commandif
(line.trim().startsWith("wait")) {String[] tab = line.split("\\s");boolean
ok = false;try {int nbSec =
Integer.parseInt(tab[1]);System.out.println("...waiting for " + nbSec + "
sec.");try {Thread.sleep(nbSec * 1000);} catch (InterruptedException e1)
{}String resp = "...end of wainting periode of " + nbSec + "
sec.";os.println(resp);System.out.println("SND: " + resp);ok = true;} catch
(Exception e) {ok = false;}if (!ok) {String resp = "usage: wait intNbSec |
any text";os.println(resp);System.out.println("SND: " + resp);}//Handle
"normal" action} else {os.println("# " + loopId);System.out.println("SND: #
" + loopId);}loopId++;} while (line != null);} catch (IOException e)
{System.out.println(e);}}}}
 
S

Steve Horsley

Olivier said:
Hi,

This is a behavior of the TCP socket in Java I dont' understand:.
Let say we have a simple client/server application running on TWO different
machines.
The client send bytes to the server.
The server recieved the bytes and wait for about X sec.
During this waiting time, we disconnect the server from the network (just by
unplugging the server network cable).
After the waiting time the server is aware of the network failure (a
SocketException is thrown : connection reset, because it is trying to send
the response)
But on the client side it is still stuck on a "rcv =
commandInput.readLine();" statement (see after for the complete code), it
will never be aware of the network failure !!! Even after one hour the
client is still waiting to read something on a closed socket. Is this the
normal behavior ?

Yes.

Your client is patiently waiting for something that YOU know will not arrive,
but IT does NOT know. The normal approach is to implement timeouts. I know
you say you are sending big files and therefore you can't use timeouts, but
I don't see why you can't reset a timer every time a chunk of the file
arrives, rather htan timint the whole file.

In java.nio you can create un-connected Sockets, and call (IIRC)
setKeepAlive() to enable TCP level keepalives that would detect a network
failure after a while.

Alternatively, you could add a No-Op message to your client-server protocol
and have the client periodically send a no-op message that will be ignored
by the server. This will detect network failures because the TCP layer
would not get TCP-level acknowlegments for the new data you send. I know it
sounds wacky, but sending TCP data that will be ignored by the server WILL
detect the network failures that you are looking for.

HTH
Steve
 
I

iksrazal

Olivier Merigon said:
In reality we are dealing with an application that use the FTP server to
server mode (we are controlling only the command socket, the data transfer
is made by the servers). We have to transfer huge files, thus we can not set
up a time out. If the last scenario occurs, some of our transfers are stuck
and it is not possible for us to detect the failure. The client will wait
for ever the response.

Does anybody have already deal with that ?

The best solution I have seen for the `client waits forever` problem
is the heartbeat pattern, as I know it from Wiley Java Design Patterns
Vol 3. That implementation is RMI, but the idea is that the server
sends a message to the client, which is listening for `still alive`
messages. So you might be able to have your Command socket - which
seems to be the choke point for all of your calls - to cancel the
requests by calling a thread interupt of something. What I typically
do when I don't have control of the server - which is more often than
not - is I open and close a connection once a minute, blocking calls
until the server comes back up.

You could also allow a timeout on activity, counting the bytes being
recieved and after nothing increments, timeout then. That`s what I
would try if I was transfering 1 gig files or something. But in
general I use java.util.Timer liberally.

HTH

Outsource to an American programmer living in brazil!
http://www.braziloutsource.com/
iksrazal
 
J

Juha Laiho

Olivier Merigon said:
This is a behavior of the TCP socket in Java I dont' understand:
(loss of connection is not seen until trying to send data)

Not specific to Java, it's part of how TCP works.

TCP is designed to be rather immune to intermittent faults, like some
network element being rebooted (or even replaced), as long as neither
end of the connection tries to send anything during the fault time.
Different packets belonging to the same connection may even take
different routes across the network, depending on traffic situations.

Two ways to get notified of these intermittent problems would be to
implement timeouts at the higher-level protocol, or to use the TCP
keepalive packets. However, you need to also consider what is your
need to keep the connection alive at all times (naturally you want
to keep it alive while you're waiting for a response for some command
you've given, but you should be able to come up with some reasonable
timeout during which the command must complete).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top