SocketException: Connection reset

Discussion in 'Java' started by fb1800, Dec 8, 2010.

  1. fb1800


    Dec 8, 2010
    I am currently working on a Server/Client application on a cluster

    I am running about 100 clients in parallel and one server (32 clients per node with 8 processor => 4 clients per processor) . The server is linked to all the clients via Sockets.

    After about 55 iteration (about 20 mins), i have a client dying and i receive the following error:

    Client 126 Connection reset
    at java.io_ObjectInputStream$PeekInputStream.peek(
    at java.io_ObjectInputStream$BlockDataInputStream.peek(
    at java.io_ObjectInputStream$BlockDataInputStream.peekByte(
    at java.io_ObjectInputStream.readObject0(
    at java.io_ObjectInputStream.readObject(
    at ServerClient.Server$
    Connection reset

    the is : ServerThread.sockIn.readObject();

    Do you have any idea what could create the connection reset ? The other clients are still running. It seems that the client socket is creating an exception when we try to read a file which actually does not exist.

    My main question is what are the method to identify the problem related to this specific socket (note that I am working on a cluster, hence accessing the node via ssh connection and without having admin access).

    I dont understand why suddenly one client would die. It was working well during 20 minutes and suddenly one die ? Do you have any idea and any advice on how to identify the reset of this Socket disconnection ?

    I though about using wireshark but the fact that the application is running on a cluster make it difficult to analyze,sniff the packets remotely. I dont know if i can do it remotely.

    Do you have any idea how I could debug this problem?

