Problem communicating with socket application

P

Pep

I am experiencing problems when trying to communicate with a TCP socket
based application that does not always append a <CR> at the end of the data
and so cannot use readLine on a BufferedInputStream.

I have tried using a simple read in to a char array but have found that
where the application has sent 4 records using 4 separate socket writes, my
read operation has resulted in them all being read in to one array and I
have no means of determining where one record ends and the next begins,
unless there are <CR> appended to each record, which as I stated is not
always the case :(

Using a normal unix socket read operation in C++ I would not have this
problem as each read operation would result in one record.

How can I get java sockets to operate in a similar manner as unix socket
reads so that one record is obtained in each read operation, regardless of
whether it is appended with a <CR> or not?

TIA,
Pep.
 
G

Gordon Beaton

I am experiencing problems when trying to communicate with a TCP
socket based application that does not always append a <CR> at the
end of the data and so cannot use readLine on a BufferedInputStream.

I have tried using a simple read in to a char array but have found
that where the application has sent 4 records using 4 separate
socket writes, my read operation has resulted in them all being read
in to one array and I have no means of determining where one record
ends and the next begins, unless there are <CR> appended to each
record, which as I stated is not always the case :(

Using a normal unix socket read operation in C++ I would not have
this problem as each read operation would result in one record.

No, you've just been lucky so far. Your C++ application is broken too,
but has been working "by accident". A subtle difference in timing may
make the difference.
How can I get java sockets to operate in a similar manner as unix
socket reads so that one record is obtained in each read operation,
regardless of whether it is appended with a <CR> or not?

TCP is a lowly byte stream, and it does not know anything about record
boundaries, nor does it make any attempts to preserve them. You may
find that multiple records are occasionally combined, and single
records are somtimes broken into two or more parts.

If you need delimited records you need to manage them yourself. The
easiest way is to insert delimiters (special characters like CR or
anything else that can't occur within a record) between the records as
you send them, so the recipient can determine where one record in the
stream ends and the next one begins.

Another way is to precede each record with a short header containing
the length of the record.

If you already know the length of each record in advance, simply read
the correct number of bytes from the stream each time.

Finally, maybe there is some other mechanism you can use in your
client to recognize the end of a record.

/gordon
 
R

Roedy Green

Another way is to precede each record with a short header containing
the length of the record.

If you already know the length of each record in advance, simply read
the correct number of bytes from the stream each time.

Finally, maybe there is some other mechanism you can use in your
client to recognize the end of a record.

and yet another way is an ObjectStream that deals with breaking the
stream up into objects for you. That won't work though when one end is
C++.
 
P

Pep

Gordon said:
No, you've just been lucky so far. Your C++ application is broken too,
but has been working "by accident". A subtle difference in timing may
make the difference.

I'm surprised I've been "lucky so far". I'm talking about a application
that services transactions from multiple clients under extreme load and it
has never missed a record yet. The records are passed using a normal socket
write operation so I know how the data is provided.

Still I won't argue with someone that knows better than me and by that I am
actually trying to be sincere not rude :)
TCP is a lowly byte stream, and it does not know anything about record
boundaries, nor does it make any attempts to preserve them. You may
find that multiple records are occasionally combined, and single
records are somtimes broken into two or more parts.

If you need delimited records you need to manage them yourself. The
easiest way is to insert delimiters (special characters like CR or
anything else that can't occur within a record) between the records as
you send them, so the recipient can determine where one record in the
stream ends and the next one begins.

Another way is to precede each record with a short header containing
the length of the record.

If you already know the length of each record in advance, simply read
the correct number of bytes from the stream each time.

Finally, maybe there is some other mechanism you can use in your
client to recognize the end of a record.

/gordon

Unfortunately I do not have control over the data being sent to me now and I
have now found from a ethereal analysis that sometimes the records have a
<CR> ending and other times they do not.

I'll have to try simply reading a determined number of bytes :(

Cheers,
Pep
 
P

Pep

Roedy said:
and yet another way is an ObjectStream that deals with breaking the
stream up into objects for you. That won't work though when one end is
C++.

Yep, unfortunately this data is being provided by some windows based c++
program.

Cheers,
Pep.
 
P

Pep

Pep said:
I'm surprised I've been "lucky so far". I'm talking about a application
that services transactions from multiple clients under extreme load and it
has never missed a record yet. The records are passed using a normal
socket write operation so I know how the data is provided.

Still I won't argue with someone that knows better than me and by that I
am actually trying to be sincere not rude :)


Unfortunately I do not have control over the data being sent to me now and
I have now found from a ethereal analysis that sometimes the records have
a <CR> ending and other times they do not.

I'll have to try simply reading a determined number of bytes :(

Cheers,
Pep

Sorry I forgot to add that the length of the record is variable so that
without this random <CR> I'm pretty much screwed :(
 
G

Gordon Beaton

Sorry I forgot to add that the length of the record is variable so
that without this random <CR> I'm pretty much screwed :(

Probably, yes. I thought to add that exact sentiment in my previous
reply, but didn't want to come across as rude!

It sounds odd to me that the existence (or lack) of CR is "random".

Have I understood correctly that your C++ clients work as expected,
and you are now implementing a Java client to an existing C++ server?

Do the C++ clients not have any special logic to deal with record
boundaries? Can you not change the way the server sends messages?

As I mentioned earlier, a small timing change could make a difference.

Basically if there is a short delay between calls to write(), it is
often the case that they will be sent separately by TCP. As long as
the recipient reads() sufficiently quickly, he will receive them
separately as well. This can work ("by accident"), but you really
shouldn't rely on this behaviour.

If the sender sends short messages without delay in between, then TCP
may send them together. Similarly when the reader is slow, messages
will accumulate in his receive buffer, and calls to read() cannot
distinguish between them.

/gordon
 
P

Pep

Gordon said:
Probably, yes. I thought to add that exact sentiment in my previous
reply, but didn't want to come across as rude!

It sounds odd to me that the existence (or lack) of CR is "random".

Same here. I was told in the spec that the records would be capped with a
Have I understood correctly that your C++ clients work as expected,
and you are now implementing a Java client to an existing C++ server?

NO. The original application which I wrote consisted of a C++ server and
client both of which use unix sockets with no EOR delimiter and they work
fine.

Now I am having to replace this with a server that is provided and has been
written using windows based C++ and I have to write the client. So I have
done this using java.
Do the C++ clients not have any special logic to deal with record
boundaries? Can you not change the way the server sends messages?

As I mentioned earlier, a small timing change could make a difference.

Basically if there is a short delay between calls to write(), it is
often the case that they will be sent separately by TCP. As long as
the recipient reads() sufficiently quickly, he will receive them
separately as well. This can work ("by accident"), but you really
shouldn't rely on this behaviour.

Accepted and thanks for that knowledge.
If the sender sends short messages without delay in between, then TCP
may send them together. Similarly when the reader is slow, messages
will accumulate in his receive buffer, and calls to read() cannot
distinguish between them.

Which appears to be my problem here.

Okay this is the code that I am trying to run

===============================================================================
try
{

while ((running.get()) && (parentProxy.br != null) &&
(parentProxy.br.ready()))
{

try
{
mResponse = "";

if (parentProxy.br != null)
{
mResponse = parentProxy.br.readLine(); // read the response from the m
server
logDebugToFile("TSreader::run processing [" + mResponse + "]");
parentProxy.processMResult(mResponse); // send the result back to the
client
}

}
catch(SocketTimeoutException e)
{
// do nothing here
}

}

}
catch(Throwable e)
{
logFatalToFile("TSreader::run Error (running) " + e.getMessage(), e);
e.printStackTrace();
}
===============================================================================

and this is the output of the code

===============================================================================
27 Oct 2005 09:34:15 GMT: DEBUG - {Thread-0} {Thread-4}
TSreader::run processing [CCOK Q458000:Y:a:XXXX48]
27 Oct 2005 09:34:16 GMT: DEBUG - {Thread-0} {Thread-4}
TSreader::run processing []
27 Oct 2005 09:34:16 GMT: FATAL - {Thread-0} {Thread-4}
TSreader::run Error (running) String index out of range: 6
java.lang.StringIndexOutOfBoundsException: String index out of range: 6
at java.lang.String.charAt(Unknown Source)
at TS.MP.processMResult(MP.java:486)
at TS.TSreader.run(TSreader.java:81)
at java.lang.Thread.run(Unknown Source)
27 Oct 2005 09:34:16 GMT: DEBUG - {Thread-0} {Thread-4}
TSreader::run processing [CCOK E458000:Y:a:XXXX34]
===============================================================================

as can be seen, it can read the first record and the 3rd record but the
second record is coming back empty

Based on this input on the socket (obtained using ethereal)

===============================================================================
CCOK
Q458000:Y:a:XXXX48C\x9f`C\xd8@^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@(V\xbf@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f
\xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfdP^P\xe4
Y^F^@^@C\x9f`C\xdbC^@<^@^@^@<^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@)^O\xf5@^@}^F\x94\xee^]^H^P\x8f\xc0\xa8j\xac#\x8d\xbc\xd1\x83^G\xfd\xf5\x95\xd4\xd3P^X^^^[Z\x91^@^@^M^@^@^@^@^@C\x9f`C\x8en^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@yV\xc6@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f
\xbc#\x8d\xf5\x95\xd4\xd3\xd1\x83^G\xfeP^X\xe4YW^@^@C:MISTER-48/5 :WXYZ :4111111111111111 :0605:
20.00:JTWB801XXXX48 ^MC\x9f`C\xab\x86^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@?^P\xf5@^@}^F\x93\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d
\xbc\xd1\x83^G\xfe\xf5\x95\xd5$P^X^]\xca\x88g^@^@CCOKR458000:Y:a:XXXX10C\x9f`C8\xb7^K^@\x87^@^@^@\x87^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@yV\xcc@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f
\xbc#\x8d\xf5\x95\xd5$\xd1\x83^H^UP^X\xe4
YW^@^@C:MISTER-18/5 :WXYZ :4111111111111111 :0605:
20.00:KTWB801XXXX18 ^MC\x9f`C\xba^K^@O^@^@^@O^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@A^R\xf5@^@}^F\x91\xd6^]^H^P\x8f\xc0\xa8j\xac#\x8d
\xbc\xd1\x83^H^U\xf5\x95\xd5uP^X^]yE~^@^@^MCCOK
E458000:Y:a:XXXX34^MC\x9f`C^K'^M^@6^@^@^@6^@^@^@^@^N\x83\x9c\x91\xb2^@^K\xdb\x95\xc
(^H^@E^@^@(V\xd1@^@@^F^@^@\xc0\xa8j\xac^]^H^P\x8f
\xbc#\x8d\xf5\x95\xd5u\xd1\x83^H.P^P\xe4Y^F^@^@C\x9f`C\xf3\xdd^M^@M^@^@^@M^@^@^@^@^K\xdb\x95\xc
(^@^N\x83\x9c\x91\xb2^H^@E^@^@?^T\xf5@^@}^F\x8f\xd8^]^H^P\x8f\xc0\xa8j\xac#\x8d
\xbc\xd1\x83^H.\xf5\x95\xd5uP^X^]y\x82F^@^@
===============================================================================

and I am struggling to find out why this is happening :(

If you can see where I am going wrong then I would greatly appreciate your
advice.

The really annoying thing is that I have written a simulator based on the
servers spec of capping the records with a <CR> and my client can process
up to 100,000 record sin a 2 hour period. So this is really starting to
piss me off!

Cheers,
Pep.
 
R

Roedy Green

Same here. I was told in the spec that the records would be capped with a
<CR> but they are not always which is something I am arguing with the
designer of the server code.

If you are using a BufferedOutputStream, you want to do a flush()
after every record or parts of it could stay stuck in the buffer
wrapping the socket until you write some more to push it out.
..
 
P

Pep

Roedy said:
If you are using a BufferedOutputStream, you want to do a flush()
after every record or parts of it could stay stuck in the buffer
wrapping the socket until you write some more to push it out.
.

I'm using a print writer

pw = new PrintWriter(microgateSocket.getOutputStream(),true);

The really annoying thing about this is that it seems ot be related to the
fact that they are not placing a <CR> at the end of the records. My
simulator puts a \n at the end of each record and like I said, my client
will then process over 100,000 record swithout dropping a single one.

Cheers,
Pep.
 
P

Pep

Roedy said:
If you are using a BufferedOutputStream, you want to do a flush()
after every record or parts of it could stay stuck in the buffer
wrapping the socket until you write some more to push it out.
.

Actually as I am looking at the ethereal output, I cannot see a <CR> at the
end of any of the records they are seeing back to me at all so I'm now
wondering how the readLine function is working at all?

Cheers,
Pep.
 
R

Roedy Green

pw = new PrintWriter(microgateSocket.getOutputStream(),true);

The really annoying thing about this is that it seems ot be related to the
fact that they are not placing a <CR> at the end of the records.

That is not the official duty of a PrintWriter. It is supposed to put
a platform specific line separator there. If you want a cr
specifically you should do a write( '\r' );
 
P

Pep

Roedy said:
That is not the official duty of a PrintWriter. It is supposed to put
a platform specific line separator there. If you want a cr
specifically you should do a write( '\r' );

Yeah but they are not using a print writer. They have written their
application using either Visual C++ or Visual Basic so maybe god knows how
they are writing the data to the socket?

I use the print writer in my java client to send the transactions to their
server and I append a <CR> to the data.

I have just done another massive run against their server and they seem to
only be appending a <CR> to maybe around 2% of their transaction reply
records. Which of course means as it is a variable length record with no
delimiter I cannot even handle the protocol myself using a data stream
reader.

Cheers,
Pep.
 
P

Pep

Roedy said:
That is not the official duty of a PrintWriter. It is supposed to put
a platform specific line separator there. If you want a cr
specifically you should do a write( '\r' );

I have now found out, with the use of ethereal at both ends of the socket,
that the windows application is definitely sending a 0x0D but it is being
transformed in to a 0xDC by the time it reaches my end of the socket.

Similarly my 0x0D0x0A byte sequence is being converted in to a 0x0d.

Cheers,
Pep.
 
G

Gordon Beaton

I have now found out, with the use of ethereal at both ends of the
socket, that the windows application is definitely sending a 0x0D
but it is being transformed in to a 0xDC by the time it reaches my
end of the socket.

I find it extremly hard to believe that the cable or a switch alone
would be making such selective changes to the data stream.

Are you absolutely certain that you aren't making parts of this
observation in the code itself, where some processing has already
taken place? Or that your data doesn't pass through a proxy of some
kind?

/gordon
 
P

Pep

Gordon said:
I find it extremly hard to believe that the cable or a switch alone
would be making such selective changes to the data stream.

Are you absolutely certain that you aren't making parts of this
observation in the code itself, where some processing has already
taken place? Or that your data doesn't pass through a proxy of some
kind?

/gordon

At this point I am not sure of anything other than ethereal shows a 0x0D on
the windows end of the socket and a 0xDC on the unix end of the socket.

Similarly that the 0x0D0x0A on the unix end of the socket is a 0x0D when it
reaches the windows end of the socket.

I make no assumptions as to what is causing the change but am relieved to
find out that it is not my client written in Java or the server written in
some windows based language.

Cheers,
Pep.
 
S

Steve Horsley

Pep said:
At this point I am not sure of anything other than ethereal shows a 0x0D on
the windows end of the socket and a 0xDC on the unix end of the socket.

Similarly that the 0x0D0x0A on the unix end of the socket is a 0x0D when it
reaches the windows end of the socket.

I make no assumptions as to what is causing the change but am relieved to
find out that it is not my client written in Java or the server written in
some windows based language.

Spooky. So what exactly connects the client and server?

As Gordon says, I imagine they must be talking via a proxy. I
would be inclined to compare the traces for IP address, MAC
address, IP sequence numbers, to prove there is some entity
playing piggy in the middle and corrupting the data stream. That
kind of change doesn't happen by accident - you have to re-write
checksums, and even change sequence numbering if you're dropping
bytes from the stream.

Steve
 
M

Missaka Wijekoon

Pep said:
Actually as I am looking at the ethereal output, I cannot see a <CR> at the
end of any of the records they are seeing back to me at all so I'm now
wondering how the readLine function is working at all?

Per the Java API docs:

public String readLine() throws IOException
Read a line of text. A line is considered to be terminated by any
one of a line feed ('\n'), a carriage return ('\r'), or a carriage
return followed immediately by a linefeed.

From some of the conversation, it feels as if there might be a filter
that is converting the stream like dos2unix, etc. Is there a chance
that the socket on the server end is not a true socket, but perhaps a
telnet connections? For example, the telnet protocol requires that 0xFF
be escaped.
 
R

Roedy Green

I have now found out, with the use of ethereal at both ends of the socket,
that the windows application is definitely sending a 0x0D but it is being
transformed in to a 0xDC by the time it reaches my end of the socket.

So Java nothing to do with it.

Write a class that reads one record scanning it byte by byte

You might use http://mindprod.com/jgloss/readblocking.html
as a model. Perhaps it should also convert it to char for you as well
after it has scanned the bytes.
 
P

Pep

Roedy said:
So Java nothing to do with it.

Thankfully, no.
Write a class that reads one record scanning it byte by byte

You might use http://mindprod.com/jgloss/readblocking.html
as a model. Perhaps it should also convert it to char for you as well
after it has scanned the bytes.

Just about to run my client on the same network segmet as the server to see
if we still have the same problem. If not then we can work outwards from
there to see where it comes in.

Cheers,
Pep.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top