DatagramSocket receive performance

R

Ron

My group at work has a tool, written in java, that receives and
displays UDP packets. We are experiencing a problem that packets are
missing when the message rate gets higher than 50/second.

To investigate further, I have two test programs running on a Windows
PC listening for those UDP packets. One is written in C, and the
other in java. All these programs do is count received packets and
then print out the current count every 1000 messages.

During my testing I increase the rate of UDP packets at the source.
The C program has no problems. Just as with the tool, the java test
program starts to miss packets after 50/second (50 is no problem, but
I start to see a problem at my next increment, which is 75/second).

In the java program, the DatagramSocket, DatagramPacket and the
counter are constructed once at initialization. Besides the call to
receive, the only thing that happens each time through the loop is the
check for counter % 1000 == 0, but I think that shouldn't be an issue.

In any case, the C program has no trouble.

Also: java -fullversion results in 1.6.0-b105 on my test platform (in
case you are wondering).

My questions:
- Is it reasonable to expect the java program to be able to
successfully receive more than 50 packets per second?
- If the answer to that is yes, what can I do to achieve better
performance?

I have not tried it on other platforms, but it is a bit of a moot
point because my team wants to be able to run their tool on a laptop
running windows.
 
D

Daniel Pitts

My group at work has a tool, written in java, that receives and
displays UDP packets. We are experiencing a problem that packets are
missing when the message rate gets higher than 50/second.

To investigate further, I have two test programs running on a Windows
PC listening for those UDP packets. One is written in C, and the
other in java. All these programs do is count received packets and
then print out the current count every 1000 messages.

During my testing I increase the rate of UDP packets at the source.
The C program has no problems. Just as with the tool, the java test
program starts to miss packets after 50/second (50 is no problem, but
I start to see a problem at my next increment, which is 75/second).

In the java program, the DatagramSocket, DatagramPacket and the
counter are constructed once at initialization. Besides the call to
receive, the only thing that happens each time through the loop is the
check for counter % 1000 == 0, but I think that shouldn't be an issue.

In any case, the C program has no trouble.

Also: java -fullversion results in 1.6.0-b105 on my test platform (in
case you are wondering).

My questions:
- Is it reasonable to expect the java program to be able to
successfully receive more than 50 packets per second?
- If the answer to that is yes, what can I do to achieve better
performance?

I have not tried it on other platforms, but it is a bit of a moot
point because my team wants to be able to run their tool on a laptop
running windows.

Is the java program multithreaded? That may make a difference.
UDP is not a reliable protocol by definition, if you need to make sure
that all data sent gets received, you'd be better off using TCP, as
its designed to handle this.


Hope this helps,
Daniel.
 
K

Knute Johnson

Ron said:
My group at work has a tool, written in java, that receives and
displays UDP packets. We are experiencing a problem that packets are
missing when the message rate gets higher than 50/second.

To investigate further, I have two test programs running on a Windows
PC listening for those UDP packets. One is written in C, and the
other in java. All these programs do is count received packets and
then print out the current count every 1000 messages.

During my testing I increase the rate of UDP packets at the source.
The C program has no problems. Just as with the tool, the java test
program starts to miss packets after 50/second (50 is no problem, but
I start to see a problem at my next increment, which is 75/second).

In the java program, the DatagramSocket, DatagramPacket and the
counter are constructed once at initialization. Besides the call to
receive, the only thing that happens each time through the loop is the
check for counter % 1000 == 0, but I think that shouldn't be an issue.

In any case, the C program has no trouble.

Also: java -fullversion results in 1.6.0-b105 on my test platform (in
case you are wondering).

My questions:
- Is it reasonable to expect the java program to be able to
successfully receive more than 50 packets per second?
- If the answer to that is yes, what can I do to achieve better
performance?

I have not tried it on other platforms, but it is a bit of a moot
point because my team wants to be able to run their tool on a laptop
running windows.

I am surprised that there is that big of a difference in performance.
What tool are you using to generate the UDP packets? How big are the
packets? How fast is the network that you are using?
 
R

Ron

I am surprised that there is that big of a difference in performance.
What tool are you using to generate the UDP packets? How big are the
packets? How fast is the network that you are using?

The packets are all 125 bytes, or less, in length. The packets are
being generated on a DSP board that is connected to my test platform
through a D-Link gigabit switch. The board has an ethernet device
that supports 10BaseT. The network card in my test platform supports
100BaseT.
 
R

Ron

Is the java program multithreaded? That may make a difference.

The java test program is multithreaded. I reused the tool code for
the test program, stripping out all the functionality except the part
of the swing gui that prompts the user for the IP address. After the
user enters the address, the program creates a thread with normal
priority to handle reading from the datagram socket.

Today I plan to try again but with a more simple test program that
does not use any swing, and just uses the main thread.
UDP is not a reliable protocol by definition, if you need to make sure
that all data sent gets received, you'd be better off using TCP, as
its designed to handle this.

The fact that UDP is not reliable has to do with delivery of the
packets. That means that the packets can be discarded and not
retransmitted by the source. I don't have that problem - the packets
are being delivered to my test platform just fine.

I don't think TCP would help in this situation, but I could be wrong.
Can you provide a technical explanation for why the Socket
implementation would outperform the DatagramSocket implementation in
this situation?
 
C

Chris Uppal

Ron said:
- Is it reasonable to expect the java program to be able to
successfully receive more than 50 packets per second?

Yes. On the assumption that the processing per packet takes significantly less
time than 20milliseconds -- of course ;-)

- If the answer to that is yes, what can I do to achieve better
performance?

No idea. But something is very wrong somewhere.

I ran a small test which just hammered one machine with small (8 byte) UDP
packets. The way it worked was a loop sending N packets as fast as it could
then sleeping for a second. I found that sending a burst of 1,000 packets in
one go consistently resulted in zero packets lost at the receiver. Sending
10,000 packets generally resulted in about 10-20% loss. Sending 100,000
packets in a burst generally resulted in the /same/ percentage packet loss.

I assume that what's happening is that sender can generate messages faster than
the receiver can process them, but that a burst of 1000 packets is within the
limits of the queuing provided by the network stack and/or OS, and so, as long
as the receiver got a chance to catch up, it could process all 1000 in a second
(or, in fact, much less). But if the bursts were 10K or greater, then that
caused the queues to overflow, and the resulting rate of packet handling (i.e.
the ones that didn't get dropped) reflects the steady-state rate at which it
could process the packets -- in this case apparently around 80-90% of the speed
at which the sender could pump them out.

A few test details. The sending machine was running Java 1.5 on a Win2K box
(and that box, and its network interface, are both faster than the receiving
machine). It could push out around 40K packets per second (that's with 8-bytes
of payload each). I don't know whether all of those actually reached the
network -- I assume so, but can't prove it. The receiving machine was an
oldish WinXP Pro laptop running 1.6 or 1.5. Neither of the machines runs
internal firewall software. They were connected by an unloaded 100 Mbit LAN
(separated by two switching hubs, but I don't think that affects this test).

-- chris
 
R

Ron

Can you show us some code?

I certainly can!

I retried the test with a version that does not use swing and does not
create any additional threads. The source is below. I get the exact
same result.

The messsage source (DSP board) is sending UDP packets of 125 bytes to
a PC that is running a C and Java program that do nothing but count
these arrivals and print out message counts every 1000 messages. I
can control the send rate at the source. At 50/messages per second,
the C and Java program are both printing out 1000, 2000, 3000, etc at
the same time, and in addition this also matches the count of sent
messages at the source. I increase to 75/messages per second. The C
program message count continues to match the send count on the DSP
board, but the Java program starts to lag behind (the C program prints
out 1000...the java program prints out 1000 several seconds later).

import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetSocketAddress;
import java.net.SocketException;
import java.net.SocketTimeoutException;

public class test
{
public static final int SERVER_PORT = 4242;

public static final String SERVER_IP = "192.168.1.10";

public static final int RECV_BUFFER_SIZE = 1024;

public static void main(String args[])
{
int messageCount = 0;
byte[] recvBytes = new byte[ RECV_BUFFER_SIZE ];
InetSocketAddress address = new
InetSocketAddress(test.SERVER_IP, test.SERVER_PORT );
DatagramPacket packet = new DatagramPacket( recvBytes,
RECV_BUFFER_SIZE);
DatagramSocket socket;

try
{
socket = new DatagramSocket();
socket.connect( address );
// Send registration message
byte[] sendBuf = new byte[1];
sendBuf[0] = 1;
DatagramPacket sendPacket = new DatagramPacket( sendBuf,
sendBuf.length,
address );
socket.send( sendPacket );

// Wait for registration response
socket.setSoTimeout(30000);
socket.receive( packet );
socket.setSoTimeout(0);

while( true )
{
socket.receive( packet );

messageCount++;

if( (messageCount % 1000) == 0 )
{
System.out.println( "Messages = " + messageCount );
}
}
}
catch (IOException ex)
{
ex.printStackTrace();
System.exit(1);
}
}
}
 
R

Ron

I ran a small test which just hammered one machine with small (8 byte) UDP
packets. The way it worked was a loop sending N packets as fast as it could
then sleeping for a second. I found that sending a burst of 1,000 packets in
one go consistently resulted in zero packets lost at the receiver. Sending
10,000 packets generally resulted in about 10-20% loss. Sending 100,000
packets in a burst generally resulted in the /same/ percentage packet loss.

I assume that what's happening is that sender can generate messages faster than
the receiver can process them, but that a burst of 1000 packets is within the
limits of the queuing provided by the network stack and/or OS, and so, as long
as the receiver got a chance to catch up, it could process all 1000 in a second
(or, in fact, much less).

I really appreciate you trying your own performance test. I should
have mentioned up front that I'm not concerned with bursty behaviour
at this time. My source for UDP packets sends them out a constant
rate. If the java program can't keep up at the start, then it will
never get a chance to catch up later.

But I agree that my sender is generating messages faster than my
receiver can process them. I can't figure out why. I don't think 75
messages / second should be too much. The C program can handle it, so
I know that the problem isn't with the network or the network card,
and also probably isn't in the OS protocol stack. I think that leaves
the JVM and my own code. I hope I'm doing something wrong! :)

Can you estimate the rates in your performance test? How long do you
think it took your generator to finish sending out those
1000/10000/100000 messages? Do you think you could time it with a
stop watch or add some timestamps to your program?
 
R

Robert Klemme

Can you show us some code?

I certainly can!

I retried the test with a version that does not use swing and does not
create any additional threads. The source is below. I get the exact
same result.

The messsage source (DSP board) is sending UDP packets of 125 bytes to
a PC that is running a C and Java program that do nothing but count
these arrivals and print out message counts every 1000 messages. I
can control the send rate at the source. At 50/messages per second,
the C and Java program are both printing out 1000, 2000, 3000, etc at
the same time, and in addition this also matches the count of sent
messages at the source. I increase to 75/messages per second. The C
program message count continues to match the send count on the DSP
board, but the Java program starts to lag behind (the C program prints
out 1000...the java program prints out 1000 several seconds later).

import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetSocketAddress;
import java.net.SocketException;
import java.net.SocketTimeoutException;

public class test
{
public static final int SERVER_PORT = 4242;

public static final String SERVER_IP = "192.168.1.10";

public static final int RECV_BUFFER_SIZE = 1024;

public static void main(String args[])
{
int messageCount = 0;
byte[] recvBytes = new byte[ RECV_BUFFER_SIZE ];
InetSocketAddress address = new
InetSocketAddress(test.SERVER_IP, test.SERVER_PORT );
DatagramPacket packet = new DatagramPacket( recvBytes,
RECV_BUFFER_SIZE);
DatagramSocket socket;

try
{
socket = new DatagramSocket();
socket.connect( address );
// Send registration message
byte[] sendBuf = new byte[1];
sendBuf[0] = 1;
DatagramPacket sendPacket = new DatagramPacket( sendBuf,
sendBuf.length,
address );
socket.send( sendPacket );

// Wait for registration response
socket.setSoTimeout(30000);
socket.receive( packet );
socket.setSoTimeout(0);

while( true )
{
socket.receive( packet );

messageCount++;

if( (messageCount % 1000) == 0 )
{
System.out.println( "Messages = " + messageCount );
}
}
}
catch (IOException ex)
{
ex.printStackTrace();
System.exit(1);
}
}
}

Just some ideas without testing this myself:

I don't know what JVM you use, but trying "java -server" is one option.
Other than that, since everything is in one method, HotSpot does not
have a chance to kick in. Maybe you can refactor the code to a method,
that receives only 1000 packets and then returns; you could then call
this method in an infinite loop.

Also you might want to reset the packet size after every reception - it
may be that there are issues if you have varying packet sizes.

Kind regards

robert
 
K

Knute Johnson

Ron said:
Can you show us some code?

I certainly can!

I retried the test with a version that does not use swing and does not
create any additional threads. The source is below. I get the exact
same result.

The messsage source (DSP board) is sending UDP packets of 125 bytes to
a PC that is running a C and Java program that do nothing but count
these arrivals and print out message counts every 1000 messages. I
can control the send rate at the source. At 50/messages per second,
the C and Java program are both printing out 1000, 2000, 3000, etc at
the same time, and in addition this also matches the count of sent
messages at the source. I increase to 75/messages per second. The C
program message count continues to match the send count on the DSP
board, but the Java program starts to lag behind (the C program prints
out 1000...the java program prints out 1000 several seconds later).

import java.io.IOException;
import java.net.DatagramPacket;
import java.net.DatagramSocket;
import java.net.InetSocketAddress;
import java.net.SocketException;
import java.net.SocketTimeoutException;

public class test
{
public static final int SERVER_PORT = 4242;

public static final String SERVER_IP = "192.168.1.10";

public static final int RECV_BUFFER_SIZE = 1024;

public static void main(String args[])
{
int messageCount = 0;
byte[] recvBytes = new byte[ RECV_BUFFER_SIZE ];
InetSocketAddress address = new
InetSocketAddress(test.SERVER_IP, test.SERVER_PORT );
DatagramPacket packet = new DatagramPacket( recvBytes,
RECV_BUFFER_SIZE);
DatagramSocket socket;

try
{
socket = new DatagramSocket();
socket.connect( address );
// Send registration message
byte[] sendBuf = new byte[1];
sendBuf[0] = 1;
DatagramPacket sendPacket = new DatagramPacket( sendBuf,
sendBuf.length,
address );
socket.send( sendPacket );

// Wait for registration response
socket.setSoTimeout(30000);
socket.receive( packet );
socket.setSoTimeout(0);

while( true )
{
socket.receive( packet );

messageCount++;

if( (messageCount % 1000) == 0 )
{
System.out.println( "Messages = " + messageCount );
}
}
}
catch (IOException ex)
{
ex.printStackTrace();
System.exit(1);
}
}
}

Ron:

Is there any chance that the source is sending a Datagram with a length
less than you expect? Just in case, try calling
packet.setLength(RECB_BUFFER_SIZE); after your receive call.
 
E

Esmond Pitt

Can I suggest that you call DatagramSocket.setReceiveBufferSize(63*1024)
before you start receiving. As the other posters have said, you must
also reset the packet length every time otherwise it will keep
shrinking. I know your thing is sending a constant 125 bytes at the
moment but you never know ...
 
C

Chris Uppal

Ron said:
Can you estimate the rates in your performance test? How long do you
think it took your generator to finish sending out those
1000/10000/100000 messages?

I said in the original post that it was sending at ~40K packets/second
(remember that the packets had only 8-bytes of payload). I added some time
measurement on the receive side (where there was none before), and refined the
sending side measurements a little. Sending in bursts of 100K packets, the
sender maintains a rate of about 38K udp packets per second. The receiver
looses some of them, but the rate that it actually processes is ~ 32K pps.

-- chris
 
R

Ron

Thanks to everyone who had suggestions. I didn't try them all, but
setting the packet length each time in my loop did fix the problem.

Just reading the docs for DatagramSocket and DatagramPacket, I didn't
get the impression this would be necessary. Are there some detailed
instructions for using these classes?

I also had a look through the source (java and native, for windows),
but I was unable pinpoint the exact place where not setting the packet
length would cause the problems I observed. In the native receive0
function, I can see the values for bufLength and offset being read. I
might regret admiting this, but I don't see any place where they get
set by receive0. I can't see anything in the java source that
modifies these values during receive() either. So if they were set
correctly when I constructed my DatagramPacket, and if receive()
doesn't change them, why does setLength() have such a beneficial
effect? One poster said that my useable buffer would shrink each time
I called receive, and I believe it. I just wasn't able to quickly
find the code that does this, and I don't think I will have time
anytime soon. But I am very curious about it.

If anyone knows, I sure would appreciate a hint. :D

Thanks again to all who responded.
 
E

Esmond Pitt

Ron said:
If anyone knows, I sure would appreciate a hint. :D

When you receive into a DatagramPacket its length is reset to the length
of the incoming data if it's shorter than the original length. So if you
keep receiving variable-length datagrams into the same DGP the DGP will
keep shrinking, and you will be losing data at the end of the larger
datagrams received.

See src/{solaris,windows}/native/java/net/PlainDatagramSocketImpl.c, end
of the receive0() function.
 
R

Ron

When you receive into a DatagramPacket its length is reset to the length
of the incoming data if it's shorter than the original length. So if you
keep receiving variable-length datagrams into the same DGP the DGP will
keep shrinking, and you will be losing data at the end of the larger
datagrams received.

See src/{solaris,windows}/native/java/net/PlainDatagramSocketImpl.c, end
of the receive0() function.

Gah. Not sure how I missed that. Thank you.

Unfortunately I'm still confused. I totally understand how this could
result in losing data. But I was actually seeing packets go missing
entirely.

Without setLength(), the count variable in my java program would trail
the count variable in my c program. As soon as added setLength(), the
counts in the two programs matched. This was true, even if I added
setLength(0) at the start of the loop! The count would be fine, I
just could never see the contents of the packet. I think there must
be more to the story.
 
T

Timothy Bendfelt

I'll add my 2 cents about the packet loss. Without setting the receive
buffer size on the socket you probably get a default buffer of 8K. This
is pretty shallow for high data rates without flow control. If the VM
turns its back for just a second (GC) and if a burst comes from the
server the buffered packets will be overwritten (dropped) by the newly
arriving packets.

As far as the setLength() call on the reused client DatagramPacket I think
that winsock is doing some strange accounting on the socket buffer when
you pull out less data than was delivered. Even with this call though you may
see the problem again at a higher data rate and you may see it in the java
client before the native one.

Your native client code is ether quicker on the draw and never needs 8K
of data buffered locally, or it is using a deeper buffer that can withstand a
bigger delivery burst in between scheduling.

You can detect this by putting a sequence number in the payload and
watching at the client to see when they skip or go out of order.

I've had similar problems working with UDP apps that want a reliable
delivery semantic.

Hope this helps.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top