Multicast getting garbage message on Linux but not on Solaris!?


T

tak

Hi.

Heres the scenario...

We have an java application running on Linux, which basically listens
to multiple multicast IPs, process them, and other things. Issue comes
in when, say Message#1 is being published on IP#1, and Message#2 is
being published from IP#2... so on... When the application starts, we
would get Message#2 on BOTH IP#1, and IP#2!!!! And If the application
is listening to 3 multicast IPs, EACH message is coming into ALL 3
channels all the time. This is bizarre, and it is causing the data to
be X (X is the number of IP we are listening) as much.

And when we bring this to Solaris - it works perfectly fine. No
duplication. Each message only goes to which ever channel it is
suppose to.

So, we wrote a very little program, which basically uses java API to
connect to the multicast socket, and start listening, and we are ABLE
to duplicate this issue.

Here are some notes after further testing.

1. It only happens on Linux
2. It should NOT be the routing on the box - b/c we have a C++
application, that listens to the same multicast IPs, and they are
working fine.
3. We create a C++ multicast listener that print out the sequence
numbers (The first byte of the each packet is the sequence number for
us), and it does NOT have duplicate messages.
4. When we create a java multicast Listener that print out sequence
numbers, and it DOES have duplicate messages (if i listens to 2 IPs
only)
5. This duplicate message ONLY happens, when there are multiple feeds
are being publisher on to the same port (i.e. feedHandler#1 is
publishing to, 234.50.60.100:18900, and feedHandler#2 is publishing to
234.50.60.300:18900, then our java application, would get messages as
if they came in from both channels)
6. Say NO one is publishing to 234.99.99.100:18900, if we start the
little java mcast app, it would receive some data (no idea where it is
coming from) as long as some other feedhandlers are using that port,
18900 (even tho with diff IP). Meaning, say FH#1 is publishing to,
234.50.60.100:18900, and If i start listening onto 237.99.99.199:18900
(a complete diff IP), I would get data!!!

I have spent a few days on why this is happening already with no
luck.. However, with elimination, I dont think it is the routing on
the box, as it works fine for the C++ application that listens to same
feed.

I do not think it is the application, b/c with the small java mcast
app, it is very very simple, and is mostly taken directly from the sun
tutorial (Will post code following this message), and it works fine
for Solaris.

Here is the code for the little small mcast app that we used to
test... (Note: Beginning of each byte from each packet is our sequence
number - our feed handlers put them there before sending it out) This
app has 2 classes.

--- TestingProgram.java ----

import java.io.IOException;
import java.net.InetAddress;
import java.net.MulticastSocket;
import java.io.FileInputStream;
import java.util.Properties;

public class TestingProgram extends Thread{
private static String option =
"234.63.76.193:8900;234.63.76.197:8900;234.63.76.201:8900";
private static final int RECEIVE_BUFFER_SIZE = 500 * 1024;

public TestingProgram(String str){
String ips[] = str.split(";");
int i = 0;
for (i =0; i < ips.length; i++){
String temp[] = ips.split(":");
System.out.println("Adding FeedDataService Source for
" + temp[0] + ":" + temp[1]);
TestingClass test = new TestingClass(i+1); // we will
create X TestingClass based on X IPs.
test.connect(temp[0], Integer.parseInt(temp[1]));
}
System.out.println(i + " feed data service initialized");
try{
join();
} catch (Exception e){ }
}

public static void main(String[] args) {
private static String str=
"234.63.76.193:8900;234.63.76.197:8900;234.63.76.201:8900";
new TestingProgram(str);
System.out.println("Main initialized");

}
}


----- TestingClass.java -----

import java.io.DataInputStream;
import java.io.IOException;
import java.net.DatagramPacket;
import java.net.InetAddress;
import java.net.MulticastSocket;

public class TestingClass extends Thread{

private int PACKET_BUFFER_SIZE = 4096;
private int RECEIVE_BUFFER_SIZE = 4096;
private DatagramPacket datagramPacket;
private DataInputStream dataInputStream;
private MulticastSocket socket;
private String host;
private int port;
private int num;

public TestingClass(int i){
try{
num =i;
byte[] packetBuffer = new byte[PACKET_BUFFER_SIZE];
datagramPacket = new DatagramPacket(packetBuffer,
packetBuffer.length);
dataInputStream = new DataInputStream(new
ByteArrayInputStream(packetBuffer));
} catch (Exception e){
System.out.println("Exception at TestingClass
" + e);
}
}

public void connect(String host, int port){
try{
this.host = host;
InetAddress groupAddress =
InetAddress.getByName(host);
socket = new MulticastSocket(port);
socket.setReceiveBufferSize(RECEIVE_BUFFER_SIZE);
System.out.println("Group joined for " + host + ":" +
port);
socket.joinGroup(groupAddress);
start();
System.out.println(host + " Thread started");
} catch (Exception e){
System.out.println("Exception at connect "
+ e);
}
}

public void run(){
int count=0;
short seqNum = 0;
System.out.println(host + " going into receive() now");
while (true) {
try {
datagramPacket.setLength(PACKET_BUFFER_SIZE);
socket.receive(datagramPacket);
seqNum = dataInputStream.readShort();
dataInputStream.reset();
count++;
System.out.println("Listener #" +num + " got packet #"
+ count + " from host: " + host + " dgram.getSocketAddress(): " +
datagramPacket.getSocketAddress() + " packet length: " +
datagramPacket.getLength() + " seqNum: " + seqNum);
} catch (Exception exception) {
throw new RuntimeException("Failed to receive
packet");
}
}
}
}


Can someone take a look, and see what is wrong? Is there something
special that needs to be done for Linux with listening to multiple
multicast ips??

Thanks!
T
 
Ad

Advertisements

H

Hunter Gratzner

We have an java application running on Linux, which basically listens
to multiple multicast IPs, process them, and other things. Issue comes
in when, say Message#1 is being published on IP#1, and Message#2 is
being published from IP#2... so on... When the application starts, we
would get Message#2 on BOTH IP#1, and IP#2!!!! And If the application
is listening to 3 multicast IPs, EACH message is coming into ALL 3
channels all the time. This is bizarre, and it is causing the data to
be X (X is the number of IP we are listening) as much.

Well, it is not bizarre, AFAIK this is nasty, but common behavior.
And when we bring this to Solaris - it works perfectly fine.

Interesting to hear that, so Sun has apparently fixed or circumvented
the problem. Maybe Linux has some optional feature to handle this too,
I don't know.

Then problem is relatively simple. The fix in Java? Hmmm...

The IP stack does not discriminate on groups when receiving multicast
datagrams, it leaves the filtering for the groups to the Ethernet
hardware. The Ethernet hardware "fetches" the datagrams with the group
addresses out of the "ether" when it sees them and forwards them to
the IP stack. The IP stack gets these incoming multicast datagrams
from the Ethernet layer as if they were received as a normal packets
and just filters them according to the port number. And then it
forwards received datagrams to all connected sockets for a particular
port, which are all your MulticastSockets...

I have never tried if the typical solution for the problem can be made
to work in Java. That is to check the destination multicast address
for each received datagram and to dispatch the datagram according to
that address. Maybe DatagramPacket.getAddress() provides the
information, maybe it doesn't (the API doc lets it appear as if it
doesn't).

Maybe the Linux IP stack can be tuned to filter on the group address.
Ask in a Linux group, and don't forget to mention your Linux
distribution and kernel version. You didn't mention them here.

Alternatively, consider using different ports.
 
T

tak

Well, it is not bizarre, AFAIK this is nasty, but common behavior.

So, that is actually the EXPECTED behavior? You mean thats how it is
suppose to be? But its different IP!
Interesting to hear that, so Sun has apparently fixed or circumvented
the problem. Maybe Linux has some optional feature to handle this too,
I don't know.

Then problem is relatively simple. The fix in Java? Hmmm...

The IP stack does not discriminate on groups when receiving multicast
datagrams, it leaves the filtering for the groups to the Ethernet
hardware. The Ethernet hardware "fetches" the datagrams with the group
addresses out of the "ether" when it sees them and forwards them to
the IP stack. The IP stack gets these incoming multicast datagrams
from the Ethernet layer as if they were received as a normal packets
and just filters them according to the port number. And then it
forwards received datagrams to all connected sockets for a particular
port, which are all your MulticastSockets...

I have never tried if the typical solution for the problem can be made
to work in Java. That is to check the destination multicast address
for each received datagram and to dispatch the datagram according to
that address. Maybe DatagramPacket.getAddress() provides the
information, maybe it doesn't (the API doc lets it appear as if it
doesn't).

Maybe the Linux IP stack can be tuned to filter on the group address.
Ask in a Linux group, and don't forget to mention your Linux
distribution and kernel version. You didn't mention them here.

Alternatively, consider using different ports.

So, the issue is actually with JAVA? B/c C++ works totaly fine...?

Thanks,
T
 
Ad

Advertisements

H

Hunter Gratzner

So, that is actually the EXPECTED behavior? You mean thats how it is
suppose to be?

It is at least the classic way how it used to be. I was in fact
surprised that you didn't saw it everywhere. Maybe the generally
expected behavior has changed in recent years. It is some time ago
that I had to do multicasting (in C), and then it was the expected
way.
But its different IP!

The IP does not matter very much. The Ethernet layer uses MAC
addresses, not IP addresses. The group numbers in multicast IP
addresses are partly mapped to a special range of MAC addresses, and
that's what is filtered on. The IP stack on top is then supposed to
check if it is really the desired group, but that used to be it. After
the check the datagram is just dispatched to the sockets connected to
that port.
So, the issue is actually with JAVA? B/c C++ works totaly fine...?

I don't know. I just googled an now know that some socket
implementations allow to add / turn on filtering according to
multicast IP addresses. Maybe this feature is not used or turned off
in Java's native code for Linux.

But this is all speculation. Try different Java VMs and ask in a Linux
networking group about the details of your particular IP stack.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top