Performance problem AIX 5.2 using RMI

T

Tassilo Kubitz

We are using J2RE 1.4.1 IBM AIX build ca141-20030703a (JIT enabled)
on our AIX 5.2.

I wrote a simple RMI-Server that starts up and listening for remote
requests as well as an client, that will access this server.
All platforms (win32, Linux, zLinux, Solaris 2.7, HP-UX 11.10)
performing well.
Only AIX 5.2 has a performance problem during the methods
LocateRegistry.createRegistry(),
Naming.rebind() and Naming.lookup().
All other platforms takes only 20-35 ms for these methods.
AIX needs between 75-90 sec !!!.

I test all of my code locally (localhost - 127.0.0.1), but I found no
explanation. IBM has reproduced this problem, but has no answer.
Do you have any idea?

Example Code below

TestService.java:
-----------------
import java.rmi.Remote;
import java.rmi.RemoteException;

public interface TestService extends Remote
{

public String getTest()
throws RemoteException;

}

TestServer.java:
----------------
import java.rmi.Naming;
import java.rmi.RemoteException;
import java.rmi.registry.LocateRegistry;
import java.rmi.server.UnicastRemoteObject;

public class TestServer extends UnicastRemoteObject implements
TestService
{
public TestServer() throws RemoteException {
super();
}

public String getTest() {
return "foo.bar";
}

public static void main(String[] args) {
long time,start;
try {
//TestServer testServer = new TestServer();
start = System.currentTimeMillis();
time = start;
System.out.println("before create: "+time);
LocateRegistry.createRegistry(4711);
time = System.currentTimeMillis() - start;
System.out.println("after create: "+time);
start = System.currentTimeMillis();
TestServer testServer = new TestServer();
time = System.currentTimeMillis() - start;
System.out.println("after new: "+time);
start = System.currentTimeMillis();
Naming.rebind("//127.0.0.1:4711/TestServer", testServer);
time = System.currentTimeMillis() - start;
System.out.println("after rebind: "+time);

} catch (Exception e) {
e.printStackTrace();
}
}

}

TestClient.java:
----------------
import java.rmi.Naming;

public class TestClient
{
public TestClient() {
}

public static void main(String args[]) {
TestClient testClient = new TestClient();
testClient.run();
}

static TestService testStub;

void run() {
long time, start;
try {
start = System.currentTimeMillis();
time = start;

System.out.println("before lookup: "+time);
testStub = (TestService)Naming.lookup("//127.0.0.1:4711/TestServer");
time = System.currentTimeMillis() - start;
System.out.println("after lookup: "+time);
start = System.currentTimeMillis();
System.out.println(testStub.getTest());
time = System.currentTimeMillis() -start;
System.out.println("end: "+time);
} catch (Exception e) {
e.printStackTrace();
}


testStub = null;
}
}

Note: JIT is not the problem

Many thanks
Tassilo
 
A

Andy Fish

Tassilo Kubitz said:
We are using J2RE 1.4.1 IBM AIX build ca141-20030703a (JIT enabled)
on our AIX 5.2.

I wrote a simple RMI-Server that starts up and listening for remote
requests as well as an client, that will access this server.
All platforms (win32, Linux, zLinux, Solaris 2.7, HP-UX 11.10)
performing well.
Only AIX 5.2 has a performance problem during the methods
LocateRegistry.createRegistry(),
Naming.rebind() and Naming.lookup().
All other platforms takes only 20-35 ms for these methods.
AIX needs between 75-90 sec !!!.

I test all of my code locally (localhost - 127.0.0.1), but I found no
explanation. IBM has reproduced this problem, but has no answer.
Do you have any idea?

this is a long shot, but is there a possibility that something in the
innards is trying to do a reverse DNS lookup and that is taking a long time
to timeout?
 
T

Tassilo Kubitz

this is a long shot, but is there a possibility that something in the
innards is trying to do a reverse DNS lookup and that is taking a long time
to timeout?

I had this idea too and I tested this before.
If I do a nslookup everything is fast and ok
localhost <-> 127.0.0.1 ok
FQHN <-> IP ok

I suppose that there must a time-out problem inside the AIX JVM, but
I can't figure out what happens inside. I 'truss'ed the Java process
and got a lot of polls ending in _nsleep... and after a huge amount of
time
everything works as expected. can you see anything inside this:

749646: 0.0003: __loadx(0x07000000, 0x3493E5B8, 0x0000001C,
0x35687748, 0x34935E38) = 0x356879BC
749646: 0.0003: gethostname(0x2FF215D8, 256) = 0
749646: 0.0003: _getpid() = 749646
749646: 0.0002: open("/etc/resolv.conf", O_RDONLY) = 23
749646: 0.0002: kioctl(23, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: = 0
749646: 0.0001: kioctl(23, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: kread(23, " d o m a i n\t u n i x .".., 4096) = 86
749646: kread(23, " d o m a i n\t u n i x .".., 4096) = 0
749646: 0.0002: close(23) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0003: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0003: send(23, 0x2FF201A0, 48, 0) = 48
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 1
749646: 0.0003: nrecvfrom(23, 0x2FF20F50, 1024, 0, 0x2FF1FEE0,
0x2FF1FEA0) = 101
749646: 0.0002: close(23) = 0
749646: 0.0003: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0005: send(23, 0x2FF201A0, 48, 0) = 48
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 1
749646: 0.0003: nrecvfrom(23, 0x2FF20F50, 1024, 0, 0x2FF1FEE0,
0x2FF1FEA0) = 96
749646: 0.0002: close(23) = 0
749646: 0.0002: socket(2, 2, 0) = 23
749646: 0.0002: getsockopt(23, 65535, 4104, 0x2FF1FE34,
0x2FF1FE30) = 0
749646: 0.0002: connext(23, 0x34879DF0, 16) = 0
749646: 0.0002: send(23, 0x2FF201A0, 24, 0) = 24
749646: 0.0002: _poll(0x2FF1FEA8, 1, 5000) = 0
749646: 0.0043: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 0
749646: 0.0064: _nsleep(0x346CE9B0, 0x346CEA28) = 0

<snip snip>

749646: 0.0102: _nsleep(0x355E99F0, 0x355E9A68) = 1
749646: 0.0060: nrecvfrom(24, 0x2FF20160, 1024, 0, 0x2FF1F0F0,
0x2FF1F0B0) = 24
749646: 0.0002: close(24) = 0
749646: 0.0004: _getpid() = 749646
749646: 0.0002: socket(2, 2, 0) = 24
749646: 0.0002: bind(24, 0x2FF20450, 16) Err#13 EACCES
749646: 0.0002: close(24) = 0
749646: 0.0002: getuidx(1) = 503
749646: 0.0002: _getpid() = 749646
749646: 0.0005: socket(2, 2, 17) = 24
749646: 0.0003: _getpid() = 749646
749646: 0.0002: bind(24, 0x2FF20280, 16) Err#13 EACCES
749646: 0.0002: kfcntl(24, F_GETFL, 0x00000000) = 2
749646: 0.0002: kfcntl(24, F_SETFL, 0x00000006) = 0
749646: 0.0002: kioctl(24, -2147195266, 0x2FF202F0, 0x00000000)
= 0
749646: 0.0002: sendto(24, 0x3577D098, 52, 0, 0x3577CE90, 16) =
52
749646: 0.0004: _select(2000, 0x2FF1E300, 0x00000000,
0x00000000, 0x2FF1C290) = 1
749646: 0.0003: nrecvfrom(24, 0x3577CF08, 400, 0, 0x2FF1C2A8,
0x2FF1C280) = 36
749646: 0.0002: close(24) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0002: socket(2, 2, 17) = 24
749646: 0.0002: _getpid() = 749646
749646: 0.0002: bind(24, 0x2FF20280, 16) Err#13 EACCES
749646: 0.0002: kfcntl(24, F_GETFL, 0x00000000) = 2
749646: 0.0002: kfcntl(24, F_SETFL, 0x00000006) = 0
749646: 0.0002: kioctl(24, -2147195266, 0x2FF202F0, 0x00000000)
= 0
749646: 0.0002: kfcntl(24, F_SETFD, 0x00000001) = 0
749646: 0.0002: bind(24, 0x2FF203B0, 16) = 0
749646: 0.0002: ngetsockname(24, 0x2FF203C0, 0x2FF203A0) = 0
749646: 0.0002: _getpid() = 749646
749646: 0.0002: statx("/var/yp/binding/nis.beta.2", 0x3577B958,
76, 0) = 0
749646: 0.0003: ngetsockname(24, 0x2FF20378, 0x2FF20370) = 0
749646: 0.0002: sendto(24, 0x35780D38, 80, 0, 0x3577EA60, 16) =
80
749646: 0.0002: _select(2000, 0x2FF1E370, 0x00000000,
0x00000000, 0x2FF1C300) = 1
749646: 0.0003: nrecvfrom(24, 0x3577EAD8, 8800, 0, 0x2FF1C318,
0x2FF1C2F0) = 32
749646: 0.0002: open("/etc/hosts", O_RDONLY) = 25
749646: 0.0002: kioctl(25, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: 0.0002: kfcntl(25, F_SETFD, 0x00000001) = 0
749646: 0.0003: kioctl(25, 22528, 0x00000000, 0x00000000)
Err#25 ENOTTY
749646: kread(25, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 1920
749646: kread(25, " # @ ( # ) 4 7\t 1 . 1".., 4096) = 0
749646: 0.0002: close(25) = 0
 
C

Chris Uppal

Tassilo said:
I suppose that there must a time-out problem inside the AIX JVM, but
I can't figure out what happens inside. I 'truss'ed the Java process
and got a lot of polls ending in _nsleep... and after a huge amount of
time everything works as expected.

If nobody can suggest anything better, then it might be worthwhile putting a
network sniffer on the box and seeing what (if anything) it's trying to
communicate with and what it's saying.

There seems to be a version of ethereal <www.ethereal.com> for AIX.

-- chris
 
N

Nigel Wade

I had this idea too and I tested this before. If I do a nslookup
everything is fast and ok localhost <-> 127.0.0.1 ok
FQHN <-> IP ok

nslookup goes directly to DNS. It bypasses the resolver libraries,
nsswitch.conf, NIS, local files etc. 127.0.0.1 should never go to DNS, to
resolve localhost shouldn't require a DNS server or even a network at all.

I'd start by looking at resolv.conf/nsswitch.conf (or whatever file AIX
uses) on that system. Check what order it's going to look in local files,
DNS and NIS. If necessary, get it to look in local files first, and make
sure there's an entry for 127.0.0.1 in /etc/hosts which identifies
localhost. You may find that the resolver is setup to use NIS and finally
local files. If NIS is not working this could be the problem.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top