Class dependency problem

K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I'm looking at doing some class loading in batch for an application I'm
working on. The problem with this is that I need to get a list of all the
classes that are used in the class I'm looking at so I can load all the
dependencies. There won't be any use of dynamically loading classes, but
I still can't seem to find a way to do this before actually running the
method I'm going to call.

Any ideas?

Thanks.

- --
Kenneth P. Turvey <[email protected]>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEvDnUi2ZgbrTULjoRAhrtAJ98d0QN3KhioxYFdEW9wILsEQ37oQCbBucW
b+SUh08cz8KYiKCfaKK3m0U=
=11eC
-----END PGP SIGNATURE-----
 
C

Chris Uppal

Kenneth said:
I'm looking at doing some class loading in batch for an application I'm
working on. The problem with this is that I need to get a list of all the
classes that are used in the class I'm looking at so I can load all the
dependencies. There won't be any use of dynamically loading classes, but
I still can't seem to find a way to do this before actually running the
method I'm going to call.

I'm not at all clear on what you are trying to achieve. What do you mean by
"load" that you are going to do in a batch operation ? And why are you going
to do it at all ?

As far as getting a list of dependencies goes, perhaps you could explain what's
wrong with just eyeballing the code and making a list. I'm sure that doesn't
meet your requirements, but it may help clarify if you explain why.


-- chris
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Kenneth P. Turvey wrote:

I'm not at all clear on what you are trying to achieve. What do you mean by
"load" that you are going to do in a batch operation ? And why are you going
to do it at all ?

I want to be able to programmaticly offload a calculation to another
machine. I would like to be able to send the machine a serialized class
and have it take a look at the class and request any other classes it
might need but can't resolve locally. That's the idea anyway.

So, I've got class A that implements some interface that includes the
method (I'm making this up now):

Object doCalculation();

I send this to my remote machine and the remote machine takes a look at my
class and realizes that when it gets around to running my doCalculation()
method it is going to need the MyFastBitSet class and so it says, "please
also send me MyFastBitSet so I'll have it later when I run your
doCalculation() method."

Right now I'm just looking at this to see if it is practical. I may spend
some time on the idea in the future.
As far as getting a list of dependencies goes, perhaps you could explain what's
wrong with just eyeballing the code and making a list. I'm sure that doesn't
meet your requirements, but it may help clarify if you explain why.

I don't want to have to put together a jar file and handle a bunch of
distribution issues. I want to be able to simply implement an interface
and make a couple function calls and let the system work out the details.
The idea is that others could do the same without an intimate knowledge of
how all the details work.

Thanks.

- --
Kenneth P. Turvey <[email protected]>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEvbYAi2ZgbrTULjoRAsamAJ95jN1gK36ECxWEBlVqZGnykKea9wCgwjgg
iTFbNgqySSYz5R+XmJusCC4=
=szTk
-----END PGP SIGNATURE-----
 
C

Chris Uppal

Kenneth said:
I want to be able to programmaticly offload a calculation to another
machine. I would like to be able to send the machine a serialized class
and have it take a look at the class and request any other classes it
might need but can't resolve locally. That's the idea anyway.

I've never used it, but I thought that RMI does (or can be configured to do)
that sort of thing for you automatically.

An easier way to do this would be to use a custom classloader on the remote
machine which "knows" how to copy across classfiles on demand. That way the
JVM will work out what classes are needed and when, and all you have to do is
write the code to copy the data across. Depending on your settup you might
even be able to get away with using a vanilla URLClassLoader.

It is, IIRC, a little fiddly to tell an object serialisation stream to use your
own classloader. I think you have to create a trivial sublcass which doesn't
use the application classloader. I forget the details, I'm afraid.

-- chris
 
N

Nigel Wade

Kenneth said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



I want to be able to programmaticly offload a calculation to another
machine. I would like to be able to send the machine a serialized class
and have it take a look at the class and request any other classes it
might need but can't resolve locally. That's the idea anyway.

So, I've got class A that implements some interface that includes the
method (I'm making this up now):

Object doCalculation();

I send this to my remote machine and the remote machine takes a look at my
class and realizes that when it gets around to running my doCalculation()
method it is going to need the MyFastBitSet class and so it says, "please
also send me MyFastBitSet so I'll have it later when I run your
doCalculation() method."

Right now I'm just looking at this to see if it is practical. I may spend
some time on the idea in the future.


I don't want to have to put together a jar file and handle a bunch of
distribution issues. I want to be able to simply implement an interface
and make a couple function calls and let the system work out the details.
The idea is that others could do the same without an intimate knowledge of
how all the details work.

Thanks.

Isn't RMI able to provide that functionality? It's not something I've personally
had any experience of; whenever I've used RMI I've made the necessary classes
directly available to the RMI registry. However, my cursory reading of the
subject indicates that this ought to be possible, but I've no idea of how
complex it would be to setup.

This document might provide some useful pointers, and at least tell you if it's
feasible: http://research.sun.com/techrep/2006/smli_tr-2006-149.pdf
 
P

Patrick May

Kenneth P. Turvey said:
I want to be able to programmaticly offload a calculation to another
machine. I would like to be able to send the machine a serialized
class and have it take a look at the class and request any other
classes it might need but can't resolve locally. That's the idea
anyway.

Look into Jini (http://www.jini.org) and the JavaSpace service in
particular. It will do exactly what you want.

Artima has a series of articles from around 2000 that explain the
basics (http://www.artima.com/jini/jiniology/js1.html) or you can just
Google for '"compute server" jini javaspace'.

Regards,

Patrick
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Look into Jini (http://www.jini.org) and the JavaSpace service in
particular. It will do exactly what you want.

Artima has a series of articles from around 2000 that explain the
basics (http://www.artima.com/jini/jiniology/js1.html) or you can just
Google for '"compute server" jini javaspace'.

I've read through this and I don't really think it addresses the problem
that I'm looking at now. It may be that it is easily addressed, but I
just don't see how at this time.

The problem is that I would like the "Master" from the compute server to
be able to connect to the space and put a bunch of tasks in that it would
like the "Worker" nodes to compute. Then the "Master" should be able to
disconnect and be turned off.

At some later point in time I would like the "Worker" to be able to
connect and complete one of the tasks added by the "Master". From what
I've read about the compute farm using JavaSpaces, it doesn't look like
this model works. It sounds like the "Master" must be connected so that
the "Workers" can get the classes they need to perform the tasks.

I hope I'm making this clear.

The problem is that by the time the worker node gets to doing the task, I
would like the master node to be disconnected, turned off, doing something
else. At some later time (or date) the master node would reconnect and
retrieve its results.

It seems that in order to do this the master would have to be able to
identify exactly which classes will be needed by the worker. I don't
really see how this is done offline.

Any suggestions would be greatly appreciated.

Thanks,

- --
Kenneth P. Turvey <[email protected]>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEyZZ4i2ZgbrTULjoRArebAJ0bb034vlBP7H8dOIUhcEmaeRc+vwCglkG3
ZKIFjt46F1+GJycuCwIXhfE=
=TVEz
-----END PGP SIGNATURE-----
 
C

Chris Uppal

Kenneth said:
The problem is that I would like the "Master" from the compute server to
be able to connect to the space and put a bunch of tasks in that it would
like the "Worker" nodes to compute. Then the "Master" should be able to
disconnect and be turned off.

But this is a completely different problem from the one you originally
described, where the worker could ask for classes it needed but didn't have.
The new version, fire-and-forget, is perfectly reasonable, but requires a
different architecture.

For a start, the job fired off to the worker must have /all/ the necessary
information for computation to complete (configuration, input data, classfiles,
....). One natural medium for delivery of all that data would be a JAR file
containing all the needed. Note that in a fire-and-forget environment it is
better to be pessimistic about providing data, in particular the effort of
identifying a truly minimal set of required classes would be a waste of time.

So I would just fire off a JAR file containing all the calculation classes with
every request. There would be a lot of duplication, but so what ? It would
make your life much easier.

If you do want to try to cut down the number of classes sent, then you'll have
to do it on the master machine, or in advance (perhaps by hand). If you want
to generate such a list automatically (and I repeat that I don't think it would
be worthwhile), then a good way to do it would be to scan the bytecodes of the
relevant classes, forming a transitive closure of the class references starting
at the "entry-point". The easy, but coarse, way to track inter-class
references is to look at each classfile's constant pool for references to other
classes. If you are up for a bit of extra effort you could do the same kind of
thing, but at method granularity. In either case, a classfile library like ASM
or BCEL would do the bulk of the work for you. Note that you would have to
find some way of handling reflective references to class (Class.forName()) --
the simplest way of doing that would be to forbid them. Alternatively look for
literal strings which are valid class names (in JNI syntax).

There are tools available which will do this kind of analysis in order to
reduce the size of JAR files. You might be able to apply such a tool directly,
or "borrow" some of its code.

-- chris
 
P

Patrick May

Kenneth P. Turvey said:
I've read through this and I don't really think it addresses the
problem that I'm looking at now. It may be that it is easily
addressed, but I just don't see how at this time.

The problem is that I would like the "Master" from the compute
server to be able to connect to the space and put a bunch of tasks
in that it would like the "Worker" nodes to compute. Then the
"Master" should be able to disconnect and be turned off.

At some later point in time I would like the "Worker" to be able to
connect and complete one of the tasks added by the "Master". From
what I've read about the compute farm using JavaSpaces, it doesn't
look like this model works. It sounds like the "Master" must be
connected so that the "Workers" can get the classes they need to
perform the tasks.

A Jini-based compute server can accomplish this easily. Consider
a Task class that exposes an execute() method. The Master configures
Task instances and writes them into a JavaSpace. Worker processes
take Tasks from the space. If the Worker has the classes required by
the Task in its classpath, it simply calls execute(). If the Worker
does not know about the necessary classes, they are automatically
downloaded from the class server.

Naturally, the use of a class server is more flexible because the
Workers don't need to be redeployed when the Task implementation
changes.
The problem is that by the time the worker node gets to doing the
task, I would like the master node to be disconnected, turned off,
doing something else. At some later time (or date) the master node
would reconnect and retrieve its results.

The class server doesn't need to run on the same machine as the
Master process. Unless performance requirements indicate otherwise,
the logical place to run it is the same machine running the JavaSpace.

Regards,

Patrick
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

A Jini-based compute server can accomplish this easily. Consider
a Task class that exposes an execute() method. The Master configures Task
instances and writes them into a JavaSpace. Worker processes take Tasks
from the space. If the Worker has the classes required by the Task in its
classpath, it simply calls execute(). If the Worker does not know about
the necessary classes, they are automatically downloaded from the class
server.

This is pretty much what I had in mind. The question that arises is how
do we know what classes to give to the class server? The master connects
to the space (or the class server) and says "I want this task done". The
next step is for the master to hand off the necessary classes to the class
server. How does it know which ones to hand off?
Naturally, the use of a class server is more flexible because the
Workers don't need to be redeployed when the Task implementation changes.

Yes, this is exactly what I'm looking for.
The class server doesn't need to run on the same machine as the
Master process. Unless performance requirements indicate otherwise, the
logical place to run it is the same machine running the JavaSpace.

So this works? I'm still unclear on how.

- --
Kenneth P. Turvey <[email protected]>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEykJBi2ZgbrTULjoRApCuAKDGInJePpv3vyOmL9qd0+IufQYl0QCg7MWB
1hNpcUu0gb+mJiXspsBOa5c=
=HHsa
-----END PGP SIGNATURE-----
 
P

Patrick May

Kenneth P. Turvey said:
This is pretty much what I had in mind. The question that arises is
how do we know what classes to give to the class server?

In this case, the Workers would need to know the Task interface,
but the implementation class(es) for the Tasks would be in a jar file
provided by the class server.
The master connects to the space (or the class server) and says "I
want this task done".

The Master connects only to the JavaSpace (which it has found via
the Jini Lookup Service) and writes a Task to the space. The Task
interface must be a subtype of the Jini Entry interface.
The next step is for the master to hand off the necessary classes to
the class server. How does it know which ones to hand off?

The jar file containing the necessary classes must be deployed to
the class server before starting processing.
So this works? I'm still unclear on how.

The class server is no more than a simple HTTP server. The
configuration file for the Master and Worker processes specifies which
class servers to use on startup.

I'll try to dig up an example that I can strip down to the bare
essentials this weekend.

Regards,

Patrick
 
K

Kenneth P. Turvey

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

But this is a completely different problem from the one you originally
described, where the worker could ask for classes it needed but didn't
have. The new version, fire-and-forget, is perfectly reasonable, but
requires a different architecture.

I looked at the previous articles in the thread. I think what I said is
pretty clear, but you know how that goes.. I do appreciate your effort to
decipher what I'm writing.
For a start, the job fired off to the worker must have /all/ the necessary
information for computation to complete (configuration, input data,
classfiles, ...). One natural medium for delivery of all that data would
be a JAR file containing all the needed. Note that in a fire-and-forget
environment it is better to be pessimistic about providing data, in
particular the effort of identifying a truly minimal set of required
classes would be a waste of time.

You have a point about the complexity of making sure all files are
available. Some of this can be handled by using a custom security model
(which is what I was planning on doing anyway) to disallow the kinds of
programming constructs that would cause problems.
So I would just fire off a JAR file containing all the calculation classes
with every request. There would be a lot of duplication, but so what ?
It would make your life much easier.

That might be the way to go. It just isn't as clean as calling a method
doMyWork().
If you do want to try to cut down the number of classes sent, then you'll
have to do it on the master machine, or in advance (perhaps by hand). If
you want to generate such a list automatically (and I repeat that I don't
think it would be worthwhile), then a good way to do it would be to scan
the bytecodes of the relevant classes, forming a transitive closure of the
class references starting at the "entry-point". The easy, but coarse, way
to track inter-class references is to look at each classfile's constant
pool for references to other classes. If you are up for a bit of extra
effort you could do the same kind of thing, but at method granularity. In
either case, a classfile library like ASM or BCEL would do the bulk of the
work for you. Note that you would have to find some way of handling
reflective references to class (Class.forName()) -- the simplest way of
doing that would be to forbid them. Alternatively look for literal
strings which are valid class names (in JNI syntax).

I'll think about this a bit.
There are tools available which will do this kind of analysis in order to
reduce the size of JAR files. You might be able to apply such a tool
directly, or "borrow" some of its code.

I used to use a tree-shaker under Lisp. I hadn't seen anything similar
under Java. I guess it doesn't surprise me that such a beast lives.

- --
Kenneth P. Turvey <[email protected]>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)

iD8DBQFEypfoi2ZgbrTULjoRAngEAJ40Nr4q54LJwX9S9FAplGSU/if5WQCfVqRU
tE/NKS3RUhq7LmURqiKocLg=
=RWs6
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

Simple GUI problem 1
Array declaration compiler bug? 3
Beowulf clusters 9
Java 1.6 13
Parameters (command line, preferences, user input) 1
Logger performance 5
Logger performance 3
Executor 8

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top