A Java Brainteaser - a Static Factory method Narrative

L

Lee Fesperman

Ross said:
This is a misunderstanding. In Java the term 'referential roots' has no
meaning. The JLS simply uses it to refer to a top-level reference that
is in scope. If an object has no roots in the referential set (I.e. in
scope vars of *all kinds*) then it is, by definition, unreachable.
This is how the whole thing about losing a Map full of objects works,
simply by throwing away the map ref (your root reference in this case).

Ok, in general.
The GC Implementation is part of the JVM, but Garbage Collection,
referential roots, reference counting and all that predate Java. The
root set is simply a detail of the JVM implementation and has no bearing
on this argument whatsoever.

No modern JVM/GC uses reference counting.
In the same way that when a reference goes out of scope, the reference
you set to null is removed from that object's reference count when you
make the assignment (or just after I think). Periodically, the JVM runs
through all it's objects (for example), and every one it finds with a
zero reference count (indicating it's not reachable) is eligable for GC.
Note this doesn't necessarily mean they *will* be collected, especially
when we factor in active threads and whatnot, but you get the idea.

Again, reference counting is not the way GC works, today; it works purely by
reachability analysis. Reference counting can produce 'orphans'. I think you should
educate yourself better because you are adding to the confusion in this thread.
 
R

Ross Bamford

Therefore, we should not design our java apps with System.gc calls to
release memory, as they will not behavior the same from one run to another.

In fact, I would have preferred a strategy where no GC 'suggestion' is
offered at all in the standard Runtime, and integrators must insert
custom implementation (perhaps of a specific GCManager so as not to
expose too much) to get any kind of handle on the management.

I've never liked 'go do what you do, or don't, I don't care' methods,
especially when there may be a dramatic performance hit, or even VM-wide
freeze-up for a second or two in a busy system... (Witness numerous
Swing apps where the fashion is to do System.gc() on certain menu
commands).

Ross
 
R

Ross Bamford

No modern JVM/GC uses reference counting.
Again, reference counting is not the way GC works, today; it works purely by
reachability analysis. Reference counting can produce 'orphans'. I think you should
educate yourself better because you are adding to the confusion in this thread.

I'd argue that a lot more confusion comes from messages highlighting
out-of-date facts that are actually irrelevant to the discussion. Okay,
so reference counting isn't used any more - the point is, I don't need
to know or care from a development point of view, because it 'just
works' regardless of where or how I instantiate new objects.

In any case the intent is the same - an object unreachable object has no
references. Whether you counted them, tracked them, or registered
details in a database every time one was used (;)), the point is that
you know, in the, end, that there are none.

However, I do apologise for my error - I must admit I am only really an
enthusiast when it comes to the JVMs deep internals.

Ross
 
?

=?ISO-8859-1?Q?Daniel_Sj=F6blom?=

Ross said:
In the same way that when a reference goes out of scope, the reference
you set to null is removed from that object's reference count when you
make the assignment (or just after I think). Periodically, the JVM runs
through all it's objects (for example), and every one it finds with a
zero reference count (indicating it's not reachable) is eligable for GC.
Note this doesn't necessarily mean they *will* be collected, especially
when we factor in active threads and whatnot, but you get the idea.

I'm not aware of any JVMs that use reference counting for GC. The
problem with reference counting is that objects that circularily point
to each other will be considered live, even though there is no way to
reach them from 'the outside' so to speak. I'll explain a bit further
about the approach normally taken in JVMs.

The basic idea in most JVMs is to keep track of which objects at time A
could possibly be accessed by the program, with the help of something
called reachability analysis. If we consider a simplified JVM with no
static variables and no JNI, and only one thread running, the program at
a point A looks something like this on the stack:

-------------
| main method
-------------
| method a
-------------
| method b
-------------
| method c <- Currently executing method c.
-------------

Where the method call chain is main -> a -> b -> c. Suppose we suspend
the program at this point, and wish to compute the set of all currently
reachable objects. How do we do this? If we do some preparation, it is
not too hard. Before we run any method, we calculate a garbage
collection map for the method. This will simply be a data structure that
contains information about which local variables in the method can
contain references to objects. For instance, it might look like this (*):

public class GCMap
{
int[] framePointerOffsets;
}

where framePointerOffsets contains the adresses (relative to the
framepointer) of local variables that contain references.

So, now all we need to do this is walk the stack backwards and with the
help of the garbage collection maps add all the references currently
contained in the local variables of each method to a set, which we will
call the root set. The code could like this:

Set rootSet = empty

for stackframe in stack
for offset in framePointerOffsets in GCMap of method at stackframe
rootSet = rootSet U (value at (framePointer + offset))


So now we have our root set. The last thing we need to do is follow all
of the references in the root set to find other live references and from
those references go to yet others and so on. This can be done with a
standard graph search. If some object is not reachable from the root set
of references, it cannot possibly be accessed by the program anymore,
and as such is garbage which can be collected.

This is a simplified explanation of a very simple reachability analysis,
in reality things are more complicated (and there are various
alternative procedures for achieving the same goals). This example
should however also explain why references may need to be 'nulled' at
times. In the gathering of references to the root set, we are only
considering the state of the program at point A, and ignoring anything
that may happen in the future. For instance, we may have a reference in
a local variable at point A which is never read (never used) again in
the future. But our analyzer does not know this! Hence we (as
programmers) may need to manually 'null' references in certain methods,
when we know that a reference points to an object we will not use anymore.

* we assume all local variables are on the stack
 
L

Lee Fesperman

Ross said:
I'd argue that a lot more confusion comes from messages highlighting
out-of-date facts that are actually irrelevant to the discussion. Okay,
so reference counting isn't used any more - the point is, I don't need
to know or care from a development point of view, because it 'just
works' regardless of where or how I instantiate new objects.

Ok, that's true enough in terms of most of the confusion on this thread and in terms of
where or how you instantiate new objects, but ...
In any case the intent is the same - an object unreachable object has no
references. Whether you counted them, tracked them, or registered
details in a database every time one was used (;)), the point is that
you know, in the, end, that there are none.

The problem with reference counting (and, probably, the other methods you mentioned) is
circular references. In a circular reference, 2 or more objects reference each other, so
even though they are not reachable from live code, they don't have a reference count of
zero. These could be called orphans, which the OP did mention, thus I didn't want you to
cause him concern over that issue.
 
R

Ross Bamford

Ok, that's true enough in terms of most of the confusion on this thread and in terms of
where or how you instantiate new objects, but ...


The problem with reference counting (and, probably, the other methods you mentioned) is
circular references. In a circular reference, 2 or more objects reference each other, so
even though they are not reachable from live code, they don't have a reference count of
zero. These could be called orphans, which the OP did mention, thus I didn't want you to
cause him concern over that issue.

Which, then, is where the fact that even if an object has one or more
actual references, if it cannot be referenced from the 'root set' (see
earlier) that fact is discounted (since it isn't reachable). Of course,
it makes sense. I was even contradicting myself ;)

I wonder if you could recommend a good link for a (quick) knowledge
update?

Ross
 
N

Nick Malik [Microsoft]

Has it occurred to you that the problem isn't the return of the newly
created object, but perhaps it's in the section you labeled 'yada yada'?
public static MyWidget staticMethod() {
// yada yada
return new MyWidgetObject();
}

Caveat: I'm not a Java developer. If I say something goofy, please forgive.

Are you creating other things, or manipulating static variables in your
class (or calling another static method on this or another class that
manipulates heap memory) during this 'yada yada'? The problem may be that
these memory locations are shared between threads. Could you be running
into concurrency issues where you have referenced a shared value without
considering the thread-safety of these values?

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
 
L

Laurent Bossavit

Krasicki,
The larger argument is this. The static method I'm describing is
invoked for every web-based visitor. The server is rebooted every
night due to performance problems. [...] I'm trying to determine
if this static factory method might be contributing to this problem.

It's not; or rather, the sole fact that it's static is not a sufficient
or a necessary condition for this kind of problem to arise. It's
irrelevant. A thread's stack is the root of all object references,
together with static instance *variables*. The keyword static means
something different altogether for methods - unfortunate overloading of
the name. At any rate, a stack frame for a static method is not treated
differently from a stack frame of a regular method.

Get yourself a copy of OptimizeIt or similar. It's possible to debug
such problems "statically" - by which I mean through the sole means of
reading the source code; I've done it a few times. A profiler is a
quicker way, and it will locate the problem even if it's not due to your
own code but rather to some third-party code.

Interaction between the language and the GC should be covered by the
Java Language Specification - have you read that ? Section 12.6 covers
object finalization.

See also:

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.h
tml

Laurent
 
T

Tony Morris

This is the correct way to expose construction of objects.
Non-private constructors violate encapsulation.
If only this very fundamental, simple and obvious concept was taught from
the beginning.
Book containing explanation pending...

--
Tony Morris
Software Engineer, IBM Australia.
BInfTech, SCJP 1.4, SCJP 5.0, SCJD

http://www.jtiger.org/ JTiger Unit Test Framework for Java
http://qa.jtiger.org/ Java Q&A (FAQ, Trivia)
http://xdweb.net/~dibblego/
 
K

krasicki

Tony said:
This is the correct way to expose construction of objects.

Don't you mean 'A' correct way.
Non-private constructors violate encapsulation.

I'm not sure where you're going here.
If only this very fundamental, simple and obvious concept was taught from
the beginning.
Book containing explanation pending...

Tony, you're awfully sparse on words here.
 
K

krasicki

Nick,

Sorry, no hidden gotchas. I was on my way to my son's Pony League
baseball game and was being as straightforward and brief as could be.

However, much of the surrounding code is ugly in a juvenile way which
is why I questioned this technique. I'm guessing whoever wrote the
stuff remembered a shortcut but didn't pay attention to elementary
things like if-then-else (I have multiple if statements on after
another (lots 'o them) in a while loop.

This onion would make you all cry.

Thanks, Nick - I hear you and you're correct in asking but what
prompted my questrion was my inability to find good literature on this
specific subject. I'm being intentionally belligerent to shake it all
out well so that others looking for a comprehensive treatment might get
it here.

Everyone responding has been great and the thread, IMO, is worth a
read.

cheers,

Frank



Has it occurred to you that the problem isn't the return of the newly
created object, but perhaps it's in the section you labeled 'yada yada'?

Caveat: I'm not a Java developer. If I say something goofy, please forgive.

Are you creating other things, or manipulating static variables in your
class (or calling another static method on this or another class that
 
K

krasicki

Laurent said:
Krasicki,
The larger argument is this. The static method I'm describing is
invoked for every web-based visitor. The server is rebooted every
night due to performance problems. [...] I'm trying to determine
if this static factory method might be contributing to this
problem.

It's not; or rather, the sole fact that it's static is not a sufficient
or a necessary condition for this kind of problem to arise. It's
irrelevant. A thread's stack is the root of all object references,
together with static instance *variables*. The keyword static means
something different altogether for methods - unfortunate overloading of
the name. At any rate, a stack frame for a static method is not treated
differently from a stack frame of a regular method.

Bingo - best answer. Let's repeat it, *The keyword static means
something different altogether for methods - unfortunate overloading of
the name.* This is what confused me most and something I have never
seen said in a technical discussion.
Get yourself a copy of OptimizeIt or similar. It's possible to debug
such problems "statically" - by which I mean through the sole means of
reading the source code; I've done it a few times. A profiler is a
quicker way, and it will locate the problem even if it's not due to your
own code but rather to some third-party code.

Yes, good advice for shops that take advice.
Interaction between the language and the GC should be covered by the
Java Language Specification - have you read that ? Section 12.6 covers
object finalization.

See also:

http://java.sun.com/docs/books/performance/1st_edition/html/JPAppGC.fm.h
tml
Thanks. Yeah I read a lot but it's overwhwelming and everything gets
dated very quickly. I had not run across this before and being someone
who likes OO programming I don't look for workarounds - hence a
blindspot with this subject.
 
K

krasicki

Andrew said:
The main problem with lots of static methods (although I'd call them
'functions'), is that they make the implementation of your application,
procedural rather than OO.

By this I meam, we lose all of the benefits of objects, as these
functions do not live on an object instance - they live on the class.

They can be useful for utility functions like 'System.out.println("..");
'. and are very useful for factory methods like yours or on a Singleton.

Other than that, try to prefer an instance method of an object.
- snip -

Yes. This is really the crux of the argument. This is procedural
stuff poking its way into OO territory and i always get the feeling
that people doing it get a secret pleasure out of it - like kicking a
vending machine.
 
L

Laurent Bossavit

Krasicki,
Yes, good advice for shops that take advice.

There you have it - the Consultant's Ultimate Frustration. Pretty much
the same as the architect's, right ?

Laurent
 
T

Tony Morris

krasicki said:
Don't you mean 'A' correct way.
No.


I'm not sure where you're going here.

Think about it - it's not a secret (that is, I didn't dream it up - it's a
fundamental fact known to all practising purists).
Tony, you're awfully sparse on words here.

I intended only to provoke thought on behalf of the reader, not write the
book.



--
Tony Morris
Software Engineer, IBM Australia.
BInfTech, SCJP 1.4, SCJP 5.0, SCJD

http://www.jtiger.org/ JTiger Unit Test Framework for Java
http://qa.jtiger.org/ Java Q&A (FAQ, Trivia)
http://xdweb.net/~dibblego/
 
T

Tor Iver Wilhelmsen

Tony Morris said:
Think about it - it's not a secret (that is, I didn't dream it up - it's a
fundamental fact known to all practising purists).

Since a lot of programmers disagree with you (ie. they think that
constructors can be non-private), calling it a "fundamental fact"
makes you look like an arrogant snob.
 
J

Joona I Palaste

Tony Morris <[email protected]> scribbled the following
This is the correct way to expose construction of objects.
Non-private constructors violate encapsulation.
If only this very fundamental, simple and obvious concept was taught from
the beginning.
Book containing explanation pending...

I'd love to see that book of yours. I know something about Java, but I
can't think of any useful way to create new objects in Java without using
non-private constructors at some point. Unless you want to use the
static factory pattern in *every* class, which I find ugly.
 
A

Alvin Ryder

krasicki said:
I had never run into this before but have recently encountered code
that is something like:

// This code runs - I'm generalizing so forgive unintentional
oversights

// from a servlet
MyWidget mw = FactoryClass.staticMethod();
:

and in FactoryClass we find:

public static MyWidget staticMethod() {
// yada yada
return new MyWidgetObject();
}


My questions are many but first and foremost is, *how is the memory
managed?* Is each new object a new static object with immutable memory
assigned it or is the static memory recycled with each assignment to
mw?

Disclaimer: YES,YES,YES - I know, I know... but take it for what it's
worth.

Hi,

Static factories are a viable, and at times preferred, way of
constructing objects.

One advantage is they can return types other than that of the class
they're contained in:
public static MyShape createSquare() {...}
public static MyShape createCircle() {...}

Another advantage is you can use meaningful names, "staticMethod()"
doesn't qualify. Something like createWidget() would be better.

Josh Bloch talks about static factories in "Effective Java".

Ultimately each object is created with 'new' so there are no additional
memory concerns for you or the gc.

Cheers.
 
K

krasicki

Alvin said:
Hi,

Static factories are a viable, and at times preferred, way of
constructing objects.

One advantage is they can return types other than that of the class
they're contained in:
public static MyShape createSquare() {...}
public static MyShape createCircle() {...}

Another advantage is you can use meaningful names, "staticMethod()"
doesn't qualify. Something like createWidget() would be better.

Josh Bloch talks about static factories in "Effective Java".

Ultimately each object is created with 'new' so there are no additional
memory concerns for you or the gc.


That would explain it. I have found Josh Bloch's book to be the most
useless and difficult to implement in the environments I work. While
I'm sure he's got his merits, almost nothing he writes about sticks in
any practical way. At least that was my milage a few years ago.

OTOH, Peter Hagar, Dove Bulka, and many article writers seem to be spot
on in providing useful memorable implementation.

Now, as neat as the implementations are that you illustrate, how does
this promote OO encapsulation? Wouldn't Fortran be more appropriate?
 
A

Alvin Ryder

krasicki said:
That would explain it. I have found Josh Bloch's book to be the most
useless and difficult to implement in the environments I work. While
I'm sure he's got his merits, almost nothing he writes about sticks in
any practical way. At least that was my milage a few years ago.

James Gosling says "many people think he doesn't need a book on Java"
but then he says "but that is one book he needs" ... so I'm not too
surprised if the rest of us find it 'a little out there' ... (Though I
don't regret buying it).

OTOH I can't remember what was said either so I'll have to make up my
own answers ... ;-)
OTOH, Peter Hagar, Dove Bulka, and many article writers seem to be spot
on in providing useful memorable implementation.

Now, as neat as the implementations are that you illustrate, how does
this promote OO encapsulation? Wouldn't Fortran be more appropriate?

I agree, from what I can see there's nothing to be gained in the
example you gave by using static factories. Maybe "yada yada" changes
the situation?

1. The problem is sometimes "new" alone is not enough.

What if you need to do some pre-processing, exactly where would you put
that code?

Shape o = new [not sure exactly what subtype to create];
Shape o = Shape.createShape ("given this info");

If you use Shape.createShape() you can decide exactly what type of
object to create, access it's private variables and encase (or
encapsulate) all of that in a sensible place. No other class needs
private variable access or intimate construction details.

The result is an object of the correct type is constructed, Fortran
functions cannot do that ;-)


2. You may not *necessarily* want to create an object, maybe it exists
already.
OTOH 'new' must create an object when called or throw an exception.

Info i = Info.getInstance (oid);
/*
* Maybe load from XML, maybe from cache,
* maybe use 'new' for a default
* object.
*
* "getInstance" can totally encapsulate
* that logic and keep Info internals
* private.
*/

3. Static factories you can be used to alter the actual creation time.
You can prebuild a pool of certain slow to build objects and then dish
them out on demand later.

OTOH "new" must create when called.

Connection c = ConnectionPool.getConnection();

vs

Connection c = new SlowConnection(); // Do it now.



HTH,
Cheers.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top