Complex object graph and "this" escaping constructor.

Daniel Pitts · Oct 28, 2007

So, I know why it is bad for the "this" reference to escape the
constructor, and that many people advocate using a static factory to
help alleviate this problem, but it seems to me that you could run into
the same kind of trouble.

Any time you pass an instance to another class before it is fully
initialized, you run the risk of that other class doing things that it
shouldn't. As far as I know there isn't a way to ensure an atomic
operation that expresses relationships between many objects to many
other objects. Any setSomething method has the right to call any public
method on the Something instance it is passed. It seems like it would
be a burden to add a "completely initialized" runtime check on every
behavior/knowledge querying method. The best it could do is throw an
IllegalStateException. If you have a lot of places this could happen it
become onerous to implement them all.

I suppose you could create some sort of universal contract/convention
that relationship builders (generally constructors and setters) don't do
anything to the objects they receive or already know about, other than
to save the given reference. This seems fragile at best, because the
compiler has no way to enforce this, and someone may unwittingly access
an object before all its required relationships are fully configured.

This is especially troublesome with circular references, which I've been
told should be avoided (probably for just this reason). Unfortunately,
I don't see a rule that can be applied to systematically avoid circular
during initialization. For my particular case.
I have a class Robot, which has a Computer object, which has a Memory
object and a Bus object. The Computer needs to know about the Robot,
and visa versa. I can instantiate them separately, and then pass the
references of each other to each other, but the Computer implementor
would have to explicitly know that they can't access the Robot object
during that reference exchange.

Of course, I could just be confusing matters and making this into a
bigger problem than it is. Any thoughts?

Lasse Reichstein Nielsen · Oct 29, 2007

Daniel Pitts said:
This is especially troublesome with circular references, which I've
been told should be avoided (probably for just this reason).
Unfortunately, I don't see a rule that can be applied to
systematically avoid circular during initialization. For my
particular case.
I have a class Robot, which has a Computer object, which has a Memory
object and a Bus object. The Computer needs to know about the Robot,
and visa versa.

Why? What is the relationship between Robot and Computer that the
references model? Does that relationship really need to be created at
the time the Robot and/or Computer is created?
Can a computer not be moved to another computer?
Are the Robot and Computer really separate entities, or does a Robot
have a Computer aspect to it ... e.g., by implementing a Computer
interface?

I can instantiate them separately, and then pass the
references of each other to each other, but the Computer implementor
would have to explicitly know that they can't access the Robot object
during that reference exchange.

"Computer" sounds more generic than something that inherently knows
about a Robot.

Of course, I could just be confusing matters and making this into a
bigger problem than it is. Any thoughts?

I have this problem with MVC all the time

/L

Roedy Green · Oct 29, 2007

So, I know why it is bad for the "this" reference to escape the
constructor, and that many people advocate using a static factory to
help alleviate this problem, but it seems to me that you could run into
the same kind of trouble.

There is a similar problem of calling virtual methods in constructors
that may access uninitialised (but zeroed) fields.

I think what Java aims to do is prevent you from doing anything that
will crash the JVM. However, you are free to write programs that give
meaningless results.

The idea is they will put in a safety net whereever it is easy to do
so, but feel no guilt about allowing you to write silly code.

Daniel Pitts · Oct 29, 2007

Lasse said:
Why? What is the relationship between Robot and Computer that the
references model? Does that relationship really need to be created at
the time the Robot and/or Computer is created?
Can a computer not be moved to another computer?
Are the Robot and Computer really separate entities, or does a Robot
have a Computer aspect to it ... e.g., by implementing a Computer
interface?

Really, a computer is an aspect of a Robot, but it is complex enough
that I'd like to separate it out. A robot also has a weapon, heatsink,
engine, shield, etc... The computer is the one that actuates all of
these, so needs a reference either to those, or to the computer which
owns them.

Of course, I realized after-the-fact that I should probably just use an
event driven system instead. Eg. the weapon registers itself as a
listener on the appropriate bus channel of the computer.

"Computer" sounds more generic than something that inherently knows
about a Robot.

In this simulation, it is coupled to the robot. However, using the event
model I can decouple it.

I have this problem with MVC all the time

Yeah, that's the other issue I'm dealing with on the same project

Thanks.

Daniel Pitts · Oct 29, 2007

Roedy said:
There is a similar problem of calling virtual methods in constructors
that may access uninitialised (but zeroed) fields.

I think what Java aims to do is prevent you from doing anything that
will crash the JVM. However, you are free to write programs that give
meaningless results.

The idea is they will put in a safety net whereever it is easy to do
so, but feel no guilt about allowing you to write silly code.

Calling possibly overridden methods is the (maybe not obviously) same as
allowing a this reference to escape.

In any case, I'm trying to avoid writing silly code. To me, silly means
code that is too complex and too defensive, or code that is to
simplistic and not "correct".

Thanks,
Daniel

Lasse Reichstein Nielsen · Oct 29, 2007

Daniel Pitts said:
Lasse Reichstein Nielsen wrote:

Really, a computer is an aspect of a Robot, but it is complex enough
that I'd like to separate it out. A robot also has a weapon,
heatsink, engine, shield, etc... The computer is the one that
actuates all of these, so needs a reference either to those, or to the
computer which owns them.

Ah. I still think the computer could exist on its own, and it should
then accept references to the parts it really needs, and the Robot,
which is the concept combining the parts, injects references to its
parts into the computer when it is built.

Until the computer is placed inside a robot, it won't have any parts
to control.

Yes, I'm a fan of dependency injection

Is the robot any more than the sum of its parts?

Of course, I realized after-the-fact that I should probably just use
an event driven system instead. Eg. the weapon registers itself as a
listener on the appropriate bus channel of the computer.

That's another approach, which allows even greater decoupling.
The computer doesn't even need to know that there is a recipient for
its events.

/L

Dmitry A. Kazakov · Oct 29, 2007

So, I know why it is bad for the "this" reference to escape the
constructor, and that many people advocate using a static factory to
help alleviate this problem, but it seems to me that you could run into
the same kind of trouble.

Any time you pass an instance to another class before it is fully
initialized, you run the risk of that other class doing things that it
shouldn't. As far as I know there isn't a way to ensure an atomic
operation that expresses relationships between many objects to many
other objects.

The problem is not in whether operators are atomic or not. Clearly public
operations have to honor object invariants. The problem is the language
design and improper construction model of. While being constructed the
object does not yet exist, hence a call to *any* its public method is
illegal. As simple as that. When the specific object is constructed you can
call any *specific* method, but it is still illegal to call any
*dispatching* ones, any method potentially dispatching from inside. Only
when the polymorphic object is finally constructed one can dispatch. This
is the semantics of construction of a class instance, which you should
mentally map onto an [admittedly inconsistent] implementation of in the
given programming language (like C++ or Java).

This is especially troublesome with circular references, which I've been
told should be avoided (probably for just this reason). Unfortunately,
I don't see a rule that can be applied to systematically avoid circular
during initialization.

No such thing can exist. It is as above. A reference may not exist before
the target object. And the target object does not before construction.
Hence, there cannot be any circular references upon initialization. Your
problem is that semantically, you take a raw pointer (to specific object)
and then cast it to a reference to a polymorphic object. That is
semantically illegal, even if the compiler does not tell you so.

I have a class Robot, which has a Computer object, which has a Memory
object and a Bus object. The Computer needs to know about the Robot,
and visa versa.

Then one object should be able to live without another. That would break
the circle. You first create, say, computer without any robot and then
attach a robot to it. Further, a robot should be functional when its
computer has no robot attached.

However, circular dependencies usually indicate a design problem. Perhaps
you should reconsider your design.

Mark Nicholls · Oct 29, 2007

So, I know why it is bad for the "this" reference to escape the
constructor, and that many people advocate using a static factory to
help alleviate this problem, but it seems to me that you could run into
the same kind of trouble.

yes...not sure about the static factory bit...but lets leave it.

Any time you pass an instance to another class before it is fully
initialized, you run the risk of that other class doing things that it
shouldn't. As far as I know there isn't a way to ensure an atomic
operation that expresses relationships between many objects to many
other objects. Any setSomething method has the right to call any public
method on the Something instance it is passed. It seems like it would
be a burden to add a "completely initialized" runtime check on every
behavior/knowledge querying method. The best it could do is throw an
IllegalStateException. If you have a lot of places this could happen it
become onerous to implement them all.

hmmmm

it is 'logically valid' to consider objects with circular references
the problem, as you say comes to construction.....I need to create my
chicken and egg at the same time.....which I clearly can't.

But the problem in some sense is not an OO one....it's a real world
one....see below....

I suppose you could create some sort of universal contract/convention
that relationship builders (generally constructors and setters) don't do
anything to the objects they receive or already know about, other than
to save the given reference. This seems fragile at best, because the
compiler has no way to enforce this, and someone may unwittingly access
an object before all its required relationships are fully configured.

This is especially troublesome with circular references, which I've been
told should be avoided (probably for just this reason). Unfortunately,
I don't see a rule that can be applied to systematically avoid circular
during initialization. For my particular case.
I have a class Robot, which has a Computer object, which has a Memory
object and a Bus object. The Computer needs to know about the Robot,
and visa versa. I can instantiate them separately, and then pass the
references of each other to each other, but the Computer implementor
would have to explicitly know that they can't access the Robot object
during that reference exchange.

Of course, I could just be confusing matters and making this into a
bigger problem than it is. Any thoughts?

your robot, computer examples is quite nice.....

but if I were to create a robot and a computer in the real world how
would I do it? At what point does it become a robot?

In the real world I would create a mechanical robot chassis.....and
then add a computer to it....i.e. the construction of a 'robot'
requires a constructed computer....in the real world the computer does
not have to be embedded in a robot...so why would you do this in
software.

interface IComputer
{
..... bla....
}

class Robot : IRobot
{
public Robot(IComputer computer)
{
....bla....
}
}

Daniel Pitts · Oct 29, 2007

Lasse said:
Ah. I still think the computer could exist on its own, and it should
then accept references to the parts it really needs, and the Robot,
which is the concept combining the parts, injects references to its
parts into the computer when it is built.

Until the computer is placed inside a robot, it won't have any parts
to control.

Yes, I'm a fan of dependency injection

I'm a fan of DI, but more specifically (or is it generally) a fan of
inversion of control.

Is the robot any more than the sum of its parts?

That's another approach, which allows even greater decoupling.
The computer doesn't even need to know that there is a recipient for
its events.

Right, and its events will be as generic as Memory at Address 6 set to,
or Port number 16 read.

Thanks for the reply.

H. S. Lahman · Oct 29, 2007

Responding to Pitts...

Any time you pass an instance to another class before it is fully
initialized, you run the risk of that other class doing things that it
shouldn't. As far as I know there isn't a way to ensure an atomic
operation that expresses relationships between many objects to many
other objects. Any setSomething method has the right to call any public
method on the Something instance it is passed. It seems like it would
be a burden to add a "completely initialized" runtime check on every
behavior/knowledge querying method. The best it could do is throw an
IllegalStateException. If you have a lot of places this could happen it
become onerous to implement them all.

I would rephrase the first sentence. It is bad practice to instantiate
relationships with other objects if the object in hand is not properly
initialized. Generally we use factory objects to manage instantiation
for two reasons.

The first is to encapsulate the business rules and policies of
instantiation because those rules and policies are likely to be
different than those governing collaboration within the problem
solution. A corollary is that the factory object's /only/ business with
the object is instantiation. So invoking other methods on the object
should not be an issue and should be easily enforcible with good naming
conventions and peer reviews.

A second, related reason is that we can control when the object's
relationships are instantiated. One practice the factory can enforce is
that the object is fully initialized prior to instantiating any
relationships. That way no other object has access to the object until
it is properly prepared for collaboration.

A corollary is that a factory method provides a scope that is more
manageable. Thus when concurrency is an issue one only needs to block
around the scope of the factory method, which is is usually <relatively>
easy to do and can be mapped readily into distributed environments. So
long as the object is instantiated, fully initialized, and all its
relationships are instantiated within the scope of the factory method,
referential and data integrity should be a minor issue.

Conversely, as soon as initialization and/or instantiation of
unconditional relationships is split among multiple object methods, one
is in a Pandora's Box situation and the developer must take special
precautions in the OOA/D to ensure that referential and data integrity
is not violated. Generally such situations are quite fragile, especially
during maintenance, so OO reviewers are going to want a lot of
justification for not encapsulating all the instantiation in a single
method.

BTW, I don't see why one needs static factory methods or what using a
'this' pointer outside constructors has to do with this issue.

This is especially troublesome with circular references, which I've been
told should be avoided (probably for just this reason). Unfortunately,
I don't see a rule that can be applied to systematically avoid circular
during initialization. For my particular case.
I have a class Robot, which has a Computer object, which has a Memory
object and a Bus object. The Computer needs to know about the Robot,
and visa versa. I can instantiate them separately, and then pass the
references of each other to each other, but the Computer implementor
would have to explicitly know that they can't access the Robot object
during that reference exchange.

First, I am not sure why the computer hardware is being modeled in a
game simulation(?)

But, assuming there is a reason for that, let's ge a bit more specific:

[Robot]
| 0..*
| controls
|
| R1
|
| 1 1 R2 contains 1
[Computer] ----------------- [Memory]
| 1
| located in
|
| R3
|
| 1
[Bus]

Here the unconditional nature of the R2 and R3 relationships requires
that all three objects be instantiated and the relationships are in
place before any collaborations with any of them. One likely approach
would be to have a single ComputerFactory::create method that
instantiates all three objects and their relationships within its scope.
In C++ (I don't do Java) that method might look something like:

ComputerFactory::create
{
Computer* myComputer;
Memory* myMemory;
Bus* myBus;

// get any attribute values needed for initialization

// instantiate objects
myComputer = new Computer; // with any attribute values
myMemory = new Memory; // with any attribute values
myBus = new Bus; // with any attribute values

// instantiate relationships
myComputer->setMemory(myMemory);
myComputer->setBus(myBus);

return myComputer;
}

If instead the objects are instantiated in separate methods, then you
need to be very careful that there is no way to collaborate with any of
them until they all have been instantiated. One way to ensure that would
be to instantiate R2 and R3 in the last factory method in the sequence.
[Just make sure nobody changes the sequence during maintenance! That's
why such instantiation is fragile.]

Now the R1 relationship conditionality requires that a Computer be
available when a Robot is instantiated, but doesn't require a Robot to
be around when a Computer is created. So we could use
RobotFactory::create to create the Robot instance and instantiate R1.
But we would have to provide that method with the proper Computer
reference, which implies that we need to invoke ComputerFactory::create
before RobotFactory::create.

Bottom line: the conditionality of the problem space relationships
usually dictates how the instantiation should be handled.

*************
There is nothing wrong with me that could
not be cured by a capful of Drano.

H. S. Lahman
(e-mail address removed)
Pathfinder Solutions
http://www.pathfindermda.com
blog: http://pathfinderpeople.blogs.com/hslahman
"Model-Based Translation: The Next Step in Agile Development". Email
(e-mail address removed) for your copy.
Pathfinder is hiring:
http://www.pathfindermda.com/about_us/careers_pos3.php.
(888)OOA-PATH

Escaping a Python sandbox	0	Mar 17, 2013
In C++, can I invoke a constructor from another constructor (constructor chaining)?	0	Jul 20, 2022
Creating a constructor	1	Dec 16, 2020
Traceback (most recent call last): File "<string>", line 23, in <module>TypeError: '>' not supported between instances of 'complex' and 'in	1	Dec 1, 2023
Looking for feedback on this markup language I developed and my website idea?	0	Jun 17, 2023
Real Time Graph	9	Dec 19, 2012
Updating JSON object	1	Aug 12, 2023
Constructing a complex object	1	Oct 19, 2007

Complex object graph and "this" escaping constructor.

Daniel Pitts

Lasse Reichstein Nielsen

Roedy Green

Daniel Pitts

Daniel Pitts

Lasse Reichstein Nielsen

Dmitry A. Kazakov

Mark Nicholls

Daniel Pitts

H. S. Lahman

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads