Wormholes

A

Arne Vajhøj

it would have to have default scope. I hoping there was a technique
that would behave like that, but which would not expose the variable
to the entire package, when really only one method should use that
variable.

The techniques that I can think of are default scope variable or pass
value down call chain to where it is used.

You should pass it down.

Arne
 
Z

Zermelo

Roedy Green said:
The techniques that I can think of are default scope variable or pass
value down call chain to where it is used. Are there any others?

You can also overload method y and add an extra parameter.
 
Z

Zermelo

Roedy Green said:
The techniques that I can think of are default scope variable or pass
value down call chain to where it is used. Are there any others?

You can also overload method y and add an extra parameter.
 
Z

Zermelo

Roedy Green said:
The techniques that I can think of are default scope variable or pass
value down call chain to where it is used. Are there any others?

You can also overload method y and add an extra parameter.
 
R

Robert Klemme

Assuming that you can't improve your structure or refactor, a
ThreadLocal might be appropriate as your wormhole.

I've had to do this lately. The intermediate steps were recursive calls
through a hierarchy. y was a custom implementation of a node nested in
that hierarchy. x started off the recursive call.

If I was only using the hierarchy once, for one call by x, I would have
simply embedded the context in y. But the hierarchy was to be re-used,
and y needed a different context for each call by x. Also, calls by x
could be concurrent.

However, I could guarantee that any single call by x would invoke y by
the same thread. That allowed the context to be passed by a
ThreadLocal, set by x just before the call.

I knew finally someone would suggest ThreadLocal for this. This might
be even worse than global variables, especially since you pass hidden
state which usually makes testing more difficult. The proper approach
would be to pass the state down the call chain.

IMHO the best usage for ThreadLocal is to cache state *inside a class*
if calls may be concurrent and the cost of creating the state is
significantly high. But using it to pass information between classes
because one wants to avoid adding method parameters is asking for trouble.

Also, you need to be aware that the lifetime of these objects can be
quite long (there was a discussion about various aspects of ThreadLocal
in light of thread pools here earlier).

Kind regards

robert
 
S

Steven Simpson

I knew finally someone would suggest ThreadLocal for this. This might
be even worse than global variables, especially since you pass hidden
state which usually makes testing more difficult.

Quite; that's why I started with: "Assuming that you can't improve your
structure or refactor, ..." That is, others' advice is to be tried first.
The proper approach would be to pass the state down the call chain.

You're probably right, but there's not enough information in the stated
problem. I'd recently experienced a specific version of the problem,
and mentioned how it was solved. For me, the "incommensurate ripples"
would be a use-specific change in an API.
IMHO the best usage for ThreadLocal is to cache state *inside a class*
if calls may be concurrent and the cost of creating the state is
significantly high. But using it to pass information between classes
because one wants to avoid adding method parameters is asking for
trouble.

I'll be more specific with the example I gave. Here's an abridged API
for a hierarchical structure that can be serialized:

abstract class Box {
List<Box> children;
abstract InputStream getFieldContent();

final InputStream getChildContent() {
List<InputStream> streams = new ArrayList<>(children.size());
for (Box child : children)
streams.add(child.getContent());
return new SequenceInputStream(Collections.enumeration(streams));
}

final InputStream getContent() {
return new SequenceInputStream(getFieldContent(), getChildContent());
}
}

Several library-defined extensions are provided, implementing
getFieldContent() in various useful ways.

Outside the library, there's a user creating a custom box type, making a
hierarchy including it, and caching it:

class MyAppSpecBox extends Box {
InputStream getFieldContent() {
...
}
}

// Create hierarchy out of library Box extensions.
Box root = ... ;

// Add the custom box type somewhere in the hierarchy.
Box myBox = new MyAppSpecBox();
root.children.get(2).children.get(1).children.add(myBox);

cache.store(key, root);

Fetch it later, and serialize it:

Box root = cache.fetch(key);
InputStream in = root.getContent();

Suppose we want MyAppSpecBox.getFieldContent() to use a context which is
known only at the point of fetching from the cache. We don't control
the Box API, and even if we did, we couldn't add an application-specific
parameter to the getContent() family of methods. How would we add a
generic one, one that would be usable by several users independently and
simultaneously (other than the Context<T> class I suggested, which is
just a variation on ThreadLocal<T>)?

If we could locate myBox from root, we could pass the context to it
after fetching. However, traversing the full hierarchy or even knowing
the correct path seem clumsy ways to locate it. Also, its storage of
the context would not be thread-safe.

So, we throw in a ThreadLocal:

static ThreadLocal<Context> context = ...;

class MyAppSpecBox extends Box {
InputStream getFieldContent() {
Context ctxt = context.get();
...
}
}

We set it before invoking the hierarchy:

Box root = cache.fetch(key);
Context ctxt = new Context(...);
context.set(ctxt);
InputStream in = root.getContent();

Also, you need to be aware that the lifetime of these objects can be
quite long (there was a discussion about various aspects of
ThreadLocal in light of thread pools here earlier).

That use of ThreadLocal was preserving state from one 'prong' of the
stack to the next, presumably with no way to inject a
ThreadLocal.set(null) at a common vertex of those prongs. This use of
ThreadLocal only pushes values up the stack, which allows us to be more
rigorous:

Box root = cache.fetch(key);
Context ctxt = new Context(...);
context.set(ctxt);
try {
InputStream in = root.getContent();
} finally {
context.set(null);
}


Cheers,

Steven
 
M

markspace

it would have to have default scope. I hoping there was a technique


I don't think default scope would work. Consider the case where
multiple threads call the method in question. If the second thread
over-writes the first thread's value before it gets used, you'll have
serious errors.

Something like a ThreadLocal is required here. Something that simulates
the call stack frame.

that would behave like that, but which would not expose the variable
to the entire package, when really only one method should use that
variable.

The techniques that I can think of are default scope variable or pass
value down call chain to where it is used. Are there any others?


Is there any way you can show us the code? I'm curious as to why it's
so hard to refactor.
 
A

Andreas Leitgeb

Steven Simpson said:
So, we throw in a ThreadLocal:

static ThreadLocal<Context> context = ...;

class MyAppSpecBox extends Box {
InputStream getFieldContent() {
Context ctxt = context.get();
...
}
}

We set it before invoking the hierarchy:

Box root = cache.fetch(key);
Context ctxt = new Context(...);
context.set(ctxt);
InputStream in = root.getContent();

What strikes me odd in this scenario is the coincidence of
two things:
There's *only one* specific Box-subclass, that requires
specific extra information
The code that kicks off processing of the object tree does
know about this specific need and does know how to
cater to it specifically (and the specific piece of
information is even available at that place).

That looks really bolted (or duct-taped) on.

At some point in future, such things tend to accumulate.

Then either you end up with "invoking" code, that sets up
a number of different ThreadLocal<...>s for those specific
Boxes (that may or may not actually show up in the tree),
or you still set up only one such ThreadLocal<Context>,
and a couple of specific Boxes use it.

in the former case it seems like the Box abstraction
completely missed the point.

In the latter case, it seems as if it might then turn out to
be reasonable to modify the Box baseclass to pass along the
Context directly (even if some Boxes still don't need it):

class Box {
InputStream getContent(Context ctx) {
child-loop { ... child.getContext(ctx); }
... getFieldContent(ctx);
...
}
InputStream getFieldContent(Context ctx) {
getFieldContent(); // default-impl
}
InputStream getFieldContent() {return null;}
}

subclasses that need the context would override getFieldContent(Context)
the others (like the old ones) would still just override getFieldContent().
 
S

Steven Simpson

What strikes me odd in this scenario is the coincidence of
two things:
There's *only one* specific Box-subclass, that requires
specific extra information
The code that kicks off processing of the object tree does
know about this specific need and does know how to
cater to it specifically (and the specific piece of
information is even available at that place).

The odd part is that the custom box is lost in the structure before a
context can be supplied to it, which is the result of caching the
structure, and only knowing the context just when recovering the
structure from the cache. That means you either find the custom box to
inform it of the context, or you leave the context somewhere the box
will look.
Then either you end up with "invoking" code, that sets up
a number of different ThreadLocal<...>s for those specific
Boxes (that may or may not actually show up in the tree),
or you still set up only one such ThreadLocal<Context>,
and a couple of specific Boxes use it.

in the former case it seems like the Box abstraction
completely missed the point.

I don't get what you mean, but the point of the Box abstraction is just
to handle the aspects common to all boxes: they have some
box-type-specific fields, followed by zero or more nested boxes. It's
not aware of the application-specific Context.
In the latter case, it seems as if it might then turn out to
be reasonable to modify the Box baseclass to pass along the
Context directly (even if some Boxes still don't need it):

class Box {
InputStream getContent(Context ctx) {

This is out of the question - the library to which Box belongs knows
nothing about the application that defines Context.


Cheers!
 
A

Andreas Leitgeb

Steven Simpson said:
The odd part is that the custom box is lost in the structure before a
context can be supplied to it, which is the result of caching the
structure, and only knowing the context just when recovering the
structure from the cache. That means you either find the custom box to
inform it of the context, or you leave the context somewhere the box
will look.

I still think, the requirement itself is probably screwed.

Such screwed requirements just happen, and if they happen, then a
screwed implementation may seem like the correct answer - until
something harmless happens, that crashes and burns the screwed
solution and leaves you in the void searching for an alternative
screwed solution that survives at least until next time...

Back to the Box-example: as the Box-framework may not expect the
getContent() call to depend on any outer local context, it might
evaluate it earlier, and just return a cached result on your call.
This of course will crash and burn your approach, as you built on
the unfounded assumption that each Box's getFieldContent() wouldn't
be called beforehand. If the Box-framework really gave such a promise,
then it had better provide a context parameter already, lest it be
screwed itself.
 
S

Steven Simpson

Back to the Box-example: as the Box-framework may not expect the
getContent() call to depend on any outer local context, it might
evaluate it earlier, and just return a cached result on your call.

No, not in this case. The whole point of this framework is to create an
internal structured representation of a file whose format is defined in
terms of nested boxes - i.e. there's a parsing/deserialization phase too
- and to allow those boxes to be manipulated. After that, the point of
the getContent() method is to serialize the hierarchy in its present
state; getContent() has no advantage to cache anything, because the
hierarchy is not ready to be serialized until getContent() is called.
Its implementation is quite transparent - getFieldContent() will be
called on each box as if by a depth-first traversal. It is therefore
relatively easy to offer informal guarantees about how a user's
getFieldContent() will be invoked by getContent(), and the finality of
Box's methods helps to prevent that from being undermined by subclasses.

Also, the extra context I'm providing does not affect the functional
behaviour of the box structure. It's there to regulate the arrival of
bytes, so I can have the next batch already requested before the current
batch has finished.

(Actually, it seems I haven't got that regulation quite right, so I've
disabled it for now. But I can add and remove it without changes to the
box framework.)

If the Box-framework really gave such a promise,
then it had better provide a context parameter already, lest it be
screwed itself.

The possibility of being screwed only comes if the getContent or
getChildContent methods decide to call getFieldContent() indirectly by
starting another thread. For this framework, I don't foresee it
happening. Even if it did, I might be able still to make a lesser
guarantee, that such threads are descendants of the original thread, so
an InheritableThreadLocal would still work.

If even that's no good, and an extra parameter is required, what should
it be, since the framework has no knowledge of how many or what types of
contexts are required? I already suggested a (different) Context and
ContextLocal<T> pair of classes, but that is almost a dual of the
ThreadLocal technique, but with the extra parameter made explicit,
instead of implicitly being Thread.currentThread(). In other
circumstances with fewer or weaker guarantees, I would do that, but I
don't foresee ThreadLocal being a problem for this box framework.
 
A

Andreas Leitgeb

Steven Simpson said:
No, not in this case. The whole point of this framework is ...

That was more than I really cared to know about that Box-framework ;-)
The possibility of being screwed only comes if the getContent or
getChildContent methods decide to call getFieldContent() indirectly by
starting another thread. For this framework, I don't foresee it
happening. Even if it did, I might be able still to make a lesser
guarantee, that such threads are descendants of the original thread, so
an InheritableThreadLocal would still work.

Still an unsafe assumption: Only after you changed ThreadLocal to
InheritedThreadLocal and start to feel safe again, the framework
would be optimized further, to use pool-threads, of course ;-)
If even that's no good, and an extra parameter is required, what should
it be, since the framework has no knowledge of how many or what types of
contexts are required?

Some Context class that contains/references the complete last-minute
knowledge of the site where getContent() is called from. It should
be the same instance for the whole tree of boxes.

Even if some of the "possible framework changes" won't happen with your
Box framework, they might do so for whatever framework the OP uses.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top