dynamic_cast is ugly!

J

Juha Nieminen

H. S. Lahman said:
(1) Use separate homogeneous collections rather than heterogeneous
collections. Then the client who needs specific types of objects can
navigate the appropriate relationship to the right collection.

In my particular case I am (usually) keeping objects of a certain type
in a container of that type, and I access this container directly when
it's possible to do so (eg. to perform an action on some or all the
objects of that type, as long as this action is independent of any other
objects in the system).

The problem is that not all actions I need to perform can be applied
to objects independently of other objects. In particular, sometimes some
actions need to be performed in a certain order. The two problems are
that the objects in the object-specific container might not be in this
order already, and secondly, even if they were, some actions have to be
performed to all existing objects (regardless of their type) in a strict
order, and cannot be performed on a per-type basis.

Currently I'm maintaining this order by putting base-class-type
pointers in a common container. The order of the pointers in this
container defines the order of all the objects. (Note that there are
also other reasons for doing this. Ordering is just one of them.)

One possibility would be to add a virtual function in the common base
class to perform this specific action. This way all the object types
could implement this action by implementing this virtual function, and
there would be no need for dynamic_cast.

The problem with this is that this base class and some of the objects
derived from it (which are often used here and there in the program) are
in a library. It would mean that I would have to add these new (often
program-specific) features to the library.

This would not only be cumbersome, but I also feel it goes against
proper OO design: You don't go and add application-specific features to
an abstract base class or existing objects in a library. If I did this
for every feature of every program I develop, it would make the library
and the base class less abstract and more bloated.

If dynamic_cast has to be avoided at all costs, some other solution is
needed. Maybe a complete re-design of the library. But even so, I really
haven't come up with a better solution. That's why I'm interested in
hearing new ideas.
(2) Provide the type information to the collection and provide an
interface to the collection that clients can access by providing a
desired type. Then the collection manages the {type, object} tuples in
its implementation.

I'm not exactly sure what this means, and how it is radically
different from using dynamic_cast. (It sounds to me like dynamic_cast
would simply have been replaced with some kind of checking of the 'type'
inside those tuples. If this is so, then it would simply make the whole
implementation more complicated with no real benefit.)
 
R

Robert Martin

Before you put the
object in the container, you knew its type, then you threw it away
even though you needed it in some latter portion of the program. The
key is to not throw that information away. How you pass that
information from the creator to the user depends on the situation.

If you can use dynamic_cast to reclaim the type of the object, you
really haven't thrown it away. Sometimes (not often, but sometimes)
the best way to reclaim that type is with a dynamic_cast.
 
R

Robert Martin

If dynamic_cast has to be avoided at all costs, some other solution is
needed.

If dynamic_cast must be avoided at all costs, then virtual functions
must also be avoided; since I can implement dynamic cast with nothing
more elaborate than two virtual function deployments.

dynamic_cast might be ugly syntax; but semantically it is no more ugly
than double dispatch. Indeed it can be implemented with double
dispatch.
 
D

Darin Johnson

Agreed. If those were our only two options, then you are 100% correct.
Fortunately, there are other options. This is exactly why OO books warn
against "manager" objects.

And in real life you fix the code you're given and
don't tell your "manager" that you're first going
to spend a few months redesigning the legacy code
to remove the need for ugly workarounds.

Of course, if you're designing new code then the
idea of having to create a template to make dynamic
casting easier should be a warning sign that the
design needs improvment. In older legacy code that
you can't redesign you may as well leave in the ugly
syntax.
 
D

Daniel T.

James Kanze said:
Isn't this exactlly what dynamic_cast does?

No, dynamic_cast punts. It's a query on object state.
What's the difference between a member function
"supportsInterfaceX()" and dynamic_cast< X* >. Except that
providing the member function means that the base interface must
know about the extended interface.
Nothing.


The problem with this is that it means that everything any derived
class supports must be known and declared in the base class.

No, it doesn't.
And how do you handle the different levels of functionality? What
happens if you tell an object to do something it's not capable of?

What happens is, you have a design error.
(Note that the article you site is mainly concerned with not
exposing mutable state, in order to reduce coupling.

The type of the object is part of the objects state. The word "mutable"
is nowhere to be found in the article. "as the caller, you should not be
making decisions based on the state of the called object that result in
you then changing the state of the object."

The idea is that you shouldn't be telling the object to do X, you
shouldn't be telling it to change its state to X, you should be telling
it about changes in its environment.
 
D

Daniel T.

James Kanze said:
Agreed, but you can't really have it both ways. At least
dynamic_cast is like finding a detailed inventory at the top of
the box when you open it (or arguably, in an envelop pasted to
the outside of the box).

I can see it now. I order some RAM for my computer, and the producer
ships me 30 boxes for which I have to check each invoice to see if it is
what I *really* want, shipping the others back to the producer... How
about this instead. I ask for type X and I get type X?
And you still haven't responded to Juta's question about what
you do when the transporter is managing several different
producer/consumer relationships, using different types, and the
order is significant.

I also haven't answered what a designer should do if all his classes
derive from Object and all his pointers are Object*, forcing him to
dynamic_cast every time he wants to call a member-function. There are
uncountable numbers of designs that would require dynamic_cast. If the
design requires you to use dynamic_cast, then (as I've said before) use
it. But IMHO, the design could have been better.
And so on. The fact remains that I've found cases where
dynamic_cast was justified, in the sense that it was more cost
efficient than the alternatives. (With cost being measured in
program readability and maintainability, which basically end up
the equivalent of monitary cost in the end.)

This is simply a difference of experience I guess. I'm certainly not
going to second guess your sense that dyanic_cast is justified, all I
can say is that I have never found a case in OO code where it is.
 
D

Daniel T.

Andy Champ said:
Now you've got me curious. What do you print your UML diagrams on?
I've spent over a year trying to get a large-format printer. It's well
over 100 classes BTW, I lost count at 250 today and don't have a tool
that'll do it manually - but less than your 550.

Print UML? You got to be kidding. :) The tool I used to count classes
is called "find 'class'". :)
But if our projects take us a couple of years, and yours 6-9 months,
something doesn't add up.

Maybe we have much smaller classes?
And I can't give too many details of what we're up to, commercial
confidentiality and all that.

The project with 550 classes (BTW that doesn't include structs :) was
this one: (http://www.gorilla.com/#hannah)
 
D

Dmitry A. Kazakov

If dynamic_cast must be avoided at all costs, then virtual functions
must also be avoided; since I can implement dynamic cast with nothing
more elaborate than two virtual function deployments.

dynamic_cast might be ugly syntax; but semantically it is no more ugly
than double dispatch. Indeed it can be implemented with double
dispatch.

Right dynamic cast is doubly dispatching. However, "implemented by" /=
"is". (BTW one of the types in dynamic cast is statically known, so it is a
weaker case of double dispatch.)

But the difference is in the interfaces and substitutability. Dynamic cast
is an operation on pointers, while the method is supposedly to work on the
target type. This clutter, as well as exposure of pointers, is what makes
it bad. A consequence of this design is that the dispatch emulated by cast
+ dereference + method call may fail at run time. So the semantics of the
method gets changed. We mean one thing but call another. (I omit the rest
as it falls under "why dynamically-typed is untyped" :)-))
 
N

Nick Keighley

Daniel T. wrote:

  I'm afraid this is just too vague to be of any help to me...

I'm finding this entire thread like that!
Could someone start using a concrete example please!

Why do the objects need to be ordered? (some sort of
ordering was the nearest to an example I've seen).

So will a graphical editor do?

class Shape
{
};

class Square: public Shape
{
};

class Circle: public Shape
{
};

So a heterogeneous collection might be

std::vector<Shape*> picture;

and you'd use a dynamic cast if you needed to know if a shape
was a Square (though why you'd need to know that...).

Is the sugestion that picture holds type info?
 
N

Nick Keighley

Responding to Nieminen...


I think what Daniel T. is saying is that from a design perspective you
have two choices:

(1) Use separate homogeneous collections rather than heterogeneous
collections. Then the client who needs specific types of objects can
navigate the appropriate relationship to the right collection.

with my Shape example (there are some heavy trimmers on this group...)
use seperate Circle and Square collections rather than a heterogeneous
Shape collection.

(2) Provide the type information to the collection and provide an
interface to the collection that clients can access by providing a
desired type. Then the collection manages the {type, object} tuples in
its implementation.

the Shape collection also holds a field with type info (yuck!)

Caveat. In both cases the client should understand some problem space
context that happens to map into a type of the collected objects rather
than the object type itself.

for instance? Could you give a simple concrete example?

Then the relationships in (1) can be
instantiated and navigated based on that context while the 'type' in (2)
can be a separate context variable. That allows the client to be
properly decoupled from the object type since the access is defined in
terms of the problem space context rather than OOPL implementation
types. IOW, the mapping of the object type to the problem context that
requires access to a specific type is isolated and encapsulated in
whoever defines the collection content; everyone else deals with the
problem space context.

<snip>
 
N

Nick Keighley

Actually, I'm advocating choice 3:
Have the producer provide the type information directly to the
consumer, this doesn't necessarally have to be through the collection.

so if the consumer wants Squares give him Squares and only Squares.
If he wants all the Shapes in a picture, in order, then give him that.
Is "order" in the nearly-example an ordering (a <= b <= c ...) and
not a purchase ("I'd like to order 3000 left handed blivets")

If the collection's job is to keep track of ordering information, then
that should be its only job. The consumer shouldn't be expected to use
that collection to *also* get all objects of a particular type.

so if you had two "collections" or streams of objects or whatever
then you'd have two different producers. One "ordered" and one "all
objects"?
 
N

Nick Keighley

Isn't this exactlly what dynamic_cast does?

could *you* give a simple concrete example. Why don't you know the
type?

 What's the
difference between a member function "supportsInterfaceX()" and
dynamic_cast< X* >.  Except that providing the member function
means that the base interface must know about the extended
interface.
yes

what is "ordering information"? Z order? creation order?

The problem with this is that it means that everything any
derived class supports must be known and declared in the base
class.

but why would you want all the objects in certain order
unless you were going to apply a common operation to them?

If they don't all support the common operation why do you want them?

Would the Visitor pattern be any use?


 And how do you handle the different levels of
functionality?  

for example?

What happens if you tell an object to do
something it's not capable of?

why would you do that?

 (Note that the article you site
is mainly concerned with not exposing mutable state, in order to
reduce coupling.  There's nothing "mutable" about the results of
dynamic_cast.  In fact, used correctly, it reduce coupling, by
keeping not only the state, but the extended interface, out of
sight except to those who need to know.)

<snip>
 
D

Daniel T.

I'll hazard a guess: Phlip was just trying to throw me a bone by asking for
something everybody is already objecting to.

I'm not so sure I would categorically object to no-op defaults. They
are a valid design choice in some situations IMO. Valid in more
situations than down casting for sure.

To Ian's comment, adding no-op defaults requires only that the base
class interface be designed for the contexts it is used in. Which of
course, has to be done in any case.
 
J

Juha Nieminen

Nick said:
std::vector<Shape*> picture;

and you'd use a dynamic cast if you needed to know if a shape
was a Square (though why you'd need to know that...).

It's not really that you need dynamic cast to *know* if the shape is a
Square. In this case you need it if you want to *do* something to the
object if it's a Square, and this operation is not supported by Shape,
only by Square.

You could have two vectors:

std::vector<Square*> squares;
std::vector<Circle*> circles;

But then you lose their relative ordering. If you need to perform some
operations to all the shapes in order, you can't do it with only that
information.

You could have both:

std::vector<Shape*> picture;
std::vector<Square*> squares;
std::vector<Circle*> circles;

The pointers in 'picture' point to the same objects and the pointers
in the two other vectors. Now you can both directly access the Squares
and Circles, and you can perform something to all objects in order.

However, that's not good either. Now many operations become
exceedingly complicated. For example, removing the 5th shape in 'picture'.

Another problem is that if you need to perform an operation to all
Squares, but in the order in which they are in 'picture', it also
becomes complicated: Either you need to use dynamic_cast, or you need to
keep 'squares' in the same order as the pointers are in 'picture'. The
latter can be very complicated if the relative order of the shapes is
changed.
 
J

James Kanze

In message


Not quite. Don't forget Fortran 4's main control structure was
the "arithmetic if", which could jump backwards ;-/

So you could replace all of the goto's with arithmetic if's in
which all three targets were the same:).

The classical implementation of a while loop in Fortran-IV did
use a goto, however:

10 IF (condition) 20,30,20
20 C While loop code here...
GOTO 10
30 CONTINUE
 
P

Phlip

andrew said:
I'll hazard a guess: Phlip was just trying to throw me a bone by asking for
something everybody is already objecting to.

Sorry - I didn't read the thread.

Going forward, if you can't change that source, you have an "outer problem". Not
an OO problem. Dependency management does not mean "never change this source".
That's a good goal; it just makes the times when you _do_ change the base class
more safe and meaningful.
 
J

James Kanze

I'm not so sure I would categorically object to no-op
defaults.

I don't think anyone would, although it always depends.
They are a valid design choice in some situations IMO. Valid
in more situations than down casting for sure.

The problem is encapsulation. The base class shouldn't
necessarily know about what additional interfaces the derived
class may decide to add. (Sometimes, it's appropriate that it
know it, and other times, it isn't.)
To Ian's comment, adding no-op defaults requires only that the
base class interface be designed for the contexts it is used
in. Which of course, has to be done in any case.

I think you're missing the point. The base class interface is
only designed for a specific use. Some derived classes may want
to define additional functionality. The reason the client code
uses a dynamic_cast is precisely because the base class wasn't
designed to be used in the given context.

Consider a concrete example. Objects are identified by DN's.
You certainly don't want all of the functionality of every
possible object declared down in the basic object type
manipulated by the directory services. All you see there are
the basic functionalities: create, destroy, get, set and action.
Suppose that certain "attributes" are used to set up
relationships; when you receive a set request, the attribute
value is a DN. But of course, in your actual object, you don't
support arbitrary relationships; the relationship depends on the
type of the target object. So in the set request, you ask the
directory service for the object (a pointer to the object, in
fact), and dynamic_cast it to the type you support. If the
dynamc_cast fails, you reject the request with an error; if it
succeeds, you establish the relationship (whatever that means in
the context of the object). Thus, for example, a connection
point or a termination point has an implementedBy relationship
(which in this case will normally be specified as an argument to
the constructor, rather than using a set method, later); the
only way to communicate it is via a DN, so you have to
dynamc_cast, and reject the argument if it isn't the right type
(e.g. if someone requests the creation of a termination point
implemented by an event forwarding discriminator).

As soon as you're dealing with any sort of middleware (even
middleware in the same process), you'll probably need
dynamc_cast at the receiving end---the middleware should not
know the details of what it is connecting. As soon as you have
to deal with multiple versions in the clients or the servers,
you're likely to need dynamic_cast as well---you can avoid it,
but avoiding it more or less comes down to reimplementing what
it does yourself. I suspect that there are other cases as well,
but I've had to deal with those two often enough in practice to
be aware of them. (When we implemented the system I described
above, dynamic_cast didn't exist yet in the language. So we
invented our own version of it. With a lot more work than would
have been necessary with direct support in the language.)
 
J

James Kanze

No, dynamic_cast punts. It's a query on object state.

Sorry, but I fail to see any difference between having the
producer provide type information, and having the compiler do it
for him. In the end, it boils down to the same thing, except
that 1) having the producer do it explicitly is more work for
the programmer, and 2) the producer can lie.
No, it doesn't.
What happens is, you have a design error.
The type of the object is part of the objects state. The word
"mutable" is nowhere to be found in the article. "as the
caller, you should not be making decisions based on the state
of the called object that result in you then changing the
state of the object."

The article didn't mention "mutable", but all of the examples
and the reasoning it applied only applied to mutable.
The idea is that you shouldn't be telling the object to do X,
you shouldn't be telling it to change its state to X, you
should be telling it about changes in its environment.

Agreed. You tell it to do something. In practice, however,
objects may pass through many layers of intermediate software,
which knows nothing about (or should know nothing about) what
services the provider actually offers, and what services the
consumer needs. In practice, however, real software often has
to deal with different versions: you must query whether your
partner supports some new functionality, and be prepared to use
an alternate strategy if it doesn't. Those are, IMHO, two cases
where dynamic_cast is called for---in the first case, it
actually helps design (by encapsulating the functionality at the
relevant level).
 
J

James Kanze

could *you* give a simple concrete example. Why don't you know
the type?

Because the middleware didn't provide it. Because the
middleware doesn't care about the type---it's just there to
ensure communication.

Because I have no control over what version of the software the
client is running. So I have to be prepared to handle two
different versions.
what is "ordering information"? Z order? creation order?

The order of the requests?
but why would you want all the objects in certain order
unless you were going to apply a common operation to them?

Who knows? But it doesn't seem so unreasonable to me that
different operations may still require ordering between them.
If they don't all support the common operation why do you want
them?
Would the Visitor pattern be any use?

Maybe.

My point is just that there are a number of different tools
available, and no one tool is always the right one.
for example?

The fact that the middleware doesn't know (and isn't supposed to
know) the services provided by the higher levels.

In large projects, the software tends to be structured very much
like network protocols. The IP layer doesn't know beans about
NNTP, and shouldn't. Similarly, the various layers that connect
the different components of a large application shouldn't know
beans about the services provided by those components.
 
J

James Kanze

That's probably because static typing is not a desirable
feature in programming in the first place.

That depends on whether you want your program to work reliably
or not. Violate the type system (static or dynamic), and the
code doesn't work. If the language you're using uses static
type checking, you get a compile time error. If it uses dynamic
type checking, you get long debugging sessions at the customer
site.
It is added complexity, which is (arguably) sometimes
necessary, more necessary with some languages than others. The
languages I know that put emphasis on static type correctness
(for example, ML and Haskell) are extremely unwieldy to
program in.

I know a couple of Smalltalk experts who migrated to Java, and
had nothing but good things to say about the static type
checking provided by that language. I tend to work on more or
less critical applications, and find that even Java's static
typing is too weak for my tastes. Apparently, I'm not alone,
since since I've used it, the Java authors have felt it
necessary to add templates (improved static type checking).
No fun at all. Nobody I think, not even here, would say, "oh,
I miss static typing. It was so neat." It's a burden on the
programmer best avoided if one can.

It's a burden on the programmer, yes. It means that he actually
has to design what he writes, and document it up front, in a
standard way that even a compiler can understand.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,281
Latest member
Pedroaciny

Latest Threads

Top