Expensive Recursion + Method Calls?

J

Jason Cavett

In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node. This is demonstrated in the following method. Note that this
method is in DataModel (which is a node in the tree, and an abstract
class that has to be implemented by any node in the tree).

public <E extends DataModel> List<E> getAllDescendentsOfType(
Class<? extends E> classType, boolean getArchived) {
List<E> foundDescendents = new ArrayList<E>();

for (DataModel child : children) {
if (!child.isArchived() || getArchived) {
if (classType.isInstance(child)) {
foundDescendents.add(classType.cast(child));
}
}

foundDescendents.addAll(child.getAllDescendentsOfType(
classType, getArchived));
}

return foundDescendents;
}

This method is relatively quick. (See the next part of my post for
why I'm saying this.)
So, because I wanted to separate the traversal of the tree from what
I'm actually trying to do on the tree (the implementation of an
algorithm), I decided to implement a visitor pattern where the
developer can provide a visitor that will perform whatever task it is
I want performed on a node and the traversal happens in the DataModel
itself. So, something like this...

// In DataModel
public <E extends DataModel> List<E> getAllDescendentsOfType(
Class<E> classType, boolean getArchived) {
AllDescendantsOfTypeVisitor<E> visitor = new
AllDescendantsOfTypeVisitor<E>(
classType, getArchived);
this.visit(visitor);

return visitor.getDescendants();
}

// Also in DataModel
public final void visit(Visitor visitor) {
visitor.startVisit(this);

synchronized (childrenLock) {
// Recurse through the children.
for (DataModel child : children) {
child.visit(visitor);
}
}

visitor.stopVisit(this);
}

// A visitor to find all the descendants of a certain type.
public class AllDescendantsOfTypeVisitor<E extends DataModel>
implements
Visitor {

private Class<E> classType;
private List<E> descendants = new ArrayList<E>();
private boolean archived = false;

public AllDescendantsOfTypeVisitor(Class<E> classType, boolean
archived) {
this.classType = classType;
this.archived = archived;
}

public List<E> getDescendants() {
return descendants;
}

@Override
public void startVisit(DataModel model) {
if (!model.isArchived() || this.archived) {
if (classType.isInstance(model)) {
descendants.add(classType.cast(model));
}
}
}

@Override
public void stopVisit(DataModel model) {
}
}


Basically, I pass in an instance of the visitor to
DataModel.visit(Visitor) and when the visit is completed, this
specific visitor will have the children I am looking for. (And I can
do any task on the tree - searching, preparing for writing, copying,
etc. It doesn't really matter.)

When I switched to this method, the application began running
noticeably slower. After profiling, there is seconds difference
between the first method that I implemented and the visitor method.
Why would this be happening? Are Java method calls so expensive that
I would notice this. (Please note that I do this search a lot since
I'm doing a lot of tree traversal to find things. We're talking
millions of traversals, so if a method is expensive, I would notice
it.)

Or...am I just doing something stupid that makes this more expensive
than it should be? Any help would be appreciated, because this is a
bit exasperating.


Thanks!
 
J

Jason Cavett

In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node.  This is demonstrated in the following method.  Note that this
method is in DataModel (which is a node in the tree, and an abstract
class that has to be implemented by any node in the tree).

public <E extends DataModel> List<E> getAllDescendentsOfType(
            Class<? extends E> classType, boolean getArchived) {
        List<E> foundDescendents = new ArrayList<E>();

        for (DataModel child : children) {
            if (!child.isArchived() || getArchived) {
                if (classType.isInstance(child)) {
                    foundDescendents.add(classType.cast(child));
                }
            }

            foundDescendents.addAll(child.getAllDescendentsOfType(
                classType, getArchived));
        }

    return foundDescendents;

}

This method is relatively quick.  (See the next part of my post for
why I'm saying this.)
So, because I wanted to separate the traversal of the tree from what
I'm actually trying to do on the tree (the implementation of an
algorithm), I decided to implement a visitor pattern where the
developer can provide a visitor that will perform whatever task it is
I want performed on a node and the traversal happens in the DataModel
itself.  So, something like this...

// In DataModel
    public <E extends DataModel> List<E> getAllDescendentsOfType(
            Class<E> classType, boolean getArchived) {
        AllDescendantsOfTypeVisitor<E> visitor = new
AllDescendantsOfTypeVisitor<E>(
                classType, getArchived);
        this.visit(visitor);

        return visitor.getDescendants();
    }

// Also in DataModel
    public final void visit(Visitor visitor) {
        visitor.startVisit(this);

        synchronized (childrenLock) {
            // Recurse through the children.
            for (DataModel child : children) {
                child.visit(visitor);
            }
        }

        visitor.stopVisit(this);
    }

// A visitor to find all the descendants of a certain type.
public class AllDescendantsOfTypeVisitor<E extends DataModel>
implements
        Visitor {

    private Class<E> classType;
    private List<E> descendants = new ArrayList<E>();
    private boolean archived = false;

    public AllDescendantsOfTypeVisitor(Class<E> classType, boolean
archived) {
        this.classType = classType;
        this.archived = archived;
    }

    public List<E> getDescendants() {
        return descendants;
    }

    @Override
    public void startVisit(DataModel model) {
        if (!model.isArchived() || this.archived) {
            if (classType.isInstance(model)) {
                descendants.add(classType.cast(model));
            }
        }
    }

    @Override
    public void stopVisit(DataModel model) {
    }

}

Basically, I pass in an instance of the visitor to
DataModel.visit(Visitor) and when the visit is completed, this
specific visitor will have the children I am looking for.  (And I can
do any task on the tree - searching, preparing for writing, copying,
etc.  It doesn't really matter.)

When I switched to this method, the application began running
noticeably slower.  After profiling, there is seconds difference
between the first method that I implemented and the visitor method.
Why would this be happening?  Are Java method calls so expensive that
I would notice this.  (Please note that I do this search a lot since
I'm doing a lot of tree traversal to find things.  We're talking
millions of traversals, so if a method is expensive, I would notice
it.)

Or...am I just doing something stupid that makes this more expensive
than it should be?  Any help would be appreciated, because this is a
bit exasperating.

Thanks!

Just for the record, I profiled both methods, and the visitor pattern
takes approx. 3 times longer to perform the same task. My guess is
method overhead.
 
J

John B. Matthews

Jason Cavett said:
In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node.  This is demonstrated in the following method.  Note that
this method is in DataModel (which is a node in the tree, and an
abstract class that has to be implemented by any node in the tree). [...]
        synchronized (childrenLock) {
[...]
Just for the record, I profiled both methods, and the visitor pattern
takes approx. 3 times longer to perform the same task. My guess is
method overhead.

Was the original synchronized, also?
 
J

Jason Cavett

In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node.  This is demonstrated in the following method.  Note that
this method is in DataModel (which is a node in the tree, and an
abstract class that has to be implemented by any node in the tree). [...]
        synchronized (childrenLock) {
[...]
Just for the record, I profiled both methods, and the visitor pattern
takes approx. 3 times longer to perform the same task.  My guess is
method overhead.

Was the original synchronized, also?

Ah. Yes it was. I forgot to put that in the second method when I was
typing into the newsgroup. Sorry about that. (And, for the record, I
tried to take out the synchronization block just to see what would
happen, and there were no speed improvements.)
 
J

Jason Cavett

In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node.  This is demonstrated in the following method.  Note that
this method is in DataModel (which is a node in the tree, and an
abstract class that has to be implemented by any node in the tree). [...]
        synchronized (childrenLock) { [...]
Just for the record, I profiled both methods, and the visitor pattern
takes approx. 3 times longer to perform the same task.  My guess is
method overhead.
Was the original synchronized, also?

Ah.  Yes it was.  I forgot to put that in the second method when I was
typing into the newsgroup.  Sorry about that.  (And, for the record, I
tried to take out the synchronization block just to see what would
happen, and there were no speed improvements.)

So, I've now also tried to use a stack to implement recursion and came
up with this.

public final void visit(Visitor visitor) {
Deque<DataModel> stack = new ArrayDeque<DataModel>();
stack.push(this);

while (stack.size() > 0) {
DataModel current = stack.pop();
visitor.startVisit(current);

Enumeration<DataModel> children = current.children();
while (children.hasMoreElements()) {
stack.push(children.nextElement());
}
}
}

However, although it's slightly faster (via profiling), there's no
real noticeable change. The only method being called here is
startVisit (I don't use stopVisit right now, so I decided to remove it
for the moment). This removes the need for synchronization which is
nice, but that's about it.

I can't figure out why this is THAT much slower.
 
J

John B. Matthews

 Jason Cavett <[email protected]> wrote:
In my original code, I implemented a method that searched through a
tree structure to find children of a certain type beneath a parent
node.  This is demonstrated in the following method.  Note that
this method is in DataModel (which is a node in the tree, and an
abstract class that has to be implemented by any node in the tree).
[...]
        synchronized (childrenLock) {
[...]
Just for the record, I profiled both methods, and the visitor pattern
takes approx. 3 times longer to perform the same task.  My guess is
method overhead.
Was the original synchronized, also?
[...]
Ah.  Yes it was.  I forgot to put that in the second method when I was
typing into the newsgroup.  Sorry about that.  (And, for the record, I
tried to take out the synchronization block just to see what would
happen, and there were no speed improvements.)

So, I've now also tried to use a stack to implement recursion and came
up with this.

public final void visit(Visitor visitor) {
Deque<DataModel> stack = new ArrayDeque<DataModel>();
stack.push(this);

while (stack.size() > 0) {
DataModel current = stack.pop();
visitor.startVisit(current);

Enumeration<DataModel> children = current.children();
while (children.hasMoreElements()) {
stack.push(children.nextElement());
}
}
}

However, although it's slightly faster (via profiling), there's no
real noticeable change. The only method being called here is
startVisit (I don't use stopVisit right now, so I decided to remove it
for the moment). This removes the need for synchronization which is
nice, but that's about it.

I can't figure out why this is THAT much slower.

I would expect the double dispatch to be only fractionally slower.
Presumably, your original getAllDescendentsOfType() traversed the data
model recursively, examining each node only once. Is it possible that
your visitor is visiting some of the same nodes repeatedly?
 
M

Mark Space

Jason said:
However, although it's slightly faster (via profiling), there's no
real noticeable change. The only method being called here is
startVisit (I don't use stopVisit right now, so I decided to remove it
for the moment). This removes the need for synchronization which is
nice, but that's about it.

I can't figure out why this is THAT much slower.


I'm interested too, but I'm feeling too lazy right now to implent the
code for profiling. Could you post a full working version with both
algoritms? Please include a method to fill the tree to a reasonable
size with random data. I could be motivated to profile that myself and
then take time to try some optimizations.
 
J

Jason Cavett

I'm interested too, but I'm feeling too lazy right now to implent the
code for profiling.  Could you post a full working version with both
algoritms?  Please include a method to fill the tree to a reasonable
size with random data.  I could be motivated to profile that myself and
then take time to try some optimizations.

So, it turns out that the problem was related to a locking issue -
nothing even to do with this visitor. I figured it out and fixed
locking and now everything works at a good speed again.

Thanks anyway!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,884
Messages
2,569,953
Members
46,283
Latest member
Bernardo23

Latest Threads

Top