ArrayList.Iterator.remove()

D

Donkey Hottie

Just noticed a 'hidden feature of Java' ;)

Removing items an ArrayList (at least with an Iterator) takes ages.

Better - while not obvious - solution is to create an new ArrayList with the
remaining elements.

So it seems. removal is futile.
 
E

Eric Sosman

Donkey said:
Just noticed a 'hidden feature of Java' ;)

Removing items an ArrayList (at least with an Iterator) takes ages.

If the ArrayList contains a lot of items, yes. No, wait,
for "yes" read "of course" or even "obviously."
Better - while not obvious - solution is to create an new ArrayList with
the remaining elements.

Depends on how many items there are, how many you're
deleting, and where they're positioned.
So it seems. removal is futile.

Nonsense. No, wait, for "nonsense" read "balderdash" or
even "baloney." If you brush your teeth with a broom and have
an unpleasant experience, it does not follow that brushing your
teeth is futile.
 
L

Lew

Just noticed a 'hidden feature of Java' ;)

Removing items an ArrayList (at least with an Iterator) takes ages.

Better - while not obvious - solution is to create an new ArrayList with the
remaining elements.

So it seems. removal is futile.

Not so hidden, really.

The size, isEmpty, get, set, iterator, and listIterator operations run
in constant time. The add operation runs in amortized constant time, that
is, adding n elements requires O(n) time. All of the other operations run
in linear time (roughly speaking).

That's for *each* removal.

That's because 'ArrayList#remove()' (through an iterator or
otherwise)
hifts any subsequent elements to the left (subtracts one from their indices).

<http://java.sun.com/javase/6/docs/api/java/util/ArrayList.html#remove
(int)>

The documentation for the collection classes usually indicates the big-
O time of the common operations.

I would expect copying to take longer, however, depending on where one
is in the iteration. Perhaps there's extra machinery in the iterator
logic to normalize the iterator's position within the list.

Or are you copying individual elements as you iterate, skipping over
the "removed" ones? That should be much faster than multiple
removals.
 
D

Donkey Hottie

Eric Sosman said:
If the ArrayList contains a lot of items, yes. No,
wait, for "yes" read "of course" or even "obviously."


Depends on how many items there are, how many you're
deleting, and where they're positioned.


Nonsense. No, wait, for "nonsense" read "balderdash"
or even "baloney." If you brush your teeth with a broom
and have an unpleasant experience, it does not follow
that brushing your teeth is futile.

* Loaded 223673 addresses
* Sorting...
* Merging...
* Merged 22577 address ranges.

Removed 22577 items from 223673. Don't know it that is many or not.
 
D

Donkey Hottie

..
I would expect copying to take longer, however, depending
on where one
is in the iteration. Perhaps there's extra machinery in
the iterator
logic to normalize the iterator's position within the
list.

Or are you copying individual elements as you iterate,
skipping over
the "removed" ones? That should be much faster than
multiple
removals.

Well, after I discovered the slow speed of removal, I rewrote the method now
creates a new List and always just adds the suitable elements and replaces
the List instance.

10% of the original List seem to be removed, and it is 1000 times faster
now.
 
D

Donkey Hottie

Patricia Shanahan said:
Also, if access to a List is dominated by iterators with
remove, LinkedList is likely to be more efficient than
ArrayList. ArrayList is optimized for indexed access.

Thanks!

I was not aware of that kind of a List being in Java library (thought it
would be an Apache solution or something).

LinkedList now belongs to my toolkit allright.
 
L

Lew

Donkey Hottie said:
* Loaded 223673 addresses
* Sorting...
* Merging...
* Merged 22577 address ranges.

Removed 22577 items from 223673. Don't know it that is many or not.

I would expect, based on the extremely non-hidden documentation of
ArrayList, for that to take time roughly

T = (22577 * 223673 / 2) * k

where 'k' is the time to copy an element from one array location to
another.

If 'k' is around 100 nanoseconds, T would be around 252 seconds, or
somewhat over 4 minutes. A 'k' of 10 ns would require about 25
seconds of removal time.
 
D

Donkey Hottie

Lew said:
I would expect, based on the extremely non-hidden
documentation of ArrayList, for that to take time roughly

T = (22577 * 223673 / 2) * k

where 'k' is the time to copy an element from one array
location to another.

If 'k' is around 100 nanoseconds, T would be around 252
seconds, or somewhat over 4 minutes. A 'k' of 10 ns
would require about 25 seconds of removal time.

Great ;)

I'm writing a rival to a Windows application, which does that merge at least
5 minutes in a Athlon XP 1900+ machine. My Java implementation does it now
in ten seconds in a Pentium Pro 3 machine.

And I though as a C++ programmer (for 10 years) that Java is sluggish..
 
K

Knute Johnson

Donkey said:
.

Well, after I discovered the slow speed of removal, I rewrote the method
now creates a new List and always just adds the suitable elements and
replaces the List instance.

10% of the original List seem to be removed, and it is 1000 times faster
now.

I'd be really curious to know if using the LinkedList is significantly
faster than ArrayList. If you try it, please post back.

Thanks,
 
E

Eric Sosman

Donkey said:
Great ;)

I'm writing a rival to a Windows application, which does that merge at
least 5 minutes in a Athlon XP 1900+ machine. My Java implementation
does it now in ten seconds in a Pentium Pro 3 machine.

And I though as a C++ programmer (for 10 years) that Java is sluggish..

As almost always, you get a lot more improvement out of
choosing the right data structures and algorithms than out of
shaving cycles or even out of choosing implementation language.

You haven't fully explained what you're trying to do, but
at a guess you're collecting a big pile of numeric "addresses"
from somewhere, sorting them, possibly eliminating duplicates,
and finally combining groups of neighboring addresses into
"ranges." ArrayList doesn't strike me as a good candidate for
the elimination and combining stages, precisely because of the
time required to squeeze out the vacated slots. If you *must*
use ArrayList (for reasons the sketchy problem description does
not reveal), consider working on it from the end toward the
beginning instead of from the beginning toward the end. (But
since you're removing only 10% of the total, I don't hold out
a lot of hope for a huge improvement therefrom.)
 
A

Arne Vajhøj

Donkey said:
* Loaded 223673 addresses
* Sorting...
* Merging...
* Merged 22577 address ranges.

Removed 22577 items from 223673. Don't know it that is many or not.

It is many.

If all the items were removed first in the list you would have moved
approx. 4.8 billion elements because ArrayList is backed by an array
and need to move things around after a delete.

Arne
 
A

Arne Vajhøj

Donkey said:
Well, after I discovered the slow speed of removal, I rewrote the method
now creates a new List and always just adds the suitable elements and
replaces the List instance.

10% of the original List seem to be removed, and it is 1000 times faster
now.

Most likely it would be better with another data type than
ArrayList.

Arne
 
A

Arne Vajhøj

Donkey said:
Great ;)

I'm writing a rival to a Windows application, which does that merge at
least 5 minutes in a Athlon XP 1900+ machine. My Java implementation
does it now in ten seconds in a Pentium Pro 3 machine.

And I though as a C++ programmer (for 10 years) that Java is sluggish..

I would expect a C++ program using STL vector to have the same
behavior.

Arne
 
R

Roedy Green

ArrayList doesn't strike me as a good candidate for
the elimination and combining stages, precisely because of the
time required to squeeze out the vacated slots.

It is very quick if you create a new array. See
http://mindprod.com/products1.html#SORTED for a set of classes for
merging and deduping etc. Unfortunately, the classes were written
before generics were introduced.
--
Roedy Green Canadian Mind Products
http://mindprod.com

"Deer hunting would be fine sport, if only the deer had guns."
~ William S. Gilbert of Gilbert and Sullivan
 
M

markspace

Donkey said:
Thanks!

I was not aware of that kind of a List being in Java library (thought it


What?! Are you kidding me? I'm sorry but this is bull****. How the
hell can you program in Java at all with out at least bumping into the
Collections classes? This is really amazing to me. They're in every
tutorial. It's basic algorithms and should be instantly familiar to
anyone who made it through their lower division course work.... I'm just
aghast. Seriously.

Especially a C++ programmer should be on the lookout for something in a
new language to replace the STL, and Collections is a big part of Java's
answer to the STL.
 
E

Eric Sosman

Roedy said:
ArrayList doesn't strike me as a good candidate for
the elimination and combining stages, precisely because of the
time required to squeeze out the vacated slots.

It is very quick if you create a new array. [...]

Assuming the items being removed are fairly numerous. (As
it seems they are, although the O.P. let slip this information
only in follow-ups and not in the original "removal is futile"
post.)

If only a few items are being deleted, especially if they
are known to appear near the end, the work involved in sliding
all their successors one place to the left may well be less than
the work of copying "all except" to an entirely new array. Horses
for courses, and don't blame the horse if he falters when you
ride him up the ski jump.
 
K

Kevin McMurtrie

"Donkey Hottie said:
Just noticed a 'hidden feature of Java' ;)

Removing items an ArrayList (at least with an Iterator) takes ages.

Better - while not obvious - solution is to create an new ArrayList with the
remaining elements.

So it seems. removal is futile.

Skipped school? This is in the basics of information science. It has
nothing to do with Java.

When I think of hidden Java features I think of secret accessor methods,
generics insanity, -XX switches, FinalReference, and in-place math
operators lacking narrowing checks.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top