Best way to loop through ArrayList and remove elements on the way?

Discussion in 'Java' started by Kevin, Jan 28, 2008.

  1. Kevin

    Kevin Guest

    Hi guys,

    Just want to confirm:

    for a ArrayList, in single thread mode (only one thread will access
    this ArrayList), what is the best way to:

    1) loop through all the element of this arraylist (for example, each
    item of it is a String).
    2) do some check on each item (for example, check if it equals to
    "abc").
    3) remove this item from the arraylist if the above check is true.

    Is using iterator() and then use iterator.remove() the best way?
    Like:

    for (Iterator it = myarraylist.iterator(); it.hasNext(); )
    {
    String s = (String) it.next();
    if (s.equals("abc"))
    {
    it.remove();
    }
    };


    Thanks a log.
    Kevin, Jan 28, 2008
    #1
    1. Advertising

  2. Re: Best way to loop through ArrayList and remove elements on theway?

    On 28.01.2008 23:46, Kevin allegedly wrote:
    > Hi guys,
    >
    > Just want to confirm:
    >
    > for a ArrayList, in single thread mode (only one thread will access
    > this ArrayList), what is the best way to:
    >
    > 1) loop through all the element of this arraylist (for example, each
    > item of it is a String). 2) do some check on each item (for example,
    > check if it equals to "abc"). 3) remove this item from the arraylist
    > if the above check is true.
    >
    > Is using iterator() and then use iterator.remove() the best way?
    > Like:
    >
    > for (Iterator it = myarraylist.iterator(); it.hasNext(); ) { String s
    > = (String) it.next(); if (s.equals("abc")) { it.remove(); } };
    >
    >
    > Thanks a log.
    >


    "Best" with respects to what?

    The code above will be more efficient on a LinkedList than on an
    ArrayList. As a general rule, when doing filtering operations
    (conditional removing/keeping) on array-backed structures (ArrayList,
    StringBuffer, etc.), it is more efficient, especially if the structure
    is big, to copy those elements which are to be kept into a new
    structure, and then to discard the old one, instead of perform multiple
    deletions. Only if there would be a significant number (more than a
    couple) of deletions, of course. Your mileage may vary, but it takes a
    lot off the CPU and adds only little memory overhead.

    DF.
    Daniele Futtorovic, Jan 28, 2008
    #2
    1. Advertising

  3. Re: Best way to loop through ArrayList and remove elements on theway?

    Kevin <> writes:

    > for a ArrayList, in single thread mode (only one thread will access
    > this ArrayList), what is the best way to:
    >
    > 1) loop through all the element of this arraylist (for example, each
    > item of it is a String).
    > 2) do some check on each item (for example, check if it equals to
    > "abc").
    > 3) remove this item from the arraylist if the above check is true.


    First of all, an ArrayList takes linear time to remove a single
    element, so repeated removal is a bad idea.

    Either use a LinkedList, which can remove in constant time during
    iteration, or first collect the elements in a HashSet and then remove
    them at the end using Collection.removeAll(Collection). I suggest the
    latter.

    For ArrayList, the removeAll method takes time proportional to the
    size of the list times the lookup time for the argument collection.
    Using a HashSet as argument should minimize the time it takes.

    If you only remove a few values, any collection will probably suffice.

    I.e., either:

    LinkedList tmpLinkedList = new LinkedList(myarraylist);
    for(Iterator iter = tmpLinkedList.iterator(); iter.hasNext();) {
    if (test(iter.next)) { iter.remove(); }
    }
    myarraylist.clear();
    myarraylist.addAll(tmpLinkedList);

    or

    HashSet toRemove = new HashSet();
    for(Iterator iter = myarraylist.iterator(); iter.hasNext();) {
    Object elem = iter.next();
    if (test(elem)) { toRemove.add(elem); }
    }
    myarraylist.removeAll(toRemove);

    This is assuming the test is complicated. If it's just equality
    check, then you might be able to use removeAll directly.

    > Is using iterator() and then use iterator.remove() the best way?


    For the above reasons, I'd say no.

    > Like:
    >
    > for (Iterator it = myarraylist.iterator(); it.hasNext(); )
    > {
    > String s = (String) it.next();
    > if (s.equals("abc"))
    > {
    > it.remove();
    > }
    > };


    In this simple case, where you only compare for equality (and even to
    only a single value), it would suffice to do:
    myarraylist.removeAll(Collections.singletonSet("abc"));
    If you have more values, but still only do equality checks, just create
    a collection and remove them all.
    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
    Lasse Reichstein Nielsen, Jan 28, 2008
    #3
  4. Kevin

    Kevin Guest

    Re: Best way to loop through ArrayList and remove elements on theway?

    Thanks guys!

    So remove(object) is linear time with respect to the ArrayList's
    lenght, right?

    Is using iterator.remove() still O(n) time? I think the iterator
    already keep the "reference" to the current item (for example, in my
    previous example code, it "points" to the current item already), and
    if we want to remove it, there is no need to look for it in the
    arraylist, so it just removes it directly, which should be constant
    time, right?

    Thanks.
    Kevin, Jan 29, 2008
    #4
  5. Re: Best way to loop through ArrayList and remove elements on theway?

    On 29.01.2008 00:30, Lasse Reichstein Nielsen allegedly wrote:
    > Kevin <> writes:
    >
    >> for a ArrayList, in single thread mode (only one thread will access
    >> this ArrayList), what is the best way to:
    >>
    >> 1) loop through all the element of this arraylist (for example,
    >> each item of it is a String). 2) do some check on each item (for
    >> example, check if it equals to "abc"). 3) remove this item from the
    >> arraylist if the above check is true.

    >
    > First of all, an ArrayList takes linear time to remove a single
    > element, so repeated removal is a bad idea.
    >
    > Either use a LinkedList, which can remove in constant time during
    > iteration, or first collect the elements in a HashSet and then remove
    > them at the end using Collection.removeAll(Collection). I suggest
    > the latter.
    >
    > For ArrayList, the removeAll method takes time proportional to the
    > size of the list times the lookup time for the argument collection.
    > Using a HashSet as argument should minimize the time it takes.


    I don't quite understand that advice. The docs for removeAll(Collection)
    in AbstractCollection (neither AbstractList nor ArrayList override that
    method) state:
    "This implementation iterates over this collection, checking each
    element returned by the iterator in turn to see if it's contained in the
    specified collection. If it's so contained, it's removed from this
    collection with the iterator's remove method".

    Which is what the OP was doing -- except for the part of the lookup.
    Sure, using removeAll is cleaner, and, as opposed to removing one "type"
    (as defined by equality relations in this context) at once, it is
    superior. But only in that it avoids multiple iterations. The most
    itensive operation here, however, isn't the iteration, but the removal.
    Which your suggestion does not affect.

    As far as I can surmise, there are only two ways of substantially
    improving the algorithm: either switch to a sequential list (and then
    using removeAll would be a good idea too), or give up on having the
    filtering modify the list at all, but rather yield a new one.

    DF.
    Daniele Futtorovic, Jan 29, 2008
    #5
  6. Re: Best way to loop through ArrayList and remove elements on theway?

    Kevin wrote:
    > Hi guys,
    >
    > Just want to confirm:
    >
    > for a ArrayList, in single thread mode (only one thread will access
    > this ArrayList), what is the best way to:
    >
    > 1) loop through all the element of this arraylist (for example, each
    > item of it is a String).
    > 2) do some check on each item (for example, check if it equals to
    > "abc").
    > 3) remove this item from the arraylist if the above check is true.
    >
    > Is using iterator() and then use iterator.remove() the best way?
    > Like:
    >
    > for (Iterator it = myarraylist.iterator(); it.hasNext(); )
    > {
    > String s = (String) it.next();
    > if (s.equals("abc"))
    > {
    > it.remove();
    > }
    > };
    >
    >
    > Thanks a log.
    >


    From the point of view of coding simplicity, I think it is the best way.

    If it becomes a performance bottleneck, investigate whether you should
    switch to LinkedList or take other steps to reduce the cost.

    Patricia
    Patricia Shanahan, Jan 29, 2008
    #6
  7. Kevin

    Roedy Green Guest

    On Mon, 28 Jan 2008 14:46:35 -0800 (PST), Kevin
    <> wrote, quoted or indirectly quoted
    someone who said :

    >Is using iterator() and then use iterator.remove() the best way?
    >Like:


    yes.
    --
    Roedy Green, Canadian Mind Products
    The Java Glossary, http://mindprod.com
    Roedy Green, Jan 29, 2008
    #7
  8. Kevin wrote:
    > Thanks guys!
    >
    > So remove(object) is linear time with respect to the ArrayList's
    > lenght, right?


    Yes. So is remove(int), but remove(object) has a larger constant.
    remove(m) for an ArrayList of size n needs to:

    1. copy the references numbered from m+1 to n-1 to the positions m
    to n-2 respectively
    2. Set the reference at m-1 to null.

    remove(o) has to

    a. Find the index m of the first o in the ArrayList
    b. perform steps 1 and 2 above.

    >
    > Is using iterator.remove() still O(n) time?


    Yes, because it amounts to remove(int).
    Mike Schilling, Jan 29, 2008
    #8
  9. In article
    <>,
    Kevin <> wrote:

    > Hi guys,
    >
    > Just want to confirm:
    >
    > for a ArrayList, in single thread mode (only one thread will access
    > this ArrayList), what is the best way to:
    >
    > 1) loop through all the element of this arraylist (for example, each
    > item of it is a String).
    > 2) do some check on each item (for example, check if it equals to
    > "abc").
    > 3) remove this item from the arraylist if the above check is true.
    >
    > Is using iterator() and then use iterator.remove() the best way?
    > Like:
    >
    > for (Iterator it = myarraylist.iterator(); it.hasNext(); )
    > {
    > String s = (String) it.next();
    > if (s.equals("abc"))
    > {
    > it.remove();
    > }
    > };
    >
    >
    > Thanks a log.


    Removing from ArrayList requires shifting elements so it can be slow.
    If an ArrayList is still the right tool, it may be faster in some cases
    to build a new list based on what wouldn't be removed. If you can't
    rebuild, you can still get an improvement by going backwards through the
    list.

    --
    I don't read Google's spam. Reply with another service.
    Kevin McMurtrie, Jan 29, 2008
    #9
  10. Re: Best way to loop through ArrayList and remove elements on theway?

    Daniele Futtorovic <> writes:

    > I don't quite understand that advice. The docs for removeAll(Collection)
    > in AbstractCollection (neither AbstractList nor ArrayList override that
    > method) state:
    > "This implementation iterates over this collection, checking each
    > element returned by the iterator in turn to see if it's contained in the
    > specified collection. If it's so contained, it's removed from this
    > collection with the iterator's remove method".


    True, my mistake.

    I was looking at the GNU Classpath implementation of ArrayList,
    which does implement an optimized removeAll, without noticing
    that it wasn't the Sun version.
    <URL:http://www.docjar.com/html/api/java/util/ArrayList.java.html>

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
    Lasse Reichstein Nielsen, Jan 29, 2008
    #10
  11. Re: Best way to loop through ArrayList and remove elements on theway?

    On 29.01.2008 08:47, Lasse Reichstein Nielsen allegedly wrote:
    > Daniele Futtorovic <> writes:
    >
    >> I don't quite understand that advice. The docs for
    >> removeAll(Collection) in AbstractCollection (neither AbstractList
    >> nor ArrayList override that method) state: "This implementation
    >> iterates over this collection, checking each element returned by
    >> the iterator in turn to see if it's contained in the specified
    >> collection. If it's so contained, it's removed from this collection
    >> with the iterator's remove method".

    >
    > True, my mistake.
    >
    > I was looking at the GNU Classpath implementation of ArrayList, which
    > does implement an optimized removeAll, without noticing that it
    > wasn't the Sun version.
    > <URL:http://www.docjar.com/html/api/java/util/ArrayList.java.html>
    >
    > /L


    I see. They've forgotten to modify the Javadoc accordingly, though. In
    their AbstractCollection class they still write it's done using an
    Iterator. Which, at that level, is correct, of course. They would have
    to override it ArrayList solely for the doc, I suppose.


    I started wondering why this hadn't been added to the core classes,
    fired up the bugparade, and here we are:
    <http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6529800>
    .... is flagged as "Closed, fixed" (took them til 1.6!!).

    The ArrayList class in my SDK doesn't contain any such implementation,
    but it's 1.6.0_02 (yeah, I should update). However, I can't seem to find
    the corresponding bugID in the release notes either:
    <http://java.sun.com/javase/6/webnotes/ReleaseNotes.html>

    :scratches head:

    DF.
    Daniele Futtorovic, Jan 29, 2008
    #11
  12. Kevin

    Boris Stumm Guest

    Kevin wrote:
    > for a ArrayList, in single thread mode (only one thread will access
    > this ArrayList), what is the best way to:
    >
    > 1) loop through all the element of this arraylist (for example, each
    > item of it is a String).
    > 2) do some check on each item (for example, check if it equals to
    > "abc").
    > 3) remove this item from the arraylist if the above check is true.
    >
    > Is using iterator() and then use iterator.remove() the best way?


    I do not think so, as stated in the other replies. An Iterator moves
    from the beginning of the ArrayList to the end, so that the complete
    operation will terminate in O(n^2). However, as long as speed is not
    an issue (or the lists are small), this approach is the cleanest one.

    The fastest way to check all elements in an ArrayList, and remove
    some of them is probably the following (assuming that the list is NOT
    SORTED!)

    ArrayList list = ...;
    for (int i = list.size() - 1; i >= 0; i--) {
    if (elementNeedsToBeRemoved(list.get(i))) {
    list.set(i, list.get(list.size() -1 ));
    list.remove(list.size() - 1);
    }
    }

    This will work in O(n).
    Boris Stumm, Jan 29, 2008
    #12
  13. Re: Best way to loop through ArrayList and remove elements on theway?

    Boris Stumm wrote:
    > Kevin wrote:
    >> for a ArrayList, in single thread mode (only one thread will access
    >> this ArrayList), what is the best way to:
    >>
    >> 1) loop through all the element of this arraylist (for example, each
    >> item of it is a String).
    >> 2) do some check on each item (for example, check if it equals to
    >> "abc").
    >> 3) remove this item from the arraylist if the above check is true.
    >>
    >> Is using iterator() and then use iterator.remove() the best way?

    >
    > I do not think so, as stated in the other replies. An Iterator moves
    > from the beginning of the ArrayList to the end, so that the complete
    > operation will terminate in O(n^2). However, as long as speed is not
    > an issue (or the lists are small), this approach is the cleanest one.


    Isn't the iterator-based version O(n*(m+1)) where n is the length of the
    list and m is the number of items being removed?

    Whether this is equivalent to O(n), O(n^2), or something in between
    depends on the relationship between n and m.

    Patricia
    Patricia Shanahan, Jan 29, 2008
    #13
  14. Re: Best way to loop through ArrayList and remove elements on theway?

    Patricia Shanahan <> writes:

    > Isn't the iterator-based version O(n*(m+1)) where n is the length of the
    > list and m is the number of items being removed?


    It's O(n*m*k) where n is the size of the list, m is the number of
    elements removed (it's an average, removing the first element is more
    expensive than the last element), and k is the time it takes to check
    whether an element is in the argument collection.

    > Whether this is equivalent to O(n), O(n^2), or something in between
    > depends on the relationship between n and m.


    Worst case is O(n^2), so it's fair to use that as an upper bound on
    the complexity of the algorithm.

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
    Lasse Reichstein Nielsen, Jan 30, 2008
    #14
  15. Re: Best way to loop through ArrayList and remove elements on theway?

    Lasse Reichstein Nielsen wrote:
    > Patricia Shanahan <> writes:
    >
    >> Isn't the iterator-based version O(n*(m+1)) where n is the length of the
    >> list and m is the number of items being removed?

    >
    > It's O(n*m*k) where n is the size of the list, m is the number of
    > elements removed (it's an average, removing the first element is more
    > expensive than the last element), and k is the time it takes to check
    > whether an element is in the argument collection.


    I don't understand why you multiply together m and k.

    The cost of examining each element is O(n*k), and the cost of removing
    the elements that need removing is O(n*m*j), where j is the cost of
    shifting one element down the array. We can treat at least one of k and
    j as being one, because constant factors don't affect complexity, so
    that can reduce to O(n*k+n*m)=O(n*(k+m)). I was treating them both as
    being one unit, so I reduced it to O(n*(m+1)).

    >
    >> Whether this is equivalent to O(n), O(n^2), or something in between
    >> depends on the relationship between n and m.

    >
    > Worst case is O(n^2), so it's fair to use that as an upper bound on
    > the complexity of the algorithm.
    >
    > /L


    However, in many real applications of this sort of operation the number
    of removed elements is very small, so I think it is important to
    remember that the complexity depends on the removal rate.

    Patricia
    Patricia Shanahan, Jan 30, 2008
    #15
  16. Re: Best way to loop through ArrayList and remove elements on theway?

    Patricia Shanahan <> writes:

    > Lasse Reichstein Nielsen wrote:


    >> It's O(n*m*k)

    ....
    > I don't understand why you multiply together m and k.


    Neither do I, now that you mention it. I should have said O(n*m+n*k),
    but I was too busy multiplying :)

    > The cost of examining each element is O(n*k), and the cost of removing
    > the elements that need removing is O(n*m*j), where j is the cost of
    > shifting one element down the array. We can treat at least one of k and
    > j as being one, because constant factors don't affect complexity, so
    > that can reduce to O(n*k+n*m)=O(n*(k+m)). I was treating them both as
    > being one unit, so I reduced it to O(n*(m+1)).


    Indeed.

    >> Worst case is O(n^2), so it's fair to use that as an upper bound on
    >> the complexity of the algorithm.


    > However, in many real applications of this sort of operation the number
    > of removed elements is very small, so I think it is important to
    > remember that the complexity depends on the removal rate.


    Nah, being practical shouldn't get in the way of a good theory :)
    But ofcourse, you are right. Knowing the actual problem being solved
    is more important than general theory. Like Bubble sort being the best
    sorting algorithm ... for almost sorted lists.

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
    Lasse Reichstein Nielsen, Jan 30, 2008
    #16
  17. Kevin

    Roedy Green Guest

    On Mon, 28 Jan 2008 14:46:35 -0800 (PST), Kevin
    <> wrote, quoted or indirectly quoted
    someone who said :

    >Is using iterator() and then use iterator.remove() the best way?
    >Like:


    Another faster method if the lists are long is to copy leaving behind
    the undesired elements.

    This is the technique used in the SortedArrayList class.

    See http://mindprod.com/products2.html#SORTED
    --
    Roedy Green, Canadian Mind Products
    The Java Glossary, http://mindprod.com
    Roedy Green, Jan 30, 2008
    #17
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Saravanan Rathinavelu

    Iterate through ArrayList using an another ArrayList

    Saravanan Rathinavelu, Aug 16, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    2,729
    Natty Gur
    Aug 19, 2003
  2. Kaidi
    Replies:
    4
    Views:
    2,335
    Kaidi
    Jan 3, 2004
  3. xz
    Replies:
    16
    Views:
    2,346
  4. Philipp
    Replies:
    6
    Views:
    904
    Arne Vajhøj
    May 28, 2008
  5. Isaac Won
    Replies:
    9
    Views:
    350
    Ulrich Eckhardt
    Mar 4, 2013
Loading...

Share This Page