Thursday, September 14, 2023

How to empty an ArrayList in Java? clear() vs removeAll() method Example Tutorial

Many times we want to reset an ArrayList for the reusing purpose, by resetting we mean clearing it or removing all elements. There are two ways to reset an ArrayList in Java, by using the clear() method or calling removeAll(). If your ArrayList is small enough like it contains only 10 or 100 elements then you can use any of these two methods without worrying too much, but, if you have a huge list of lots of objects like an ArrayList containing 10M entries, then the choice of clear() vs removeAll() can make a huge difference in the performance of your Java application.

Sometimes it's even better to create a new ArrayList instead of resetting the old one, especially if resetting takes a long time, but this also has a caveat, you need to make sure that the old ArrayList is eligible for garbage collection, otherwise, there is a huge risk of java.lang.OutOfMemoryError: Java Heap Space.

Coming back to the clear() vs removeAll() method, you should always use clear(), because it gives you O(n) performance, while removeAll(Collection c) is worse, it gives O(n^2) performance, that's why you see a huge difference in time taken by clearing a large ArrayList by these two methods.

Things will be obvious, when you will run our example program and see the code of clear() and removeAll() method from JDK API. 

By the way, if you are in doubt, use the clear() method, and if not then always prefer clear over removeAll in Java.



Clear() vs RemoveAll(Collection c) Example in Java

In order to compare the performance of both these methods, it's very important to see their code. You can check the source code of the clear() method in java.util.ArrayList class, for convenience I have included it here.

 This code is from JDK version 1.7.0_40. By the way, If you want to learn more about performance measurement and tuning then I strongly suggest joining these best Java performance online courses for beginners and experienced developers.  

It covers both processes and tools you need to monitor and tune the performance of Java applications. 

   /**
     * Removes all of the elements from this list.  The list will
     * be empty after this call returns.
     */
    public void clear() {
        modCount++;
        // clear to let GC do its work
        for (int i = 0; i < size; i++)
            elementData[i] = null;
        size = 0;
    }

You can see that this method loop over ArrayList and assign null to every element to make them eligible for garbage collection, of course, if there is no external reference. 

How to empty an ArrayList in Java? Clear vs RemoveAll Example Tutorial



Similarly, you can check the source code of java.util.AbstractCollection class to look at how the removeAll(Collection c) method works, here is a snippet:

public boolean removeAll(Collection c) {
        boolean modified = false;
        Iterator it = iterator();
        while (it.hasNext()) {
            if (c.contains(it.next())) {
                it.remove();
                modified = true;
            }
        }
        return modified;
 }

This implementation iterate over the collection, checking each element returned by the iterator, in turn, to see if it's contained in the specified collection.  If it's present then it is removed from this collection by using Iterator's remove method

Because of using contains() method, removeAll() performance goes into the range of O(n^2), which is an absolutely NO, especially if you are trying to reset a large ArrayList. Now let's see their performance in action to reset an ArrayList of just 100K entries.

If you are interested more in Java Performance measurement and tuning then I also suggest you take a look at Java Performance The Definitive Guide By Scott Oaks, one of the best books on Java profiling.

clear vs removeAll two ways to remove all elements form ArrayList




Removing all elements from ArrayList with 100K Objects

I have initially tried to run this example with 10M elements but after waiting for more than half an hour to let removeAll() finish, I decided to reduce the number of objects to 100K, even then the difference between the time taken by clear() vs removeAll() is quite significant. The removeAll(Collection c) is taking 10000 times more time than clear to reset.

Actually, the purpose of clear() and removeAll(Collection c) are different in API, the clear() method is meant to reset a Collection by removing all elements, while removeAll(Collection c) only removes elements that are present in the supplied collection. This method is not designed to remove all elements from a Collection.

So, if your intention is to delete all elements from a Collection, then use clear(), while if you want to remove only some elements, which are present in another Collection, e.g. list of closed orders, then use the removeAll() method.

import java.util.ArrayList;

/**
 * Java Program to remove all elements from list in Java and comparing
 * performance of clearn() and removeAll() method.
 *
 * @author Javin Paul
 */
public class ArrayListResetTest {
    private static final int SIZE = 100_000;
    public static void main(String args[]) {
     
        // Two ArrayList for clear and removeAll
        ArrayList numbers = new ArrayList(SIZE);
        ArrayList integers = new ArrayList(SIZE);
        
        // Initialize ArrayList with 10M integers
        for (int i = 0; i &lt; SIZE; i++) {
            numbers.add(new Integer(i));
            integers.add(new Integer(i));
        }
     
        // Empty ArrayList using clear method
        long startTime = System.nanoTime();
        numbers.clear();
        long elapsed = System.nanoTime() - startTime;
        System.out.println("Time taken by clear to empty
                ArrayList of 1M elements (ns): " + elapsed);
     
 
       // Reset ArrayList using removeAll method
        startTime = System.nanoTime();
        integers.removeAll(integers);
        long time = System.nanoTime() - startTime;
        System.out.println("Time taken by removeAll to reset 
             ArrayList of 1M elements (ns): " + time);
    }
}

Output:
Time taken by clear to empty ArrayList of 100000 elements (ns): 889619
Time taken by removeAll to reset ArrayList of 100000 elements (ns): 36633112126

Make sure you provide sufficient memory to run this program because it uses two ArrayList to store Integers, especially if you want to compare the performance of clear() and removeAll() for List with 1M elements.



You also need Java 7 to run this program because I am using underscore with the numeric literal feature. If you don't have JDK 7 then just remove underscores from SIZE constants, those are just for improving readability.


That's all about how to reset an ArrayList in Java. We have not only learned two ways to remove all elements from ArrayList but also learned about the difference between the clear vs removeAll method. We have seen that removeAll performs poorly when the list is large and that's why you should always prefer clear() over removeAll() in Java.

By the way, if clearing ArrayList is taking significant time, consider using the new ArrayList, as Java is pretty fast in creating objects.



Other Java ArrayList tutorials you may like

If you are interested to learn more about the ArrayList class, one of the most useful classes from the Java collection framework, then you might like the following articles as well:
  • Java - ArrayList and HashMap Performance Improvement in JDK 7 (read more)
  • Java - How to convert ArrayList to Set? (read more)
  • How to sort an ArrayList in reverse order in Java? (solution)
  • How to remove duplicate elements from ArrayList in Java? (solution)
  • How to clone an ArrayList in Java? (solution)
  • How do you convert a Map to a List in Java? (solution)
  • Java - Performance comparison of contains() vs binarySearch() (read more)
  • Java - How to initialize an ArrayList with values in Java? (example)
  • Java - The ArrayList Guide (tutorial)
  • The difference between an ArrayList and a Vector in Java? (answer)
  • How to make an ArrayList unmodifiable in Java? (solution)
  • 21 ArrayList interview questions with answers (ArrayList questions)
  • How to sort ArrayList by multiple fields in Java? (sort a list)

Thanks for reading this article so far. If you like this tutorial and my example of ArrayList.clear() and ArrayList.removeAll() method to empty an ArrayList in Java then please share with your friends and colleagues. If you have any questions, please drop a note.

P. S. - If you are keen to level up your Java collections skills and need resources then I highly recommend you to check out these best Java collections and Stream Courses to start with. It contains some high-quality online courses to master both Collections and Stream API in Java. 

Also, what is your favorite way to empty an ArrayList in Java? clear(), removeAll() or anything else? Let me know in comments. 

4 comments :

Anonymous said...

I think these two methods got totally diferent meaning and purpose. Like clear() removes all elements and resets the collection where as removeAll(Collection c) takes a input parameter and removes the elements that we passed to the method. So we use clear() to remove all elements where as removeAll(Collection c) is used when we want a list of elements to be removed.

So the point what i m trying to make is clear() and removeAll() got completely different meaning to be compared. Correct me if i am wrong.

Verhás Péter said...

removeAll() is O(n^2) but not for the reason you mention. Calling 'contains()' is O(n) as an average. When you call 'al.removeAll(al)' on an array list on itself the call to 'contains()' finds the first element in one iteration, and that is O(1). That would make the it still O(n). The quadratic performance comes from the array nature of array list: each time an element is removed the rest of the elements are copied one position down. That is an O(n) operation.

Saikat said...

Don't think this is the reason. This is not about performance. The both methods have different meaning.

@Test
void whenClear_thenListBecomesEmpty() {
Collection collection = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5));

collection.clear();

assertTrue(collection.isEmpty());
}

As we can see the collection is empty after clear() being called.

Let's create a second test with two collections, one with numbers from 1 to 5 and the other with numbers from 3 to 7. After that, we'll call removeAll() on the first collection with the second one as a parameter.

We'll expect only the numbers 1 and 2 to remain in the first collection (while the second one is unchanged).

@Test
void whenRemoveAll_thenFirstListMissElementsFromSecondList() {
Collection firstCollection = new ArrayList<>(
Arrays.asList(1, 2, 3, 4, 5));
Collection secondCollection = new ArrayList<>(
Arrays.asList(3, 4, 5, 6, 7));

firstCollection.removeAll(secondCollection);

assertEquals(
Arrays.asList(1, 2),
firstCollection);
assertEquals(
Arrays.asList(3, 4, 5, 6, 7),
secondCollection);
}

And our expectations are met. Only the numbers 1 and 2 are remaining in the first collection and the second one hasn't been changed.

We've seen the purposes of Collection.clear() and Collection.removeAll().

Despite what we might think at first, they aren't doing the same thing. clear() deletes every element from the collection and removeAll() one only removes the elements matching those from another Collection.

javin paul said...

@Saikat, you have got a point there and clear should be used to empty the collection, while removeAll(), though it can also empty the collection is best to remove a subset of elements.

Post a Comment