Saturday, April 29, 2017

Difference between PriorityQueue and TreeSet in Java?

The PriorityQueue and TreeSet collection classes has a lot of similarities e.g. both provide O(log(N)) time complexity for adding, removing, and searching elements, both are non-synchronized and you can get element from both PriorityQueue and TreeSet in sorted order, but there is fundamental difference between them, TreeSet is a Set and doesn't allow a duplicate element, while PriorityQueue is a queue and doesn't have such restriction. It can contain multiple elements with equal values and in that case head of the queue will be arbitrarily chosen from them. Another key difference between TreeSet and PriorityQueue is iteration order, though you can access elements from the head in a sorted order e.g. head always give you lowest or highest priority element depending upon your Comparable or Comparator implementation but iterator returned by PriorityQueue doesn't provide any ordering guarantee.

Only guarantee PriorityQueue gives that head will always be smallest or largest element. On the other hand, TreeSet keeps all elements in sorted order and iterator returned by TreeSet will allow you to access all elements in that sorted order.

This is one of the frequently asked Collection interview questions and what makes it interesting is the subtle difference between a lot of similarities between PriorityQueue and TreeSet.  You can use one in place of another in some scenarios but not in all scenarios and that's what Interviewer is looking for when he ask this question to you on Interview.




Similarity between PriorityQueue and TreeSet

Before looking at the difference between PriorityQueue and TreeSet, let's first understand the similarities between them. This will help you to think of scenarios where you can use a PriorityQueue in place of TreeSet or vice-versa. 

ThreadSafety
First similarities between PriorityQueue and TreeSet is that both are not thread-safe, which means you cannot share them between multiple threads. If multiple threads are required to modify the TreeSet at the same time, then you must synchronize their access externally.

Ordering
The third similarity between PriorityQueue and TreeSet is that both can be used to access elements in a particular order. TreeSet is a SortedSet, hence it always keeps the elements in the order defined by their Comparator or natural order if there is no Comparator defined, while PriorityQueue will always make sure that head contains the lowest or highest value depending upon the order imposed by Comparator or Comparable.



Eligibility 
When I say eligibility, which means which objects can be stored in PrioritySet and TreeSet? Is there any restriction or all objects are allowed? Well, there is, you can only store objects which implement Comparable or Comparator in both PriorityQueue and TreeSet because the collection classes are responsible for keeping their commitment i.e. PriorityQueue must adjust after every insertion or deletion to keep the lowest or highest element in head position. Similarly, TreeSet must re-arrange elements so that they remain the sorted order specified by Comparator or natural order imposed by Comparable.


Performance
This is one point where you will see both similarity and difference between PriorityQueue and TreeSet in Java i.e. both provides O(log(N)) complexity for adding, removing and searching elements, but when you want to remove the highest or lowest priority element then PriorityQueue gives O(1) performance because it always keep the element in head, much like a heap data structure i.e. minimum heap where root is always the minimum value.  If you are not familiar with heap data structure, I suggest you reading a good book on data structure e.g. Algorithms 4th Edition by Robert Sedgewick.

TreeSet vs PriorityQueue in Java


Difference between PriorityQueue and TreeSet

Now, that you know and understand similarities between TreeSet and PrioritySet, let's see how different they are? and why you cannot use PrioritySet in place of TreeSet in Java?


Underlying Data Structure
This first and foremost difference is underlying data structure. PriorityQueue is a Queue and that's why it provides the functionality of FIFO data structure, while TreeSet is a Set and doesn't provide the Queue API.


Duplicate Elements
The second difference between them can be derived from the first difference i.e. properties of their underlying data structure. Since TreeSet is a Set it doesn't allow duplicate elements but PriorityQueue may contain duplicates. In the case of ties, the head of the priority queue is chosen arbitrarily.



Performance
The third difference between TreeSet and PrirityQueue comes from their relative performance. The PriorityQueue provides largest or smallest element in O(1) time, which is not possible by TreeSet. Since TreeSet is backed by a red black tree, the search operation will take O(logN) time. This is why if you are developing an application where priority matters e.g. a job scheduling algorithm where you always want to execute the job with the highest priority, you should use PriorityQueue data structure.


Availability
The 5th and last difference between PriorityQueue and TreeSet class are that former was added in JDK 1.5 while TreeSet was available from JDK 1.4 itself. This is not a very significant difference in the age of Java 8, but if you are working with legacy systems still running on Java 1.5, a point worth remembering.


Ordering
The fourth difference, which is more subtle than you think because in similarities I have told that both are responsible for keeping some ordering. The key point is that in TreeSet all elements remain in the sorted order, while in priority queue apart from root, which is guaranteed to be smallest or largest depending upon Comparing logic, rest of element may or may not follow any ordering.

This means if you store same elements in the TreeSet and PriorityQueue and iterate over them, then their order will be different. TreeSet will print them in sorted order but PriorityQueue will not until you are always getting the element from the head.  You can read a good core Java book e.g. Java: The Complete Reference, Ninth Edition to learn more about PriorityQueue implementation in Java.

Difference between PriorityQueue and TreeSet in Java



Using PriorityQueue and TreeSet in Java Program

Now, let's some code to highlight the difference between PriorityQueue and TreeSet in Java. If you remember the most subtle difference I mention in the previous paragraph was that TreeSet keeps all elements in the sorted order, while PriorityQueue only keeps the root or head into order. I mean the lowest element will always be at root in PriorityQueue. If you use poll() which consumes the head and removes it, you will always retrieve the elements in increasing order from PriorityQueue, as shown in following Java program.

import java.util.Collections;
import java.util.Date;
import java.util.Iterator;
import java.util.List;
import java.util.PriorityQueue;
import java.util.Queue;
import java.util.Set;
import java.util.TreeSet;

/**
 * Java Program to demonstrate difference between PriorityQueue
 * and TreeSet. 
 * 
 * @author WINDOWS 8
 *
 */
public class App {  

  public static void main(String args[]) {

      Set setOfNumbers = new TreeSet<>();
      Queue queueOfNumbers = new PriorityQueue<>();
      
      // inserting elements into TreeSet and PriorityQueue
      setOfNumbers.add(202);
      setOfNumbers.add(102);
      setOfNumbers.add(503);
      setOfNumbers.add(33);
      
      queueOfNumbers.add(202);
      queueOfNumbers.add(102);
      queueOfNumbers.add(503);
      queueOfNumbers.add(33);
      
      // Iterating over TreeSet
      System.out.println("Iterating over TreeSet in Java");
      Iterator itr = setOfNumbers.iterator();
      while(itr.hasNext()){
        System.out.println(itr.next());
      }
      
      // Iterating over PriorityQueue
      System.out.println("Iterating over PriorityQueue in Java");
      itr = queueOfNumbers.iterator();
      while(itr.hasNext()){
        System.out.println(itr.next());
      }
      
      // Consuming numbers using from head in PriorityQueue
      System.out.println("polling a PriorityQueue in Java");
      while(!queueOfNumbers.isEmpty()){
        System.out.println(queueOfNumbers.poll());
      }
  }
     

}

Output:
Iterating over TreeSet in Java
33
102
202
503
Iterating over PriorityQueue in Java
33
102
503
202
polling a PriorityQueue in Java
33
102
202
503

You can see that the order is different from the Iterator returned by PriorityQueue and HeadSet (highlighted by red) but when you use the poll() method to consume all elements in PriorityQueue, you have got them in the same order they were present in TreeSet i.e. lowest to highest. This proves the point that TreeSet keeps all elements in sorted order while PriorityQueue only care about the root or head position. This kind of details are very important from Java certification perspective as well, you will find a lot of questions based upon such subtle details in mock exams as well as on the real exam.


Summary

Finally, here is the summary of all the differences between TreeSet and PriorityQueue in Java on the nice tabular format for easy reference:

Difference between PriorityQueue vsTreeSet in Java


That's all about the difference between TreeSet and PrirorityQueue in Java. Though both provides some sort of sorting capability there is a huge difference between the behavior of these two collection classes. The first and most important difference is one is Set and other is Queue, which means one doesn't allow duplicate while other is FIFO data structure without any restriction on duplication.  If you want to learn more about advanced data structure and collection classes in Java, I suggest to read a good book core Java e.g. Core Java For the Impatient by Cay. S. Horstmann.

Further Learning
Java Fundamentals: The Java Language by Jim Wilson
Java Fundamentals: Collections
Java Programming Interview Exposed

Other Java Collection Interview Questions you may like
  • Difference between ArrayList and HashSet in Java? (answer)
  • Difference between Hashtable and HashMap in Java? (answer)
  • Difference between HashMap and LinkedHashMap in Java? (answer)
  • Difference between ConcurrentHashMap and HashMap in Java? (answer)
  • Difference between ArrayList and Vector in Java? (answer)
  • Difference between ArrayList and LinkedList in Java? (answer)
  • Difference between TreeMap, HashMap, and LinkedHashMap in Java? (answer)
  • Difference between fail-fast and fail-safe Iterator in Java? (answer)
  • Difference between HashMap, Hashtable, and ConcurrentHashMap in Java? (answer)
  • Difference between an Array and ArrayList in Java? (answer)
  • Difference between synchronized ArrayList and CopyOnWriteArrayList in Java? (answer)

Thanks for reading this article so far, if you like this article then please share with your friends and colleagues. If you have any questions or feedback then please drop a comment.

13 comments :

Robin Richtsfeld said...

The PriorityQueue should have complexity Of(log(n)), but the Java implementation is actually O(n).
This was very disturbing when I found out, so I had to rewrite it completely.

Javin Paul said...

Hello Robin, for which operation you saw O(n) time complexity? could you please explain little more? Thanks

Robin Richtsfeld said...

Hello Javon Paul, the problem is the 'indexOf' method using a plain old for loop and which is used subsequently by 'remove' and 'contains', you get the idea...

Javin Paul said...

Hello Robin,
Indeed, this is seriously weired and it hasn't been rectified yet, even on Java 8. I checked the PriorityQueue code for both Java 1.7 and Java 8 and I still see same code:

public boolean contains(Object o) {
return indexOf(o) != -1;
}

private int indexOf(Object o) {
if (o != null) {
for (int i = 0; i < size; i++)
if (o.equals(queue[i]))
return i;
}
return -1;
}

Did you raise this to Oracle Java Forum?

Robin Richtsfeld said...

Hello Javin Paul,
I even tried to submit an alternative implementation but didn't cope with how the JDK development was organized. But now that you mention it, Oracle Java Forum seems to be a good starting point. May I reference this blog post and comments from there?

Javin Paul said...

Hello Robin, sure, you can. You are helping the community, thank you.

Duc Nguyen said...

I know that PriorityQueue is implemented by a heap behind the sense, and searching an element in a heap take O(N) time.
So, there are any way to improve the contains() or remove() function in Java?

Duc Nguyen said...

Another problem is you said: "but when you want to remove the highest or lowest priority element then PriorityQueue gives O(1) performance because it always keep the element in head, much like a heap data structure".
Why it take O(1), heap take O(logn) to remove the root and then siftDown() the heap?
So it should take O(logn)??

Javin Paul said...

@Duc, if you remove it just take O(1) time but yes if you take re-arranging the heap them it takes O(logN) time.

Robin Richtsfeld said...

The whole point of a PriorityQueue is that its elements are ordered. This allows for O(logN) for most operations.
Removing the first element takes so long because it doesn't make use of this and instead moves one element at a time, hence it has O(n).

Duc Nguyen said...

Hi Robin,
You said that 'remove' and 'contains' of PriorityQueue in Java is bad (It take O(n)). So, how can we make it faster?
I still not get your ideal.
Thanks,

Robin Richtsfeld said...

@Duc When removing from head, just increment a pointer and make the array cyclic -> O(1). When removing from middle, use the improved indexOf -> O(logN) and System.arraycopy and contains using improved indexOf -> O(logN).
Currently indexOf is O(N) but it could be O(logN) when implemented correctly.

Anonymous said...

The PriorityQueue in Java is implemented as Heap but the PriorityQueue class is derived from AbstractQueue. It's still a heap. A priority queue is an abstract data structure, while a heap is a non-abstract data structure satisfying all requirements of a priority queue.

Post a Comment