Java Stream distinct(): Get Unique Values from Stream

Lokesh Gupta

Added in Java 8, the Stream.distinct() method returns a new Stream consisting of the distinct elements from the given Stream. The distinct() operation removes duplicate elements from a stream, ensuring that only unique elements are retained in the resulting stream.

List<T> distinctItems = stream.distinct().toList();

1. Stream.distinct() Method

The distict() is one such stateful intermediate operation that uses the state from previously seen elements from the Stream while processing the new items.

Stream<T> distinct()
  • The distinct() returns the distinct elements from the given stream. For checking the equality of the stream elements, the equals() method is used.
  • The distinct() guarantees the ordering for the streams backed by an ordered collection. The element appearing first in the encounter order is preserved for ordered streams.
  • For unordered streams, no stability guarantees are made.

2. Find Distinct Elements in a Stream of Strings or Primitives

It is easy to find distinct items from a list of simple types such as String and wrapper classes. These classes implement the required equals() method, which compares the value stored in it.

In the given example, we have List of strings and we want to find all distinct strings from the List. We will use Stream to iterate over all the String elements and collect the distinct String elements into another List using Stream.collect() terminal operation.

Collection<String> list = Arrays.asList("A", "B", "C", "D", "A", "B", "C");
 
List<String> distinctChars = list.stream()
                        .distinct()
                        .collect(Collectors.toList());    //[A, B, C, D]

3. Stream Distincts By Field or Property

In real-world applications, we will be dealing with a stream of custom classes or complex types (representing some system entity).

By default, all Java objects inherit the equals() method from Object class. The default equals() method compares the references for checking the equality of two instances. So, it is highly recommended to override the equals() method and define custom logic for object equality. If we do not override the equals() method in our custom type, then we may see strange behavior while finding the distinct elements from a Stream.

3.1. Override equals() to Define Object Equality

Let’s create a Person class for our example. It has three fields: id, fname and lname. Two persons are equal if their ids are the same. Do not forget to override the equals() method otherwise, the object equality will not work as expected.

public record Person(Integer id, String fname, String lname) {

  @Override
  public boolean equals(final Object obj) {
    if (this == obj) {
      return true;
    }
    if (obj == null) {
      return false;
    }
    if (getClass() != obj.getClass()) {
      return false;
    }
    Person other = (Person) obj;
    return Objects.equals(id, other.id);
  }
}

3.2. Demo

Let’s test the code. We will add a few duplicate person records in the List. Then we will use the Stream.distinct() method to find all instances of Person class with unique id.

Person lokeshOne = new Person(1, "Lokesh", "Gupta");
Person lokeshTwo = new Person(1, "Lokesh", "Gupta");
Person lokeshThree = new Person(1, "Lokesh", "Gupta");
Person brianOne = new Person(2, "Brian", "Clooney");
Person brianTwo = new Person(2, "Brian", "Clooney");
Person alex = new Person(3, "Alex", "Kolen");
 
Collection<Person> list = Arrays.asList(alex, 
                                        brianOne, 
                                        brianTwo, 
                                        lokeshOne, 
                                        lokeshTwo, 
                                        lokeshThree);

// Get distinct people by id
List<Person> distinctElements = list.stream()
            .distinct()
            .collect( Collectors.toList() );

System.out.println( distinctElements );

Program output:

[
Person [id=1, fname=Lokesh, lname=Gupta],
Person [id=2, fname=Brian, lname=Clooney],
Person [id=3, fname=Alex, lname=Kolen]
]

4. Find Distinct Items by Complex Keys or Multiple Fields

We may not always get distinct items based on the natural equality rules. Sometimes, business wants to find distinct items based on custom logic. For example, we may need to find all people who may have any id but their full name is the same. In this case, we must check the equality based on Person class’s fname and lname fields.

Java does not have any native API for finding distinct objects while comparing the objects using a provided user function. So we will create our own utility function and then use it.

We can use the information on the linked post to find the items that are distinct by multiple fields.

4.1. Create distinctByKey() Method

The distinctByKey() function uses a ConcurrentHashMap instance to find out if there is an existing key with the same value – where the key is obtained from a function reference.

The parameter to this function is a lambda expression used to generate the map key for the comparison. If the used key is a custom type, do not forget to override the hashCode() and equals() method.

public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor) 
{
    Map<Object, Boolean> map = new ConcurrentHashMap<>();
    return t -> map.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}

We can pass any field-getter method as a method argument which will cause the field value to act as the key to the map.

4.2. Demo

Check how we are using distinctByKey(p -> p.getFname() + " " + p.getLname()) in the filter() method.

Person lokeshOne = new Person(1, "Lokesh", "Gupta");
Person lokeshTwo = new Person(2, "Lokesh", "Gupta");
Person lokeshThree = new Person(3, "Lokesh", "Gupta");
Person brianOne = new Person(4, "Brian", "Clooney");
Person brianTwo = new Person(5, "Brian", "Clooney");
Person alex = new Person(6, "Alex", "Kolen");
 
Collection<Person> list = Arrays.asList(alex, 
                                        brianOne, 
                                        brianTwo, 
                                        lokeshOne, 
                                        lokeshTwo, 
                                        lokeshThree);

// Get distinct objects by key
List<Person> distinctElements = list.stream()
            .filter( distinctByKey(p -> p.getFname() + " " + p.getLname()) )
            .collect( Collectors.toList() );

System.out.println( distinctElements );

Program Output:

[
Person [id=1, fname=Lokesh, lname=Gupta],
Person [id=4, fname=Brian, lname=Clooney],
Person [id=6, fname=Alex, lname=Kolen]
]

5. Conclusion

The primary purpose of Stream.distinct() is to eliminate duplicate elements from a given stream, guaranteeing that only distinct elements remain in the resulting stream. When applied to a stream, the distinct() operation leverages the equals() and hashCode() methods of the objects within the stream to identify and remove duplicates.

While filtering out duplicates, distinct() operation preserves the original order of elements in the stream.

Happy Learning !!

Sourcecode on Github

Comments

Subscribe
Notify of
guest
5 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.

Our Blogs

REST API Tutorial

Dark Mode

Dark Mode