Learn to use Stream.distinct()
method for finding the distinct elements by field from a Stream. We can use the information on the linked post to find the items that are distinct by multiple fields.
List<String> distinctItems = list.stream().distinct().collect(Collectors.toList())
1. Stream distinct()
API
The distict()
is one such stateful intermediate operation that uses the state from previously seen elements from the Stream while processing the new items.
Stream<T> distinct()
- The
distinct()
returns a new stream consisting of the distinct elements from the given stream. For checking the equality of the stream elements, theequals()
method is used.
- The
distinct()
guarantees the ordering for the streams backed by an ordered collection. The element appearing first in the encounter order is preserved for ordered streams. - For unordered streams, no stability guarantees are made.
2. Find Distinct in Stream of Strings or Primitives
It is easy finding distinct items from a list of simple types such as String
and wrapper classes. These classes implement the required equals()
method, which compares the value stored in it.
In the given example, we have List
of strings and we want to find all distinct strings from the List
. We will use Stream to iterate over all the String
elements and collect the distinct String
elements into another List
using Stream.collect()
terminal operation.
Collection<String> list = Arrays.asList("A", "B", "C", "D", "A", "B", "C");
List<String> distinctChars = list.stream()
.distinct()
.collect(Collectors.toList()); //[A, B, C, D]
3. Find Distinct Objects By Field
In real-world applications, we will be dealing with a stream of custom classes or complex types (representing some system entity).
By default, all Java objects inherit the equals()
method from Object
class. The default equals() method compares the references for checking the equality of two instances. So, it is highly recommended to override the equals() method and define custom logic for object equality. If we do not override the equals()
method in our custom type, then we may see strange behavior while finding the distinct elements from a Stream.
3.1. Overide equals()
Method
Let’s create a Person class for our example. It has three fields: id
, fname
and lname
. Two persons are equal if their ids
are the same. Do not forget to override the equals()
method otherwise, the object equality will not work as expected.
public record Person(Integer id, String fname, String lname) {
@Override
public boolean equals(final Object obj) {
if (this == obj) {
return true;
}
if (obj == null) {
return false;
}
if (getClass() != obj.getClass()) {
return false;
}
Person other = (Person) obj;
return Objects.equals(id, other.id);
}
}
3.2. Demo
Let’s test the code. We will add a few duplicate person records in the List
. Then we will use the Stream.distinct()
method to find all instances of Person class with unique id
.
Person lokeshOne = new Person(1, "Lokesh", "Gupta");
Person lokeshTwo = new Person(1, "Lokesh", "Gupta");
Person lokeshThree = new Person(1, "Lokesh", "Gupta");
Person brianOne = new Person(2, "Brian", "Clooney");
Person brianTwo = new Person(2, "Brian", "Clooney");
Person alex = new Person(3, "Alex", "Kolen");
Collection<Person> list = Arrays.asList(alex,
brianOne,
brianTwo,
lokeshOne,
lokeshTwo,
lokeshThree);
// Get distinct people by id
List<Person> distinctElements = list.stream()
.distinct()
.collect( Collectors.toList() );
System.out.println( distinctElements );
Program output:
[
Person [id=1, fname=Lokesh, lname=Gupta],
Person [id=2, fname=Brian, lname=Clooney],
Person [id=3, fname=Alex, lname=Kolen]
]
4. Find Distinct Objects by Complex Keys
We may not always get distinct items based on the natural equality rules. Sometimes, business wants to find distinct items based on custom logic. For example, we may need to find all people who may have any id
but their full name is the same. In this case, we must check the equality based on Person
class’s fname
and lname
fields.
Java does not have any native API for finding distinct objects while comparing the objects using a provided user function. So we will create our own utility function and then use it.
4.1. Create distinctByKey() Method
The distinctByKey()
function uses a ConcurrentHashMap
instance to find out if there is an existing key with the same value – where the key is obtained from a function reference.
The parameter to this function is a lambda expression used to generate the map key for the comparison. If the used key is a custom type, do not forget to override the hashCode() and equals() method.
public static <T> Predicate<T> distinctByKey(Function<? super T, Object> keyExtractor)
{
Map<Object, Boolean> map = new ConcurrentHashMap<>();
return t -> map.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}
We can pass any field-getter method as a method argument which will cause the field value to act as the key to the map.
4.2. Demo
Check how we are using distinctByKey(p -> p.getFname() + " " + p.getLname())
in the filter()
method.
Person lokeshOne = new Person(1, "Lokesh", "Gupta");
Person lokeshTwo = new Person(2, "Lokesh", "Gupta");
Person lokeshThree = new Person(3, "Lokesh", "Gupta");
Person brianOne = new Person(4, "Brian", "Clooney");
Person brianTwo = new Person(5, "Brian", "Clooney");
Person alex = new Person(6, "Alex", "Kolen");
Collection<Person> list = Arrays.asList(alex,
brianOne,
brianTwo,
lokeshOne,
lokeshTwo,
lokeshThree);
// Get distinct objects by key
List<Person> distinctElements = list.stream()
.filter( distinctByKey(p -> p.getFname() + " " + p.getLname()) )
.collect( Collectors.toList() );
System.out.println( distinctElements );
Program Output:
[
Person [id=1, fname=Lokesh, lname=Gupta],
Person [id=4, fname=Brian, lname=Clooney],
Person [id=6, fname=Alex, lname=Kolen]
]
Happy Learning !!
Leave a Reply