Getting Distinct Stream Items by Comparing Multiple Fields

Learn to collect or count distinct objects from a stream where each object is distinct by comparing multiple fields in the class.

Java does not have direct support for finding such distinct items from the Stream where items should be distinct by multiple fields. So, we will create a custom Predicate for this purpose.

1. Finding Distinct Items by Multiple Fields

Below given is a function that accepts varargs parameters and returns a Predicate instance. We can use this function to pass multiple key extractors (fields on which we want to filter the duplicates).

This function creates a List of field values and this List act as a single key for that Stream item. The list contains the values of fields to check distinct combinations.

Then these keys are inserted into a ConcurrentHashMap that allows only unique keys.

private static <T> Predicate<T> 
    distinctByKeys(final Function<? super T, ?>... keyExtractors) 
{
    final Map<List<?>, Boolean> seen = new ConcurrentHashMap<>();
     
    return t -> 
    {
      final List<?> keys = Arrays.stream(keyExtractors)
                  .map(ke -> ke.apply(t))
                  .collect(Collectors.toList());
       
      return seen.putIfAbsent(keys, Boolean.TRUE) == null;
    };
}

In the given example, we are finding all persons having distinct ids and names. We should have only 3 records as output.

Collection<Person> list = Arrays.asList(alex, brianOne, 
        brianTwo, lokeshOne,
        lokeshTwo, lokeshThree);

List<Person> distinctPersons = list.stream()
      .filter(distinctByKeys(Person::firstName, Person::lastName))
      .collect(Collectors.toList());

Here Person may be a class or record.

record Person(Integer id, String firstName, String lastName, String email) {
}

2. Distinct by Multiple Fields using Custom Key Class

Another possible approach is to have a custom class that represents the distinct key for the POJO class.

For the previous example, we can create a class CustomKey containing id and name values. The distinct elements from a list will be taken based on the distinct combination of values for all these fields.

In the given example, again, we are finding all records having unique ids and names. Note that in this approach, we are only replacing the List with CustomKey class.

record CustomKey(String firstName, String lastName) {
  public CustomKey(final Person p) 
  {
    this(p.firstName(), p.lastName());
  }
}

Let us see how CustomKey::new is used for filtering the distinct elements from the list based on the given multiple fields.

Collection<Person> list = Arrays.asList(alex, brianOne, 
    brianTwo, lokeshOne,
    lokeshTwo, lokeshThree);

List<Person> distinctPersons = list.stream()
      .filter(distinctByKeyClass(CustomKey::new))
      .collect(Collectors.toList());

//Method accepting Custom key class
public static <T> Predicate<T> 
    distinctByKeyClass(final Function<? super T, Object> keyExtractor) 
{
    Map<Object, Boolean> seen = new ConcurrentHashMap<>();
    return t -> seen.putIfAbsent(keyExtractor.apply(t), Boolean.TRUE) == null;
}

Happy Learning !!

Sourcecode on Github

Comments

Subscribe
Notify of
guest
3 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments

About Us

HowToDoInJava provides tutorials and how-to guides on Java and related technologies.

It also shares the best practices, algorithms & solutions and frequently asked interview questions.

Our Blogs

REST API Tutorial

Dark Mode

Dark Mode