This Java tutorial discusses the different ways to iterate over a HashMap and compare the performance of each technique so we can make an informed decision.
It is recommended to learn how HashMap works internally to know how the data is stored in a HashMap. If my last similar post, we compared different “for loop” available in Java. These studies usually help in setting up best practices for your next project.
1. Introduction
In this post, I decided to compare the performance of different ways to traverse through the HashMap in Java. HashMap is a very widely used class, and most of the time, we fetch the value using get(Object key) method provided by the class. But it is sometimes required to iterate over the whole Map and fetch all key-value pairs stored in it.
For example, analyzing all request parameters sent from the client. In such cases, for every client, we are iterating the whole map at least once in during the request processing.
If we are using this type of iteration in many places in the code and there are many requests, then we surely would like to optimize the HashMap iteration code to make the best use of it. My below-given analysis will help us to decide our next step.
2. Different Ways to Iterate over a Map
Let us start with different ways to iterate over HashMap. The HashMap has been defined as follows:
Map<String, Integer> map = new HashMap();
2.1. HashMap.forEach()
Since Java 8, we can use the forEach() method to iterate over the keys and values stored in Map.
map.forEach((key, value) -> {
System.out.println(key + ": " + value);
//...
});
2.2. Iterating over Map Entries
The entrySet() method returns all the entries in a Set view that we can iterate similar to a normal HashSet in Java.
for (Map.Entry<String, Integer> entry : map.entrySet()) {
String key = entry.getKey();
Integer value = entry.getValue();
//...
}
2.3. Iterating over Map Keys and Accessing Values
The keySet() method returns all keys as a Set. We can iterate over all the keys, and use them to access the values from the Map. Note that it requires an additional step to access the value, so in most cases, it is not recommended approach.
for (String key : map.keySet()) {
Integer value = map.get(key);
//...
}
2.4. Using Iterator
As we get the Set view from entrySet() and keySet(), we can use the Iterator to iterate through the Map.
The following code iterates over the Map entries using the Iterator.
Iterator<Entry<String, Integer>> entryIterator = map.entrySet().iterator();
while (entryIterator.hasNext()) {
Map.Entry<String, Integer> entry = entryIterator.next();
String key = entry.getKey();
Integer value = entry.getValue();
//...
}
Similarly, the following code iterates over the Map keys using the Iterator and accesses the values using keys.
Iterator<String> keySetIterator = map.keySet().iterator();
while (keySetIterator.hasNext()) {
String key = keySetIterator.next();
Integer value = map.get(key);
//...
}
3. Comparing the Performance of Different Techniques
Now let us compare the performances of all types of iterations for a common data set stored in the Map. We are storing 1 million key-value pairs in the Map.
Map<String, Integer> map = new HashMap();
for (int i = 0; i < 10_00_000; i++) {
map.put(String.valueOf(i), i);
}
We will iterate over the map in all four ways. We will also fetch the key and value from the map for all entries in the best suitable way. We are using the JMH module for benchmarking the performance of all the methods.
@Benchmark
public void usingForEach(Blackhole blackhole) {
map.forEach((key, value) -> {
blackhole.consume(key);
blackhole.consume(value);
});
}
@Benchmark
public void usingEntrySetWithForLoop(Blackhole blackhole) {
for (Map.Entry<String, Integer> entry : map.entrySet()) {
blackhole.consume(entry.getKey());
blackhole.consume(entry.getValue());
}
}
@Benchmark
public void usingKeySetWithForLoop(Blackhole blackhole) {
for (String key : map.keySet()) {
blackhole.consume(map.get(key));
}
}
@Benchmark
public void usingEntrySetWithForIterator(Blackhole blackhole) {
Iterator<Entry<String, Integer>> entryIterator = map.entrySet().iterator();
while (entryIterator.hasNext()) {
Map.Entry<String, Integer> entry = entryIterator.next();
blackhole.consume(entry.getKey());
blackhole.consume(entry.getValue());
}
}
4. Result
The benchmarking score of the above program is :
c.h.c.collections.map...usingForEach thrpt 15 89.044 ± 2.703 ops/s
c.h.c.collections.map...usingEntrySetWithForLoop thrpt 15 54.906 ± 6.326 ops/s
c.h.c.collections.map...usingKeySetWithForLoop thrpt 15 52.163 ± 4.517 ops/s
c.h.c.collections.map...usingEntrySetWithForIterator thrpt 15 63.494 ± 4.334 ops/s
Although there is not much difference in all mobe methods, we can see that:
- Using for loop with the keySet(), entrySet() methods perform almost equally.
- Using iterator() methods perform comparatively worse.
- Surprisingly, the forEach() method takes the most time. It is because the
forEachmethod uses internal iteration, which means that the iteration logic is handled within theHashMapimplementation. It calls the provided lambda expression for each element in theHashMap. This internal iteration comes with some overhead due to lambda expression execution and additional function calls.
For the sake of clean coding, we can conclude that:
- If Map contains only a few entries, using HashMap.forEach() is the best way to iterate over a HashMap.
- If Map contains only a million of entries, using HashMap.entrySet() is the best way to iterate over a HashMap.
5. Conclusion
10 lacs is a very big number for most of the application requirements. Even though the difference is not very substantial in milliseconds, as compared to it was very big in the case of for-loops. I believe most of us can live with such a minor difference.
But if you want to make the conclusion, using an entry set very specifically is more powerful and yields better performance than others. The result varies from 20% – 50% when the above program is executed multiple times.
Please do let me know your thoughts about the above analysis.
Happy Learning !!
So, I’ve found that entry set is only faster after the first loop through (as the entry objects are cached, I’d assume).
Using entrySet() in for-each loop for key/value #1: 95
Using entrySet() in for-each loop for key/value #2: 66
Using keySet() in for-each loop for key/value: 68
Using entrySet() in for-each loop for key: 33
Using keySet() in for-each loop for key: 31
Using entrySet() in for-each loop for value: 33
Using values() in for-each loop for value: 33
I’m assuming your example is “optimizing” a bit since many of your methods have no-op code in the loop itself. Add a System.out.println for each of your values so Java can’t ignore what’s happening in the loop and you’ll probably get different results. Also, if you really want to do timing, don’t use the Calendar. System.nanoTime() is much more accurate. (System.currentTimeInMillis() isn’t guaranteed to be consistently as accurate last I checked, anyway)
It’s always better to use JMH for nano/microbenchmarks ;)
https://gist.github.com/AFulgens/42ef34d625dd5b00f62d9ed77e727ccb
Bottomline: I would say it’s a best practice to use `entrySet()` over `keySet()` when you are iterating over a Map.
Thanks for sharing your analysis, and confirming my conclusion.
Hi Lokesh,
I am little confuse about working of entrySet method.
public Set<Map.Entry> entrySet() {
return entrySet0();
}
private Set<Map.Entry> entrySet0() {
Set<Map.Entry> es = entrySet;
return es != null ? es : (entrySet = new EntrySet());
}
When this entrySet is assigning values into this , i can see its called the constructor which is default one.Initially this entrySet is null but when its assigned values into this entrySet.
keyset iterator() is bound to be slow, since there is an extra step to lookup of value .
Great compairsion!
Thanks for that! That helped me to solve a problem I had using Primefaces bar charts… thanks!
Hi Lokesh,
Firstly Brilliant article on HashMap working.
QUERY:Will there be an issue if I create the Hashmap inside the static block itself?
I don’t see any adverse effect on HashMap when you create HashMap inside static block. Only one thing concern me is that HashMap instance will also be static; so all the objects it refers from either keys OR values will not be garbage collected by GC for longer time. Usually static variables in application live longer than their non-static counterparts.
Got the point.:) :)
but first thing still holds TRUE ..ARTICLE WAS IMMENSLY HELPFUL…
Please explain it bit more i.e the same u explained in loop concepts….
Hi Lokesh,
As per my Analysis, If you want to retrieve some value from Map based on “key” search then keySet would be faster(little bit) But if you want to parse the whole Map then entrySet would be faster 30 to 40 % based on data size. If required then I can post the code also.
Please share.
Hi Lokesh,
I also tried to find the best way to parse a Map and found that results are better with keySet. Here I am posting the code that I used for my analysis. Kindly have a look…and please let me know if you find something valuable to share…Thanks in advance..
—————————————————————————————————–
package com.himanshu;
import java.util.HashMap;
import java.util.Map;
import java.util.Map.Entry;
public class Test
{
static Integer finalKey = 10000;
public static void main(String[] args)
{
Map map = new HashMap();
for(int i=1 ; i <= 1000000; i++)
{
map.put(i, "a");
}
long first = parseThroughEntrySet(map);
long second = parseThroughKeySet(map);
System.out.println("First :"+first);
System.out.println("Second :"+second);
}
static long parseThroughEntrySet(Map map)
{
long startTime = System.currentTimeMillis();
String value;
for (Entry entry : map.entrySet())
{
Integer key = entry.getKey();
if(key.equals(finalKey))
{
value = entry.getValue();
}
}
long endTime = System.currentTimeMillis();
long totalTime = endTime – startTime;
return totalTime;
}
static long parseThroughKeySet(Map map)
{
long startTime = System.currentTimeMillis();
String value;
for(Integer key : map.keySet())
{
if(key.equals(finalKey))
{
value = map.get(key);
System.out.println(“Done in parseThroughKeySet !!!” +value);
}
}
long endTime = System.currentTimeMillis();
long totalTime = endTime – startTime;
return totalTime;
}
}
————————————————————————————————————————————–
It will be good if you explain the reason for performance differences
Hi ,
I have tried this and come with different results everytime , however i find that results are better with key set , even i read somewhere that key set is more better than entry set .
Regards,
chandra
Can you please the code you used.