How hashmap works in java

Most of you will agree that HashMap is most favorite topic for discussion in interviews now-a-days. I have gone through several discussions with my colleagues time to time and it really helped. Now, I am continuing this discussion with you all.

I am assuming that if you are interested in internal working of HashMap, you already know basics of HashMap, so i am skipping that part. But if you are new to concept, follow official java docs.

Before moving forward, i will highly recommend reading my previous post : Working with hashCode and equals methods in java

Sections in this post

    1) Single statement answer
    2) What is Hashing
    3) A little about Entry class
    4) What put() method actually does
    5) How get() methods works internally
    6) Key Notes
    7) [Update] Improvements in Java 8

Single statement answer

If anybody asks me to describe “How HashMap works?“, I simply answer: “On principle of Hashing“. As simple as it is. Now before answering it, one must be very sure to know at least basics of Hashing. Right??

What is Hashing

Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties. A true Hashing function must follow this rule:

Hash function should return the same hash code each and every time, when function is applied on same or equal objects. In other words, two equal objects must produce same hash code consistently.

All objects in java inherit a default implementation of hashCode() function defined in Object class. This function produce hash code by typically converting the internal address of the object into an integer, thus producing different hash codes for all different objects.

Read more here : Working with hashCode() and equals() methods in java

A little about Entry class

A map by definition is : “An object that maps keys to values”. Very easy.. right?

So, there must be some mechanism in HashMap to store this key value pair. Answer is YES. HashMap has an inner class Entry, which looks like this:

static class Entry<K ,V> implements Map.Entry<K ,V>
{
	final K key;
	V value;
	Entry<K ,V> next;
	final int hash;
	...//More code goes here
}

Surely Entry class has key and value mapping stored as attributes. Key has been marked as final and two more fields are there: next and hash. We will try to understand the need of these fields as we go forward.

What put() method actually does

Before going into put() method’s implementation, it is very important to learn that instances of Entry class are stored in an array. HashMap class defines this variable as:

	/**
	 * The table, resized as necessary. Length MUST Always be a power of two.
	 */
	transient Entry[] table;

Now look at code implementation of put() method:

/**
* Associates the specified value with the specified key in this map. If the
* map previously contained a mapping for the key, the old value is
* replaced.
*
* @param key
*            key with which the specified value is to be associated
* @param value
*            value to be associated with the specified key
* @return the previous value associated with <tt>key</tt>, or <tt>null</tt>
*         if there was no mapping for <tt>key</tt>. (A <tt>null</tt> return
*         can also indicate that the map previously associated
*         <tt>null</tt> with <tt>key</tt>.)
*/
public V put(K key, V value) {
	if (key == null)
	return putForNullKey(value);
	int hash = hash(key.hashCode());
	int i = indexFor(hash, table.length);
	for (Entry<K , V> e = table[i]; e != null; e = e.next) {
		Object k;
		if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
			V oldValue = e.value;
			e.value = value;
			e.recordAccess(this);
			return oldValue;
		}
	}

	modCount++;
	addEntry(hash, key, value, i);
	return null;
}

Lets note down the steps one by one:

1) First of all, key object is checked for null. If key is null, value is stored in table[0] position. Because hash code for null is always 0.

2) Then on next step, a hash value is calculated using key’s hash code by calling its hashCode() method. This hash value is used to calculate index in array for storing Entry object. JDK designers well assumed that there might be some poorly written hashCode() functions that can return very high or low hash code value. To solve this issue, they introduced another hash() function, and passed the object’s hash code to this hash() function to bring hash value in range of array index size.

3) Now indexFor(hash, table.length) function is called to calculate exact index position for storing the Entry object.

4) Here comes the main part. Now, as we know that two unequal objects can have same hash code value, how two different objects will be stored in same array location [called bucket].
Answer is LinkedList. If you remember, Entry class had an attribute “next”. This attribute always points to next object in chain. This is exactly the behavior of LinkedList.

So, in case of collision, Entry objects are stored in LinkedList form. When an Entry object needs to be stored in particular index, HashMap checks whether there is already an entry?? If there is no entry already present, Entry object is stored in this location.

If there is already an object sitting on calculated index, its next attribute is checked. If it is null, and current Entry object becomes next node in LinkedList. If next variable is not null, procedure is followed until next is evaluated as null.

What if we add the another value object with same key as entered before. Logically, it should replace the old value. How it is done? Well, after determining the index position of Entry object, while iterating over LinkedList on calculated index, HashMap calls equals method on key object for each Entry object. All these Entry objects in LinkedList will have similar hash code but equals() method will test for true equality. If key.equals(k) will be true then both keys are treated as same key object. This will cause the replacing of value object inside Entry object only.

In this way, HashMap ensure the uniqueness of keys.

How get() methods works internally

Now we have got the idea, how key-value pairs are stored in HashMap. Next big question is : what happens when an object is passed in get method of HashMap? How the value object is determined?

Answer we already should know that the way key uniqueness is determined in put() method , same logic is applied in get() method also. The moment HashMap identify exact match for the key object passed as argument, it simply returns the value object stored in current Entry object.

If no match is found, get() method returns null.

Let have a look at code:

/**
* Returns the value to which the specified key is mapped, or {@code null}
* if this map contains no mapping for the key.
*
* <p>
* More formally, if this map contains a mapping from a key {@code k} to a
* value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise it returns
* {@code null}. (There can be at most one such mapping.)
*
* </p><p>
* A return value of {@code null} does not <i>necessarily</i> indicate that
* the map contains no mapping for the key; it's also possible that the map
* explicitly maps the key to {@code null}. The {@link #containsKey
* containsKey} operation may be used to distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
	if (key == null)
	return getForNullKey();
	int hash = hash(key.hashCode());
	for (Entry<K , V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) {
		Object k;
		if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
			return e.value;
	}
	return null;
}

Above code is same as put() method till if (e.hash == hash && ((k = e.key) == key || key.equals(k))), after this simply value object is returned.

Key Notes

  1. Data structure to store Entry objects is an array named table of type Entry.
  2. A particular index location in array is referred as bucket, because it can hold the first element of a LinkedList of Entry objects.
  3. Key object’s hashCode() is required to calculate the index location of Entry object.
  4. Key object’s equals() method is used to maintain uniqueness of Keys in map.
  5. Value object’s hashCode() and equals() method are not used in HashMap’s get() and put() methods.
  6. Hash code for null keys is always zero, and such Entry object is always stored in zero index in Entry[].

[Update] Improvements in Java 8

As part of the work for JEP 180, there is a performance improvement for HashMap objects where there are lots of collisions in the keys by using balanced trees rather than linked lists to store map entries. The principal idea is that once the number of items in a hash bucket grows beyond a certain threshold, that bucket will switch from using a linked list of entries to a balanced tree. In the case of high hash collisions, this will improve worst-case performance from O(n) to O(log n).

Basically when a bucket becomes too big (currently: TREEIFY_THRESHOLD = 8), HashMap dynamically replaces it with an ad-hoc implementation of tree map. This way rather than having pessimistic O(n) we get much better O(log n).

Bins (elements or nodes) of TreeNodes may be traversed and used like any others, but additionally support faster lookup when overpopulated. However, since the vast majority of bins in normal use are not overpopulated, checking for existence of tree bins may be delayed in the course of table methods.

Tree bins (i.e., bins whose elements are all TreeNodes) are ordered primarily by hashCode, but in the case of ties, if two elements are of the same “class C implements Comparable<C>“, type then their compareTo() method is used for ordering.

Because TreeNodes are about twice the size of regular nodes, we use them only when bins contain enough nodes. And when they become too small (due to removal or resizing) they are converted back to plain bins (currently: UNTREEIFY_THRESHOLD = 6). In usages with well-distributed user hashCodes, tree bins are rarely used.

I hope, i have correctly communicated my thoughts by this article. If you find any difference or need any help in any point, please drop a comment.

Happy Learning !!

142 thoughts on “How hashmap works in java”
  1. I know in case of two similar keys, Hashmap will replace old key value pair to new key value pair, now see following code

    HashMap hashMap = new HashMap();
    map.put(1, 1);
    map.put(1, 2);

    Now when I will do map.get(1) I will get answer 2, but what if for map.get(1) I want answer as 1(first value inserted)?

  2. If for example I have a hashMap which in it I have instances of Student Class objects and also instaces of Teachers class. Would it be possible that I get only the instances of teacher objects?

    Then my second part of the question is…. would it be possible that within the same method if find the highest salary of one of the teachers?

    Thanks for your help :)

  3. Hi Lokesh,
    I have small doubt regarding put and get method.

    int hash = hash(key.hashCode());

    here i’m not able to understand the line of code means(hash(—-hash code–)).

    1. “2) Then on next step, a hash value is calculated using key’s hash code by calling its hashCode() method”. Reason is given after this in post itself. Please read and let me know if still have doubt.

      1. Hi Lokesh,
        I have small doubt about HashSet. Hashset does not allow duplicate values. HashMap works internally in hashset, is it correct?

          1. i have found the code of HashSet class in rt.jar.

            public class HashSet
            extends AbstractSet
            implements Set, Cloneable, java.io.Serializable

            {
            private transient HashMap map;

            // some code

            private static final Object PRESENT = new Object();

            public HashSet() {
            map = new HashMap();
            }

            // some code ,i.e Other methods in Hash Set

            public boolean add(E e) {
            return map.put(e, PRESENT)==null;
            }
            //some code ——————-
            }

            Based on above code, i think when i create object of HashSet internally another object(hashmap) is created. So hashset internally uses hashmap.
            If i’m wrong please correct me.
            Thanks in Advance

            1. Apologies :-( I misunderstood your statement “HashMap works internally in hashset”. I thought of opposite of what you are saying. Sorry for that.

              Yes, you are correct. Because Map does not allow duplicate keys, so set does not allow duplicate values. Set store it’s values – as key inside map.

  4. hi Lokesh, just tell me hashmap internally uses overriding equals() and hashCode().is there any situation where we need to write these two methods in our hashmap program ???or only hashmap internally uses it???

  5. Hi Lokesh… I found below paragraph on blogspot.com for the same question.

    While rehashing, if two thread at the same time found that now HashMap needs resizing and they both try to resizing. on the process of resizing of HashMap, the element in bucket which is stored in linked list get reversed in order during there migration to new bucket because HashMap doesn’t append the new element at tail instead it append new element at head to avoid tail traversing. If race condition happens then you will end up with an infinite loop.

    Please explain above paragraph. I failed to understand below points –
    1. Why list is getting reversed while rehashing. I mean what advantage we got ?
    2. Why elements are getting added at front end of list ? Why not tail end ?

    1. what this basically means is that tail traversing is a concept in linked list. i will try to explain this with an example. lets say you want to add the following elements to a singly linked list

      23,65,44,12,90

      Ok fine now. you have added 5 elements. so after some time , you need to add a new element 10. so if our algorithm adds elements to end of linked list, it has to traverse through theese five elements to find the tail which can be quite expensive in case of lengthy linked lists. so one efficient way is to add new elements to head instead of tail and change the head pointer to point to new head.so in this case when you add a new element 10, linked list would look like following

      10,23,65,44,12,90

      As you can see, this is a very efficient approach.

      I will answer your second question now(What do they mean by reversing?) So in hashmap when they resize/rehash, they pull up elements from linked list starting from head and make a new linked list and add subsequent elements in order so on each iteration the result will be

      10
      23 10
      65 23 10
      44 65 23 10
      12 44 65 23 10

      90 12 44 65 23 10

      so this is the result of adding new elements to head In short this is a LIFO(Last in First out) structure.

  6. hi
    i just want to know that looping through the hashmap would be faster or through a for loop. I think for loop but just need an expert opinion on this.

  7. The article is really helpful Lokesh.
    I have a querry,
    you said objects are stored in LinkedList in a single bucket. is it like this
    K!V–>K!V–>null
    ?
    Pkease clarify.

  8. Is it possible to write a program in which we can put all elements in same bucket. like overiding some of the methods..

    1. Absolutely. Look at the sourcecode. public Object put(Object obj, Object obj1) is public so you can override it. It uses indexFor(i, table.length) method call for getting the index location. Pass two constants values in this function and it will return the same location everytime; means all elements in same bucket.
      But other operations does not seem so simple. You will need to override a lot of methods to work it properly.

  9. Hello Lokesh,

    Thanks for sharing this nice article. I have a small doubt or say disagreement regarding put method explanation here –

    “So, in case of collision, Entry objects are stored in LinkedList form. When an Entry object needs to be stored in particular index, HashMap checks whether there is already an entry?? If there is no entry already present, Entry object is stored in this location.

    If there is already an object sitting on calculated index, its next attribute is checked. If it is null, and current Entry object becomes next node in LinkedList. If next variable is not null, procedure is followed until next is evaluated as null.”

     public V put(K key, V value) {
            if (key == null)
                return putForNullKey(value);
            int hash = hash(key.hashCode());
            int i = indexFor(hash, table.length);
            for (Entry<k , V> e = table[i]; e != null; e = e.next) {
                Object k;
                if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
                    V oldValue = e.value;
                    e.value = value;
                    e.recordAccess(this);
                    return oldValue;
                }
            }
     
            modCount++;
            addEntry(hash, key, value, i);
            return null;
        }
    
    

    So, that for loop in the put will run on a LinkedList at a particular bucket index, and the only purpose of that for loop is to identify if key already exists and if it is replace that and return the old value.

    and if key doesn’t exist it will invoke

     addEntry(hash, key, value, i);  

    i.e it will add entry to the head of linkedlist.

    Hope you got my point, Please let me know if above explanation has some flaws ?

    1. Hi Saurabh, yes you are right. Correct wording should be “added to head of linkedlist”. Thanks for suggesting this small but critical language change.

  10. Hi Lokesh,
    Your article is very informative. I just have one question.
    If the state of an object is changed, the hashCode of that object will be re-calculate and changed automatically or will stay same until calculated again explicitly?

      1. Great care must be exercised if mutable objects are used as set
        * elements. The behavior of a set is not specified if the value of an object
        * is changed in a manner that affects equals comparisons while the
        * object is an element in the set,,,,,,, Does the same not apply for Map

  11. Hi Lokesh,
    It’s the best explanation I ever came across. I have a small doubt. You mentioned in one of your above notes:
    “If they have different hashcodes, they may or may not be in same bucket. Anything is possible. You see if there are 100 objects, all having different hashcodes. Will you need a hashtable of size 100?? NO. you do not need. You can accommodate them in a hashtable of size 10 also. It means some of objects with different hashcodes will come into single bucket.”.

    How is this implemented ? if there are 100 objects, all having different hashcodes, then how can they go on same/single bucket (Entry type array’s element) ???

  12. Hi sir,Iam Having Small Doubt related to Map.Question is like,if hashmap found 2 same keys,then it will override old value with the latest value .but before overriding with new value,is it possible to take or collect the old value from hashmap sir.Correct me if iam Wrong.Thanks Venkata Sriram

    1. Hi Venkata, It’s really good question. And infact easy as well. If you go through java docs of HashMap the put() method is given as:

      public V put(K key, V value)
      Parameters:
      key – key with which the specified value is to be associated
      value – value to be associated with the specified key
      Returns:
      the previous value associated with key, or null if there was no mapping for key. (A null return can also indicate that the map previously associated null with key.)

      It returns the old value associated with key; IF any. I quickly ran below lines to write an example:

      public static void main(String[] args)
      {
        HashMap<String, String> map = new HashMap<String, String>();
        System.out.println(map.put("data", "initial value"));
        System.out.println(map.put("data", "new value"));
      }
      

      It outputs as:

      null
      initial value

      So, when you write the “new value” then it returns the “initial value”.

  13. Rakesh ,

    What is important is not only the hashcode but the number of buckets(Entry table array element is called bucket).Depending on the size of HashMap ie the number of buckets, the objects hashcode , hash and indexFor function will try to put the object in one of the buckets. In case of HashMap of size 16 only lower 4 bits of hash value is considered and in case of size 32 lower 5 bits are considered for &(byte operator) in indexFor function.I have done analysis of the same.
    Please check this blog for a demo of what I have mentioned above. Lokesh , Please correct me if my understanding is wrong somewhere.

    http://ganeshrashinker.blogspot.in/

  14. Note: All objects in java inherit a default implementation of hashCode() function defined in Object class. This function produce hash code by typically converting the internal address of the object into an integer, thus producing different hash codes for all different objects.

    I disagree to the above statement. It is simply not possible…

    Lets say, I have an Employee class….

    ANd I create 1 trillion trillion different objects of Employee class.
    But HashCode range is limited to the size of Integer in java..

    So there will be 2 different objects, which will have same hashcode…So the statement you’ve given above, cannot be always applied… Though it can be strived to do so.

    1. Hi Yogesh, there are other things to consider as well i.e. garbage collection. You will need a really big heap to store so many objects, and other objects to keep them referenced so that GC should not claim them to free memory. I am pretty sure that you will get OutOfMemoryError much before trying to create so many objects.
      Another argument can be that hashCode() comes into consideration only when you store all these objects in hash backed collection such as HashMap. As HashMap uses fixed length array to store key-value pairs, then the array size will require a continuous memory location of really big size. It will again result in OutOfMemoryError.

  15. when i put three object a1,a2,a3 in the hash map how they store in table? my doubt is when i get a1 than no equals will call but i get a2 than equals call one times and when i get a3 than equals will call two time why and how???

    I have create class A and override equals that will return false and hashcode return 1.

  16. Hashmap values of existing keys getting overwritten upon using put to store a new key-value pair. how to prevent it?

  17. Hi Lokesh Nice Explaination. Could you please tell me what is the default size of the Entry[] table array?Thanks.

  18. Good Job Lokesh and looking forward to hear more topics from you. Excellent Stuff . A must read for all Java Developers….

  19. Hi LOkesh,

    In the below snippet both the lines are used for calculating the indexing for storing the entry object, then do we really require ‘indexFor()’ method as we already got the index from the ‘hash()’ method.
    If yes, please explain the significance of the ‘indexFor()’.

    int hash = hash(key.hashCode());
    int i = indexFor(hash, table.length);

  20. Hi Lokesh,

    Here, indexFor(hash,table.length) Never give the exact position for storing Entry Object.

    for different hash code index number could be the same. what i actually want to say plz look into below code…

    i have created two function indexFor() and for hash() is code(), my question is here for every object hascode is different but index position is same, even hash code is not same…so the rule or contract is that if two unequal object has same hascode then linked list will be generated right…but if we look in deep thats not happening here…10 is the size of table[] array and i used the same

    public class CheckDemo implements Cloneable {
    public static void main(String…args) throws CloneNotSupportedException
    {
    CheckDemo obj=new CheckDemo();
    CheckDemo obj1=new CheckDemo();
    CheckDemo obj2=new CheckDemo();
    CheckDemo obj3=(CheckDemo) obj2.clone();
    CheckDemo obj4=new CheckDemo();
    CheckDemo obj5=new CheckDemo();
    CheckDemo obj6=new CheckDemo();
    CheckDemo obj7=new CheckDemo();
    CheckDemo obj8=new CheckDemo();
    CheckDemo obj9=new CheckDemo();
    CheckDemo obj10=new CheckDemo();
    System.out.println(code(obj.hashCode())+”:”+obj.hashCode());
    System.out.println(“&: “+indexFor(code(obj.hashCode()),10));
    System.out.println(code(obj1.hashCode())+”:”+obj1.hashCode());
    System.out.println(“&: “+indexFor(code(obj1.hashCode()),10));
    System.out.println(code(obj2.hashCode())+”:”+obj2.hashCode());
    System.out.println(“&: “+indexFor(code(obj2.hashCode()),10));
    System.out.println(code(obj3.hashCode())+”:”+obj3.hashCode());
    System.out.println(“&: “+indexFor(code(obj3.hashCode()),10));
    System.out.println(code(obj4.hashCode())+”:”+obj4.hashCode());
    System.out.println(“&: “+indexFor(code(obj4.hashCode()),10));
    System.out.println(code(obj5.hashCode())+”:”+obj5.hashCode());
    System.out.println(“&: “+indexFor(code(obj5.hashCode()),10));
    System.out.println(code(obj6.hashCode())+”:”+obj6.hashCode());
    System.out.println(“&: “+indexFor(code(obj6.hashCode()),10));
    System.out.println(code(obj7.hashCode())+”:”+obj7.hashCode());
    System.out.println(“&: “+indexFor(code(obj7.hashCode()),10));
    System.out.println(code(obj8.hashCode())+”:”+obj8.hashCode());
    System.out.println(“&: “+indexFor(code(obj8.hashCode()),10));
    System.out.println(code(obj9.hashCode())+”:”+obj9.hashCode());
    System.out.println(“&: “+indexFor(code(obj9.hashCode()),10));

    }
    //&: is the index position in table array
    private static int indexFor(int i, int j) {
    // TODO Auto-generated method stub
    return i & (j-1);
    }

    private static int code(int h)
    {
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
    }

    1. I am sorry but I am feeling a little difficulty in understanding your question. Based on my best guess, you are asking if all 10 instances have different hash-codes, why some (or all) are pointing to same index position. And they should not.

      Well, Hashtable does not say anything like this. Nor I have said anything like this in above post. The only thing guaranteed is that if two or more instances are generating same hashcode they will be in same bucket.

      If they have different hashcodes, they may or may not be in same bucket. Anything is possible. You see if there are 100 objects, all having different hashcodes. Will you need a hashtable of size 100?? NO. you do not need. You can accommodate them in a hashtable of size 10 also. It means some of objects with different hashcodes will come into single bucket.

      Let me know if I misunderstood you or you have more questions.

  21. Hi Lokesh, Thanks for the gr8 explanation.
    I have doubt in getting the values from the bucket. Suppose for the same hashcode value, if it has a set of values maintains in bucket as LinkedList, then how it will retrieve the exact value as expected from the bucket using the get() of the HashMap. Because If I pass the key in the HashMap get(), it will try to find the index in the entry table. In the index, If it contains 10 values in the bucket how it will internally match to our expected value.

    1. Please re-read the point no. 4 and 5 in key notes. It says that if mutiple keys are found in form of linked list then key’s equals() method is used to found correct key.
      Remember: if two objects are equal, that is obj1.equals(obj2) is true then, obj1.hashCode() and obj2.hashCode() must return same integer. But vice-versa is not mandatory. It means even if hashcode of two objects are same (same bucket location), still obj1.equals(obj2) can be false (placed in different nodes in linked list).

      Watch again the code:

      if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {

      Hope you got the point.

    2. Hello Ramprem,

      please look at this get method code
      for (Entry e = table[indexFor(hash, table.length)]; e != null; e = e.next) {
      Object k;
      if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
      return e.value;
      }

      When you pass a key in get method it just generate the index position of the table[] array(meas found the bucket only, now need to find right element), then it will start to move on linked list just focus on this line e != null; e = e.next) it will move on nodes and will do comparison with key and hash value only if following condition is true than return you value else return null

  22. Great article…
    If you can provide a pictorial representation of hashmap working that would really help, it will take less time in understanding.

  23. Nice article. I have a query though. What if my requirement is about having 2 keys for the same object. For ex:
    I have key as ‘fname, lname’ for value ’emp1′ and also ‘lname, fname’ for value ’emp1′ i.e. for both the keys, same value(emp1) should be fetched. How do I design that? One option I can think of is having 2 maps with maintaining each pair, is there a better way?

    1. What’s problem in storing in single HashMap. HashMap applies all rules on key objects, not on value objects. For hashmap “fname, lname” and “lname, fname” both are different keys. HashMap doesn’t care if they derived from same object. So, if you insert [“fname,lname” -> emp1] and [“lname,fname” -> emp1], both will be stored in HashMap as normal entries without any conflict.

      1. Thanks Lokesh for your quick response. Maintaining separate key, value pair in the same map, for each key, sounds fine, though it looks too simple to be correct(Can’t see any obvious problems though). Can there be any issues like size of the map etc for this solution considering the situation will typically arise for maintaining employee records? I was googling and could get the below link where a similar problem was discussed. Your thoughts?:

        http://stackoverflow.com/questions/822322/how-to-implement-a-map-with-multiple-keys

        1. Actually it is that much simple. Strange, everybody talking about having two maps without providing a solid reason to choose two. May be I am not able to think that way they are thinking, but I do not see any problem with one HashMap instance. If you look at MultikeyMap internal’s it also uses one hashmap only i.e. it is possible.

          http://grepcode.com/file/repo1.maven.org/maven2/net.sourceforge.collections/collections-generic/4.01/org/apache/commons/collections15/map/MultiKeyMap.java#MultiKeyMap.putMultiKey%28java.lang.Object%2Cjava.lang.Object%29

          public V putMultiKey(V value, K... keys) {
          int hashCode = hash(keys);
          int index = map.hashIndex(hashCode, map.data.length);
          AbstractHashedMap.HashEntry, V> entry = map.data[index];
          while (entry != null) {
          if (entry.hashCode == hashCode && isEqualKey(entry, keys)) {
          V oldValue = entry.getValue();
          map.updateEntry(entry, value);
          return oldValue;
          }
          entry = entry.next;
          }

          map.addMapping(index, hashCode, new MultiKey(keys), value);
          return null;
          }

  24. Hi Lokesh,
    I know definition of Interface and programatically also. Where can i use interface actually. Can you please give me an example ????

        1. e.g. Serializable, Comparable, Cloneable etc. are interfaces are used to give certain characteristic/behavior to any class. While String, Integer or Vector are classes. In any application, Report.java would be a class but Printable.java would be interface which Report.java will implement.

          There are other factors also, which differ from case to case. e.g. When you are designing an API then you would want to expose all the functionality through public interfaces to segregate the implementation from client code. This gives flexibility to change the implementation without changing the public interface. e.g. Mostly all frameworks like Struts or Spring expose their extension points though interfaces or abstract classes.

  25. Very good article but I have one doubt.
    Do all the objects in the same bucket have the same HashCode or they might have different HasCode because the indexFor method can return the same index for two different has code.

      1. My question is “Suppose there are two keys with different HashCode but when we pass these two HashCode to indexFor function, it returns the same index then both the keys will get inserted in the same bucket, right?”

      2. Hi Lokesh,

        Thanks for the wonderful articles you write.

        I think your above reply is contradicting with your another reply in this article saying “If they have different hashcodes, they may or may not be in same bucket. Anything is possible. You see if there are 100 objects, all having different hashcodes. Will you need a hashtable of size 100?? NO. you do not need. You can accommodate them in a hashtable of size 10 also. It means some of objects with different hashcodes will come into single bucket.”

        Please clarify.
        Thanks

  26. thats a nice one..
    just one question.. how does this “addEntry(hash, key, value, i);” work ?
    is there any implementation of that ?

  27. The explanation is not correct:

    “If there is already an object sitting on calculated index, its next attribute is checked. If it is null, and current Entry object becomes next node in LinkedList. If next variable is not null, procedure is followed until next is evaluated as null.”

    The quoted/explained code will take care of “updates” of values with the same key, not additions of new pairs. The actual add operation is (a few methods down the stack) here:

    void createEntry(int hash, K key, V value, int bucketIndex) {
    Entry e = table[bucketIndex];
    table[bucketIndex] = new Entry(hash, key, value, e);
    size++;
    }

    Where, most importantly: the new entry is added to the beginning of the linked list, to avoid unnecessary traversal. And the existence of any entries at that bucketIndex is not even checked, if it’s null, that will be the “e” (next) of the new Entry,

    1. Wrong. You looked up the code incorrectly. The method you are mentioning does not participate in HashMap.put() method anywhere. So the logic does not apply here.

      The method “createEntry(int hash, K key, V value, int bucketIndex) ” is called only when you try to “clone a hashmap”. And when you are cloning a map, then actually you do not need to worry about uniqueness of keys, because you know thy have already been checked.

      Also, how they are put in bucket in cloning process, is not the topic in this post.

      1. the only difference between addEntry and createEntry is resizing of the table. But as Szabolcs Parragh said, it will not iterate over the loop, it simply adds the element in the begining. The for loop is just for duplicate key values. but, the tutorial was really helpful :)

  28. Hi sir,really super iam using hashmap,i studied at so many sites but this gave me more clear picture.thanks once again.

  29. The explanation is “Crystal Clear” … thanks Lokesh.
    Do you have a similar page that describes the internal working of ConcurrentHashMap ??

    – Ramesh V L

  30. Great Article!!

    But I have a question here.. What is default length of table? If its an array I m sure table length would be declared somewhere?

      1. DEFAULT_INITIAL_CAPACITY = 16 and instances of Entry class are stored in an array.So does the array store 16 key-value pairs(Eg: ‘a=Aus’,’b=Berlin’ etc)? What about the load factor..At some point of time, it has to dynamically increase its size….right?

        1. Every array index works as a bucket. All 16 can go into single bucket, or all in different buckets. As soon as you insert 17th key-value pair, hashmap grew by load factor. By default load factor in 0.75f

          1. If have readb elow points some where regarding size of HashMap.
            suppose 16 is the initial size and i have added 12 ie. 16*0.75 element into hashmap that rehashing will happen and size will double.
            kindly put some light on this..

  31. The best article i have ever read about Hashmaps. so neat and clean.
    Thanks Lokesh

    can you Please put a article explaining thread locking(Synchronization concepts).
    Looking forward to hear from you

  32. Very well explained …. I read other post also on same topic earlier but this one is very well explained. Thanks buddy

    1. In java, List by default allows the duplicates.. so it means that you need to do something extra for sure. Here some alternatives can be :

      1) Using 3rd party implementations like :
      http://collections15.sourceforge.net/apidocs/net/sf/collections15/list/SetUniqueList.html

      2) Use logical expressions like:

      ArrayList al = new ArrayList();
      if(!al.contains(“red”)){
      al.add(“red”);
      }

      3) Extend the ArrayList class and override all add methods and put above logic in all overridden methods.

      Choice is yours !!

  33. Pingback: Java 中 HashMap 的工作机制 | 水货网

Note:- In comment box, please put your code inside [java] ... [/java] OR [xml] ... [/xml] tags otherwise it may not appear as intended.

Leave a Reply

Your email address will not be published. Required fields are marked *


5 × eight =

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>