How HashMap works ....
If anybody asks me to describe “How HashMap works?“, I simply answer: “On principle of Hashing“. As simple as it is. Now before answering it, one must be very sure to know at least basics of Hashing. Right??
What is Hashing
Hashing in its simplest form, is a way to assigning a unique code for any variable/object after applying any formula/algorithm on its properties. A true Hashing function must follow this rule:
Hash function should return the same hash code each and every time, when function is applied on same or equal objects. In other words, two equal objects must produce same hash code consistently.
Note: All objects in java inherit a default implementation of hashCode() function defined in Object class. This function produce hash code by typically converting the internal address of the object into an integer, thus producing different hash codes for all different objects.
A little about Entry class
A map by definition is : “An object that maps keys to values”. Very easy.. right?
So, there must be some mechanism in HashMap to store this key value pair. Answer is YES. HashMap has an inner class Entry, which looks like this:
Surely Entry class has key and value mapping stored as attributes. Key has been marked as final and two more fields are there: next and hash. We will try to understand the need of these fields as we go forward.
What put() method actually does
Before going into put() method’s implementation, it is very important to learn that instances of Entry class are stored in an array. HashMap class defines this variable as:
Lets note down the steps one by one:
- First of all, key object is checked for null. If key is null, value is stored in table[0] position. Because hash code for null is always 0.
- Then on next step, a hash value is calculated using key’s hash code by calling its hashCode() method. This hash value is used to calculate index in array for storing Entry object. JDK designers well assumed that there might be some poorly written hashCode() functions that can return very high or low hash code value. To solve this issue, they introduced anotherhash() function, and passed the object’s hash code to this hash() function to bring hash value in range of array index size.
- Now indexFor(hash, table.length) function is called to calculate exact index position for storing the Entry object.
- Here comes the main part. Now, as we know that two unequal objects can have same hash code value, how two different objects will be stored in same array location [called bucket].
Answer is LinkedList. If you remember, Entry class had an attribute “next”. This attribute always points to next object in chain. This is exactly the behavior of LinkedList.
Answer is LinkedList. If you remember, Entry class had an attribute “next”. This attribute always points to next object in chain. This is exactly the behavior of LinkedList.
So, in case of collision, Entry objects are stored in LinkedList form. When an Entry object needs to be stored in particular index, HashMap checks whether there is already an entry?? If there is no entry already present, Entry object is stored in this location.
If there is already an object sitting on calculated index, its next attribute is checked. If it is null, and current Entry object becomes next node in LinkedList. If next variable is not null, procedure is followed until next is evaluated as null.
What if we add the another value object with same key as entered before. Logically, it should replace the old value. How it is done? Well, after determining the index position of Entry object, while iterating over LinkedList on calculated index, HashMap calls equals method on key object for each Entry object. All these Entry objects in LinkedList will have similar hash code but equals() method will test for true equality. If key.equals(k) will be true then both keys are treated as same key object. This will cause the replacing of value object inside Entry object only.
In this way, HashMap ensure the uniqueness of keys.
How get() methods works internally
Now we have got the idea, how key-value pairs are stored in HashMap. Next big question is : what happens when an object is passed in get method of HashMap? How the value object is determined?
Answer we already should know that the way key uniqueness is determined in put() method , same logic is applied in get() method also. The moment HashMap identify exact match for the key object passed as argument, it simply returns the value object stored in current Entry object.
If no match is found, get() method returns null.
Let have a look at code:
Above code is same as put() method till if (e.hash == hash && ((k = e.key) == key || key.equals(k))), after this simply value object is returned.
Key Notes
1. Data structure to store Entry objects is an array named table of type Entry.
2. A particular index location in array is referred as bucket, because it can hold the first element of a LinkedList of Entry objects.
3. Key object’s hashCode() is required to calculate the index location of Entry object.
4. Key object’s equals() method is used to maintain uniqueness of Keys in map.
5. Value object’s hashCode() and equals() method are not used in HashMap’s get() and put() methods.
6. Hash code for null keys is always zero, and such Entry object is always stored in zero index in Entry[].
This comment has been removed by the author.
ReplyDelete