C# GetHashCode Method and why we should give the correct implementation to it?

C# GetHashCode Method and why we should give the correct implementation to it?

In this article, we are exploring the Object hashcode method and why should give the correct implementation.

Let's start with the first question

What is the use of GetHashCode()?

GetHashCode () plays an important role while doing the object/instance equality in C#.

GetHashCode () return 32 bit unique integer value called hash value.

The main purpose of the existence of the Hash code is to allow the types to be used in the hash table.

Hash table are heavily used in the collections in the dot net framework. The collection is internally stored their value in the hash table.

The purpose of the hash table is to speed up looking the items in the collection.

Example

Dictionary internally uses the hash table to store the keys of an element.

How hash table work ?

Hash tables divide the collection into smaller collections called buckets. Suppose we have the

Dictionary<string,T> myDictionay=new Dictionary<string,T>();

This collection has 25 value and supposes the Hash table maintain the 5 buckets to store the keys. Those 25 key will store among the 5 buckets from bucket 0 to bucket 4 (bucket index start from 0).

Probably each bucket contains 5 items some buckets contain less than 5 and some contain more than 5 but on an

average contains 5 items.

Searching an item in 5 items is a lot quicker than 25 items if we know in which bucket item is present.

So you must be thinking about how this distribution of items in the bucket will take place and how to identify the bucket?.

so the answer is here suppose we want to store the key “Orange” then it called the GetHashCode () of “orange”

which is divided by a number of buckets and takes the remainder. The remainder will be the index of the bucket, Where the item will go.

indexOfBucket= x.GetHashCode()% nBuckets;

Example 1.

“Orange”.GetHashCode() —> 2372379284;

“Orange”.GetHashCode()%5 —> return 4;

index Of Bucket = 4

So in the 4th bucket “orange” key will go.

example 2.

for next item having key “apple”

“apple”.GetHashCode() —> 2372379280 % 5

Index Of Bucket = 0

So in 0Th bucket “apple” key will go.

So when next time we call myDictionay[“Orange”] the same procedure will be applied and identify the index of the bucket.

“Orange”.GetHashCode() —> 2372379284;

“Orange”.GetHashCode()%5 —> return 4;

index Of Bucket = 4

then items are searched in the 4th bucket only.

Why we should give the correct implementation to it?

suppose we call the myDictionay[“Apple”] then return “value not present in Dictionary”.

GetHashcode() of string method for “Apple” gives different hash value

than “apple” hash value, and it is obvious that “Apple” and “apple” are different strings.

We want to support our mydictionay will be able to find the “Apple” even “apple” in key are present

in the lower case.

we can provide the string comparer

example

Dictionary<string,T> mydictionay=new Dictionary<string,T>(StringComaparer.OrdinalIgnoreCase);

but we don’t want to provide a comparer.

Dictionary will only be able to find the “Apple” if “Apple” and “apple” GetHashCode () give the same hash value which leads to land to the same bucket where “apple” is stored.

Whereas the string GetHashCode () does not satisfy the need.

So while doing equality they must have the equal hash value. if does not Implement properly then Dictionary will not work.

When we override the equal method then we must override GetHashCode() associated with the equality base condition.

What is the correct way to implement the GetHashCode ()? Below is the requirement for the need to correctly implement GetHashCode ().

Hashtable-based collections frequently invoke the GetHashCode () so performance wise it should be fast. Hash Code values are 32-bit integer values and while storing the collection value it should be evenly distributed in the buckets. So the recommended way to implement the GetHashCode() is to use exclusive OR “^” of the same field used in the

object equal condition in equal() method.

class Person : IEquatable<Person>
{
    public string FirstName { get; set; }
    public string LastName { get; set; }
    public int Age { get; set; }
    public override bool Equals(Person obj)
    {

        if (obj == null || GetType() != obj.GetType())
        {
            return false;
        }

        return this.Age == obj.Age && this.FirstName == obj.FirstName && this.LastName == obj.LastName;

    }

    // override object.GetHashCode
    public override int GetHashCode()
    {
        // TODO: write your implementation of GetHashCode() here
        return Age.GetHashCode() ^ FirstName.GetHashCode() ^ LastName ^ GetHashCode();
    }
}

Why should use exclusive OR “^” ?

Mostly I found that implement of hash code uses the arithmetic operator “+”.

Doing logical operations on the Hash Code has minimum chances of evenly distributing keys in the hash table.

suppose “a” gives hash value 12 and “b” gives 13 then 12+13 =25

using XOR operator 12^13 . 12 in binary 1100 and 13 in binary 1101 1100^1101 = 0001

which is 1. There is no logical relation between inputs and results which leads to fair distribution of a key in the hash table

Logical XOR operators are very fast and they take in the CPU register. It improves the performance very much.

Arithmetic operator evaluations are more costly than the logical operator.

I hope use learn something new

Thanks for reading. keep Learning .....