C# String pooling deep dive

In .Net string is used to store the string Literal. CLR maintains the table for the string storage called intern pool. Inter pool contains a single reference for each unique string. Example. string a= “test”; string b= “test”; So both variables point to the same reference according to the pooling theory. While the assignment of b variable CLR searches the “test” Literal in the intern pool. If the string is found then retrieve the reference and assign it to “b” variable. If not found then create the literal and insert it in the intern pool and return the reference. CLR maintains a single reference for both variables. GetHashCode() method gives same value for both the variable.

Screenshot 2022-06-01 at 8.25.42 AM.png

Advantage

=> by this way system can reduce the allocation of the memory for the same string literals.

Disadvantage

memory allocated by CLR for intern string objects will not get garbage collected until CLR terminates. The memory used by the String objects still be allocated, even though the memory will eventually be garbage collected.

Flexibility

We also have the option to bypass or not require string pooling for the assembly by applying CompilationRelaxations attribute and pass the CompilationRelaxations.NoStringInterning Enum value to

Attribute constructor

So for that assembly string pooling will not apply.

Screenshot 2022-06-01 at 8.27.04 AM.png

Difference between the constant string and local variable string

So deference between the const string and local variable is that,

const is literal it goes the intern pooing while local string variable if they are not literal so the scope is up to the parent

function and GC will remove the memory after it goes out of scope.

Example

Screenshot 2022-06-01 at 8.29.38 AM.png

There is much more reason behind making string immutable and I think there is a strong connection between the immutability of string and string intern pool.

The reason behind the immutable

Security

As we know string is a reference type

a= “test”;

b=”test”;

According to the intern pooling, both objects point to the same object.

Suppose if strings are mutable and I change b=b+”1″; then “a” variable value will be “test1”;

We are using strings in so many sensitive places like Database connection strings.

Thread safety.

Suppose if strings are mutable and if multiple treads are modifying the same string then we found an unpredicted result.

String intern pool

Creating the string is overhead for the CLR so we have the concept of intern pooling.

Suppose we don’t have the string immutable then all distributed references will modify the string.

SO I think that’s why string is immutable, to achieve security, thread safety, and intern pooing.

For the second question about the scope

Interned String object can persist after the application, or even the application domain terminates.

So as its intern string objects will release the memory when the CLR process terminates

I hope you get the idea why string is so important to be immutable Thanks for reading Keep learning .....