Engineering Full Stack Apps with Java and JavaScript
String interning is a method of storing only one copy of each distinct string value, and then reusing them.
The distinct values are stored in a fixed-size hashtable usually referred to as string intern pool or string pool.
The single copy of each string is called its 'intern'.
Strings in Java are immutable and hence this sharing is perfectly safe and give you better performance.
A table of strings, initially empty, is maintained by the class String.
When strings are interned, if the pool already contains a string equal to this String object's literal as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
All literal strings and compile-time string-valued constant expressions are automatically interned (using the intern() method) in Java.
This means that, any two constant expressions of type String that designate the same character sequence are represented by identical object references.
Example: Two compile-time constant expressions
String var1= "length: 10";
String var2 = "length: 10";
System.out.println("first and second are equal: " + (var1 == var2));
System.out.println("first and second are equal(with equals): " + (var1.equals(var2)));
The above statements will produce the output of:
first and second are equal: true
first and second are equal(with equals): true
Here, both strings are initialized with compile time constant expressions ("length: 10") and hence they will be referring to same object through automatic interning.
Example: Non compile-time constant, but same value
String var1 = "length: 10";
String var2 = "length: " + var1.length();
System.out.println("first and second are equal: " + (var1 == var2));
System.out.println("first and second are equal(with equals): " + (var1.equals(var2)));
This println statements will produce the output of:
first and second are equal: false
first and second are equal(with equals): true
Here, at runtime, both var1 and var2 will have the value of "length: 10".
However, var1 is initialized by a constant expression, but not var2 and hence automatic interning will not happen for var2, and hence var1 and var2 will be two different objects with same value:
'==' will print false and equals() will print true.
We can do explicit string interning using the intern() method.
When the intern() method is invoked, if the pool already contains a string equal to this String object's literal as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.
For any two strings s and t, s.intern() == t.intern() is true, if and only if s.equals(t) is true.
Example: Intern method
String var1 = "length: 10";
String var2 = "length: " + var1.length();
var2 = var2.intern();
System.out.println("first and second are equal: " + (var1 == var2));
System.out.println("first and second are equal(with equals): " + (var1.equals(var2)));
This will print:
first and second are equal: true
first and second are equal(with equals): true
Here, var1 is initialized by a constant expression, but not var2 and hence interning will not happen for var2 by default.
The statement var2.intern() will compare the value of var2 with values in the pool and return the reference to the string literal pointed by var1 from the pool (but doesn't change var2 itself as Strings are immutable).
Therefore, we needed to assign the returned intern string exlpicitely to var2, making the old String object referred to by var2 eligible for garbage colection.
Example: String interning, but strings are immutable
String var1 = "length: 10";
String var2 = "length: " + var1.length();
var2.intern();
System.out.println("first and second are equal: " + (var1 == var2));
System.out.println("first and second are equal(with equals): " + (var1.equals(var2)));
This will print:
first and second are equal: false
first and second are equal(with equals): true
Here, var1 is initialized by a constant expression, but not var2 and hence interning will not happen for var2 by default.
The statement var2.intern() will compare the value of var2 with values in the pool and return the reference to the string literal pointed by var1 from the pool (but doesn't change var2 itself as Strings are immutable).
To change var2, we needed to assign the returned inter string exlpicitely to var2.
If we create two Strings using the same literal (first example), then both of them will refer to the same String object through interning.
If we use new keyword for String creation, java will create a new String object as the JVM is obliged to create a new String object at run-time for the new keyword, rather than using the one from the String table.
String var1= "length: 10";
String var2 = new String("length: 10");
System.out.println("first and second are equal: " + (var1== var2));
This will print:
first and second are equal: false
As we have seen, you can make the second Stirng also point to the same String object referred by String one by manually calling intern() as:
var2= var2.intern();
At this point the String object created at run-time using new keyword becomes eligible for garbage collection.
For further advanced reading on String interning details, and tuning String interning for better performance, refer to string-interning-and-performance-tuning-in-java.