Take a look at this test
String s1 = "1234";
String s2 = "123";
Field field = String.class.getDeclaredField("value");
field.setAccessible(true);
char[] value1 = (char[]) field.get(s1);
char[] value2 = (char[]) field.get(s2);
System.out.println(value1 == value2);
It prints false and it means that the JVM holds two different char arrays for s1 and s2. Can anybody explain the reason why s1 and s2 cannot share the same char array? It seems like java.lang.String was designed for content sharing, isn’t it?
Note: I don’t know about all JVMs. This is Oracle’s Java HotSpot(TM) Client VM 22.1-b02 (JRE 1.7).
UPDATE
On the other hand, if partial sharing is rare (it seems it’s only for Strings created by String.substring) then why should all Strings have int count and int offset fields? It is 8 useless bytes. And this is not only the size, it is also the creation speed. The bigger the object the longer its initialization. Here’s a test
long t0 = System.currentTimeMillis();
for (int i = 0; i < 10000000; i++) {
new String("xxxxxxxxxxxxx");
}
System.out.println(System.currentTimeMillis() - t0);
it takes ~200ms. If I use this class
class String2 {
char[] value;
String2(String2 s) {
value = s.value;
}
}
it takes ~140 ms.
They can, they just don’t, probably because the JVM start-up time would be impacted by looking through the string
internpool for partial matches.It’s worth noting that with non-interned strings, they can share a char array, in certain cases:
…at least through OpenJDK 6. Apparently, in OpenJDK7 they don’t share anymore (thank you Marko Topolnik for teaching me that here).
And interestingly, Sun’s JVM 1.6 separates them if you intern:
I get:
I guess it doesn’t like having strings in the intern pool that are subsets of other strings.