This question bothers me quite sometime and i cant seem to find any answer from oracle java resource. does anyone have any clue on this matter:
I understand that a string is actually an array of char. However, I wonder how is it different when comes to how the data is store in physical memory for the following two cases:
Case 1) total of 10 Strings put inside a hashmap. Each string start with 0 length. and each second, each string is append with 1000 bytes. until it reaches 1MB each
Case 2) total of 10 Strings put inside a hashmap. Each string start with 1M length (with space). and each second, each string is replace with 1000 bytes. until it replaced all the 1MB each.
For case1 will it causes more reference to be made in physical memory because the string length keeps growing and new allocation needs to be made? or does it “push” the data behind so it can allocate the next available memory?
For case2 does it means less reference require (or practically no reference required) because the string was initiate with 1MB length in the first place?
Lastly, i wonder does these two cases has any impact to Garbage Collector or memory allocation performance?
Actually they’re both about the same when dealing with Strings.
Simply put, Strings are immutable. So, if you have a String of 1000 chars, and append 1000 chars to it, you then have a single String of 2000 chars, and the previous one is available for garbage collection.
If you have a 1M String and change it, you have a new 1M String, and the old one is available for garbage collection. Since Strings are immutable, there’s no gimmickry of splitting the old string, removing what you want, adding the new and appending the old and new together. Rather, it simply copies it wholesale with the new version.
There are other structures that behave better, but still have similar issues.
For example, if you have a StringBuilder, it will behave almost exactly like a normal String in the 1000 + 1000 case. However, if you know this is going to happen, you can pre-allocate it to, say, 10,000 and then it will simply copy in to the pre-allocated space rather than throwing the entire old kit away.
Another feature of String immutability is that Strings can be shared.
A simple example is this:
This will have a single array of 6 chars (“abc123”), but BOTH String will point to this array, the ‘b’ String will point to the offset within the array from the original.
The downside of this is:
‘b’ now points to the original buffer that ‘a’ used, even though it only “sees” 3 characters. So, your ‘b’ String is actually holding on to 1M chars of memory.