Using a profiler, I seem to be seeing the following with Apple’s 1.6 Java:
I start with a moderately long Java string. I split it into tokens using String.split("\\W+"). The code then holds references to some of the split up pieces.
It seems, if I believe my eyes in yourkit, that Java has helpfully not copied these strings, so that I’m in fact holding references to the lengthy originals. In my case this leads to a rather large waste of space.
Does this seem plausible? It’s easy enough to add a loop making copies of these guys.
String.split()does not copy the parts of theString[the new objects…], instead it uses theString‘s fields:offsetandcount. By “changing” them, when later you access theStringobject, it is done by adding the offset to the original reference. This is indeed done to prevent copying the wholeString, and save space [well, at least usually…].So basically yes. All of your new objects, will have the same
char[]reference, which leads to the originalchar[], in the originalString.