I’m reading from a binary file and want to convert the bytes to US ASCII strings. Is there any way to do this without calling new on String to avoid multiple semantically equal String objects being created in the string literal pool? I’m thinking that it is probably not possible since introducing String objects using double quotes is not possible here. Is this correct?
private String nextString(DataInputStream dis, int size)
throws IOException
{
byte[] bytesHolder = new byte[size];
dis.read(bytesHolder);
return new String(bytesHolder, Charset.forName("US-ASCII")).trim();
You’d have to have a cache mapping byte arrays to strings, then search through the cache for any equal values before creating a new string.
You can intern existing strings with
intern()as Yishai posted – that won’t stop you from creating more strings, but it’ll make all but the first one (for any char sequence) very short lived. On the other hand, it’ll make all the distinct strings live for a very long time indeed.You can have “pseudo-interning” by using a
Map<String, String>:You could even put a bit more effort in and end up with an LRU cache so that it’ll keep the N most recently fetched strings, discarding others when it needs to.
None of that reduces the number of strings created in the first place, as I say – but is that likely to be a problem in your situation? GCs have been tuned to make it very cheap to create short-lived objects.