A recent question about string literals in .NET caught my eye. I know that string literals are interned so that different strings with the same value refer to the same object. I also know that a string can be interned at runtime:
string now = DateTime.Now.ToString().Intern();
Obviously a string that is interned at runtime resides on the heap but I had assumed that a literal is placed in the program’s data segment (and said so in my answer to said question). However I don’t remember seeing this anywhere. I assume this is the case since it’s how I would do it and the fact that the ldstr IL instruction is used to get literals and no allocation seems to take place seems to back me up.
To cut a long story short, where do string literals reside? Is it on the heap, the data segment or some-place I haven’t thought of?
Edit: If string literals do reside on the heap, when are they allocated?
Strings in .NET are reference types, so they are always on the heap (even when they are interned). You can verify this using a debugger such as WinDbg.
If you have the class below
And you call
Foo()on an instance, you can use WinDbg to inspect the heap.The reference will most likely be stored in a register for a small program, so the easiest is to find the reference to the specific string is by doing a
!dso. This gives us the address of our string in question:Now use
!gcgento find out which generation the instance is in:It’s in generation zero – i.e. it has just be allocated. Who’s rooting it?
The ESP is the stack for our
Foo()method, but notice that we have aobject[]as well. That’s the intern table. Let’s take a look.I reduced the output somewhat, but you get the idea.
In conclusion: strings are on the heap – even when they are interned. The interned table holds a reference to the instance on the heap. I.e. interned strings are not collected during GC because the interned table roots them.