When is it a good idea to intern strings manually in a .Net code?

I have done this is deserialization/materialization code when there is a good chance of repeated values (almost an enum, but not quite). When deserializing thousands of records this can give a significant memory benefit. However, in such cases you might prefer to use a separate intern cache, to avoid saturatig the shared one (or maybe the shared one is fine; it depends on the scenario) But the key point there is: a scenario where you are likely to have lots and lots of different string instances with the same value.

Deserialization is a big candidate there. It should also be note that there is some CPU overhead in checking the interned cache (progressively more overhead as you add data), so this should obly be done if there is a chance that the constucted objects are goin to live more than gen-0; if they are always going to be collected quickly anyway then it isn't worth swapping them for interned versions.

I have done this is deserialization/materialization code when there is a good chance of repeated values (almost an enum, but not quite). When deserializing thousands of records this can give a significant memory benefit. However, in such cases you might prefer to use a separate intern cache, to avoid saturatig the shared one (or maybe the shared one is fine; it depends on the scenario).

But the key point there is: a scenario where you are likely to have lots and lots of different string instances with the same value. Deserialization is a big candidate there. It should also be note that there is some CPU overhead in checking the interned cache (progressively more overhead as you add data), so this should obly be done if there is a chance that the constucted objects are goin to live more than gen-0; if they are always going to be collected quickly anyway then it isn't worth swapping them for interned versions.

Please elaborate on the separate intern cache. Is this something that a . Net framework provides, or something that needs to be implemented manually?

Thank you. – Hamish Grubijan Nov 13 '10 at 21:47 @Hamish - something as simple as a Dictionary-of-string-string will suffice, using the same string in key and value. (I would format that properly, but I'm on iPod) – Marc Gravell?

Nov 13 '10 at 21:51 Thanks, seeing some code samples would help me a great deal. – Hamish Grubijan Nov 15 '10 at 23:17 1 @Hamish look at DeserializeImpl here which calls into Intern here – Marc Gravell? Nov 16 '10 at 14:23.

It's a good idea to do so when profiling shows that it gives performance benefits.

1 But implementing and profiling it one should know in which situations it might make sense at all. – CodeInChaos Nov 13 '10 at 20:30 That is a bit of a circular argument, though, and it is worth stressing that interning is about memory performance (doing the checks etc actually reduces CPU performance) – Marc Gravell? Nov 13 '10 at 21:40.

It is done by the runtime, but a language could introduce its own string type with a different behavior. It is only done for literal strings. If you want to intern dynamically created strings, you can do so.

For one thing it makes comparing strings really simple, but keep in mind that while some operations will benefit from interning others will not. E.g. Interned strings are not released until process shutdown (as they are rooted by the internal structure, see this question for details), so if you intern a lot of strings manually, the process will carry around a lot of memory.

I cant really gove you an answer,but what I can give you is a way to a solution, that is you have to find the anglde that you relate to or peaks your interest. A good paper is one that people get drawn into because it reaches them ln some way.As for me WW11 to me, I think of the holocaust and the effect it had on the survivors, their families and those who stood by and did nothing until it was too late.

Related Questions