Thursday, July 19, 2012

String.Empty vs "" - How .NET Handles String Instances

For the longest time, I've known people, myself included that used String.Empty to represent "" in code, because back in the 1.0 - 1.1 days, for every String literal you created, you were creating an object in memory. String.Empty was a static reference to the same object, so using that prevented the developer from creating all sorts of empty strings in memory. This is how it was for many years, until something changed, and frankly, I didn't get the memo.

String.Empty and "" are now literally the same thing. In fact, any two strings that match are now the same reference as well. This isn't true for other primitive types, like integers for example, but for strings it is. Have a look:

private static void TestStringInternment()
   TestEquality("\"\" and \"\"", "", "");
   TestEquality("\"\" and String.Empty", "", String.Empty);
   var x = "foo";
   var y = "foo";
   TestEquality("x and y", x, y);
   x += "!!!";
   TestEquality("x and y again", x, y);
   TestEquality("0 and 0", 0, 0);

static void TestEquality<T1, T2>(string name, T1 a, T2 b) where T1: IComparable where T2: IComparable
   Console.WriteLine("Equal: {0}\tSameReference: {1}\t// {2}", a.Equals(b), Object.ReferenceEquals(a, b), name);


Equal: True     SameReference: True     // "" and ""
Equal: True     SameReference: True     // "" and String.Empty
Equal: True     SameReference: True     // x and y
Equal: False    SameReference: False    // x and y again
Equal: True     SameReference: False    // 0 and 0

So as you can see, as long as the values of the strings are the same, they're the actually the same instance. But why is this happening? and how? Well, the why is pretty simple: Strings can be any size in memory, as such, it's probably a good idea to try to manage their memory usage as closely as you can. But how? Again this is pretty simple, since strings are immutable, it's safe to put all variables with matching strings at the same reference, because you know that reference won't change. The CLR actually interns all strings so each string variable points to the same instance in the intern pool.

So why isn't this done with things like int? Int is immutable too! ... I presume it's because an int is only 4 bytes long and has a very small memory foot print, whereas a string can be any number of bytes long, and is almost always longer than an int.

One thing to note, however, is some code will indeed create a new instance of a string object, like StringBuilder for example. Even if it outputs the same value as another string, unless you call String.Intern() on the output, it won't use the value stored in the intern pool. This doesn't mean you need to intern every string you get from StringBuilder, it just means you should be aware that not all strings are referenced from the intern pool.

So, I'll admit it, I didn't know this fun fact for WAY too long. I was aware of the intern pool, but I thought that was something that needed to be done explicitly. Now I know better and I figured I would share with my friends, who if they did know, never corrected me. :P Thanks, jerkfaces. LOL

1 comment:

  1. I didn't know about it either! Thanks for posting.

    Personally I thought the intern pool was the smaller swimming area, compared to the executive pool.


This form allows some basic HTML. It will only create links if you wrap the URL in an anchor tag (Sorry, it's the Blogger default)