Compared to C++, C#’s garbage collector seems like magic, and you can very easily write code without worrying about the underlying memory. But if you care about performance, knowing how the .NET runtime manages its RAM can help you write better code.
Value Types vs. Reference Types
There are two kinds of types in .NET, which directly affect how the underlying memory is handled.
Value types are primitive types with fixed sizes like
double, etc. They’re passed by value, meaning if you call
someFunction(int arg), the argument is copied and sent over as a new location in memory.
Under the hood, value types are (usually) stored on the stack. This mostly applies to local variables, and there are plenty of exceptions where they’ll instead be stored on the heap. But in all cases, the location in memory where the value type resides holds the actual value of that variable.
The stack is just a special location in memory, initialized with a default value but able to expand. The stack is a Last-in, First-out (LIFO) data structure. You can think of it like a bucket—variables are added to the top of the bucket, and when they go out of scope, .NET reaches into the bucket and removes them one at a time until it gets to the bottom.
The stack is a lot faster, but it’s still just a location in RAM, not a special location in the CPU’s cache (though it’s smaller than the heap, and as such is very likely to be hot in the cache, which helps out with performance).
The stack gets most of its performance from its LIFO structure. When you call a function, all the variables defined in that function are added to the stack. When that function returns and those variables go out of scope, the stack clears off everything that function put on it. The runtime manages this with stack frames, which define blocks of memory for different functions. Stack allocations are extremely fast, because it’s just writing a single value to the end of the stack frame.
This is also where the term “StackOverflow” comes from, which results when a function contains too many nested method calls and fills up the entire stack.
Reference types, however, are either too big, don’t have fixed sizes, or live too long to be on the stack. Usually, these take the form of objects and classes that have been instantiated, but they also includes arrays and strings, which can vary in size.
Reference types like instances of classes are often initialized with the
new keyword, which creates a new instance of the class and returns a reference to it. You can set this to a local variable, which actually uses the stack to store the reference to the location on the heap.
The heap can expand and fill up until the computer runs out of memory, which makes it great for storing a lot of data. However, it’s unorganized, and in C# it must be managed with a garbage collector to work properly. Heap allocations are also slower than stack allocations, although they’re still quite fast.
However, there are a number of exceptions to these rules, otherwise value and reference types would be called “stack types” and “heap types.”
- Outer variables of lambda functions, local variables of
IEnumeratorblocks, and local variables of
asyncmethods are all stored on the heap.
- Value type fields of classes are long-term variables, and are always stored on the heap. They’re also wrapped in a reference type, and are stored alongside that reference type.
- Static class fields are also always stored on the heap.
- Custom structs are value types, but they can contain reference types like Lists and strings, which are stored on the heap as normal. Creating a copy of the struct creates a new copy and allocation of all reference types on the heap.
The most notable exception to the rule of “reference types being on the heap,” is the usage of
Span<T>, which manually allocates a block of memory on the stack for a temporary array that will be cleaned off the stack as normal when it goes out of scope. This bypasses a relatively expensive heap allocation, and puts less pressure on the garbage collector in the process. It can be a lot more performant, but it’s a bit of an advanced feature, so if you’d like to learn more about it you can read this guide on how to use it properly without causing a StackOverflow exception.
What Is Garbage Collection?
The stack is very organized, but the heap is messy. Without something to manage it, things on the heap don’t get cleaned up automatically, which leads to your application running out of memory due to it never being freed.
Of course, that’s a problem, which is why the garbage collector exists. It runs on a background thread and periodically scans your application for references that no longer exist on the stack, which indicate that the program has stopped caring about the data being referenced. The .NET runtime can come in and clean up, and shift memory around in the process to make the heap more organized.
However, this magic comes at a cost—garbage collection is slow and expensive. It runs on a background thread, but there is a period where program execution must be halted to run the garbage collection. This is the tradeoff that comes with programming in C#; all you can do is try to minimize the garbage you create.
In languages without a garbage collector, you need to manually clean up after yourself, which is faster in many cases but more annoying for the programmer. So, in a sense, a garbage collector is like a Roomba, which cleans up your floors automatically, but is slower than just getting up and vacuuming.