Exploring the internals of the .NET Runtime

I recently appeared on Herding Code and Stackify ‘Developer Things’ podcasts and in both cases, the first question asked was ‘how do you figure out the internals of the .NET runtime’?

This post is an attempt to articulate that process, in the hope that it might be useful to others.

Here are my suggested steps:

  1. Decide what you want to investigate
  2. See if someone else has already figured it out (optional)
  3. Read the ‘Book of the Runtime’
  4. Build from the source
  5. Debugging
  6. Verify against .NET Framework (optional)

Note: As with all these types of lists, just because it worked for me doesn’t mean that it will for everyone. So, ‘your milage may vary’.

Step One - Decide what you want to investigate

For me, this means working out what question I’m trying to answer, for example here are some previous posts I’ve written:

(it just goes to show, you don’t always need fancy titles!)

I put this as ‘Step 1’ because digging into .NET internals isn’t quick or easy work, some of my posts take weeks to research, so I need to have a motivation to keep me going, something to focus on. In addition, the CLR isn’t a small run-time, there’s a lot in there, so just blindly trying to find your way around it isn’t easy! That’s why having a specific focus helps, looking at one feature or section at a time is more manageable.

The very first post where I followed this approach was Strings and the CLR - a Special Relationship. I’d previously spent some time looking at the CoreCLR source and I knew a bit about how Strings in the CLR worked, but not all the details. During the research of that post I then found more and more areas of the CLR that I didn’t understand and the rest of my blog grew from there (delegates, arrays, fixed keyword, type loader, etc).

Aside: I think this is generally applicable, if you want to start blogging, but you don’t think you have enough ideas to sustain it, I’d recommend that you start somewhere and other ideas will follow.

Another tip is to look at HackerNews or /r/programming for posts about the ‘internals’ of other runtimes, e.g. Java, Ruby, Python, Go etc, then write the equivalent post about the CLR. One of my most popular posts A Hitchhikers Guide to the CoreCLR Source Code was clearly influenced by equivalent articles!

Finally, for more help with learning, ‘figuring things out’ and explaining them to others, I recommend that you read anything by Julia Evans. Start with Blogging principles I use and So you want to be a wizard (also available as a zine), then work your way through all the other posts related to blogging or writing.

I’ve been hugely influenced, for the better, by Julia’s approach to blogging.

Step Two - See if someone else has already figured it out (optional)

I put this in as ‘optional’, because it depends on your motivation. If you are trying to understand .NET internals for your own education, then feel-free to write about whatever you want. If you are trying to do it to also help others, I’d recommend that you first see what’s already been written about the subject. If, once you’ve done that you still think there is something new or different that you can add, then go ahead, but I try not to just re-hash what is already out there.

To see what’s already been written, you can start with Resources for Learning about .NET Internals or peruse the ‘Internals’ tag on this blog. Another really great resource is all the answers by Hans Passant on StackOverflow, he is prolific and amazingly knowledgeable, here’s some examples to get you started:

Step Three - Read the ‘Book of the Runtime’

You won’t get far in investigating .NET internals without coming across the ‘Book of the Runtime’ (BOTR) which is an invaluable resource, even Scott Hanselman agrees!

It was written by the .NET engineering team, for the .NET engineering team, as per this HackerNews comment:

Having worked for 7 years on the .NET runtime team, I can attest that the BOTR is the official reference. It was created as documentation for the engineering team, by the engineering team. And it was (supposed to be) kept up to date any time a new feature was added or changed.

However, just a word of warning, this means that it’s an in-depth, non-trivial document and hard to understand when you are first learning about a particular topic. Several of my blog posts have consisted of the following steps:

  1. Read the BOTR chapter on ‘Topic X’
  2. Understand about 5% of what I read
  3. Go away and learn more (read the source code, read other resources, etc)
  4. GOTO ‘Step 1’, understanding more this time!

Related to this, the source code itself is often as helpful as the BOTR due to the extensive comments, for example this one describing the rules for prestubs really helped me out. The downside of the source code comments is that they are bit harder to find, whereas the BOTR is all in one place.

Step Four - Build from the source

However, at some point, just reading about the internals of the CLR isn’t enough, you actually need to ‘get your hands’ dirty and see it in action. Now that the Core CLR is open source it’s very easy to build it yourself and then once you’ve done that, there are even more docs to help you out if you are building on different OSes, want to debug, test CoreCLR in conjunction with CoreFX, etc.

But why is building from source useful?

Because it lets you build a Debug/Diagnostic version of the runtime that gives you lots of additional information that isn’t available in the Release/Retails builds. For instance you can view JIT Dumps using COMPlus_JitDump=..., however this is just one of many COMPlus_XXX settings you can use, there are 100’s available.

However, even more useful is the ability to turn on diagnostic logging for a particular area of the CLR. For instance, lets imagine that we want to find out more about AppDomains and how they work under-the-hood, we can use the following logging configuration settings:

SET COMPLUS_LogFacility=02000000

Where LogFacility is set to LF_APPDOMAIN, there are many other values you can provide as a HEX bit-mask the full list is available in the source code. If you set these variables and then run an app, you will get a log output like this one. Once you have this log you can very easily search around in the code to find where the messages came from, for instance here are all the places that LF_APPDOMAIN is logged. This is a great technique to find your way into a section of the CLR that you aren’t familiar with, I’ve used it many times to great effect.

Step Five - Debugging

For me, biggest boon of Microsoft open sourcing .NET is that you can discover so much more about the internals without having to resort to ‘old school’ debugging using WinDBG. But there still comes a time when it’s useful to step through the code line-by-line to see what’s going on. The added advantage of having the source code is that you can build a copy locally and then debug through that using Visual Studio which is slightly easier than WinDBG.

I always leave debugging to last, as it can be time-consuming and I only find it helpful when I already know where to set a breakpoint, i.e. I already know which part of the code I want to step through. I once tried to blindly step through the source of the CLR whilst it was starting up and it was very hard to see what was going on, as I’ve said before the CLR is a complex runtime, there are many things happening, so stepping through lots of code, line-by-line can get tricky.

Step Six - Verify against .NET Framework

I put this final step in because the .NET CLR source available on GitHub is the ‘.NET Core’ version of the runtime, which isn’t the same as the full/desktop .NET Framework that’s been around for years. So you may need to verify the behavior matches, if you want to understand the internals ‘as they were’, not just ‘as they will be’ going forward. For instance .NET Core has removed the ability to create App Domains as a way to provide isolation but interestingly enough the internal class lives on!

To verify the behaviour, your main option is to debug the CLR using WinDBG. Beyond that, you can resort to looking at the ‘Rotor’ source code (roughly the same as .NET Framework 2.0), or petition Microsoft the release the .NET Framework Source Code (probably not going to happen)!

However, low-level internals don’t change all that often, so more often than not the way things behave in the CoreCLR is the same as they’ve always worked.


Finally, for your viewing pleasure, here are a few talks related to ‘.NET Internals’:

Discuss this post on /r/programming or /r/dotnet