How the .NET Runtime loads a Type
15 Jun 2017 - 2465 wordsIt is something we take for granted every time we run a .NET program, but it turns out that loading a Type or class
is a fairly complex process.
So how does the .NET Runtime (CLR) actually load a Type?
If you want the tl;dr it’s done carefully, cautiously and step-by-step
Ensuring Type Safety
One of the key requirements of a ‘Managed Runtime’ is providing Type Safety, but what does it actually mean? From the MSDN page on Type Safety and Security
Type-safe code accesses only the memory locations it is authorized to access. (For this discussion, type safety specifically refers to memory type safety and should not be confused with type safety in a broader respect.) For example, type-safe code cannot read values from another object’s private fields. It accesses types only in well-defined, allowable ways.
So in effect, the CLR has to ensure your Types/Classes are well-behaved and following the rules.
Compiler prevents you from creating an ‘abstract’ class
But lets look at a more concrete example, using the C# code below
public abstract class AbstractClass
{
public AbstractClass() { }
}
public class NormalClass : AbstractClass
{
public NormalClass() { }
}
public static void Main(string[] args)
{
var test = new AbstractClass();
}
The compiler quite rightly refuses to compile this and gives the following error, because abstract
classes can’t be created, you can only inherit from them.
error CS0144: Cannot create an instance of the abstract class or interface
'ConsoleApplication.AbstractClass'
So that’s all well and good, but the CLR can’t rely on all code being created via a well-behaved compiler, or in fact via a compiler at all. So it has to check for and prevent any attempt to create an abstract
class.
Writing IL code by hand
One way to circumvent the compiler is to write IL code by hand using the IL Assembler tool (ILAsm) which will do almost no checks on the validity of the IL you give it.
For instance the IL below is the equivalent of writing var test = new AbstractClass();
(if the C# compiler would let us):
.method public hidebysig static void Main(string[] args) cil managed
{
.entrypoint
.maxstack 1
.locals init (
[0] class ConsoleApplication.NormalClass class2)
// System.InvalidOperationException: Instances of abstract classes cannot be created.
newobj instance void ConsoleApplication.AbstractClass::.ctor()
stloc.0
ldloc.0
callvirt instance class [mscorlib]System.Type [mscorlib]System.Object::GetType()
callvirt instance string [mscorlib]System.Reflection.MemberInfo::get_Name()
call void [mscorlib]Internal.Console::WriteLine(string)
ret
}
Fortunately the CLR has got this covered and will throw an InvalidOperationException
when you execute the code. This is due to this check which is hit when the JIT compiles the newobj
IL instruction.
Creating Types at run-time
One other way that you can attempt to create an abstract
class is at run-time, using reflection (thanks to this blog post for giving me some tips on other ways of creating Types).
This is shown in the code below:
var abstractType = Type.GetType("ConsoleApplication.AbstractClass");
Console.WriteLine(abstractType.FullName);
// System.MissingMethodException: Cannot create an abstract class.
var abstractInstance = Activator.CreateInstance(abstractType);
The compiler is completely happy with this, it doesn’t do anything to prevent or warn you and nor should it. However when you run the code, it will throw an exception, strangely enough a MissingMethodException
this time, but it does the job!
The call stack is below:
- Activator CreateInstance(..) (C# code)
- RtType CreateInstanceSlow(..) (C# code)
- RuntimeHandles CreateInstance(..) (extern call)
- RuntimeTypeHandle::CreateInstance(..) (C++ implementation)
- The actual check that throws a
MissingMethodException
One final way (unless I’ve missed some out?) is to use GetUninitializedObject(..)
in the FormatterServices class like so:
public static object CreateInstance(Type type)
{
var constructor = type.GetConstructor(new Type[0]);
if (constructor == null && !type.IsValueType)
{
throw new NotSupportedException(
"Type '" + type.FullName + "' doesn't have a parameterless constructor");
}
var emptyInstance = FormatterServices.GetUninitializedObject(type);
if (constructor == null)
return null;
return constructor.Invoke(emptyInstance, new object[0]) ?? emptyInstance;
}
var abstractType = Type.GetType("ConsoleApplication.AbstractClass");
Console.WriteLine(abstractType.FullName);
// System.MemberAccessException: Cannot create an abstract class.
var abstractInstance = CreateInstance(abstractType);
Again the run-time stops you from doing this, however this time it decides to throw a MemberAccessException
?
This happens via the following call stack:
- FormatterServices GetUninitializedObject(..) (C# code)
- FormatterServices nativeGetUninitializedObject(..) (extern call)
- ReflectionSerialization::GetUninitializedObject(..) (C++ implementation)
- Actual check that throws a
MemberAccessException
Further Type-Safety Checks
These checks are just one example of what the runtime has to validate when creating types, there are many more things is has to deal with. For instance you can’t:
- instantiate an interface
- create a Function Pointer type
- load a type with invalid IL
- box a type containing stack pointers
- load a type if any of it’s generic argument types failed to load
- create a subclass of an Array
- create virtual, static methods
- have methods in an enum
- have a class with a method name that is too long (1024 characters if you’re wondering)
- and many, many more (for instance, search classcompat.cpp for
BuildMethodTableThrowException
and methodtablebuilder.cpp forThrowTypeLoadException
)
Loading Types ‘step-by-step’
So we’ve seen that the CLR has to do multiple checks when it’s loading types, but why does it have to load them ‘step-by-step’?
Well in a nutshell, it’s because of circular references and recursion, particularly when dealing with generics types. If we take the code below from section ‘2.1 Load Levels’ in Type Loader Design (BotR):
classA<T> : C<B<T>>
{ }
classB<T> : C<A<T>>
{ }
classC<T>
{ }
These are valid types and class A
depends on class B
and vice versa. So we can’t load A
until we know that B
is valid, but we can’t load B
, until we’re sure that A
is valid, a classic deadlock!!
How does the run-time get round this, well from the same BotR page:
The loader initially creates the structure(s) representing the type and initializes them with data that can be obtained without loading other types. When this “no-dependencies” work is done, the structure(s) can be referred from other places, usually by sticking pointers to them into another structures. After that the loader progresses in incremental steps and fills the structure(s) with more and more information until it finally arrives at a fully loaded type. In the above example, the base types of A and B will be approximated by something that does not include the other type, and substituted by the real thing later.
(there is also some more info here)
So it loads types in stages, step-by-step, ensuring each dependant type has reached the same stage before continuing. These ‘Class Load’ stages are shown in the image below and explained in detail in this very helpful source-code comment (Yay for Open-Sourcing the CoreCLR!!)
The different levels are handled in the ClassLoader::DoIncrementalLoad(..) method, which contains the switch
statement that deals with them all in turn.
However this is part of a bigger process, which controls loading an entire file, also known as a Module
or Assembly
in .NET terminology. The entire process for that is handled in by another dispatch loop (switch statement), that works with the FileLoadLevel
enum (definition). So in reality the whole process for loading an Assembly
looks like this (the loading of one or more Types happens as sub-steps once the Module
had reached the FILE_LOADED
stage)
- FILE_LOAD_CREATE - DomainFile ctor()
- FILE_LOAD_BEGIN - Begin()
- FILE_LOAD_FIND_NATIVE_IMAGE - FindNativeImage()
- FILE_LOAD_VERIFY_NATIVE_IMAGE_DEPENDENCIES - VerifyNativeImageDependencies()
- FILE_LOAD_ALLOCATE - Allocate()
- FILE_LOAD_ADD_DEPENDENCIES - AddDependencies()
- FILE_LOAD_PRE_LOADLIBRARY - PreLoadLibrary()
- FILE_LOAD_LOADLIBRARY - LoadLibrary()
- FILE_LOAD_POST_LOADLIBRARY - PostLoadLibrary()
- FILE_LOAD_EAGER_FIXUPS - EagerFixups()
- FILE_LOAD_VTABLE_FIXUPS - VtableFixups()
- FILE_LOAD_DELIVER_EVENTS - DeliverSyncEvents()
- FILE_LOADED - FinishLoad()
- CLASS_LOAD_BEGIN
- CLASS_LOAD_UNRESTOREDTYPEKEY
- CLASS_LOAD_UNRESTORED
- CLASS_LOAD_APPROXPARENTS
- CLASS_LOAD_EXACTPARENTS
- CLASS_DEPENDENCIES_LOADED
- CLASS_LOADED
- FILE_LOAD_VERIFY_EXECUTION - VerifyExecution()
- FILE_ACTIVE - Activate()
- calls MethodTable::CheckRunClassInitThrowing() and Module::ExpandAll() which trigger/run the
static
constructors of all the classes in the file/module
- calls MethodTable::CheckRunClassInitThrowing() and Module::ExpandAll() which trigger/run the
We can see this in action if we build a Debug version of the CoreCLR and enable the relevant configuration knobs. For a simple ‘Hello World’ program we get the log output shown below, where LOADER:
messages correspond to FILE_LOAD_XXX
stages and PHASEDLOAD:
messages indicate which CLASS_LOAD_XXX
step we are on.
You can also see some of the other events that happen at the same time, these include creation of static
variables (STATICS:
), thread-statics (THREAD STATICS:
) and PreStubWorker
which indicates methods being prepared for the JITter.
-------------------------------------------------------------------------------------------------------
This is NOT the full output, it's only the parts that reference 'Program.exe' and it's modules/classses
-------------------------------------------------------------------------------------------------------
PEImage: Opened HMODULE C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe
StoreFile: Add cached entry (000007FE65174540) with PEFile 000000000040D6E0
Assembly C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe: bits=0x2
LOADER: 439e30:***Program* >>>Load initiated, LOADED/LOADED
LOADER: 0000000000439E30:***Program* loading at level BEGIN
LOADER: 0000000000439E30:***Program* loading at level FIND_NATIVE_IMAGE
LOADER: 0000000000439E30:***Program* loading at level VERIFY_NATIVE_IMAGE_DEPENDENCIES
LOADER: 0000000000439E30:***Program* loading at level ALLOCATE
STATICS: Allocating statics for module Program
Loaded pModule: "C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe".
Module Program: bits=0x2
STATICS: Allocating 72 bytes for precomputed statics in module C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe in LoaderAllocator 000000000043AA18
StoreFile (StoreAssembly): Add cached entry (000007FE65174F28) with PEFile 000000000040D6E0Completed Load Level ALLOCATE for DomainFile 000000000040D8C0 in AD 1 - success = 1
LOADER: 0000000000439E30:***Program* loading at level ADD_DEPENDENCIES
Completed Load Level ADD_DEPENDENCIES for DomainFile 000000000040D8C0 in AD 1 - success = 1
LOADER: 0000000000439E30:***Program* loading at level PRE_LOADLIBRARY
LOADER: 0000000000439E30:***Program* loading at level LOADLIBRARY
LOADER: 0000000000439E30:***Program* loading at level POST_LOADLIBRARY
LOADER: 0000000000439E30:***Program* loading at level EAGER_FIXUPS
LOADER: 0000000000439E30:***Program* loading at level VTABLE FIXUPS
LOADER: 0000000000439E30:***Program* loading at level DELIVER_EVENTS
DRCT::IsReady - wait(0x100)=258, GetLastError() = 42424
DRCT::IsReady - wait(0x100)=258, GetLastError() = 42424
D::LA: Load Assembly Asy:0x000000000040D8C0 AD:0x0000000000439E30 which:C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe
Completed Load Level DELIVER_EVENTS for DomainFile 000000000040D8C0 in AD 1 - success = 1
LOADER: 0000000000439E30:***Program* loading at level LOADED
Completed Load Level LOADED for DomainFile 000000000040D8C0 in AD 1 - success = 1
LOADER: 439e30:***Program* <<<Load completed, LOADED
In PreStubWorker for System.Environment::SetCommandLineArgs
Prestubworker: method 000007FEC2AE1160M
DoRunClassInit: Request to init 000007FEC3BACCF8T in appdomain 0000000000439E30
RunClassInit: Calling class contructor for type 000007FEC3BACCF8T
In PreStubWorker for System.Environment::.cctor
Prestubworker: method 000007FEC2AE1B10M
DoRunClassInit: Request to init 000007FEC3BACCF8T in appdomain 0000000000439E30
DoRunClassInit: returning SUCCESS for init 000007FEC3BACCF8T in appdomain 0000000000439E30
RunClassInit: Returned Successfully from class contructor for type 000007FEC3BACCF8T
DoRunClassInit: returning SUCCESS for init 000007FEC3BACCF8T in appdomain 0000000000439E30
PHASEDLOAD: LoadTypeHandleForTypeKey for type ConsoleApplication.Program to level LOADED
PHASEDLOAD: table contains:
LoadTypeHandle: Loading Class from Module 000007FE65174718 token 2000002
PHASEDLOAD: Creating loading entry for type ConsoleApplication.Program
PHASEDLOAD: About to do incremental load of type ConsoleApplication.Program (0000000000000000) from level BEGIN
Looking up System.Object by name.
Loading class "ConsoleApplication.Program" from module "C:\coreclr\bin\Product\Windows_NT.x64.Debug\Program.exe" in domain 0x0000000000439E30
SD: MT::MethodIterator created for System.Object.
EEC::IMD: pNewMD:0x65175178 for tok:0x6000001 (ConsoleApplication.Program::.cctor)
EEC::IMD: pNewMD:0x651751a8 for tok:0x6000002 (ConsoleApplication.Program::.ctor)
EEC::IMD: pNewMD:0x651751d8 for tok:0x6000003 (ConsoleApplication.Program::Main)
STATICS: Placing statics for ConsoleApplication.Program
STATICS: Field placed at non GC offset 0x38
Offset of staticCounter1: 56
STATICS: Field placed at non GC offset 0x40
Offset of staticCounter2: 64
STATICS: Static field bytes needed (0 is normal for non dynamic case)0
STATICS: Placing ThreadStatics for ConsoleApplication.Program
THREAD STATICS: Field placed at non GC offset 0x20
Offset of threadStaticCounter1: 32
THREAD STATICS: Field placed at non GC offset 0x28
Offset of threadStaticCounter2: 40
STATICS: ThreadStatic field bytes needed (0 is normal for non dynamic case)0
CLASSLOADER: AppDomainAgileAttribute for ConsoleApplication.Program is 0
MethodTableBuilder: finished method table for module 000007FE65174718 token 2000002 = 000007FE65175230T
PHASEDLOAD: About to do incremental load of type ConsoleApplication.Program (000007FE65175230) from level APPROXPARENTS
Notify: 000007FE65175230 ConsoleApplication.Program
Successfully loaded class ConsoleApplication.Program
PHASEDLOAD: Completed full dependency load of type (000007FE65175230)+ConsoleApplication.Program
PHASEDLOAD: Completed full dependency load of type (000007FE65175230)+ConsoleApplication.Program
LOADER: 439e30:***Program* >>>Load initiated, ACTIVE/ACTIVE
LOADER: 0000000000439E30:***Program* loading at level VERIFY_EXECUTION
LOADER: 0000000000439E30:***Program* loading at level ACTIVE
Completed Load Level ACTIVE for DomainFile 000000000040D8C0 in AD 1 - success = 1
LOADER: 439e30:***Program* <<<Load completed, ACTIVE
In PreStubWorker for ConsoleApplication.Program::Main
Prestubworker: method 000007FE651751D8M
In PreStubWorker, calling MakeJitWorker
CallCompileMethodWithSEHWrapper called...
D::gV: cVars=0, extendOthers=1
Looking up System.Console by name.
SD: MT::MethodIterator created for System.Console.
JitComplete completed successfully
Got through CallCompile MethodWithSEHWrapper
MethodDesc::MakeJitWorker finished. Stub is 000007fe`652d0480
DoRunClassInit: Request to init 000007FE65175230T in appdomain 0000000000439E30
RunClassInit: Calling class contructor for type 000007FE65175230T
In PreStubWorker for ConsoleApplication.Program::.cctor
Prestubworker: method 000007FE65175178M
In PreStubWorker, calling MakeJitWorker
CallCompileMethodWithSEHWrapper called...
D::gV: cVars=0, extendOthers=1
JitComplete completed successfully
Got through CallCompile MethodWithSEHWrapper
MethodDesc::MakeJitWorker finished. Stub is 000007fe`652d04c0
So there you have it, the CLR loads your classes/Types carefully, cautiously and step-by-step!!
Discuss this post on HackerNews and /r/programming
As always, here’s some more links if you’d like to find out further information:
- Type Loader Design (BotR)
- Type System Overview (BotR)
- JIT compiler and type constructors (.cctors) (i.e. ‘When do class constructors (.cctor) get run’?)
- Why Do Initializers Run In The Opposite Order As Constructors? Part Two
- Disallow statics of spans and class instance members of span (PR)
- Span: Add tests to verify type loader checks for ref-like types #8516
- Back to Basics: When does a .NET Assembly Dependency get loaded