How do .NET delegates work?

Delegates are a fundamental part of the .NET runtime and whilst you rarely create them directly, they are there under-the-hood every time you use a lambda in LINQ (=>) or a Func<T>/Action<T> to make your code more functional. But how do they actually work and what’s going in the CLR when you use them?


IL of delegates and/or lambdas

Let’s start with a small code sample like this:

public delegate string SimpleDelegate(int x);

class DelegateTest
{
    static int Main()
    {
        // create an instance of the class
        DelegateTest instance = new DelegateTest();
        instance.name = "My instance";

        // create a delegate
        SimpleDelegate d1 = new SimpleDelegate(instance.InstanceMethod);

        // call 'InstanceMethod' via the delegate (compiler turns this into 'd1.Invoke(5)')
        string result = d1(5); // returns "My instance: 5"
    }

    string InstanceMethod(int i)
    {
        return string.Format("{0}: {1}", name, i);
    }
}

If you were to take a look at the IL of the SimpleDelegate class, the ctor and Invoke methods look like so:

[MethodImpl(0, MethodCodeType=MethodCodeType.Runtime)]
public SimpleDelegate(object @object, IntPtr method);

[MethodImpl(0, MethodCodeType=MethodCodeType.Runtime)]
public virtual string Invoke(int x);

It turns out that this behaviour is manadated by the spec, from ECMA 335 Standard - Common Language Infrastructure (CLI):

Delegates in the Common Language Infrastructure (CLI) Spec

So the internal implementation of a delegate, the part responsible for calling a method, is created by the runtime. This is because there needs to be complete control over those methods, delegates are a fundamental part of the CLR, any security issues, performance overhead or other inefficiencies would be a big problem.

Methods that are created in this way are technically know as EEImpl methods (i.e. implemented by the ‘Execution Engine’), from the ‘Book of the Runtime’ (BOTR) section ‘Method Descriptor - Kinds of MethodDescs:

EEImpl Delegate methods whose implementation is provided by the runtime (Invoke, BeginInvoke, EndInvoke). See ECMA 335 Partition II - Delegates.

There’s also more information available in these two excellent articles .NET Type Internals - From a Microsoft CLR Perspective (section on ‘Delegates’) and Understanding .NET Delegates and Events, By Practice (section on ‘Internal Delegates Representation’)


How the runtime creates delegates

Inlining of delegate ctors

So we’ve seen that the runtime has responsibility for creating the bodies of delegate methods, but how is this done. It starts by wiring up the delegate constructor (ctor), as per the BOTR page on ‘method descriptors’

FCall Internal methods implemented in unmanaged code. These are methods marked with MethodImplAttribute(MethodImplOptions.InternalCall) attribute, delegate constructors and tlbimp constructors.

At runtime this happens when the JIT compiles a method that contains IL code for creating a delegate. In Compiler::fgOptimizeDelegateConstructor(..), the JIT firstly obtains a reference to the correct delegate ctor, which in the simple case is CtorOpened(Object target, IntPtr methodPtr, IntPtr shuffleThunk) (link to C# code), before finally wiring up the ctor, inlining it if possible for maximum performance.

Creation of the delegate Invoke() method

But what’s more interesting is the process that happens when creating the Invoke() method, using a technique involving ‘stubs’ of code (raw-assembly) that know how to locate the information about the target method and can jump control to it. These ‘stubs’ are actually used in a wide-variety of scenarios, for instance during Virtual Method Dispatch and also by the JITter (when a method is first called it hits a ‘pre-code stub’ that causes the method to be JITted, the ‘stub’ is then replaced by a call to the JITted ‘native code’).

In the particular case of delegates, these stubs are referred to as ‘shuffle thunks’. This is because part of the work they have to do is ‘shuffle’ the arguments that are passed into the Invoke() method, so that are in the correct place (stack/register) by the time the ‘target’ method is called.

To understand what’s going on, it’s helpful to look at the following diagram taken from the BOTR page on Method Descriptors and Precode stubs. The ‘shuffle thunks’ we are discussing are a particular case of a ‘stub’ and sit in the corresponding box in the diagram:

Figure 3 The most complex case of Precode, Stub and Native Code

How ‘shuffle thunks’ are set-up

So let’s look at the code flow for the delegate we created in the sample at the beginning of this post, specifically an ‘open’ delegate, calling an instance method (if you are wondering about the difference between open and closed delegates, have a read of ‘Open Delegates vs. Closed Delegates’).

We start off in the impImportCall() method, deep inside the .NET JIT, triggered when a ‘call’ op-code for a delegate is encountered, it then goes through the following functions:

  1. Compiler::impImportCall(..)
  2. Compiler::fgOptimizeDelegateConstructor(..)
  3. COMDelegate::GetDelegateCtor(..)
  4. COMDelegate::SetupShuffleThunk
  5. StubCacheBase::Canonicalize(..)
  6. ShuffleThunkCache::CompileStub()
  7. EmitShuffleThunk (specific assembly code for different CPU architectures)

Below is the code from the arm64 version (chosen because it’s the shortest one of the three!). You can see that it emits assembly code to fetch the real target address from MethodPtrAux, loops through the method arguments and puts them in the correct register (i.e. ‘shuffles’ them into place) and finally emits a tail-call jump to the target method associated with the delegate.

VOID StubLinkerCPU::EmitShuffleThunk(ShuffleEntry *pShuffleEntryArray)
{
  // On entry x0 holds the delegate instance. Look up the real target address stored in the MethodPtrAux
  // field and save it in x9. Tailcall to the target method after re-arranging the arguments
  // ldr x9, [x0, #offsetof(DelegateObject, _methodPtrAux)]
  EmitLoadStoreRegImm(eLOAD, IntReg(9), IntReg(0), DelegateObject::GetOffsetOfMethodPtrAux());
  //add x11, x0, DelegateObject::GetOffsetOfMethodPtrAux() - load the indirection cell into x11 used by ResolveWorkerAsmStub
  EmitAddImm(IntReg(11), IntReg(0), DelegateObject::GetOffsetOfMethodPtrAux());

  for (ShuffleEntry* pEntry = pShuffleEntryArray; pEntry->srcofs != ShuffleEntry::SENTINEL; pEntry++)
  {
    if (pEntry->srcofs & ShuffleEntry::REGMASK)
    {
      // If source is present in register then destination must also be a register
      _ASSERTE(pEntry->dstofs & ShuffleEntry::REGMASK);

      EmitMovReg(IntReg(pEntry->dstofs & ShuffleEntry::OFSMASK), IntReg(pEntry->srcofs & ShuffleEntry::OFSMASK));
    }
    else if (pEntry->dstofs & ShuffleEntry::REGMASK)
    {
      // source must be on the stack
      _ASSERTE(!(pEntry->srcofs & ShuffleEntry::REGMASK));

      EmitLoadStoreRegImm(eLOAD, IntReg(pEntry->dstofs & ShuffleEntry::OFSMASK), RegSp, pEntry->srcofs * sizeof(void*));
    }
    else
    {
      // source must be on the stack
      _ASSERTE(!(pEntry->srcofs & ShuffleEntry::REGMASK));

      // dest must be on the stack
      _ASSERTE(!(pEntry->dstofs & ShuffleEntry::REGMASK));

      EmitLoadStoreRegImm(eLOAD, IntReg(8), RegSp, pEntry->srcofs * sizeof(void*));
      EmitLoadStoreRegImm(eSTORE, IntReg(8), RegSp, pEntry->dstofs * sizeof(void*));
    }
  }

  // Tailcall to target
  // br x9
  EmitJumpRegister(IntReg(9));
}

Other functions that call SetupShuffleThunk(..)

The other places in code that also emit these ‘shuffle thunks’ are listed below. They are used in the various scenarios where a delegate is explicitly created, e.g. via `Delegate.CreateDelegate(..).


Different types of delegates

Now that we’ve looked at how one type of delegate works (#2 ‘Instance open non-virt’ in the table below), it will be helpful to see the other different types that the runtime deals with. From the very informative DELEGATE KINDS TABLE in the CLR source:

# delegate type _target _methodPtr _methodPtrAux
1 Instance closed ‘this’ ptr target method null
2 Instance open non-virt delegate shuffle thunk target method
3 Instance open virtual delegate Virtual-stub dispatch method id
4 Static closed first arg target method null
5 Static closed (special sig) delegate specialSig thunk target method
6 Static opened delegate shuffle thunk target method
7 Secure delegate call thunk MethodDesc (frame)

Note: The columns map to the internal fields of a delegate (from System.Delegate)

So we’ve (deliberately) looked at the simple case, but the more complex scenarios all work along similar lines, just using different and more stubs/thunks as needed e.g. ‘virtual-stub dispatch’ or ‘call thunk’.


Delegates are special!!

As well as being responsible for creating delegates, the runtime also treats delegate specially, to enforce security and/or type-safety. You can see how this is implemented in the links below

In MethodTableBuilder.cpp:

In ClassCompat.cpp:


Discuss this post in /r/programming and /r/csharp


If you’ve read this far, good job!!

As a reward, below are some extra links that cover more than you could possibly want to know about delegates!!

General Info:

Internal Delegate Info

Debugging delegates