Instance Methods Called on null References

12 Jul 2010

In a previous post, I wrote how you can call Extension Methods on null references, as in fact the are static methods with one more parameter, the extended object itself.

I’m currently reading CLR via C# (which is a fascinating read), and I was surprised to learn in chapter 6 how the CIL instructions call and callvirt actually work.

What is amazing is that for methods called with the call instruction, the CLR does not check if the referenced object is null. The method call will succeed, but the this reference will be null in the instance method. Actually, in both cases, the reference to the object that the method was called on is passed as a hidden parameter to the method.

Before examining this, another interesting fact is the that the C# compiler mostly emits callvirt instructions when calling a method, which checks if the reference is null. To test the call instruction easily, we will have to disassemble, modify then reassemble the following code:

public class SomeClass
{
    public String GetHello()
    {
        if (this == null)
        {
            return "Amazing!";
        }

        return "Hello";
    }
}

class Program
{
    static void Main(string[] args)
    {
        var o = null as SomeClass;
        var hello = o.GetHello();

        Console.WriteLine(hello);
    }
}

Pretty dumb, right? Especially the if statement where we check if this is null. It’s seems logical to most of us that this will throw a NullReferenceException. However, this is just to get the compiler to build us code that is very close to what to achieve, so we don’t have to write IL ourselves.

After running ILDasm.exe on the assembly, this is what we have in the Main method:

  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       18 (0x12)
    .maxstack  1
    .locals init ([0] class Sandbox09.SomeClass o,
             [1] string hello)
    IL_0000:  nop
    IL_0001:  ldnull
    IL_0002:  stloc.0
    IL_0003:  ldloc.0
    IL_0004:  callvirt   instance string Sandbox09.SomeClass::GetHello()
    IL_0009:  stloc.1
    IL_000a:  ldloc.1
    IL_000b:  call       void [mscorlib]System.Console::WriteLine(string)
    IL_0010:  nop
    IL_0011:  ret
  } // end of method Program::Main

As we can see, the call to GetHello is done with the callvirt instruction. As this instruction checks if the object is null (and in this case, it is), this will fail at runtime.

Just to make sure, I used ILasm.exe to build the assembly and ran it, here is what it outputs:

Unhandled Exception: System.NullReferenceException: Object reference not set to an instance of an object.

at Pvle.Program.Main(String[] args)

Now, let’s try to replace the callvirt by call to see how it behaves.

    IL_0004:  call       instance string Sandbox09.SomeClass::GetHello()

Now, run it again trough ILasm.exe once more and run it. Here’s what it outputs:

Amazing!

The actual difference between call and callvirt is that call calls the method on the compile time type of the object, so there is no need to check if the reference is null. The object will be passed as a hidden parameter to the method and will be references as this. It’s very similar to extension methods.

Callvirt, on the other hand, will resolve the method that is to be called at runtime, depending on the runtime type of the object, so the object cannot be null. The CLR enforces this check at runtime.

What About Value Types?

For value types, it’s a bit different. As they are implicitly sealed, the only methods that are virtual are the ones that are defined in System.Object. Oh wait, there is another case: if the value type is cast to an interface it implements, calls to methods on that variable will be using callvirt, as the value type will have to be boxed.

Here is some sample code that demonstrates this:

public interface ISomeInterface
{
    String GetHelloFromInterface();
}

public struct SomeClass : ISomeInterface
{
    public String GetHello()
    {
        return "Hello";
    }

    public override string ToString()
    {
        return "Hello";
    }

    public String GetHelloFromInterface()
    {
        return "Hello from interface";
    }
}

class Program
{
    static void Main(string[] args)
    {
        var o = new SomeClass();
        var hello = o.GetHello();
        
        o.ToString();
        o.GetHelloFromInterface();
        ((ISomeInterface)o).GetHelloFromInterface();
        o.GetHashCode();
    }
}

And here is the corresponding IL for the Main method:

  .method private hidebysig static void  Main(string[] args) cil managed
  {
    .entrypoint
    // Code size       66 (0x42)
    .maxstack  1
    .locals init ([0] valuetype Pvle.SomeClass o,
             [1] string hello)
    IL_0000:  nop
    IL_0001:  ldloca.s   o
    IL_0003:  initobj    Pvle.SomeClass
    IL_0009:  ldloca.s   o
    IL_000b:  call       instance string Pvle.SomeClass::GetHello()
    IL_0010:  stloc.1
    IL_0011:  ldloca.s   o
    IL_0013:  constrained. Pvle.SomeClass
    IL_0019:  callvirt   instance string [mscorlib]System.Object::ToString()
    IL_001e:  pop
    IL_001f:  ldloca.s   o
    IL_0021:  call       instance string Pvle.SomeClass::GetHelloFromInterface()
    IL_0026:  pop
    IL_0027:  ldloc.0
    IL_0028:  box        Pvle.SomeClass
    IL_002d:  callvirt   instance string Pvle.ISomeInterface::GetHelloFromInterface()
    IL_0032:  pop
    IL_0033:  ldloca.s   o
    IL_0035:  constrained. Pvle.SomeClass
    IL_003b:  callvirt   instance int32 [mscorlib]System.Object::GetHashCode()
    IL_0040:  pop
    IL_0041:  ret
  } // end of method Program::Main

We can see that when calling the method trough the interface, the value type is boxed.

I find this very interesting in understanding how calls to methods actually work. Getting your nose in IL is always a good idea when you want to see what happening under the hood, but I have to admit that this is the first time that I modify it and reassemble it.