Overview of C# 4.0

November 13th, 2008

Note: This article is also posted at The Code Project. Refer to this link. There are quite interesting discussions going on there.

The .NET framework 4.0 CTP has just been released and I think it’s a good time to explore the new features of C# 4.0. In this post, I will introduce about the following features: dynamic lookup, generics covariance and contravariance support, optional and named parameters.

1. Dynamic Lookup
If var in C# 3.0 brings about type inference for local variables with the purpose of saving a couple of key strokes, dynamic lookup adds much more to the dynamicity of the C# language. When you declare a variable with dynamic, all of its method invocations or member accesses will be resolved at runtime. For example, let’s look at the following code

public static void Main() {
    dynamic obj = “I’m statically System.String”;
    obj.NotExistingMethod(“param”);
}

In the code, we create a string object and instead of assigning it to a variable of type string, we assign it to a variable declared as dynamic. This basically instructs the compiler to not to attempt to resolve any method call or member access of the declared variable. Instead, all these resolutions will happen at runtime. That being done, the next line in which we invoke a non-existing method with an arbitrary parameter will compile just fine into CIL. At runtime, when the method call is resolved and the runtime finds that there’s no such method in the runtime type (which is a System.String), an exception is thrown. That being said, if we use a valid method then there would be no exception and the code will run fine till completion. For example:

    dynamic obj = “I’m statically System.String”;
    Console.WriteLine(obj.ToLower());

Not only that we can declare a variable as dynamic, we can also declare methods’ arguments to be so. For example:

    private static void PrintID(dynamic obj) {
        Console.WriteLine(obj.ID);
    }

    public static void Main() {
        var person = new {ID = 111, Name = "Buu"};
        PrintID(person);

        var account = new {ID = 101, Bank = "Some Bank"};
        PrintID(account);
    }

We basically create two anonymous types, for the sake of convenience, both having a property called ID and then instantiate objects out of these types. We then pass each of these two objects to the method PrintID which accepts a dynamic object and print out the ID property. The code will print out “111” and “101” respectively. And yes, we’ve just seen duck typing in action. In C#.

That example also shows an interesting usage of dynamic in regarding to anonymous types. Now, we can pass objects of anonymous types out of their declaration scope (e.g. the method) and still be able to invoke their methods or access to their members without resorting to verbose reflection code. (And in fact, whenever you see any piece of reflection code in your application and regardless of the code’s purpose, there might be an opportunity to replace it with dynamic lookup to make the code more readable.)

We are not just limited to make statically typed .NET objects become dynamic, we can use this dynamic lookup feature to conveniently interact with “actual” dynamic objects available through the Dynamic Language Runtime (DLR) included in .NET framework 4.0. In fact, one can implement such dynamic objects in C# by implementing the System.Scripting.Actions.IDynamicObject interface, which is also part of the DLR. Regardless of the actual receiver of dynamic dispatch, the usage of dynamic lookup is exactly the same from the caller’s perspective.

Some of you might wonder what code the compiler generates and whether there is any change to the CLR to support dynamic lookup or not. To answer that question, let’s look at the CIL generated from the very first code fragment of this post. (Click on the image to see the full size.)

The generated CIL is pretty verbose but a skim through it reveals two important things. Firstly, our “dynamic variable” turns out to be a plain old CLR System.Object instance. And secondly, there’s no new CIL directive or opcode to support dynamic lookup. Instead, dynamic resolution and invocation is completely handled by framework code. In fact, the above CIL is equivalent to the following C# code, but now without the dynamic keyword.

    object obj = "I'm statically System.String";
    var payload = new CSharpCallPayload(
        RuntimeBinder.GetInstance(),
        false, false, "NotExistingMethod",
        typeof(object), null);
    var callSite = CallSite<Action<CallSite, object, string>>.Create(payLoad);
    Action<CallSite, object, string> action = callSite.Target;
    action.Invoke(callSite, obj, "param");

(In fact, I simplified the code a bit so that it can fit in one method, the actual CIL generated does have a check at IL_000c which basically looks at the callSite static field of a nested class, which is generated automatically by the compiler, to see if it’s null or not before going ahead and initializing it. In order words, the callSite is cached for subsequent invocations of the same method.)

So there’s really no magic behind the dynamic keyword. What the compiler basically does is generating some payload containing the information about the invocation so that it can be made at runtime. If there’s any magic at all, it would be in the static method CallSite.Create() which uses reflection to invoke the NotExistingMethod on the given object if it’s not an instance of IDynamicObject.

Now, we know that a “dynamic object” is actually a plain old object; it should explain why method can accept dynamic parameters. It should also be of no surprise to know that dynamic lookup can also be applied to return value of an instance/static method or an instance/static field. After all, unlike the var keyword which requires the compiler to infer the exact type at compile time, the compiler in dynamic lookup scenarios can simply pick System.Object as the type.


2. Generics Covariance and Contravariance
In previous versions of C#, generic types are invariant. That is, for any two types GenericType<T> and GenericType<K>, in which T is a subclass of K or K is a subclass of T, GenericType<T> and GenericType<K> have no inheritance relationship whatsoever to each other. On the other hand, if T is a subclass of K and if C# supports covariance, GenericType<T> is also a subclass of GenericType<K> or if C# supports contravariance, GenericType<T> is a superclass of GenericType<K>.

To understand why pre-4.0 C# disallows covariance and contravariance, let’s look at some code

    var strList = new List<string>();
    List<object> objList = strList; // compile-error, with cast or not

If the code in line 2, it can be error-prone. Consider what we can write after line 2 assuming it is allowed:

    // BOOM: we're adding an arbitrary AnyObject to what, at runtime, is a list of strings
    objList.add(new AnyObject());

On the other hand, if C# supported contravariance, we could have written the following problematic code:

    var objList = new List<object>;
    objList.add(3);
    objList.add(new AnyObject());
    List<string> strList = objList;

    // BOOM: we're getting an arbitrary object thinking it’s a string
    string element = strList.get(0);

Due to this invariant restriction, although for the good purpose, we can’t easily reuse variables and methods to respectively get assigned to or accepts various generic types. It’s somewhat unfortunately because the key thing to realize that covariance is fine as long as GenericType<T> does not have any method or member accepting arguments of type T (e.g. if we can’t add a bunch of objects into the objList, which is indeed an instance of List<string>, we’ll be safe). Besides, contravariance is just as safe if T does not appear in any return value from a member or method (e.g., if we can’t get out any string from strList, which is indeed an instance of List<object>, we’ll be safe).

Fortunately, C# 4.0 provides us an option: if a generics interface or generics delegate which has a reference type T as its type parameter and does not have any method or member that take in a parameter of type T, we can declare it to be covariant on T. On the other hand, if that interface or delegate does not have any method or member that returning T, we can declare it be contravariant on T. (Notice the emphases, only interfaces and delegates have covariance and contravariance support and the type parameter must be a reference type. On the other hand, C# arrays have been supporting covariance from the very beginning.)

Let’s look at an example. We have a Generator delegate which basically returns random instances of object of certain type (e.g. string). Its declaration is as follows:

    delegate T Generator<out T>();

Because this delegate does not take in any T as its argument, we can safely make it covariant on T. And indeed, that compiler will allow us to do so by adding the modifier out before the type parameter. However, if this delegate is declared as, say delegate T Generator<out T>(T seed) then the compiler will complain since it’s no longer safe for covariance. Now, let’s look at the usage:

    Generator<string> strGen = new Generator<string>(StringGenerator);

    // Below line is compiled now because Generator<string> is
    // subclass of Generator<object> under covariant rule
    Generator<object> objGen = strGen;

    // Downcast is also allowed for the same reason
    strGen = (Generator<String>)objGen;

    ...

    private string StringGenerator()
    {
        return "I'm a random string";
    }

With contravariance, you need to use the in keyword. Let’s look at an example making use of both contravariance and covariance:

    delegate K Converter<in T, out K>(T param);

This converter takes an object of type T and converts it to an object of type K (e.g. convert String to Object). Since it does not take in any K, it can safely be declared to be covariant on K. Similarly, since it exposes no T, it can safely declare to be contravariant on T. Its usage is as follows:

    var converter = new Converter<object, string>( ConvertImpl);
    Converter<string, object> string2ObjectConverter = converter;
    object result = string2ObjectConverter("A");

    ...

    private string ConvertImpl(object o) {
        return o.ToString();
    }

Note that although the above examples show how to declare covariance and contravariance for delegate types, it’s not different to do so with interfaces.

Before we finish with covariance and contravariance, let’s look at what code the compiler generates. After all, we know that the compiler must encode something to the CIL to denote covariant and/or contravariant generics types so that they can be consumed properly by client code. And this is what the definitions of our Generator and Converter delegates viewed from ildasm.

Do you notice anything? Those little minus and addition signs are used to denote contravariance and covariance respectively. And interesting enough, it turns out that the CLR has been supporting such CIL since the introduction of generics in .NET 2.0. Therefore, it’s possible to write CIL using covariance and contravariance under .NET 2.0+. Only by now that it’s possible to do so using C#.

A final thought about this feature, while this is a good enhancement to the language, I like the generics covariance/contravariance implementation, enabled via wildcards, in Java better for its flexibility. Anyway, let’s just be happy with it for now, we can’t have everything.


3. Optional and Named Parameters
The last two features of C# 4.0 that we’ll discuss about are optional parameters and named parameters. These features have been with VB.NET since forever and I’m glad they are finally implemented in C#.

With optional parameters, we can provide default values for methods’ and constructors’ parameters. That way, we don’t have to write a bunch of overloaded methods and constructors. For example, we can define a constructor like this:

    public Cart(int id, String name = “default cart”, double amount = 0d) {…}

Now, you can invoke this constructor with any of these calls:

   new Cart(1);
   new Cart(1, “my cart”);
   new Cart(1, “my cart”, 105.5);

How exactly does C# implement this feature? If we look at the CIL generated for this code, we’ll see there’s no magic at all. Basically the default values will be injected to the call site so that the method invocation happens normally. Here’s the CIL generated by the compiler:

Some might think that the compiler gets the default values in the source code to inject into the caller’s code. However, that will not work if a method with optional parameters is published as a library. In that case, there’s no way the compiler can get the default values to inject to the code of the library’s client. Therefore, what the compiler actually does is to encode the default parameters right into the method itself. This is what the CIL for Cart’s constructor:

Notice the .param directive which basically tells the compiler about the default values for the optional parameters. These parameters are also annotated by the [opt] attribute.

While being a great feature, there’s a caveat when using optional parameters: since the compiler inlines the default values at the caller site, change to default values at the library site won’t be reflected unless the caller is recompiled. In order words, you should consider default values of parameters as part of the published API of a method and you’d better make them right the first time.

Now, let’s say we want to call the Cart’s constructor with the ID and amount specified but not the name, how would we do that?

To avoid ambiguity (for example when two optional parameters are both strings), C# won’t allow us to simply skip a parameter like this:

    new Cart(1, 15.5d); // compile-error

An approach C# could take is to allow this syntax:

    new Cart(1, , 15.5d);

However, this looks terrible enough with just one missing parameter, much less more of them. (How do you like to read this code MethodWithManyFields(,,,15,,”param”,,)?)

For this particular situation, named parameters are excellent solution:

   new Cart(1, amount: 15.5d);

This is not the only use of named parameters though. A very important purpose of named parameters is to enhance the readability of code. Let’s say you have a class having a bunch of fields that are to be initialized in a constructor. Without name parameters, a constructor of such class will look very ugly and hard to understand without resorting to the documentation, source code, or Intellisense.

One approach is to create a builder to create instance of such class. For example:

   new BigClass.Builder().attr1(“value”).attr2(“value”)...attrN(“value”);

This will make the code more explicit about what values are to be assigned to what fields. However, the drawback is that we need to implement a builder class, which is quite a tedious task if we have to repeat that for many classes in our application.

With C# 3.0, it’s not that bad because we can use object initializer to do something like this:

    new BigClass {Attr1 = “value”, Attr2 = “value”, …, AttrN = “value”}

This looks good, but we have to define all the necessary properties for it to work, which is also a tedious thing to do if we don’t really need properties for the class or if the class itself is immutable.

With named parameters, we have a very nice solution without having to write any builder or property if we don’t want to. (Note the semicolon after the field name.)

    new BigClass{attr1: “value”, attr2: “value”, …, attrN: “value”);

    // The below does the same thing
    new BigClass{attrN: “value”, attr1: “value”, …, attr2: “value”);

Regarding the implementation of this at the CIL level, the compiler is smart enough to infer the correct argument order and simply perform a plain old method invocation.

That’s it. The key features of C# 4.0. I am personally glad that C# has come to support these. Some people said that C# has become so complex and started losing its original beauty. While I can understand that view, I think the situation is not that bad. While it’s obvious that C# has more and more constructs to supports functional and dynamic programming, the statically typed nature and the old constructs of the language are still there and no developer is forced to use the new features if they don’t need to. On the other hand, these features bring more options for those who need them and I’d rather have more options than to be handcuffed.

  1. Nguyen Thoai
    November 14th, 2008 at 07:07 | #1

    Thank you, it’s a very nice review.

  2. Duc Nguyen
    November 14th, 2008 at 10:23 | #2

    Very cool article. Thanks a Buu for sharing

  3. Jack Harris
    November 24th, 2008 at 01:12 | #3

    Where can I download the C# 4.0 binaries?

  4. November 24th, 2008 at 09:35 | #4

    @Jack: MS bundles everything (.NET 4.0, VS.NET 2010) in a MS Virtual PC image which can be downloaded at this URL: http://www.microsoft.com/downloads/details.aspx?FamilyId=922B4655-93D0-4476-BDA4-94CF5F8D4814&displaylang=en.

  5. December 10th, 2008 at 12:33 | #5

    The dynamic loopup feature makes C# like JavaSript, we can create a anonymous type and its method (maybe for property, I haven’t checked yet ^^) and then use it somewhere else. I can see the first advantage: use anonymouse type as entity for binding data to GUI controls.

    Thank for your reseraching, I got the main points of C# 4.0 :D

  6. December 11th, 2008 at 00:41 | #6

    Very good article.

  7. December 11th, 2008 at 19:00 | #7

    @Duy: In fact, it makes C# look like all dynamic languages, not just JavaScript :-) . I don’t think you can have methods in anonymous types though (assuming we’re still talking about C# here). But yeah, dynamic keyword makes it much easier to use anonymous types outside their declaration scope.

    @Michael: I’m glad you like it.

Comments are closed.