.NET Collections and Generics in C#

This post is based on Chapter 10 of Andrew Troelsen’s “Pro C# 2008 and the .Net Platform” available from Apress.

Right – yet another wondrous adventure with Mr. Troelsen awaits us, this time he will explain just what collections and generics can do for us. As always, this is a brief outline, if you really want to know, and be a better developer – we highly recommend getting hold of Mr. Troelsen’s book! It is quite simply very good, bordering on superb.

Collections

So collections are tools for containing objects. An array is the most simple container, but as you will know this has some serious limitations for adding, removing and resizing. So, to the rescue comes the Systems.Collections namespace. Troelsen gives us an outline of the interfaces in here:

System.Collections Interface Meaning in Life
ICollection Defines general characteristics (e.g., size, enumeration, thread safety) for all nongeneric collection types.
IComparer Allows two objects to be compared.
IDictionary Allows a nongeneric collection object to represent its contents using name/value pairs.
IDictionaryEnumerator Enumerates the contents of a type supporting IDictionary.
IEnumerable Returns the IEnumerator interface for a given object.
IEnumerator Enables foreach style iteration of subtypes.
IHashCodeProvider Returns the hash code for the implementing type using a customized hash algorithm.
IList Provides behavior to add, remove, and index items in a list of objects. Also, this interface defines members to determine whether the implementing collection type is read-only and/or a fixed-size container.

Also, we should know that these are organised as follows:

ScreenShot004

So, what we can see is that the ICollection is the most basic interface, and also we have a lot of enumeration going on (see our previous post on this).
Let’s look at all this in some more detail:

ICollection is described in the table above. The IDictionary uses name/value pairs of object type to organise objects, so you could for example store objects of any type (the value) and retrieve these by an ID or string Key – or indeed another object. The IDictionaryIterator simply returns a strongly typed iterator with a Entry property of type DictionaryEntry for each value that is retrieved (so you get stuff like Key/Value – both of type object.
The IList gives you add/remove and index functionality.

So what classes implement these indexes? Troelsen gives us this table:

System.Collections Class Meaning in Life Key Implemented Interfaces
ArrayList Represents a dynamically sized array of objects. IListICollectionIEnumerable, and ICloneable
Hashtable Represents a collection of objects identified by a numerical key. Custom types stored in a Hashtable should always override System. Object.GetHashCode(). IDictionaryICollectionIEnumerable, andICloneable
Queue Represents a standard first-in, first-out (FIFO) queue. ICollectionICloneable, and IEnumerable
SortedList Like a dictionary; however, the elements can also be accessed by ordinal position (e.g., index). IDictionaryICollectionIEnumerable, andICloneable
Stack A last-in, first-out (LIFO) queue providing push and pop (and peek) functionality. ICollectionICloneable, and IEnumerable

So Troelsen wisely recommends that we focus on the interfaces that are implemented here to gain an understanding of what each class brings to the table. Also, he quickly outlines three core classes, ArrayList, Queue and Stack – and leaves the rest to the 3.5 documentation.

You might have noticed that ArrayList is the only in the above table that implements IList. So we’ve kind of already covered this.

The Queue type is a bit more interesting, this basically works on a first-in-first-out basis, much like shopping in your local corner store. So you got Enqueue() and  Dequeue() methods, both handling objects. You can call Peek() to access the next object without popping this off the queue. Note though that you cannot use this to null check – use Count > 0 in stead.

And finally the Stack class is of course a last-in-first-out container, much like the company you work in. If you’re still working that is of course, if not, you probably did not enjoy that attempt at humour.  So this works much like the Queue class, except we now have Push() and Pop() methods. Count and Peek() still works the same way.

Finally note that we have a load of additional collections in the System.Collections.Specialized namespace. A quick overview from Mr. Troelsens most excellent book:

Member of System.Collections.Specialized Meaning in Life
BitVector32 A simple structure that stores Boolean values and small integers in 32 bits of memory.
CollectionsUtil Creates collections that ignore the case in strings.
HybridDictionary Implements IDictionary by using a ListDictionary while the collection is small, and then switching to a Hashtable when the collection gets large.
ListDictionary Implements IDictionary using a singly linked list. Recommended for collections that typically contain ten items or fewer.
NameValueCollection Represents a sorted collection of associated String keys and String values that can be accessed either with the key or with the index.
StringCollection Represents a collection of strings.
StringDictionary Implements a hashtable with the key strongly typed to be a string rather than an object.
StringEnumerator Supports a simple iteration over a StringCollection.

The Trouble With Boxing and Typed Collections

So, next up we’ll be moving onto Generics, but to understand the point of all this, first we must understand the problem of not using generics – that is, the problem of using the System.Object base class to treat all object in a consistent way (because although it sounds good, it’s really not).

So, what is boxing? Well, boxing is a process that allows you to convert a value type to a reference type, on-the-fly so to speak. And this happens a lot more than you think.
Unboxing is of course the opposite, so we have a reference type and we convert it back into a value type, an example I think:

			//A value type
			int x = 24;
			//Box it into an object (reference type)
			object objX = x;
			//Unbox it back into a value type
			int anotherX = (int)objX;

So, why does this happen more often than I think? – you might well ask. Well, working with collections of objects is the answer, another example we think again:

			ArrayList myListOfObjects = new ArrayList();
			//The Arraylist Add() method expects an object, so here we're boxing each one of these ints:
			myListOfObjects.Add(10);
			myListOfObjects.Add(20);
			myListOfObjects.Add(30);
			myListOfObjects.Add(40);
			//Now we want them back as ints:
			for (int i = 0; i < myListOfObjects.Count; i++)
			{
				 //...so we have to unbox each one:
				Console.WriteLine(((int)myListOfObjects[i]));
			}

As you see, this process relies heavily on boxing and unboxing, and why is that a problem? Well two things:

Firstly we obviously have a type safety issue, we are casting objects back into a set type, and any exceptions from this are not known until runtime. In the above simple example this is not a problem, but in complex systems dealing with many different value types, this is a very real concern.

Secondly there is a serious performance issue – the steps involved in boxing and unboxing are:

  1. Allocate a new object on the heap.
  2. Copy the value of the stack based value type into this object.
  3. When unboxing we have to copy the value of the heap object back into a new stack variable.
  4. Finally we end up with a load of objects lying around in the heap waiting for the garbage collector.

Again the above example would never cause any issues, but imagine you have a list with several thousand objects and you start to see the problem.

So what’s the solution? Well for a long time Typed Collections was the best solutions around. I won’t go into much detail on these (because they are not the answer my friends!) Basically you create a custom collection type for each specific type that you need to deal with. So for instance you might have a OrderCollection and a ProductCollection. The basic idea is that you wrap an ArrayList with strongly typed access methods. However, this only solves the type safety issue – but at the same time it also saddles us with a very labour intensive process and a likely maintenance nightmare. Furthermore, as these are nothing but wrappers around more general collection types, we still encounter the same boxing/unboxing issues.
So not much of a solution at all. Who will save me you ask? Generics will save you, that’s who.

Generics to The Rescue

Yes, indeed we are in need of rescuing, so let’s send in the proverbial cavalry.

So in a nutshell generic collections (System.Collections.Generic) allows you to create typed collections on-the-fly (i.e. specify the type at the point of creation) and at also removes the need for boxing and unboxing value types.

How so? Well magic really – or not. Basically we have a List<T> (T is used by convention, for type) which defines a load of operations that are common for typed collections, and collections generally. In our code we supply something like List<Orders> = new List<Orders>(); – and the compiler then replaces the T with Orders and viola! We have a typed collection – which we can create and use in a generic way.
What great fun.

So, other stuff we can do is to create our own generic methods – but you probably won’t really need to so we’ll skip over that bit. Also note that if you have a generic method that takes parameters, you can optionally leave out the type in the <> and rely on the compiler to do some type inference – which you could probably have some fun with using polymorphism.

Further more you can also create generic classes, again I’ll skip over that a bit lightly as it’s really quite specialised. However note that the default keyword is overloaded and when used in generics this can be used to set a generic parameter to it’s default value. So for instance an int would be zero etc etc.

Finally following along with this line of possibilities, you can of course also create a Custom Generic Collection – and once again you are quite unlikely to do this and it almost defies the whole point of generics. Like normal typed collections this is nothing but a wrapper around a generic List<T> collection.
One advantage to be gained is that you would be able to add your own specialised add/insert/remove methods – so thinking about it this might not be completely away with the fairies, but nevertheless I will leave the details out of this.

One interesting aspect of custom generic classes is the use of the where keyword, like this:

		// MyGenericClass derives from Object, while
		// contained items must be a class implementing IDrawable
		// and support a default ctor.
		public class MyGenericClass<T> where T : class, IDrawable, new()
		{...}

This is obviously interesting from a polymorphic point of view, and you can now do all kinds of operations on this generic collection. The options for the where keyword are:

Generic Constraint Meaning in Life
where T : struct The type parameter <T> must have System.ValueType in its chain of inheritance.
where T : class The type parameter <T> must not have System.ValueType in its chain of inheritance (e.g., <T> must be a reference type).
where T : new() The type parameter <T> must have a default constructor. This is very helpful if your generic type must create an instance of the type parameter, as you cannot assume the format of custom constructors. Note that this constraint must be listed last on a multiconstrained type.
where T :NameOfBaseClass The type parameter <T> must be derived from the class specified by NameOfBaseClass.
where T :NameOfInterface The type parameter <T> must implement the interface specified by NameOfInterface. Multiple interfaces can be separated as a comma-delimited list.

A very useful feature.

Generic Inheritance

Finally – finally, note that you can also inherit from generic classes and collections (you may define virtual and abstract methods as well), and also that you can indeed create generic interfaces and implement these.

Advertisements

One Response to .NET Collections and Generics in C#

  1. Pingback: .NET System Data Types, a few notes on « Blog Bustin' .NET Beats

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: