Gargabe Collection and Finalising and/or Disposing Objects in .NET

This post is based on Chapter 8 of Andrew Troelsen’s “Pro C# 2008 and the .Net Platform” available from Apress.

So, yet another chapter from Mr. Andy Troelsen’s brilliant book – quite possibly the best book on .NET I’ve ever read, stuffed full of goodness and gotchas!

Garbage collection, well round here it happens every Wednesday morning, when I’m rudely awaken by a dirty bin man (he rings my bell… to access my backyard… oh my – I’ve gone gay all of a sudden – good thing my girlfriend wouldn’t read this blog if it was a cold day in hell).
Anyways –  back in the safety of .NET – garbage collection is not quite so predictable – so let’s begin. I think this time, we will try a slight more abbreviated approach (spend my life writing these summaries).

Classes, objects and References (and stack and heap memory)

So, classes are blueprints or definitions, from which objects are created and placed in working memory. The new keyword  returns a reference to an object on the heap, while the reference itself is stored on the stack for further use in your application.

Let’s just pause for air and examine this heap and stack stuff a bit more. Check out an excellent article on this here: http://www.c-sharpcorner.com/UploadFile/rmcochran/csharp_memory01122006130034PM/csharp_memory.aspx?ArticleID=9adb0e3c-b3f6-40b5-98b5-413b6d348b91
Which tells us that (generally speaking) we can:
Think of the Stack as a series of boxes stacked one on top of the next. We keep track of what’s going on in our application by stacking another box on top every time we call a method (called a Frame). We can only use what’s in the top box on the stack. When we’re done with the top box (the method is done executing) we throw it away and proceed to use the stuff in the previous box on the top of the stack. The Heap is similar except that its purpose is to hold information (not keep track of execution most of the time) so anything in our Heap can be accessed at any time. With the Heap, there are no constraints as to what can be accessed like in the stack.

Recall that you get a stack trace in your exceptions – so with the knowledge above this now makes sense as a list of method calls or frames on the stack – each frame can/will then have a number of references pointing into objects lying around in the heap (not unorderly, but freely available). Also this will explain why memory placed on the stack is freed as soon as the current method returns.

So – my learned friends, what happens, when a method (or frame) returns and we have all these objects in the heap that are no longer needed? Well, let’s sit back and let Mr Andy Troelsen explain for us:

The .NET Garbage Collector

Once it’s determined that a heap object is no longer reachable from your application (i.e. no longer used, i.e. no longer referenced from the stack, i.e. no references to local objects, static fields/objects or indeed no method parameters can be found) it becomes available for GC – and will be destroyed (i.e. removed from memory) next time the GC runs – which is whenever the .NET runtime deems it prudent – or, in exceptional cases when your logic instructs it to do so.

What this means is of course that you don’t have to explicitly dispose or null objects – although the latter certainly does no harm, and the first can be a good practice in some cases – but more on that later (below).

Anyways – the GC sounds pretty inefficient, you can’t help but imagine some nasty process that is continually scanning through working memory checking if there is an active reference to any of the in-memory heap objects.
Well – it’s not, because it’s optimised and designed to only run when necessary (low memory being the only certain trigger). Once the GC runs it suspends all threads and then uses “Object Generations” to efficiently scan through memory.
Object Generations is a process whereby objects that survive a GC (i.e. are checked but found to be in use) are promoted to an older generation. when the GC runs it will always first scan the youngest generation, and if this frees sufficient memory, no further GC will occur. However if this is not sufficient, the GC will continue with the next generation, and so forwards down through the generations.
If you keep the stacked methods or boxes on the stack memory in mind this will make sense to you – and the  end result is that long life heap objects that are used throughout the application will not be checked again and again – where as new local objects will be checked first.

The System.GC Class

As a quick aside note that this is a classic example of a static utility class, which it would never make sense to instantiate this in code.
Anyways, moving on:- this static class allows you to manually work with the GC, triggering collections and checking memory and generations etc. I’ve never known or seen this done in any application – but Troelsen tells us that this can be useful if you a) want to make sure GC does not run during a certain procedure (so force a collection before) or b) you know that a (very) large number of objects have gone out of scope.

More interesting is methods like GetTotalMemory() which gives you the bytes currently on the heap, also you can get the generation of a given object, by supplying a reference to it.
Another interesting methods is AddMemoryPressure(long bytesAllocated) – which “Informs the runtime of a large allocation of unmanaged memory that should be taken into account when scheduling garbage collection”  – from MSDN. Remember of course that GC maintains managed code – so if you are for instance calling some COM library, this can be a method to bear in mind.

…which leads us nicely onto

Finalizable and Disposable Objects

So, what do we need these for when we got our lovely little GC?
Well, you need these when you are are working with managed code (i.e. your .NET code) that references unmanaged code!
Makes sense. So, what are they and how do we build them.

Finalizable Objects

The base object class has a protected virtual method which is called whenever the GC collects an object, or the application shuts down. The object base class’ Finalize method does nothing – but you are of course free to override this in any class. However – you do not use the override keyword, in stead you use your class name prefixed with a tilde, with no access modifiers (protected by default) and no parameters, so for example: ~MyClass() { /* finalize here */ }
Kind of like a constructor. Apparently this tilde is more of a C++ syntax – no idea why this is used here.

When objects are placed on the heap the .NET runtime determines if they have overriden the Finalize method – and if so adds them to a queue of objects to be Finalized on removal from the heap.

Disposable Objects

A disposable object implements the IDisposable interface – and unlike Finalizable object, it’s assumed that the object user will manually call Dispose() once finished with the object. At least it should, and btw, the object user, is you!
So to be clear:- the GC has no clue about the IDisposable interface, and to be frank, it just doesn’t want to know. So Dispose should be called by your code, while the object is still alive and well (and referenced) on the heap – of course, how else are you going to call it’s Dispose() method?

So what does one do when implementing the Dispose() method? Simples, you do two things: clean up all unmanaged resources, and then call Dispose on any Disposable managed objects! Note that you can of course use the is keyword to check if an object is IDisposable.

One slightly annoying thing is that things like database connections and file streams have Close() methods, which is in fact an alias to Dispose() – you can call both without causing an error – but really, if it’s IDisposable, call Dispose()  – and then you can get all geeky and clever when some lesser developer comes to you and accuses you of not having “closed” your database connection (tell him it’s not a ruddy garden gate, and that Close is just an alias for Dispose and watch his eyes blink in confusion).

One better and cleaner alternative to the above is the using keyword (which happily also doubles for setting up reference shorthands).
When you use a using block like the below example, the IL code generated is actually a try/catch block, with a call to Dispose on the object used placed in the finally block (however this does of course not mean that any exceptions are handled!):

			//Note you can place >1 object of the same type in a using block
			using(FileStream fileStream = new FileStream(), fileStream2 = new FileStream())
			{
				//Do your thing here
			}
			//Know that both objects have been disposed (aka "closed") here

Whenever possible, always use using blocks – they are good, clean and we like them!

Finally – we have a GC.SuppressFinalize(this); method call – which can be used to build a fool proof object that combines the best of Finalizable and Disposable objects.
The general idea is that the object user might forget to call  Dispose – in which case we want the object to clean up unmanaged resources when the GC collects it and Finalizes it. The methods is simple, put you clean up code in a private method. Call this from both the Dispose and Finalize methods, but in the Dispose, follow this by a call to SuppressFinalize(this); – as you might have guessed, this removes the object from the GC queue of Finalizable objects.

Simples!


Advertisements

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: