↓ Archives ↓

Archive → August, 2008

Bayesian filtering

Currently I’m thinking a lot about Bayesian filtering and new ways to apply this to my every day software.

For those who aren’t that informed on the subject I would recommend reading A Plan for Spam by Paul Graham. Wikipedia has a few articles of interest as well. I’d recommend the ones on Bayes’ theorem and Bayesian spam filtering.

Currently I see a lot of areas where this technique can be applied – generally everything evolve around e-mails, RSS-feeds and automatic web-site crawling.

I hope that I’ll have the time in the near future to actually implement something.

Memory consumption

I’m currently devoting some of my spare time to developing a tool that can track the memory consumption of individual running processes and display the data on a graph.

To narrow this description down a bit – the tool let’s you select a process and shows a graph of used memory over the last few minutes. Pretty nifty.

Windows XP/Vista lets me track these values numerical and with no history – the application of this feature is rather big during my every day work.

At the moment I’m trying to narrow down some performance and memory consumtion issues in our product which – if solved properly – would allow some very large data sets to be imported.

Double keyed tables

Do you ever find yourself in the need of double keyed tables? Well I do!

If you’re now quite sure what I mean about this I’ll try and sum it up. Imagine a Hashtable where you supply a key-pair instead of one key.

myHashtable.Put(key1, key2, value);

Or if you like that flavor

myHashtable[key1, key2] = value;

So far I haven’t found a good implementation of this scenario which leads me to the only solution: Do it yourself!

Edit: During my work yesterday I made a generic class for double keyed tables. The overall time it took was about 30 minutes and was really worth it! As a natural consequence I can’t reveal the source code here.

Updating .NET Versions

When Microsoft release new versions of their .NET Framework there is no doubt that there will be some nifty new features that we can all benefit from.

Lately the releases has contained data and presentation oriented additions to the framework which would allow us to save many lines of code if used properly. The only problem is – our users need the framework!

It is safe to say that 95% of our users have .NET version 1.0 and 1.1 installed. When it comes to version 2.0 (which our program is linked upon) the number drops to about 85-90% and framework versions 3.0 and 3.5 is installed on under 5% of the user systems. The 10-15% of our users that lack the required framework version 2.0 is bad enough! For various reasons I won’t go into detail with here – we can’t use the usual approach and bundle the installer in an MSI-pack that automatically installs the needed framework – we have to do this manually! This means that two developers spent most of a day writing some script for our installation script that checked for the proper .NET version and downloads + installs version 2.0 if needed.

I really think that Microsoft could help us out a lot on this by simply making all the versions of the .NET Framework mandatory updates through Windows Update. This would help the installation rate among our users to around 95% or even higher – they simply do as told.

This problem is actually one of the main reasons that the Java platform was discarded during the planning phase of the project. With Java the install base drops below the 5% marker which means that every install would have a bundled framework install. The impact on the users would be enormous. They would see a 150+ MB disk space used and have highly increased download times. Not acceptable.

Annoying merge

At work we use a Subversion server to store, branch and merge all our code. Generally this works out great! The life quality of the programming staff has risen quite a few nods after introducing this. Even as a single developer I can only recommend it – the posibility of branching and merging whilst hopping back and forth between different revisions is great! Simply great!

Anyway – one thing that constantly annoys me is the lack of code knowledge by the merge tool. This means that often when two developers has been editing the same file and lines there is a conflict that the merge tool can’t solve properly. Often when files are merged the result can’t compile! This means more management – which means an unhappy developer. Me!

To the best of my knowledge the merge tool could do a much better job if it simply knew what it was merging. A merge tool re-written specifically for C# code would in my oppinion do a much better job in these special cases. I’m not even going to start on the subject of designer generated GUI code merging. We do our very best to avoid this – it is always a source of great failure!

I’ll conclude this rant by noting that my irritation obviously hasn’t peaked yet. If this was the case I would’ve put some effort into a solution.