Archive for the ‘Development’ Category.

Mammoth VPS is now live

After many months of hard work, our new service Mammoth VPS was launched this week. The service is a good fit for us I think, making use of the many years of experience we’ve had managing servers at Telstra; but being virtualized requiring relatively little physical hands-on work at the data centre.

It was also an interesting project from a development point of view, figuring out how to automate all the various Xen related tasks. We have also taken the presumedly un-usual route of running the load-balanced ASP.NET web application on both Windows and Linux. This required partially reimplementing the built-in FormsAuthentication class (which behaves differently in Mono) to allow the session cookie to be readable on both platforms, but otherwise is performing well. The website’s load times are really snappy, so I’m happy in that regard.

UrlRoutingModule, StaticFileHandler and IIS6

Today I was making sure that GZIP and the HTTP Expires header was enabled on all the content of our new ASP.NET MVC application. We have a little bit of code in our Global.asax to set up compression and expires appropriately.

I was surprised to find that despite using wildcard-mapping and IIS6 (which would result in all HTTP requests being ‘visible’ to Global.asax in a regular WebForms application), that our stylesheets were no longer being setup correctly.

Fiddling with the HTTP Header Expires setting in IIS confirmed that the stylesheet file appeared to be processed directly by IIS, which I had previously thought was impossible when using wildcard + IIS6.

I also found it strange that an MVC application would differ compared to WebForms: my understanding was that the new System.Web.UrlRoutingModule simply passed on-disk files to the appropriate handler, in this case System.Web.StaticFileHandler.

However, with a bit of testing I found this was not the case: modifying the web.config with:

<system.web>
<httpHandlers><add verb="*" path="*.css" type="System.Web.StaticFileHandler"/></httpHandlers>
</system.web>

Resulted in the Global.asax processing the request. This blog post mentions the potential behaviour of System.Web.DefaultHttpHandler in terms of passing files back to IIS (though its not mentioned where or how this was learned), and indeed this does appear to be exactly the case in MVC.

Poking through System.Web.DefaultHttpHandler in Reflector its not clear to me how this actually works, but its clear from my testing that a typical MVC application will have its static on-disk files sent directly by IIS.

How much engineering is too much?

Jared sent me this blog post which I found quite interesting. The article itself I feel is worded kind of poorly, even right from the title (I dont think anyone really argues about data access any more) but the core concept of “how much engineering is too much?” is discussed with more riguer in the comments.

As an employer who is both supportive and encouraging of the ALT.NET patterns and processes (though there’s little agreement on exactly what those are), I have my own perspective on this issue. While it can sometimes be tempting to think about good software development principles in a vacuum, its also important as developers to remember why we are employed: to build software.

As an employee, we are a cost (the input) that is subtracted from the revenue our employer gains from the software we make (the output). Since for most employees their salary does not vary according to output, the most valuable employees are those that create the most software per unit of time. However, many of us have rightfully rallied against RAD tools as creating unmaintainable solutions that are hard to alter as requirements change.

If you want to be a valuable employee (and chances are you do, since a valuable employee is a well-paid employee) it is still your primary responsible to figure out how you can develop as much software as you can per unit of time. There are really three parts to this:

  • Time to market: how long does it take to get the each version ready for deployment, given a predictable incremental approach to requirements?
  • Defect rate: defects have two costs: the time to fix them, and the negative impact they have on customer satisfaction.
  • Malleability: how long does it take to get successive versions ready for deployment as requirements change?

RAD tools focus too heavily on the first criteria while ignoring the others. Over-engineering focuses too much on the third criteria while ignoring the first one. Conery summarises this as “Simplicity is beautiful”; the best developers will adapt an approach that satisfies all three criteria.

Custom data binding with MonoRail 1.0

Yesterday I had a solution where I wanted to receive a domain entity as an action parameter, similar in conceptĀ  to the built-in ActiveRecord data-binding. In my situation though I wanted to use a slug to identify the entity. My URL’s look like this:

/news/slug-goes-here

which I want to result in calling into:

public class NewsController
{
    public void Index(NewsItem item)
    {
    }
}

The first step here is to setup routing to suit. We currently just use an XML file to hold our routing rules, so this route entry just looks like

 <rule> 
    <pattern>^/news/([^/]+)$</pattern>
    <replace>/news/index?item=$1</replace>
  </rule>

With this in place, our Index action will receive item as an argument. All that’s left to do is tell MonoRail how to bind the item argument to our NewsItem type. There are two approaches available here: you can derive a class From DataBinder and pass it to the SmartDispatchController contructor, or you can create a new attribute that implements IParameterBinder and attach it to the action’s parameter. Unfortunately in MonoRail 1.0 the DataBinder class from what I can tell is really only extensible via inheritance, so I have used the attribute approach.

MonoRail’s source is a real help here, since you can inspect DataBindAttribute.cs to see what the class should be doing. Here is the implementation I ended up with:

public class NewsItemAttribute : Attribute, IParameterBinder
	{
		public int CalculateParamPoints(SmartDispatcherController controller, ParameterInfo parameterInfo)
		{
                        CompositeNode node = controller.ObtainParamsNode(From);
                        IDataBinder binder = CreateBinder();
                        return binder.CanBindObject(typeof(string), prefix, node) ? 10 : 0;
		}
 
		public object Bind(SmartDispatcherController controller, ParameterInfo parameterInfo)
		{
                        IDataBinder binder = CreateBinder();
                        ConfigureValidator(controller, binder);
                        CompositeNode node = controller.ObtainParamsNode(From);
                        var slug = (string) binder.BindObject(typeof(string), prefix, exclude, allow, node);
			var instance = IoC.Resolve<INewsItemRepository>().GetBySlug(slug);
                        BindInstanceErrors(controller, binder, instance);
                        PopulateValidatorErrorSummary(controller, binder, instance);
                        return instance;
		}
	}

The only line of note here is the IoC.Resolve() call. To my knowledge you cant use DI with attributes, so we’re stuck with using the service locator approach. The rest of the code is straight from the DataBindAttribute source.

I’ve not yet gotten the chance to use ASP.NET MVC in a project, but from what I’ve read the ModelBinder facility appears a much cleaner approach to this problem than that possible with MonoRail 1.0.

Increasing the Windows XP IIS connection limit

Stefan passed on this handy blog post discussing the IIS connection limit in Windows XP. The default is 10 connections which I had always presumed could not be changed, but it can actually be increased to 40 connections – which is a bit more useful if you need to do any performance testing.

Avoiding the Complexity Spiral

At work I intermittently talk about “complexity spirals”: the phenomenon where someone makes the wrong – usually over complicated – choice in software design, but for time constraints or some other reason the development group is unable to go back and make the right choice.

Sooner or later, users want a new feature. That feature is complicated to implement on top of the wrong solution. If you go ahead and implement it, you’ve now got two complicated bits of code instead of two simple ones. Soon you spiral out of control, as the complexity piles up and it becomes very time consuming to add functionality.

All this can be avoided if you recognise the original design was wrong. This article on Raymond’s Windows blog interested me today, as its a well known example of where the “wrong” decision (.lnk files instead of symbolic links as first-class members of the file system) are so incongruous with the rest of the operating system that implementing what seems like a simple behaviour that users will want, is essentially impossible.

When should you deploy a new build?

Working for a small company doing ongoing contract work for a large company, there can be often be disagreements over how to solve problems. Large companies like to solve problems with “process” – a series of steps that must be performed to carry out some task.

The key thing about a process is that it absolves everyone of responsibility: if you encounter failure after carrying out the process, it is the process at fault. Unfortunately process failure leads to process review, with the end result typically being more steps added to the process.

One process that is familiar to many developers is that of website deployment: the process of actually changing what’s online from an old release to a new release. At Mammoth we firmly believe in many of the tenets of agile development and so have a release every one to two weeks. We store our releases in Subversion so the actual mechanics of getting a new release onto the site are both easy and repeatable.

The matter of when during the day (or night) to actually deploy a build seems an easy answer at first. Let’s say for example that you have spent the day testing your application; its now 4pm and time to deploy. The deployment is carried out, and your database immediately falls over: it turns out your application has a bug that causes 1000 extra queries on a popular page.

This outrageous mistake has caused ten minutes of downtime until the deployment was rolled back to the prior release (you do have a rollback plan dont you?). At this point, the deployment process comes under attack: “Why is there a deployment at 4pm? We should deploy at 4am when no one is using the site!”

At first (and second) glance, this line of thought is hard to argue with and indeed this is the current process utilised by our client. And indeed it does appear to work – the site essentially never does crash immediately after being deployed.

Upon closer analysis though, this is not the least bit surprising. The load on our website at 4am is essentially the same as that created during testing. If the website was going to crash with this level of load, it already would have during the test process, hence meaning it would already be fixed.

So at this point everyone pats themselves on the back for another successful deployment and goes to bed. The reality is somewhat different however: what has been deployed is essentially a ticking time bomb. The site may very well still have a bug in it that when exposed to the level of load that 4pm entails, causes the entire site to crash.

This I think is the crux of why 4am deployments (on their own) are bad: it fosters a false sense of safety. Recall that our actual problem was that a defect caused the number of users at 4pm to crash the site. There is only one way to actually prevent this problem – ensure that such a defect is not in the site at 4pm. A couple of methods spring to mind as ways to prevent this:

  • Utilise a load testing process prior to each deployment. This should catch most your problems
  • Ensure there is adequate monitoring of your database, CPU, memory, etc resources that it is obvious when a defect causes a sudden day-over-day spike.

A secondary problem is that our client is rigid about the one-release-per-week: if a release doesn’t “take” the only option is to rollback. This creates somewhat of a quandry when combined with 4am releases – if you do not rollback until say 4pm, customers have been exposed to new functionality for 12 hours that suddenly disappears.

What’s my answer then? My preference is for 10am deployments: our load peaks at night, so 10am is still relatively quiet. The development team is at work and so on hand to look at any issues. But whatever the time of the release I think being able to make a secondary update later in the day is crucial: on a limited budget and weekly releases, the occasional defect will make it onto the site – that I think is simply reality. What I think is important then is to focus on keeping the customer happy by both avoiding peak-time deployments (potential downtime) and avoiding rollbacks (feature loss).

Performance of concurrent XmlReader construction

Hosting FeedZero on a linux server using Mono can be trying at times. Its not that Mono itself is particularly troublesome (I actually think its amazingly good), but more that Mono is red-headed step-child of the wider .NET community – much like those who try and build various command line UNIX tools on Windows.

Usually running .NET assemblies on Mono “just works” but when you encounter a bug – particularly in third party libraries – it seems I am often the first person to have run into it (judging by Google) and so have no real choice except to diagnose it myself.

Worse, many .NET libraries – even open source ones – do not even try to support Mono. Unlike GameCreate, FeedZero utilises a number of third-party libraries and so is more exposed to these sorts of problems. I encountered three separate problems in three separate libraries this week all of which I had to diagnose and either fix or find a work around for.

The most recent issue I encountered was a problem with Argotic which calls itself a “syndication framework” but is more plainly described as a library for parsing RSS feeds. I noticed while inspecting our service logs that with relative regularity, our updates would pause for minutes at a time – often 5 or more. After inspecting a few occurances the common theme was that after each pause an error was displayed due to the RSS feed being invalid in some way.

Armed with the log data, we set about building a test executable that parsed a list of RSS feeds. Running the test against a sample list obtained from the service logs provided the same results as observed in production – provided with a list of 16 feeds 3 of which were invalid, the test would pause for multiple minutes before finally finishing. However running the test executable with a single item list consisting of any one of the three bad RSS feeds did not cause the pausing. Worse, Windows did not seem to have this behaviour.

Due to the number of feeds we need to scan, FeedZero does RSS feed updates in multiple threads at once; the above behaviour was consistent with some sort of concurrency bugs. The three classes of bugs: concurrency-related, third-party library, Mono-only are probably the most tiresome to diagnose and here was an issue that fell into all three categories.

I eventually narrowed my test case down to updating any two invalid RSS feeds at once and still had the pause. For a third party library (and indeed any distinct module of code), the easiest way to to test for a concurrency issue is to simply prevent that module or library from being called by more than one thread at a time. This is very easy to do using C#’s built-in object locking feature:

private static const object _lock = new object();
public void UpdateFeed(string url)
{
    ...
    GenericFeed feed = new GenericFeed();
    lock (_lock)
        feed.Load(url);
    ...
}

Sure enough, with the simple two-line addition of a static object on which to lock the problem went away. Unfortunately for me, FeedZero really needs to get good performance on its feed updating so I would need to identify the underlying cause in Argotic. Through some tedious but necessary analysis I was able to narrow the problem down to the construction of System.Xml.XPath.XPathDocument, a part of the BCL.

My next step was to write a program from scratch that demonstrates the problem, that is not reliant on any other libraries and can be freely distributed. I ended up with a command line executable that created two XPathDocument objects simultaneously with a HTML page as the input, with a command-line argument permitting a lock to be used on the constructor.

This simple program shows the problem; on my PC it took 6 seconds using the lock and over 5 minutes without it. I then turned to Windows, where my first run took 9 seconds without the lock but a subsequent run took almost 2 minutes. Through additional reading, I determined that XPathDocument apparently uses a System.Xml.XmlReader internally and adjusted my test executable to construct XmlReader’s instead; the problem remained.

Finally I altered my executable to perform the test 10 times and report the runtime of each so I could look for average runtime; on both Windows and Linux with the lock I received about 6 seconds – but without the lock, the results are more interesting. While Windows seems to be not too bothered without the lock on the first attempt, on the second attempt it threw some sort of internal timeout exception.

So, both Mono and Windows essentially have very poor performance when attempting to parse two non-XML documents simultaneously; I will package up my test executable and submit a bug to Novell for the mono issue.

So, the final outcome is that I really had no choice but to lock; however a secondary observation from this process is that relying on XmlReader to throw an exception on non-XML takes a pretty long time (multiple seconds); too long for my purposes. Finally I settled on passing the downloaded data through the following regular expression to determine if its likely to be a feed

private static readonly Regex _isFeedRegex = new Regex("&lt;([\\w_-]+:)?(feed|rss|rdf)",
    RegexOptions.IgnoreCase | RegexOptions.Compiled);

If there is no match, then we can skip the entire parsing attempt; with this change we can now fail invalid feeds in a few hundred msec.

Update: Further testing shows that the standard reader obtained from XmlReader.Create notices immediately (on first attempted read) that the document isn’t XML, which makes an even easier way to help out XPathDocument.