Sunday 16 December 2012

SSD, GPT, EFI, TLA, OMG!

I finally bought an SSD, so I took the drive change as an excuse to try out some other nifty new technologies as well: UEFI and GPT. Getting them to work (along with a dual-boot operating systems - Gentoo + Windows 7) wasn't trivial so I'll describe what was required to get it all humming nicely.

The hardware part was easy. The laptop I have came with a 1TB 5.4k extra-slow hard drive plugged into it's only SATA-3.0 port, but that's not a problem. There's another SATA-2.0 port, dedicated to a DVD drive - why would anyone need that? I've replaced the main drive with a fast Intel SSD (450MBps write, 500MBps read, 22.5K IOPS - seriously, they've become so cheap that if you're not using one you must be some kind of a masochist that likes to stare blankly at the screen waiting for hard drive LEDs to blink), ordered a "Hard Driver Caddy" off eBay ($9 including postage, although it took 24 days to arrive from Hong Kong) and started system installation.

HDD and SSD on an open laptop

Non-chronologically, but sticking to the hardware topic: the optical drive replacement caddy comes in three different sizes (for slot drives/slim 9.5mm/standard 12.7mm) and that's pretty much the only thing you have to check before you order one. Connectors and even the masking plastic bits are standardised, so replacement operation is painless. A caddy itself weights about 35g (as much as a small candy bar), so your laptop will end a bit leaner than before.

DVD and an HDD in the caddy:

DVD and HDD in a replacement caddy

You'll want to remove the optical drive while it's ejected, as the release mechanism is electrical, and one of two hooks holding the bezel is only accessible when the drive is opened. I used a flat screwdriver to unhook it, but be careful, as the mask is quite flimsy and might break. Only a cosmetic problem, but still. Showing the hooks:

That's pretty much everything that's needed from the hardware side - now to the software. I was following a Superuser post Make UEFI, GPT, Bootloader, SSD, USB, Linux and Windows work together, which describes the dual boo installation procedure quite well. My first problem was that I couldn't get a UEFI boot to work from a DVD (when I still had it). Went for the easiest solution with Ubuntu Live USB that managed to start in the UEFI mode just fine.

There are quite a few "gotchas" here: you can't install a UEFI system if you're not already booted into UEFI mode (check dmesg output for EFI messages). The starting payload needs to be 64 bit and reside on a FAT32 partition on a GPT disk (oversimplifying a bit, but those are the requirements if you want to dual-boot with Windows). A side-note for inquiring minds: you'll also need a legal copy of Windows 7/8, as the pirate bootloaders require booting in BIOS mode. Oh, and your SATA controller needs to be set to AHCI mode, because otherwise TRIM commands won't reach your SSD drive and it will get slower and slower as it fills with unneeded (deleted, but not trimmed) data.

Once I had Ubuntu started, I proceeded with a mostly standard Gentoo installation procedure. Make sure you do your GPT partitioning properly (see the Superuser post, although the 100MB for EFI boot partition might be too much - I have 16MB used on it and that's unlikely to change) and remember to mount the "extra" partition in /boot/efi before you install Grub2. Additional kernel options needed are listed on Gentoo Wiki, Grub2 installation procedure for UEFI is documented there as well. Make sure that your Linux partitions are ext4 and have the discard option enabled.

All of this resulted in my machine starting - from pressing the power button to logging onto Xfce desktop - in 13 seconds. Now it was time to break it by getting Windows installed. Again, the main hurdle proved to be starting the damn installer in UEFI mode (and you won't find out in which mode it runs until you try to install to a GPT disk and it refuses to continue because of unspecified errors). Finally I got it to work by using the USB I had created for Ubuntu, replacing all of the files on the drive with Windows Installation DVD contents and extracting the Windows bootloader. That was the convoluted part, because a "normal" Windows USB key will only start in BIOS mode.

  • Using 7zip, open file sources/install.wim from the Windows installation DVD and extract \1\Windows\Boot\EFI\bootmgfw.efi from it.
  • On your bootable USB, copy the folder efi/microsoft/boot to efi/boot.
  • Now take the file you extracted and place it in efi/boot as bootx64.efi.

This gave me an USB key that starts Windows installer in UEFI mode. You might want to disconnect the second drive (or just disable it) for the installation, as sometimes Windows decides to put it's startup partition on the second drive.

Windows installation done, I went back to Ubuntu Live USB and restored Grub2. Last catch with the whole process is that, due to some bug, it won't auto-detect Windows, so you need an entry in /etc/grub.d/40_custom file:

menuentry "Windows 7 UEFI/GPT" {
 insmod part_gpt
 insmod search_fs_uuid
 insmod chain
 search --fs-uuid --no-floppy --set=root 6387-1BA8
 chainloader ($root)/EFI/Microsoft/Boot/bootmgfw.efi
}

The 6387-1BA8 identifier is the partitions UUID, you can easily find it by doing ls -l /dev/disk/by-uuid/.

Dual-booting is usually much more trouble than it's worth, but I did enjoy getting this all to work together. Still, probably not a thing for faint of heart ;-) I also have to admit that after two weeks I no longer notice how quick boot and application start-up are (Visual Studio 2012 takes less than a second to launch with a medium size solution, it's too fast to practically measure), it's just that every non-SSD computer feels glacially slow.

In summary: why are you still wasting your time using a hard drive instead of an SSD? Replace your optical drive with a large HDD for data and put your operating system and programs on a fast SSD. The hardware upgrade is really straightforward to do!

Sunday 30 September 2012

Handling native API in a managed application

Although Windows 8 and .NET 4.5 have already been released, bringing WinRT with them and promising the end of P/Invoke magic, there's still a lot of time left until programmers can really depend on that. For now, the most widely available way to interact with the underlying operating system from a C# application, when the framework doesn't suffice, remains P/Invoking the Win32 API. In this post I describe my attempt to wrap an interesting part of that API for managed use, pointing out several possible pitfalls.

rusted gears

Lets start with a disclaimer: almost everything you need from your .NET application is doable in clean, managed C# (or VisualBasic or F#). There's usually no need to descend into P/Invoke realms, so please consider again if you really have to break from the safe (and predictable) world of the Framework.

Now take a look at one of the use cases where the Framework does not deliver necessary tooling: I have an application starting several children processes, which may in turn start other processes as well, over which I have no control. But I still need to turn the whole application off, even when one of the grandchild processes breaks in a bad way and stops responding. (If this is really your problem, then take a look at KillUtil.cs from CruiseControl.NET, as this way ultimately what I had to do.)

There is a very nice mechanism for managing child processes in Windows, called Job Objects. I found several partial attempts of wrapping it into a managed API, but nothing really that fitted my purpose. An entry point for grouping processes into jobs is the CreateJobObject function. This is a typical Win32 API call, requiring a structure and a string as parameters. Also, meaning of the parameters might change depending on their values. Not really programmer-friendly. There are a couple of articles on how the native types map into .NET constructs, but it's usually fastest to take a look at PInvoke.net and write your code based on samples there. Keep in mind that it's a wiki and examples will often contain errors.

What kind of errors? For one, they might not consider 32/64 bit compatibility. If it's important to you then be sure to compile your application in both versions - if your P/Invoke signatures aren't proper you'll see some ugly heap corruption exceptions. Other thing often missing from the samples is error checking. Native functions do not throw exceptions, they return status codes and update the global error status, in a couple of different ways. Checking how a particular function communicates failure is probably the most tricky part of wrapping. For that particular method I ended up with the following signature:

[DllImport("kernel32", SetLastError = true, CharSet = CharSet.Auto)]
private static extern IntPtr CreateJobObject(IntPtr lpJobAttributes, string lpName);

Modifiers static extern are required by P/Invoke mechanism, private is a good practice - calling those methods requires a bit of special handling on the managed side as well. You might also noticed that I omitted the .dll part of the library signature - this doesn't matter on Windows, but Mono will substitute a suitable extension based on the operating system it's running on. For the error reporting to work, it's critical that the status is checked as soon as the method returns. Thus the full call is as follows:

IntPtr result = CreateJobObject(IntPtr.Zero, null);
if (result == IntPtr.Zero)
    throw new Win32Exception();

On failure, this will read the last reported error status and throw a descriptive exception.

Every class holding unmanaged resources should be IDisposable and also include proper cleanup in it's finalizer. Since I'm only storing an IntPtr here I'll skip the finalizer, because I might not want for the job group to be closed in some scenarios. In general that's a bad pattern, it would be better to have a parameter controlling the cleanup instead of "forgetting" the Dispose() call on purpose.

There's quite a lot of tedious set-up code involved in job group control that I won't be discussing in detail (it's at the end of this post if you're interested), but there are a couple of tricks I'd like to point out. First, and pointed out multiple times in P/Invoke documentation (yet still missing from some samples) is the [StructLayout (LayoutKind.Sequential)] attribute, instructing the runtime to lay out your structures in memory exactly as they are in the file. Without that padding might be applied or even the members might get swapped because of memory access optimisation, which would break your native calls in ways difficult to diagnose (especially if the size of the structure would still match).

As I mentioned before, Win32 API calls often vary their parameters meaning based on their values, in some cases expecting differently sized structures. If this happens, information on the size of the structure is also required. Instead of manual counting, you can rely on Marshal.SizeOf (typeof (JobObjectExtendedLimitInformation)) to do this automatically.

Third tip is that native flags are best represented as enum values and OR'ed / XOR'ed as normal .NET enums:

[Flags]
private enum LimitFlags : ushort
{
    JobObjectLimitKillOnJobClose = 0x00002000
}

Wrapping unmanaged API often reveals other problems with it's usage. In this case, first problem was that Windows 7 uses Compatibility Mode for launching Visual Studio, which that wraps it and every program started by it in a job object. Since a job can't (at least not in Windows 7) belong to multiple groups, my new job group assignment would fail and the code would never work inside a debugger. As usual, StackOverflow proved to be helpful in diagnosing and solving this problem.

However, my use case is still not fulfilled: if I add my main process to the job group, it will be terminated as well when I close the group. If I don't, then a child process might spin off children of its own before it is added to the group. In native code, this would be handled by creating the child process as suspended and resuming it only after it has been added to the job object. Unfortunately for me, turns out that Process.Start performs a lot of additional set-up that would be much too time consuming to replicate. Thus I had to go back to the simple KillUtil approach.

I've covered a couple of most common problems with calling native methods from a managed application and presented some useful patterns that make working with them easier. The only part missing is the complete wrapper for the API in question:

Friday 31 August 2012

Dynamic log level with log4net

Out of all the features of log4net, the most useful and the least known at the same time is the possibility for the logger to dynamically change the logging level based on future events. Yes, future! Nothing like a little clairvoyance to produce clean and usable log files.

log4net can buffer incoming events and, when an error occurs, write out the sequence of actions that lead to it - and if nothing wrong happens, then the excessive messages are dropped. The class that allows for that is BufferingForwardingAppender. It wraps around another log appender (e.g. file or console or smtp or database or eventlog or whatever else you would like log4net to write to) and uses an evaluator to decide when to flush buffered data. Let's have a look at a sample configuration (app.config file):

<?xml version="1.0" encoding="utf-8"?>
<configuration>
  <configSections>
    <section name="log4net" type="log4net.Config.Log4NetConfigurationSectionHandler, log4net" />
  </configSections>
  <log4net>
    <!-- see http://logging.apache.org/log4net/release/config-examples.html for more examples -->
    <appender name="ConsoleAppender" type="log4net.Appender.ConsoleAppender">
      <threshold value="WARN" />
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%-4timestamp [%thread] %-5level %logger %ndc - %message%newline" />
      </layout>
    </appender>
    <!-- you should use a RollingFileAppender instead in most cases -->
    <appender name="FileAppender" type="log4net.Appender.FileAppender">
      <file value="my_application.log" />
      <!-- pattern is required or nothing will be logged -->
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%-4timestamp [%thread] %-5level %logger %ndc - %message%newline" />
      </layout>
    </appender>
    <appender name="BufferingForwardingAppender" type="log4net.Appender.BufferingForwardingAppender" >
      <evaluator type="log4net.Core.LevelEvaluator">
        <threshold value="ERROR" />
      </evaluator>
      <bufferSize value="50" />
      <lossy value="true" />
      <appender-ref ref="FileAppender" />
    </appender>
    <!-- root is the main logger -->
    <root>
      <!-- default is INFO, this performs initial filtering -->
      <level value="DEBUG"/>
      <!-- messages are sent to every appender listed here -->
      <appender-ref ref="BufferingForwardingAppender"/>
      <appender-ref ref="ConsoleAppender" />
    </root>
  </log4net>
</configuration>

Now this is a wall of text. What is going on here?

  • configSections is a standard .NET configuration section declaration
  • then we declare a ConsoleAppender that will print everything of level WARN or above to console - you can configure a ColoredConsoleAppender instead to have prettier output
  • following that is a FileAppender, which simply outputs everything to a file
  • next one is the magical BufferingForwardingAppender, containing an evaluator that triggers for every message of level ERROR or above, a lossy buffer of size 50 (that means that when more messages are buffered, the first ones are being discarded) and a target appender that will receive messages when they are flushed
  • last element is the root logger, which is the default sink for all the messages - it contains referenced to our appenders and will feed messages to them

So far so good. log4net now needs to be instructed to parse this configuration - my preferred way is with an assembly attribute:

[assembly: log4net.Config.XmlConfigurator (Watch = true)]

You can specify a file path in this attribute if you don't want to store your configuration inside app.config. A simple way to create a logger is just

private static readonly log4net.ILog log = log4net.LogManager.GetLogger ( System.Reflection.MethodBase.GetCurrentMethod ().DeclaringType );

and we're good to go. Now all that remains is dumping some log messages into our log.

for (int i = 0; i < 1025; i++)
{
  log.DebugFormat("I'm just being chatty, {0}", i);
  if(i%2 ==0)
    log.InfoFormat("I'm just being informative, {0}", i);
  if(i%20 == 0)
    log.WarnFormat("This is a warning, {0}", i);
  if(i%512==0)
    log.ErrorFormat("Error! Error! {0}", i);
}

When you execute this sample code you will see every warning and error printed to console. Contents of my_application.log, however, will look differently: they will contain only errors and 50 messages that were logged before the error. Now that's much easier to debug, isn't it?

Please also take a look at how I include parameters in the logging calls: using the DebugFormat() form overloads means that the strings are not formatted until this is necessary - so if a log message is suppressed, no new string will be allocated and no ToString() will be called. This might not change your applications performance a lot, but it's a good practice that is worth following. And one last thing to remember: log4net, by default, does not do anything. In order to get any output, you need to explicitly request it - most likely through configuration.

Wednesday 1 August 2012

NuGet proxy settings

This post is based on code present in NuGet 2.0.

NuGet reads web proxy settings from three distinct sources, in order:

  • configuration files
  • environment variables
  • current user's Internet Options

While the layout of IE's Connection Settings is probably familiar to you if you are behind a corporate firewall and require proxy configuration to access the Internet, first two options require a bit of explanation.

For configuration files, NuGet first considers .nuget\NuGet.config and then falls back to %APPDATA%\NuGet\NuGet.config. Relevant configuration entries are http_proxy, http_proxy.user and http_proxy.password. You can either edit them manually, by adding a line under <settings> node:

<add key="http_proxy" value="http://company-squid:3128" />

or you can add them from NuGet command line:

nuget.exe config -set http_proxy=http://company-squid:3128

If those variables aren't found in the configuration files, NuGet will fall back to checking standard environment variables for proxy configuration. By pure concidence, the variables have the same names as the configuration options ;-) . Names are not case-sensitive, but you might have to experiment a bit until you get NuGet to properly parse your settings if you have a space where it wouldn't expect one (e.g. in your user name).

Finally, if you are running NuGet in your user account, not using a service account (e.g. on a continuous build server), it will simply pick up whatever you have configured in the Control Panel as the system proxy server. All credentials configured there (including Active Directory single sign-on mechanism) should work without any work on your part.

Monday 4 June 2012

Why aren't C# methods virtual by default?

Recently, during GeeCON 2012 conference, I had a very interesting conversation with Martin Skurla on differences between the .NET runtime and the Java Virtual Machine. One of the more surprising divergences is centred around the virtual keyword.

Virtual methods are one of the central mechanisms of polymorphic objects: they allow a descendant object to replace the implementation provided by the base class with it's own. In fact, they are so important that in Java all public methods are virtual by default. even though this does carry a small runtime overhead. The virtual method dispatch is usually implemented using a virtual method table, thus each call to such a method requires an additional memory read to fetch the code address - it cannot be inlined by the compiler. On the other hand, a non-virtual method can have it's address inlined in the calling code - or even can be inlined whole, as is the case with trivial methods such as most C# properties.

There are several ways of dealing with this overhead: HotSpot JVM starts the program execution in interpreted mode and does not compile the bytecode into machine code until it gathers some execution statistics - among those is information, for every method, if it's virtual dispatch has more than a single target. If not, then the method call does not need to hit the VTable. When additional classes are loaded, the JVM performs what is called a de-optimization, falling back to interpreted execution of the affected bytecode until it re-verifies the optimization assumptions. While technically complex, this is a very efficient approach. .NET takes a different approach, akin to the C++ philosophy: don't pay for it if you don't use it. Methods are non-virtual by default and the JIT performs the optimization and machine code compilation only once. Because virtual calls are much rarer, the overhead becomes negligible. Non-virtual dispatch is also crucial for the aforementioned special 'property' methods - if they weren't inlineable (and equivalent in performance to straight field access), they wouldn't be as useful. This somewhat simpler approach has also the benefit of allowing for full compilation - JVM need to leave some trampoline code between methods that will allow it to de-optimize them selectively, while .NET runtime, once it has generated the binaries for the invoked method, can replace (patch) the references to it with simple machine instructions.

I am not familiar with any part of ECMA specification that would prohibit the .NET runtime from performing the de-optimization step, thus not permitting the HotSpot approach to the issue (apart from the huge Oracle patent portfolio covering the whole area). What I do know is that since the first version of the C# language did not choose virtual to be the default, future versions will not change this behaviour - it would be a huge breaking change for the existing code. I've always assumed that the performance trade-off rationale was the reason for the difference in behaviour - and this was also what I explained to Martin. Mistakenly, as it turns out.

As Anders Hejlsberg, the lead C# architect, explains in one of his interviews from the begging of the .NET Framework, a virtual method is an important API entry point that does require proper consideration. From software versioning point of view, it is much safer to assume method hiding as the default behaviour, because it allows full substitution according to the Liskov principle: if the subclass is used instead of an instance of the base class, the code behaviour will be preserved. The programmer has to consciously design with the substitutability in mind, he has to choose to allow derived classes to plug into certain behaviours - and that prevents mistakes. C# is on it's fifth major release, Java - seventh, and each of those releases introduces new methods into some basic classes. Methods which, if your code has a derived class that already used the new methods name - constitute breaking changes (if you are using Java) or merely compilation warnings (on the .NET side). So yes, a good public API should definitely expose as many plug-in points as possible, and most methods in publicly extendable classes should be virtual - but C# designers did not want to force this additional responsibility upon each and every language user, leaving this up to a deliberate decision.

Tuesday 1 May 2012

Tracking mobile visitors with Google Analytics

I've seen some strange approaches to tracking mobile visits using Google Analytics, which is quite surprising - especially considering that this is something that Analytics does out of the box. Granted, the Standard Reporting -> Audience -> Mobile page does not show much, apart from mobile operating system and resolution, but there's a very nice tool that allows any report to be filtered by a custom parameter.

I'm not talking about Profiles, which, although powerful, are only applied as data is gathered, and cannot be selectively enabled and disabled for existing statistics. Advanced segments are a very mighty, yet not well known tool. They can filter any existing report (e.g. Content, to see what pages should be the first to get a mobile-friendly layout). Most importantly - they can be mixed and matched, to show multiple facets of your site's traffic at once:

Visitors by browser

As today Google enabled custom reports and advanced segments sharing, you can just click my link to add Advanced segment - Mobile to your Google Analytics dashboard. If you would rather define it manually (and you should - as you'll probably want to define other advanced segments for your site), then proceed as follows:

  • Go to Standard Reporting -> Advanced Segments and click New Custom Segment
  • In the new form, set Name to Mobile, and parameters to Include, Mobile, Exactly matching, Yes
  • Press Save Segment and you're done.
Defining Advanced Segment for Mobile

To choose which segments are used for displaying the data, press Advanced Segments again, select and press Apply. All Visitors brings you back to an unfiltered view.

Choosing active segments

And finally, a screenshot of the Mobile segment in action:

Mobile visitors vs. total traffic

Tuesday 24 April 2012

I want to live forever!

There is a concept of singularity in general relativity theory, describing a place where gravitational forces become infinite, and the rules of the universe no longer apply. This area is limited by the events horizon, from which no knowledge of the internal state of the singularity can escape. By analogy, Vernor Vinge in 1982 coined the term technical singularity to describe the moment in the history of technology when the rate of acceleration of future development becomes infinite from the point of view of a bystander. This is based on observation that all knowledge growth is self-propelling, and - as Ray Kurzweil argues - Moore's observation of exponential growth of computation capabilities extends both into the far past and the oncoming future.

Not surprisingly, such topic is a potent source of inspiration for science fiction writers, bringing forth numerous stories. Doctor Manhattan from Watchmen, Amber from Accelerando and Adam Zamoyski from Perfect Imperfection are just a few of my favourite characters, taking positions on the curve of progress that are well beyond human capabilities. However, the singularity seems now close enough that it no longer resides in the realm of pure fiction - well established futurologists place their bets as well, trying to proclaim the date of the breakthrough. Reading through the list of such predictions amassed by Ray Kurzweil, a curious pattern emerges: each of the prophets places the term within his own lifespan, hoping himself to experience the event.

Those bets may not be that far off: just from last year, I recall two large pharmaceutical companies starting clinical trials with yet another batch of medications promising to delay the aging process and to relegate it beyond the hundred years milestone. First journalist comments on the story also mentioned - with outrage - how this would necessitate another extension of the retirement age. Which is a bit ironic, considering the fact that initially the Old Age Pension introduced by Otto von Bismarck covered workers reaching 70 years of life, which was only a small percentage of the overall workforce at that time. Before you comment with dismay, consider that passing - or even approaching - the technical singularity means a true end to the scarcity economy. It's a world close to the one shown in Limes inferior, Crux or the books of Cory Doctorow: a real welfare state, where every citizen can be provided with almost anything he needs.

Interestingly, Terry Pratchett hid a gem of an idea of how such a society is born in his book Strata: once a dependable life-prolonging technique is available, anyone earning enough per year to elongate his life at least for the next year becomes effectively immortal. The most amazing - and brutal - events happen at the brink of this revolution, for that truly is the events horizon: beyond the extension threshold, people are on their way to become gods and live forever. Being left behind is one of the most scary things that I can imagine. And unlike the gravitational singularity, this one has a border that permits communication. One-way, mostly, as it's not possible for an ant to understand the giant, but that makes the division even more glaring.

Those that are able to partake in the transition will be, in a way, the last human generation. Oh, surely we will not stop to procreate, but the relation of power between the children and the parents will change dramatically: no longer are they raising a heir, an aid for their old days. As if they are a vampire from old tales - a child becomes a very expensive burden, that only the wealthiest can afford, and a competitor for limited resources. I did mention before that this will be a post-scarcity economy, but still some goods remain in limited supply. A Mona Lisa, for example.

And if you are lucky enough to be a member of the chosen caste, why wouldn't you desire something as unique? After all, your wealth will be unimaginable, with time unlimited for gathering the spoils, and only so few from your generation to share this gift of time. That's the real meaning of the last generation - for others will too, in future, arise to this plateau of eternal life. But being late to the party, most of them will never have the chance to amass such wealth and power.

I don't claim to know when will the breakthrough come. However, when it does - wouldn't it be terrible to miss it just by a few years? We already know some ways to extend ones life. If I can get ten, even five years more, my chances to participate in the singularity grow.
And so, I run.

Tuesday 6 March 2012

Converting NAnt build files to MSBuild projects

TL;DR: I have a NAnt-to-MSBuild converter available at https://github.com/skolima/generate-msbuild.

Initially, I envisioned to implement as faithful translation of the build script as possible. However, after examining the idioms of both NAnt and MSBuild scripts I decided that a conversion producing results in accordance with those established patterns is a better choice. Investigating the build process of available projects revealed that converting the invocation of the csc task is enough to produce a functional Visual Studio solution. Translating tasks such as mkdir, copy, move or delete, while trivial to perform, would be actually detrimental to the final result. Those tasks are mostly employed in NAnt to prepare the build environment and to implement the “clean” target – the exact same effect is achieved in MSBuild by simply importing the Microsoft.CSharp.targets file. In a .csproj project conforming to the conventional file structure, such as is generated by the conversion tool, targets such as “PrepareForBuild” or “Clean” are automatically provided by the toolkit.

I planned to use the build listener infrastructure to capture the build process as it happends. The listener API of NAnt is not comprehensively documented, but exploring the source code of the project provides examples of its usage. Registering an IBuildListener reveals some clumsiness that suggest this process has not seen much usage:

protected override void ExecuteTask()
{
  Project.BuildStarted += BuildStarted;
  Project.BuildFinished += BuildFinished;
  Project.TargetStarted += TargetStarted;
  Project.TargetFinished += TargetFinished;
  Project.TaskStarted += TaskStarted;
  Project.TaskFinished += TaskFinished;
  Project.MessageLogged += MessageLogged;

  // this ensures we are propagated to child projects
  Project.BuildListeners.Add(this);
}

Last line of this code sample is crucial, as it is a common practice to split the script into multiple files, with a master file performing initial setup and separate per-directory build files, one for each output assembly. This allows shared tasks and properties to be defined once in the master file and inherited by the child scripts. Surprisingly, build listeners registered for events are not passed to the included scripts by default.

Practically every operation in the NAnt build process is broadcasted to the project’s listeners, with *Started events providing opportunity to modify the subject before it is executed and *Finished events exposing final properties state, along with information on step execution status (success or failure). Upon receiving each message the logger is able to access and modify the current state of the whole project.

Typical MSBuild use case scenarios

I have inspected several available open source projects to establish common MSBuild usage scenarios. I determined that although the build script format allows for deep customization, most users do not take advantage of this, instead relying on Visual Studio to generate the file automatically. One notable exception from this usage pattern is NuGet, which employs MSBuild full capabilities for a custom deployment scenario. However, in order to comply with the limitations that the Visual Studio UI imposes on the script authors, the non-standard code is moved to a separate file and invoked through the BeforeBuild and AfterBuild targets.

Thus, in practice, users employ the convenience of .targets files “convention over configuration” approach (as mentioned in the previous post) and restrict the changes to those that can be performed through the graphical user interface: setting compiler configuration property values; choosing references, source files and resources to be compiled; or extending pre- and post-build targets. When performing incremental conversion, those settings are preserved, so the user does not need to edit the build script manually.

The only exception to this approach is handling of the list of source files included in the build: it is always replaced with the files used in the recorded NAnt build. I opted for this behavior because it is coherent with what developers do in order to conditionally exclude and include code in the build – instead of decorating Item nodes with Condition attributes, they wrap code inside the source files with #if SYMBOL_DEFINED/#else/#endif preprocessor directives. This technique is employed, for example, in the NAnt build system itself and has been verified to work correctly after conversion. It has the additional benefit of being easily malleable within the Visual Studio – conditional attributes, on the other hand, are not exposed in the UI.

NAnt converter task

Because I meant the conversion tool to be as easy to use for the developer as possible, I have implemented it as a NAnt task. It might be even more convenient if the conversion was available as a command line switch to NAnt, but this would require the user to compile a custom version of NAnt instead of using it as a simple, stand-alone drop-in. To use the current version, you just have to add <generate-msbuild/> as the first item in the build file and execute a clean build.

As I shown in my previous post, Microsoft Build project structure is sufficiently similar to NAnt’s syntax that almost verbatim element-to-element translation is possible. However, as the two projects mature and introduce more advanced features (such as functions, in-line scripts and custom tasks), the conversion process becomes more complex. Instead of shallow translation of unevaluated build variables, the converter I designed captures the flow of the build process and maps all known NAnt tasks to appropriate MSBuild items and properties. The task registers itself as a build listener and handles TaskFinished and BuildFinished events.

Upon each successful execution of a csc task, its properties and sub-items are saved as appropriate MSBuild constructs. When the main project file execution finishes (because a NAnt script may include sub-project files, as is the case with the script NAnt uses to build itself), a solution file is generated which references all the created Microsoft Build project files.

As I mentioned earlier, I initially anticipated that translators would be necessary for numerous existing NAnt tasks. However, after performing successful conversion of NAnt and CruiseControl.NET, I found out that only a csc to .csproj translation is necessary. The converter captures the output file name of the csc invocation and saves a project file with the same name, replacing the extension (.dll/.exe) with .csproj. If the file already exists then its properties are updated, to the extent possible. In the resulting MSBuild file all variables are expanded and all default values are explicitly declared.

All properties that are in use by the build scripts on which the converter was tested have been verified to be translated properly. Several known items (assembly and project references, source files and embedded resources) are always replaced, but other items are preserved. Properties are set without any Condition attribute, thus if if the user sets them from the Visual Studio UI, then those more specific values will override the ones copied from the NAnt script.

I have initially developerd and tested the MSBuild script generator on the Microsoft.NET Framework, but I always plannedfor it to be usable on Mono as well. I quickly found out that Mono had no implementation of the Microsoft.Build assembly. This is a relatively new assembly, introduced in Microsoft .NET Framework version 4.0. As this new API simplified development of the converter greatly, I decided that instead of re-writing the tool using classes already existing in Mono, I would implement the missing classes myself.

Mono Project improvements

I created a complete implementation of Microsoft.Build.Construction namespace, along with necessary classes and methods from Microsoft.Build.Evaluation and Microsoft.Build.Exceptions namespaces. The Construction namespace deals with parsing the raw build file XML data, creating new nodes and saving them to a file. It contains a single class for every valid project file construct, along with several abstract base classes, which encapsulate functionality common to their descendants, e.g. ProjectElement is able to load and save a simple node, storing information in XML attributes, while ProjectElementContainer extends it and can also store child sub-nodes.

While examining the behavior of the Microsoft implementation of those classes strongly suggest that they store the XML in memory, as they are able to save the loaded file without any formatting modifications, the documentation does not require this behavior. As this would bring no additional advantages, and is detrimental to the memory usage, my implementation only stores the parsed representation of the build script. Two exceptions from this are the ProjectExtensionsElement and ProjectCommentElement, as they represent nodes that have no syntactic meaning from the MSBuild point of view and it is not possible to parse them in any way – thus the raw XML is kept and saved as-is.

A project file is parsed using an event-driven parsing model, also known as SAX. This is preferable because of performance reasons – the parser does not backtrack, and there is no need to ever store the whole file in memory. As subsequent nodes are encountered, the parent node checks whether its content constitutes a valid child, and creates an appropriate object.

As is suggested for Mono contributions, the code was created using a test-driven development approach, with NUnit test cases written first, followed by class stubs to allow the code to compile, and finally the actual API was implemented. As the tests’ correctness was first verified by executing them on Microsoft .NET implementation, this method ensures that the code conforms to the expected behavior even in places where the MSDN documentation is vague or incomplete.

Evaluation in practice

After completing the implementation work, I tested the tool using two large open source projects that employ NAnt in their build process: Boo and IKVM.NET.

Boo project consists mostly of code written in Boo itself and ships with a custom compiler, NAnt task and Boo.Microsoft.Build.targets file for MSBuild, so a full conversion would require referencing those additional assemblies and would not provide much value. However, the compiler itself and bootstrapping libraries are written in C#, thus providing a suitable test subject.

Executing the conversion tool required forcing the build using the 4.0 .NET Framework (instead of 3.5) and disabling the Boo script that the project uses internally to populate MSBuild files. Initial conversion attempt revealed a bug in my implementation, as Boo employs a different layout of NAnt project files than the previously tested projects. Once I fixed the converter to take this into account and generate paths rooted against the .csproj file location instead of the NAnt .build file, the tool executed successfully and produced a fully working Visual Studio 2010 project that can be used for building the C# parts of the Boo project.

Testing using IKVM.NET followed a similar path, as most of the project consists of Java code, which can not be compiled using MSBuild and does not lend itself to conversion. After I successfully managed to perform the daunting task of getting IKVM.NET to compile, the <generate-msbuild/> task was executed and produced a correct Visual Studio solution, with no further fixes or manual tweaks necessary. The update functionality also worked as expected, setting build properties copied from NAnt where they were missing from the MSBuild projects.

Monday 6 February 2012

Build systems for the .NET Framework

When on 13th of February 2002 Microsoft released the first stable version of the .NET Framework, the ecosystem lacked an officially supported build platform. However, since early betas were available short after July 2000 Professional Developers Conference, a native solution – NAnt – emerged in August 2001, three months before the framework itself became officially available. But it was not until 7th of November 2005 that Microsoft presented it’s own tool : MSBuild. For two years the competing systems coexisted in the .NET world, as MSBuild was a new, and relatively unpolished, product. When on 19th of November 2007 a second version of MSBuild (labeled 3.5, to match the .NET Framework version it accompanied) was released, it brought multiple improvements that developers have asked for. The community’s focus switched from NAnt to the Microsoft solution, and NAnt 0.86-beta1, released on 8th of December 2007, was the last release for almost three years. Although NAnt development started again in April 2010, this long stagnation has led many of it’s previous users to believe the Open Source solution to be abandoned.

MSBuild 4.0 offers multiple improvements over NAnt: it ships with packaged Target files for commonly used project types, in accordance with “convention over configuration” paradigm; it has an ever-growing collection of community Tasks which perform various commonly executed build operations; it supports parallel builds; it integrates with Team Build (a Continuous Integration component of Microsoft Team Foundation Server) and other CI systems; and most importantly, it is used internally by Visual Studio, which presents most build options through a graphical user interface – developers creating a build project with the help of an IDE may not even be aware that MSBuild is being used underneath.

Nowadays MSBuild is the de facto standard tool for build automation in the .NET ecosystem. However, multiple projects still employ a legacy NAnt build system – the main problems preventing migration being complexity of the existing build infrastructure and supporting Mono, which, until 2.4 (released on 8th of December 2009), lacked an MSBuild implementation. Although the Mono version of MSBuild 3.5 is now relatively complete, version 4.0 is still virtually non-existent.

Pre-existing Solutions

Apart from the two already mentioned build platforms, there are several others. The first of them, dating way back into the Unix times, is called Autotools, officially known as the GNU Build System. The core of Autotools – make – was released in 1997. Although this system is widely used by projects developed in C or C++, such as the Mono runtime engine, it has no built-in support for .NET specific compilers, requiring a large amount of custom per-project work by developers. It also has a reputation of being convoluted and unfriendly, although extremely powerful.

Developers and users of other build system, such as CMake, Ant or Maven, had on numerous occasions undertaken efforts to enhance .NET support. Especially Maven community has spawned numerous .NET-targeted clones – NPanday, Byldan, NMaven – none of which has gained any traction. The only exception seems to be maven-dotnet-plugin, which delegates the build process back to MSBuild.

An interesting new tool that is worth mentioning is FAKE – F# Make. Although still very much an experimental project, started on 30th of March 2009, this tool is under active development by several contributors. It borrows heavily from ideas explored by Rake (written in and for the needs of projects in the ruby language), and allows users to describe the build process configuration in the same language they are using to write their code.

This post looks in depth at three existing build platforms employed on the .NET Framework: NAnt – which used to be the de facto standard, Microsoft Build – the officially supported tool, and FAKE – an interesting build tool employing an entirely different build description paradigm.

All three tools present the same basic functionality of a build platform: a project file contains tasks, enclosed in targets, which may have specified dependencies upon other targets. During the build process, those targets are first sorted topologically and then tasks within each target are executed in sequence. However, the structure of a project file differs greatly between tools.

NAnt

When Stefan Bodewig announced first official Ant release on 19th July 2000, the project already had undergone over a year of public development as part of the Tomcat servlet container, and had been used for a year before that as an internal tool at Sun Microsystems (under the name Another Neat Tool). In August 2001, Gerry Shaw made a decision to base the new .NET build platform on the existing Ant file syntax (initial code for .NET Beta 1 Ant clone was written by David Buksbaum of Hazware and released under the name of XBuild). Keeping with the open source tradition of self-recursive names, he aptly named this new tool NAnt, from NAnt is not Ant.

After almost ten years of separate development, NAnt’s Project.build is still difficult to distinguish from Ant’s build.xml file; the only obvious giveaway being the use of C#’s csc compiler task instead of Java’s javac. An absolutely minimal working NAnt build file looks as follows:

<project default="build">
  <target name="build">
    <csc target="exe" output="Hello.exe">
      <sources>
        <include name="*.cs" />
      </sources>
    </csc>
  </target>
</project>

This short example contains a single target (build), which in turn contains a single task, with a simple nested fileset. Executing this file starts the default target, which invokes the csc task to compile the code using the appropriate C# compiler.

NAnt projects consist of several basic entities: task, types, properties, functions and loggers. Tasks wrap fundamental operations, such as copying a file, performing source control operations or invoking the compiler. Types represent strongly typed parameters, are aware of their content and validate their correctness on creation. A fileset is perhaps the most often used type – it is a lazily evaluated collection of files (the sources element in the example above is a fileset). Properties can be used for storing text values that are used multiple times. They are evaluated in the place of their declaration. Functions, along with operators, can be used in any attribute value, and are evaluated when the attribute is read (usually upon task execution). Loggers are usually employed for reporting build progress to the user through various front-ends, but can also serve for tracking project execution for other purposes. NAnt ships with a large collection of predefined elements, additional ones can be either loaded from external assemblies or defined in-line using a script task. Scripts can be written in any .NET language that has a System.CodeDom.Compiler.CodeDomProvider available.

A more advanced example, showing properties, functions and global tasks (not enclosed inside a target):

<project>
  <property name="is-mono"
    value="${string::contains(framework::get-target-framework(), 'mono')}" />
  <property name="runtime-engine"
    value="${framework::get-runtime-engine(framework::get-target-framework()) }" />
  <echo message="Checking Mono version" if="${is-mono}"/>
  <exec program="${runtime-engine}" commandline="-V" if="${is-mono}" />
  <echo message="Using non-Mono runtime engine: '${runtime-engine}'"
    unless="${is-mono}" />
</project>

Global tasks are always executed in the order they are declared and are used for setting up the project. Functions and properties are evaluated inside ${} blocks, they can be distinguished by the fact that functions use :: to separate the prefix from the function name. Also visible in this example are the if and unless attributes which are available on every task and are used for conditional task execution.

While NAnt inherited Ant’s mature syntax, along with such brilliant constructs as a distinction between * (match in current directory) and ** (recursive directory match) for file inclusion/exclusion, it also inherited Ant’s deficiencies. The most glaring one is the inherent single threaded nature of the build process – although the engine itself can be relatively easily extended to invoke targets in parallel, existing build files rely on targets being executed sequentially.

Microsoft Build

MSBuild 2.0 (releases are numbered after the Microsoft .NET Framework they accompany, thus the first release is labeled 2.0, second – 3.5 and third – 4.0) was released on the 7th of November, 2005, as part of the Microsoft .NET 2.0 release. It came bundled as the default build tool for Visual Studio 2005. MSBuild’s initial design was similar to NAnt’s, but because at that time company policy forbade Microsoft employees from looking at the implementation of open source solutions (in order to prevent intellectual property violation claims), it does differ in many subtle ways.

Visual Studio 2005 used Microsoft Build for compiling C# and Visual Basic projects, all other solution types were still handled by the built-in mechanisms inherited from the 2003 release. Before version 2.0 MSBuild was not used internally by Microsoft, but as soon as it had reached the Release To Manufacturing stage, intense build process conversion effort has been launched, and by the early November 2005 it was already building about 40% of the Visual Studio project itself. This internal version added support for parallel builds (released to the general audience on 19th November 2007 as 3.5) and compiled all types of projects available in Visual Studio, including Visual C++ (this last feature was released on 12th April 2010 as part of 4.0 version). Another important improvement released with Visual Studio 2010 was a graphical debugging tool.

A minimalistic MSBuild’s Project.proj looks as follows:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003"
  DefaultTargets="Build">
  <Target Name="Build">
    <ItemGroup>
      <Compile Include="*.cs" />
    </ItemGroup>
    <CSC Sources="@(Compile)" OutputAssembly="Hello.exe"/>
  </Target>
</Project>

Although a different naming convention is used (uppercase identifiers instead of lowercase), this file shows great similarity to NAnt’s Project.build. It contains a single target, which in turn contains an item group and a task. The namespace definition is required and uses the same schema regardless of the MSBuild version. Executing this file starts the default target, named Build, which calls the CSC task to compile the code. File collections (and item groups in general) can be declared at target or project level, but (unlike NAnt) cannot be nested inside tasks (some tasks allow for embedding item groups and property groups, but this is rare behavior). Prior to version 4.0, items could not be modified once declared.

Despite being a valid MSBuild file, the above example would not be recognized by Visual Studio (and by most .NET developers). Instead of requiring the user to describe the whole build process verbosely, MSBuild offers .target files which allow “convention over configuration” approach to build process : user only specifies those settings and actions that differ from the default ones. MSBuild projects use .proj extension for generic build scripts, and language-specific extensions are used for files importing specific .targets (for example .csproj for C# projects). Thus, a minimal Project.csproj might be written as:

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <ItemGroup>
    <Compile Include="*.cs" />
  </ItemGroup>
  <Import Project="$(MSBuildBinPath)\Microsoft.CSharp.targets" />
</Project>

By replacing an explicit invocation of the CSC task with an Import directive, this file inherits the whole build pipeline defined for Visual Studio, including automatic dependency tracking (should one declare Reference items), graphical user interface for configuring the build, targets for cleaning and rebuilding the assembly, and standardized extension points.

Basic entities in a MSBuild project are properties, items and tasks. Properties represent simple values. Items are untyped key-value collections, mostly used to represent files. Both types are evaluated as soon as they are encountered. They must be wrapped in groups, but this only allows them to share a Condition: properties cannot be bundled and items are always grouped by name (in the example above the ItemGroup generates items named Compile, one for each matching file). MSBuild has a mechanism named batching that splits items sharing a name according to a specified metadata value – when this is used, a task defined once will be executed for each batch of items separately. Item definitions allow setting default item metadata values. MSBuild, like NAnt, distinguishes between * (match inside current folder) and ** (recursive directory match). Loggers can be used for tracking project execution, but they must be attached from command line. There is a quite extensive task collection available out of the box, many of them are direct replacements for NAnt tasks. Since 4.0 it is also possible to define a task in-line with the help of UsingTask.

An example of using functions for evaluating task conditions (this example does not work as of Mono 2.10 because functions are still not implemented):

<Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003"
  InitialTargets="Info">
  <PropertyGroup>
    <IsMono>$(MSBuildBinPath.Contains('mono'))</IsMono>
    <RuntimeEngine>$(MSBuildBinPath)/../../../bin/mono</RuntimeEngine>
  </PropertyGroup>
  <Target Name="Info">
    <Message Text="Checking Mono version" Condition="$(IsMono)"/>
    <Exec Command="$(RuntimeEngine) -V" Condition="$(IsMono)"/>
    <Message Text="Using non-Mono runtime engine: '$(MSBuildBinPath)'"
      Condition="!$(IsMono)"/>
  </Target>
</Project>

Property values and functions are evaluated inside $() blocks, basic operators (such as ==) are also recognized outside of those markers. @() syntax is used for referencing collections of items and %() triggers the batching mode using item metadata. MSBuild 4.0 is keeping track of the actual underlying type of each property value and is able to invoke any .NET instance methods defined on such an object – however, because of security concerns, only methods marked as safe (number/date/string/version manipulation and file system read-only access) are available in scripts (this security mechanism can be disabled by setting the environment variable MSBUILDENABLEALLPROPERTYFUNCTIONS to 1). The syntax for method invocation comes from PowerShell – instance methods are called with a simple Value.Method(), while static methods can be invoked with [Full.Type.Name]::Method().

The MSBuild syntax draws heavily from NAnt and should feel quite familiar for any developer once one grasps how items differ from NAnt’s strongly typed collections. The tool is under active development, has extensive support from both Microsoft and the community, and – since Mono 2.4 was released on 8th of December 2009 – is usable as a cross-platform build system.

FAKE

Fake was published by Steffen Forkmann on the 1st of April, 2009. His goal was to create a build platform using the same language he wrote his programs in – F# (this trend is also observed in Ruby, Python and other languages which allow executable domain-specific languages to be defined at the language level). Three years later, Fake still remains more of an academic exercise than a widely deployed tool, but it does explore a very interesting approach to build management. Fake executes its scripts through the F# interpreter, extending the syntax of the language with three simple additions: defining build steps (Target? TargetName), declaring coupling between targets (For? TargetName <- Dependency? AnotherTargetName) and specification of default targets (Run? TargetName).

A basic build.fsx might look as shown in the listing below:

#I @"tools\FAKE"
#r "FakeLib.dll"
open Fake

Target? Default <-
  fun _ ->
    let appReferences  = !+ @"**.csproj" |> Scan
    let apps = MSBuildRelease @".\build\" "Build" appReferences
    Log "AppBuild-Output: " apps

Run? Default

First three lines import the Fake namespace from the FakeLib.dll file in directory tools\FAKE. Following them is a target definition, with a fileset wildcard match pipelined (using F#’s |> operator) to the Scan function, then a MSBuildRelease task invocation, log output, and finally – declaration of the default target. It should be noted here that Fake does not have built-in tasks for compiling code – it relies on the presence of MSBuild instead. There is also no need for a special in-line task definition syntax, as arbitrary F# code can be embedded anywhere in the script. This can be seen in the following example:

#I @"tools\FAKE"
#r "FakeLib.dll"
open Fake
open System

let isMono = Type.GetType ("Mono.Runtime") <> null
let stringType = Type.GetType ("System.String")
let corlibLocation = IO.GetDirectoryName (stringType.Assembly.Location)
let notMono = String.Format("Using non-Mono runtime engine: {0}", corlibLocation)

Target? Info <-
  fun _ ->
    if isMono then
      trace "Running on Mono"
    else
      trace notMono

Run? Info

The keyword let declares a F# variable, which is equivalent to property declarations used by NAnt and MSBuild. However, unlike those two tools, Fake allows the developer to invoke any .NET method, without security contraints.

As an experimental project, Fake does have some shortcomings. It does not execute its targets in parallel, although the code inside them can be easily parallelized. It also does not keep track of a target’s outputs up-to-date state, executing the target commands during every project rebuild, which makes it unsuitable for large projects. There is no support for using Fake under operating system other than Windows. And the F# language itself still remains exotic to most .NET developers, making the build scripts hard to understand and maintain.

Wednesday 18 January 2012

Upgrading from CruiseControl.NET 1.5 to 1.6 / 1.7

I've finally decided to update the version of CruiseControl.NET I use, going from 1.5 straight to 1.7 nightly build. My previous attempt ended with cryptic error messages, but, as this time the build server was already having some problems and required maintenance, I went through with the update (after fixing the problems first, of course).

Most important thing, if you don't already know it: there's a validator included in the downloadable package, which you can use to check how the server will interpret your pretty configuration files spaghetti. If you are making use of the pre-processor feature - the validator is indispensable. A neat trick while using it is copying the output (processed) configuration, changing the input files and copying the new output to a separate file, then running diff on those two to check whether the actual change you just introduced is what you were intending to do. In my case - I was checking whether I got exactly the same output while using a two-years-newer release by running my original configuration through the 1.5 validator and trying to get identical results from the 1.7 parser.
The initial result you'll get will most likely be this:

Unused node detected: xmlns:cb="urn:ccnet.config.builder"

Oh. Not good. StackOverflow has an answer that claims to fix this problem, only to result in this:

Unused node detected: xmlns="http://thoughtworks.org/ccnet/1/5"

Well - not exactly a change for the better.  What is the problem? The changes in the configuration parser made it a bit more picky about the files it accepts. Now they have to start with the XML preamble and include the version information (1.5 or 1.6, there's no 1.7 schema yet). The required beginning of the main configuration file is now as follows:

<?xml version="1.0" encoding="utf-8"?>
<cruisecontrol xmlns:cb="urn:ccnet.config.builder"
xmlns="http://thoughtworks.org/ccnet/1/5">

Also, while 1.5 allowed you to include files containing a "naked" node (e.g. to reuse svn version control configuration), 1.6 requires the top level node in the included file to be either a <cb:config-template> or <cb:scope>. Thus, to be on the safe side, start each of your configuration sub-files with the following:

<?xml version="1.0" encoding="utf-8"?>
<cb:config-template xmlns:cb="urn:ccnet.config.builder"
xmlns="http://thoughtworks.org/ccnet/1/5">

With those changes in place, my configuration file results in the same pre-processor output both in CruiseControl.NET 1.5 and 1.7.