Open Source Development

November 27th, 2008

I have been working on quickly integrating an open-source library into an application I have been writing. This is my first large-scale experience with open source, and it has been an interesting one. I did think twice about naming the library as it very popular and well respected and I have had problems integrating it.

Disclaimer: The only intention of this post is to illustrate my experience (and probable ignorance) of the open source development process. It is not meant to stir up controversy.

The library (and application suite) in question is Subversion. The point of this post is to describe issues I have had around using open source projects. In no particular order:

Getting The Libraries.

It is fairly obvious where to find the source code and headers. It is also fairly obvious where to get the binaries. It is not obvious to me where to get the complete set of binaries, source code, static libs and headers.

It seems that there are a relatively small number of public projects using the Subversion libraries. Those projects that are open source, only expose their source code in their repositories.

Call me old fashioned, but if I build something I usually want all the parts from front-to-back wherever possible. At the very least I want the minimal set of libs, headers, and binaries to get ‘hello world’
compiling and working.

To do this I had to:

  • Get the binary distribution.
  • Get the libs and headers distribution.
  • Search around for some other pieces (including libdb44.dll).

No matter how hard I searched, I could not find a full ‘developer’
distribution of the Subversion 1.5.x libraries.

As far as I am concerned this should include a complete cut of all source code for Subversion, plus the necessary headers and binaries of any other projects that Subversion is built on.

Documentation and C++

I use C++. So straight away I started wrapping the C-api to make it more friendly and abstracted for the things I want to do. This of course means using RAII and other common-place patterns. My main issue here is that I could not find any documentation with a clear, well explained ‘Hello World!’ style example (to perpetuate the problem, I will not post an example either).

The Doxygen documentation of the C-api, whilst useful, was far from complete. As I work on Windows, I am somewhat lucky to be using Visual Studio and Visual Assist which between them give me a fairly powerful and extensive Intellisense/code completion. I found this to be considerably more helpful than the available documentation.

Skinning the Apr-Cat

This was also my first experience of any project built on top of the Apache Portable Runtime. I don’t really have any comment about this.
The headers and binaries are shipped with Subversion, and they are fairly well and widely documented. I have no reason to use it, other than what is required by Subversion.

However, to make apr more C++, I started to use the RAII idiom, as mentioned earlier, and very quickly came across the RapidSVN C++ Subversion wrapper, which coincidentally appeared to be almost identical to the C++ wrapper I had started to write.

RapidSVN / svncpp

At this point I threw away the code that I had started to write (hence the lack of my own ‘Hello World!’ example) and chose to use the RapidSvn SvnCpp
wrapper. I would strongly recommend any C++ developer looking to integrate Subversion use this.

However, there are a few things wrong with the version I picked up. Firstly, I’m assuming, perhaps incorrectly, that whilst there are named branches and tags in the RapidSvn svn repository, taking the trunk will give me the most up to date and complete API. I found that a few api calls only partially populated a struct. This was relatively simple to fix.

My Own Repository

Every developer has their own style of coding and file layout. As a Windows developer, and being well versed in how awkward Visual Studio can be, I often spend some time moving projects around to suit my build structure. My personal preference is to have the minimal number of include lines in my projects. This is usual just ‘$(SolutionDir)’. In other words, my VS solution sits at my root level, and everything else is underneath it.

The one problem I had with RapidSvn/SvnCpp, is that I like my libraries to be self contained with minimal interfaces, so I was somewhat disappointed to find that after I finally built svncpp and included it in my project I found that I had to add extra include paths to my main projects to include the Subversion includes, included by svncpp in its headers. In other words, from my perspective, svncpp ‘leaks’ some Subversion headers.

So there and then I wrapped the svncpp code in another lightweight wrapper that contained a pimpl. This nicely fits in with how I like to build my projects.

Finally

So I finally have a built project of svncpp (containing svn) that is self contained that I can include and link to without any adjustment to my main project. In fact I have two versions, for Subversion 1.4 and 1.5.

As a result of all this extra work and configuration, in my own repository I now have complete snapshots of everything I need to build and ship an application built on top of Subversion. It is also in a state where I can drop in new revisions of both the svncpp and Subversion source code, headers, and binaries without any effort.

More Open Source?

This has been an interesting process to go through. Though it does beg the quest about Open Source development. It seems that lots of projects use lots of other projects with considerable amounts of configuration performed by scripts (make configure). Perhaps Visual Studio is part of the problem, but my feeling is that if Project A needs Project B version 1, and Project C version 2, then Projects B and C should provide, at the very least a zip file containing headers and binaries where necessary, and Project A should contain a link to that in a readme file or something similar.

Zero-configuration is also something I prefer. I should be able to build things ‘out of the box’, which is often what ‘make configure’ does on *nix systems.

More intelligent building and configuration could be used. I cannot see that it is particularly difficult to write a cross platform perl script that on an internet connected machine, does a svn checkout of the extra bits and pieces of project B and C that are required (assuming they use Subversion of course). The only requirement on developers (particularly on Windows), is to have the svn binaries and an perl installation in your PATH. If you’re a developer this should be an issue. If it is an issue, you shouldn’t be a developer.

Whilst I very much enjoy Open Source software, the development process (being a user of, not a developer of) has left me a little uneasy due to the inherent disconnectedness of projects and project dependencies. If I cannot build ‘out of the box’ I feel there is something wrong (see item 2 for a one click build).

I would just like to finally emphasize that I am not singling out Subversion (or RapidSvn) here as they are both great projects, I am just stating my experiences with multiple Open Source projects.

Uncategorized

Manifest Madness

October 24th, 2008

It seems that manifests were supposed to reduce the amount of dll hell that the average Windows developer experiences.  Whilst I am sure this is true, there seems to be a paucity of information on the web.  Finding information on manifests on the web seems to be a difficult task.

I came across a manifest issue recently that I think is worth describing here precisely due to the lack of information elsewhere.

Disclaimer:  most of the opinions on OS behaviour are based on non-exhaustive observations of our particular problem, especially in the absence of any particularly obvious documentation.  They may not necessarily be correct.

The Technical Problem.

Quite simply, we have a C++/native comand line process (an exe and lots of dlls) built under Visual Studio 2005 on XP and Vista.  The release build works correctly as expected.  However, on a machine that does not have VS2005 installed the application fails to work:  it fails silently on the command line.  We ship the VC runtime redistributable dlls in the same folder as our exe along with their manifests.

Common wisdom on the web, which is basically solving the problem without actually understanding what the problem is, is to “just install the VC8 redistributable runtime on your machine”.  Personally, I cannot abide ‘fixing’ something, without understanding what I’ve fixed and why it needed fixing, and you almost always find you reuse the knowledge you have learnt again in the future.

The Non-Technical Problem.

This is a work problem.  At my company, applications are automatically distributed to locked down machines that have no admin access.  Rolling out the redistrbutable patch, for purely bureaucractic reasons, is not a solution.  Also, I only have one test machine, and have to therefore be very careful with what I install on it, otherwise I have to wait for a long time for the machine to be rebuilt.

Preamble

Since it is obvious that installing VS2005 would solve the problem, and most likely the redistributable package, in order to start diagnosing the problem, we need to put the minimal set of tools on a machine, in this case Windows XP.

At this point, I wasn’t aware of what the problem actually was, so the first stop was our good old friend depends.exe (http://www.dependencywalker.com).

Hint 1:  I would suggest to anyone working in a corporate or otherwise locked down environment to either ship a set of tools with your application or making them available on an internal network drive.

Once I had loaded the application into depends, it appeared that everything was fine.  Later, however, it was obvious that this was not the case.  Buried in one of the collapsed nodes was a problem that I did not see.

The next step was to fire up the venerable DrWatson.  Straight away it was clear that the application was doing the Windows equivalent of a core dump.  The dump itself was apparently pointing to a empty (non-pure) virtual function that returned a void.  At this point I was still under the misguided idea that the problem was caused by a problem that could be diagnosed by a debug or core dump analysis.

After some head scratching, I decided to get windbg installed on the test machine.  This would have a low impact on the machine configuration as it would not install lots of unnecessary junk in order to work.  It did, of course, require admin rights:  exactly what you don’t need in a corporate environment.

Firing up the Windows debugger and attaching it to our process, it was clear on the very first run that the problem lay after a particular dll load event.  In this instance the debugger was also pointing to a different dll and a different part of the code.

Hint 2:  ship .map and .pdb files with your release build, and keep a copy of the build available: source code, pdb, objects etc.

I now started to investigate the loading of the dlls.  Next stop was psmon from the sysinsternals/Microsoft website.  Running this against my process then confirmed the loading that I saw in the Windows debugger with a little bit more interesting, including the attempted load of ‘our.dll.2.manifest’.  Whilst this would not appear to be an error, it is what first attracted my attention back to manifest files.

Back to depends.exe;  I now tried to load ‘our.dll’ into depends.exe and received the message:

“The side-by-side configuration information for ‘our.dll’ contains errors.  This application has failed to start because the application configuration is incorrect.  Reinstalling the application may fix the problem.”

How is that for a helpful message?  At least it is a start.  So the problem with the silent failure of our application is now firmly identified to lie withing ‘our.dll’.  There were also additional system log messages in the event viewer, including the usual ‘the command completed successfully’ in a reported error.

Debugging manifests

The next stop was google.  What does the error mean, and how do I debug it for more information?  I then found this “classic” (for all the wrong reasons).

The summary is that you can debug the manifest/sxs but only on Vista.

Yes, that’s right, Windows XP has manifests, but you can’t get a dump of the manifest files.  If you know otherwise, leave a comment below.

As luck would have it, I have a Vista box.  Unfortunately, it has Visual Studio 2005 installed, and therefore, the offending application works on that machine.  Giving sxstrace a go, doesn’t reveal anything I didn’t already expect to see.  Running it was a little strange:  start a console window, and run

sxstrace trace -logfile:sxstrace.etl

In a second console:

myapp.exe

Wait for it to finish, back to console 1, hit enter to stop the trace, then:

sxstrace parse -logfile:sxstrace.etl -outfile:sxstrace.txt

to make it human readable.  This then gives you a summary of the manifests that the dll depends on.  As this gave no information as it was running on a machine where the app worked, I went back to depends.exe.

Hint 3:  In depends.exe expand all nodes, and switch on the full path option for the dlls.

At this point I noticed that part of the dll in question was referencing another dll that was in turn referencing an out of date VC8 runtime.

This is where the manifest madness begins.  As far as I am aware, under Windows 2000 onwards, the search behaviour for a dll named ‘widget.dll’ (for example), is to search the local folder first, except in the presence of manifests or redirection. The behaviour with manifests is to load dlls via their manifest descriptions first.

Hint 4:  One way to see the manifest of a dll is to just drag into Visual Studio, and look at the second resource ordinal.

Consequently our app appeared to be loading two versions of the VC80 runtime.  There are two workarounds to this.  The first is to rebuild the offending dll against a newer version of the VC runtime.

If you cannot do that, for example, it is a third party dll, then you can redirect the loading by creating an ‘ourapp.exe.config’ file.  Of course, this didn’t work for us, but we were lucky in that we could recompile the dll.  However, the extra manifest entry still appeared to be there.

The reason why the app.exe.config doesn’t work is also interesting.  As far as I can find there is no documentation that states this explicitly:  application config files can only do binding redirects if the dlls they are redirecting are installed in the WinSxS folder.  The behaviour I was hoping for (which may not be correct, since this mechanism covers the .NET world as well), is that the redirect occurs on and to a manifest file and dll in the current working directory of your application.

Back to the problem;  the recompiled dll still showed that the second version of the vc runtime was in the manifest.  At this point depends was telling us that nothing directly depended on the oldest vc runtime.

So the question then was “where is this extra manifest entry coming from?”  In the land of Unix:

find . | xargs grep 50608

will find all files that contain the string ‘50608′ which is part of the version number of the offending vc runtime.  Without using such a scatter gun approach, we know that the value can only be coming from a limited number of places, the source code, or something that is linked in.  We knew it wasn’t the source code, so that left some static libs.

Dragging each one into notepad finally revealed that on of the static libs that we use was referencing the old vc runtime.  This therefore meant that that particular static lib was built at some point by VS2005 RTM, and not VS2005 SP1 which the rest of the application was being built with.  By some good fortune it was our (not my!) source code, and we resurrected it, rebuilt it, and guess what?

Problem solved!

Summary

After this somewhat lengthy article, the summary is brief:  I located the problem to be in ‘our.dll’, and that the problem was in turn due to an old lib built by an older compiler that was being linked in.

It is really disappointing that on Windows XP, at least, there does not appear to be a tool for further diagnosing dll problems such as this, other than a random scattering of blog postings, and poor documentation.

At the very least there should be a tool that list the manifests of each dll, and where they were found.  The manifest tool, in verbose mode, does not do this.

Final note: if you choose to leave a comment, please be helpful and polite. If you have questions about manifests, it’s unlikely I’ll be able to help beyond what I learnt whilst writing this article.

Troubleshooting manifests

These are a list of links that I gradually worked through to figure out what on earth was going on:

http://channel9.msdn.com/forums/TechOff/22266-Side-by-side-screwup/?CommentID=272900

http://blogs.msdn.com/jreddy/archive/2005/12/23/troubleshooting-c-c-isolated-applications-and-side-by-side-assemblies-scenario-based-with-solutions.aspx

http://blogs.msdn.com/dsvc/archive/2008/08/07/part-2-troubleshooting-vc-side-by-side-problems.aspx

Uncategorized

Version Control

October 6th, 2008

As with most development issues, everyone has some opinion on what language to use, best tool or editor, build system and version control control system.

This post basically boils down to one point:

Don’t procrastinate.

It really is quite simple.  You are a developer.  Your main job is writing code.  So write code!

If you do a web search, you will probably find all manner of threads in newsgroups and forums where people argue over the merits of Git, versus Subversion versus CVS versus Perforce, Darcs, SCCS, Clearcase and so on.  Frankly, who cares?  Pick a version control system that suits you, does what you want, and allows you to get on with it.

Whilst it is true that you can engineer yourself into a corner with any particular tool and make it hard to switch, and also true that you can pick the wrong tool, it is a simple fact that you could spend your time more productively by getting on with your primary task of writing code.  If you pick the wrong version control system by whatever criteria you prefer, take a cut of the code, prefereable at a stable major revision, and just import it all into the new version control system.  No fuss, no trouble.

Uncategorized

Sceptic

September 30th, 2008

I cannot recommend Phil Plait’s Bad Astronomy website highly enough.  It has all manner of excellent posts from interesting astronomy articles to careful explanations of many misunderstood phenomena.

Let’s be clear:  the moon landings were not faked, and there aren’t aliens under Denver airport.  If you “believe” in something, I suggest you visit his site and read it slowly and actually think about things logically.

Good stuff!

Uncategorized

Another theme

September 30th, 2008

I’ve applied another theme to the site.

This is a good resource for all things WordPress at Smashing Magazine.


Uncategorized

Hello world!

August 31st, 2008

It’s back.  I might have something interesting on here at some point.  We’ll see ;-)

random