Archive for the ‘Uncategorized’ Category.

On Auto-Updates

Despite my best googling efforts, I’ve yet to find any posting on designing an auto-update feature for my code, so in the traditions of NIH, I rolled my own.  It seems, ultimately, that the simplest solution is, well, the simplest solution.

Since I work on Windows, it is easy to get the current application’s name and version (provided you have set it and maintain it) from the resource file, either in .NET or Win32. So straight away you have your application name, and your current application version.   The next thing is to store the new version away somewhere. Going back to my comment about the simplest solution, parsing HTML is, to say the least, difficult. Putting updates on the web demand a web page, and hence HTML.

HTML can be tackled in a variety of ways, the worst of which is probably a regular expression, up to using something like http://htmlagilitypack.codeplex.com/. Either way, in HTML, you end up with a solution not dissimilar to:

<span class='name'> My App </span>
<span class='version'> My Version </span>

and parsing it relies on you trusting whoever edits the HTML to preserve the tags, classes and/or ids.

So back once again to ‘simple’. My solution is this: a text file containing comma separated values. Why? It is easy, it is human readable, and it is pretty near impossible to get wrong. When scanning it, you can just skip everything in there except your application information. It could contain a header, plus the information:

Name,Version,Installer
My App, 1.0, http://localhost/my_installer_1.0.msi

At this point someone reading this (yes, you!) will say: “but what about the web page?” Well, that is now fairly trivial with modern Javascript libraries. A little bit of jquery, and a little bit more, and you’re most of the way there. Making the installer a hyperlink is left as an exercise. Or you could simply then just maintain a static HTML page to go with the text file.

For C#, the code to get the file is remarkably trivial. For a blocking single-threaded web request:

System.Net.WebClient webClient = new System.Net.WebClient();
//  blocking...
webClient.DownloadFile(new System.Uri("http://localhost/updates.csv"), @"C:\updates.csv"));

Put this into the appropriate try catch block, and download to a file of your choosing.

On Version Naming.

What’s the difference between version 3.2 and version 3.12? For one thing, you might think 3.12 is later than version 3.2, but you would be wrong. A naive implementation of an auto-update system that checks versions would do this:

if (new_version > old_version) {  run_installer(); }

Except for two things:

1. You might have chosen new_version and old_version to be doubles. Then 3.2 is actually 3.20 which IS greater than 3.12 and your installer never gets run.

2. You might have been smarter and chosen new_version and old_version to be strings, so you could have 3.2a and 3.12b, but you would still be in trouble because a string comparison compares characters at a time, and once it gets to the ’1′ in 3.12, and the ’2′ in 3.2, it once again decides that 3.2 is greater that 3.12, and once again your installer doesn’t start.

So your run_installer clause ends up being:

if (new_version != old_version) { run_installer(); }

which has the side effect of allowing rollbacks to prior versions if you so choose (assuming of course that the MSI installers don’t prevent you from doing that).

Release build issues

I’m sure I’m not the only person to come across this:  release builds that die in the middle of STL somewhere at runtime after being built by Visual Studio (2005 onwards I expect).  The short answer to this is that if you have put _SECURE_SCL=0 as a preprocessor define in one project, you need to ensure that it is in all projects.  Otherwise the compiler gets confused when optimising your code because of two more slightly differing template expansions.

Manifest Madness

It seems that manifests were supposed to reduce the amount of dll hell that the average Windows developer experiences.  Whilst I am sure this is true, there seems to be a paucity of information on the web.  Finding information on manifests on the web seems to be a difficult task.

I came across a manifest issue recently that I think is worth describing here precisely due to the lack of information elsewhere.

Disclaimer:  most of the opinions on OS behaviour are based on non-exhaustive observations of our particular problem, especially in the absence of any particularly obvious documentation.  They may not necessarily be correct.

The Technical Problem.

Quite simply, we have a C++/native comand line process (an exe and lots of dlls) built under Visual Studio 2005 on XP and Vista.  The release build works correctly as expected.  However, on a machine that does not have VS2005 installed the application fails to work:  it fails silently on the command line.  We ship the VC runtime redistributable dlls in the same folder as our exe along with their manifests.

Common wisdom on the web, which is basically solving the problem without actually understanding what the problem is, is to “just install the VC8 redistributable runtime on your machine”.  Personally, I cannot abide ‘fixing’ something, without understanding what I’ve fixed and why it needed fixing, and you almost always find you reuse the knowledge you have learnt again in the future.

The Non-Technical Problem.

This is a work problem.  At my company, applications are automatically distributed to locked down machines that have no admin access.  Rolling out the redistrbutable patch, for purely bureaucractic reasons, is not a solution.  Also, I only have one test machine, and have to therefore be very careful with what I install on it, otherwise I have to wait for a long time for the machine to be rebuilt.

Preamble

Since it is obvious that installing VS2005 would solve the problem, and most likely the redistributable package, in order to start diagnosing the problem, we need to put the minimal set of tools on a machine, in this case Windows XP.

At this point, I wasn’t aware of what the problem actually was, so the first stop was our good old friend depends.exe (http://www.dependencywalker.com).

Hint 1:  I would suggest to anyone working in a corporate or otherwise locked down environment to either ship a set of tools with your application or making them available on an internal network drive.

Once I had loaded the application into depends, it appeared that everything was fine.  Later, however, it was obvious that this was not the case.  Buried in one of the collapsed nodes was a problem that I did not see.

The next step was to fire up the venerable DrWatson.  Straight away it was clear that the application was doing the Windows equivalent of a core dump.  The dump itself was apparently pointing to a empty (non-pure) virtual function that returned a void.  At this point I was still under the misguided idea that the problem was caused by a problem that could be diagnosed by a debug or core dump analysis.

After some head scratching, I decided to get windbg installed on the test machine.  This would have a low impact on the machine configuration as it would not install lots of unnecessary junk in order to work.  It did, of course, require admin rights:  exactly what you don’t need in a corporate environment.

Firing up the Windows debugger and attaching it to our process, it was clear on the very first run that the problem lay after a particular dll load event.  In this instance the debugger was also pointing to a different dll and a different part of the code.

Hint 2:  ship .map and .pdb files with your release build, and keep a copy of the build available: source code, pdb, objects etc.

I now started to investigate the loading of the dlls.  Next stop was psmon from the sysinsternals/Microsoft website.  Running this against my process then confirmed the loading that I saw in the Windows debugger with a little bit more interesting, including the attempted load of ‘our.dll.2.manifest’.  Whilst this would not appear to be an error, it is what first attracted my attention back to manifest files.

Back to depends.exe;  I now tried to load ‘our.dll’ into depends.exe and received the message:

“The side-by-side configuration information for ‘our.dll’ contains errors.  This application has failed to start because the application configuration is incorrect.  Reinstalling the application may fix the problem.”

How is that for a helpful message?  At least it is a start.  So the problem with the silent failure of our application is now firmly identified to lie withing ‘our.dll’.  There were also additional system log messages in the event viewer, including the usual ‘the command completed successfully’ in a reported error.

Debugging manifests

The next stop was google.  What does the error mean, and how do I debug it for more information?  I then found this “classic” (for all the wrong reasons).

The summary is that you can debug the manifest/sxs but only on Vista.

Yes, that’s right, Windows XP has manifests, but you can’t get a dump of the manifest files.  If you know otherwise, leave a comment below.

As luck would have it, I have a Vista box.  Unfortunately, it has Visual Studio 2005 installed, and therefore, the offending application works on that machine.  Giving sxstrace a go, doesn’t reveal anything I didn’t already expect to see.  Running it was a little strange:  start a console window, and run
sxstrace trace -logfile:sxstrace.etl

In a second console:
myapp.exe

Wait for it to finish, back to console 1, hit enter to stop the trace, then:
sxstrace parse -logfile:sxstrace.etl -outfile:sxstrace.txt

to make it human readable.  This then gives you a summary of the manifests that the dll depends on.  As this gave no information as it was running on a machine where the app worked, I went back to depends.exe.

Hint 3:  In depends.exe expand all nodes, and switch on the full path option for the dlls.

At this point I noticed that part of the dll in question was referencing another dll that was in turn referencing an out of date VC8 runtime.

This is where the manifest madness begins.  As far as I am aware, under Windows 2000 onwards, the search behaviour for a dll named ‘widget.dll’ (for example), is to search the local folder first, except in the presence of manifests or redirection. The behaviour with manifests is to load dlls via their manifest descriptions first.

Hint 4:  One way to see the manifest of a dll is to just drag into Visual Studio, and look at the second resource ordinal.

Consequently our app appeared to be loading two versions of the VC80 runtime.  There are two workarounds to this.  The first is to rebuild the offending dll against a newer version of the VC runtime.

If you cannot do that, for example, it is a third party dll, then you can redirect the loading by creating an ‘ourapp.exe.config’ file.  Of course, this didn’t work for us, but we were lucky in that we could recompile the dll.  However, the extra manifest entry still appeared to be there.

The reason why the app.exe.config doesn’t work is also interesting.  As far as I can find there is no documentation that states this explicitly:  application config files can only do binding redirects if the dlls they are redirecting are installed in the WinSxS folder.  The behaviour I was hoping for (which may not be correct, since this mechanism covers the .NET world as well), is that the redirect occurs on and to a manifest file and dll in the current working directory of your application.

Back to the problem;  the recompiled dll still showed that the second version of the vc runtime was in the manifest.  At this point depends was telling us that nothing directly depended on the oldest vc runtime.

So the question then was “where is this extra manifest entry coming from?”  In the land of Unix:
find . | xargs grep 50608

will find all files that contain the string ’50608′ which is part of the version number of the offending vc runtime.  Without using such a scatter gun approach, we know that the value can only be coming from a limited number of places, the source code, or something that is linked in.  We knew it wasn’t the source code, so that left some static libs.

Dragging each one into notepad finally revealed that on of the static libs that we use was referencing the old vc runtime.  This therefore meant that that particular static lib was built at some point by VS2005 RTM, and not VS2005 SP1 which the rest of the application was being built with.  By some good fortune it was our (not my!) source code, and we resurrected it, rebuilt it, and guess what?

Problem solved!

Summary

After this somewhat lengthy article, the summary is brief:  I located the problem to be in ‘our.dll’, and that the problem was in turn due to an old lib built by an older compiler that was being linked in.

It is really disappointing that on Windows XP, at least, there does not appear to be a tool for further diagnosing dll problems such as this, other than a random scattering of blog postings, and poor documentation.

At the very least there should be a tool that list the manifests of each dll, and where they were found.  The manifest tool, in verbose mode, does not do this.

Final note: if you choose to leave a comment, please be helpful and polite. If you have questions about manifests, it’s unlikely I’ll be able to help beyond what I learnt whilst writing this article.

Troubleshooting manifests

These are a list of links that I gradually worked through to figure out what on earth was going on:

http://channel9.msdn.com/forums/TechOff/22266-Side-by-side-screwup/?CommentID=272900

http://blogs.msdn.com/jreddy/archive/2005/12/23/troubleshooting-c-c-isolated-applications-and-side-by-side-assemblies-scenario-based-with-solutions.aspx

http://blogs.msdn.com/dsvc/archive/2008/08/07/part-2-troubleshooting-vc-side-by-side-problems.aspx