Monday, August 18, 2008


Last night, I did something stupid on my primary desktop computer (a Vista box) and needed to restore the system to a recent backup. I use Windows Home Server on my home network, so I was confident in my ability to roll back the system to a previous night's backup. I booted my machine using the WHS Client Restore CD, chose the appropriate backup, waited (im)patiently for about two hours while the bits were restored, the system rebooted...
...and that's when I saw the Blue Screen of Death... specifically, a STOP 0x0000C1F5. Shit.
Now, my first instinct was that I had a sketchy backup image in WHS, and perhaps I should try a slightlly older one. I repeated the restore process with three older backups and got the same result. On the verge of going off on a major "WHS sucks" tirade, I instead opted for some Googling on a still-working system to see if I could find any clues. It seems as though the frequency of reports of STOP 0x0000C1F5 problems is increasing, with most people attributing the issue to a bad Vista SP1 (or prepare-for-SP1) update or patch. Microsoft acknowledges the problem in KB946084, but there is no public hotfix or workaround save for "clear the MBR and reinstall", which IMHO is unacceptable.
Looking at the problem a little more closely, it seems that if the $TxfLog file is corrupted, the Common Log File System Driver shits the bed at boot time, causing the BSOD. The particularly nasty thing about this problem is that you cannot even boot the Vista distribution DVD to use its repair tools; the BSOD occurs when you boot from DVD too! Basically, it crashes whenever a Windows box tries to mount the file system.
Soooo... a fix might be possible by accessing the disk using an operating system that doesn't give a rat's ass about Windows file systems (e.g. Linux).
At this point, I broke out one of my favorite sysadmin tools, SystemRescueCD. This is a Linux-based live distro that has all sorts of diagnostic and repair goodies on it. I figured that if I booted the SystemRescueCD disk, I might be able to diagnose, and maybe even repair, the problem.
(Unsolicited plug alert: take a minute to download SystemRescueCD, burn a copy, and add it to your sysadmin bag of tricks. The folks who make and maintain this disc do a helluva good job... it has saved my bacon more than once. Check it out.)
So, here's an overview of how I fixed my system. For part 1, you need a SystemRescueCD disc. Don't forget that Linux commands are case-sensitive, so pay careful attention to upper and lower case letters and spaces between items on the command line. Also note that several of these file names contain dollar signs ($), and the $ must be escaped from interpretation by the shell by preceding it immediately with a backslash (\), e.g. "\$foo" when referring to a file named $foo.
Boot the SystemRescueCD disc, answering any localization questions as required, until you get to a shell prompt.
Mount your hard drive at /mnt/windows using ntfs-3g, e.g. "ntfs-3g /dev/sda1 /mnt/windows". You may have to "ls /dev/hd*" or "ls /dev/sd*" or "fdisk -l" to figure out the correct device to mount. If you are using a RAID device for your root file system, run "dmraid -ay" to attempt to mount all available RAID file systems, then "ls /dev/mapper" and look for your device. Also, if the NTFS file system is corrupted (which it probably is if you are reading this post) you may have to add the "-o force" flag to the mount, e.g. "ntfs-3g /dev/sda1 /mnt/windows -o force".
Verify that you have the correct file system mounted by "ls /mnt/windows". You should see the content of "C:" or whatever is your boot drive in Windows... if you don't, repeat Step 2 until you mount the correct device.
Navigate to the first hidden folder: "cd /mnt/windows/\$Extend". Note the backslash before the $; that is important as it keeps the command shell from interpreting the $ (it is really part of the file name).
Navigate to the second hidden folder: type "cd \$RmMetadata". Once again, note that the $ is escaped with a backslash.
Type "ls". Among the files/folders listed you should see "$TxfLog".
Take a deep breath and recursively remove the $TxfLog file: "rm -rf \$TxfLog". Once again, note that the $ is escaped with a backslash.
Use "ls" to verify that it has been deleted. (You should see the same listing as in Step 6 except the $TxfLog folder is now missing.)
"cd /", "umount /mnt/windows", and "init 6" to reboot, removing the CD when appropriate.
At this point, your system will no longer bluescreen, but it won't boot, either. To fix that, here's part 2, for which you'll need a Vista DVD.
Boot the Vista DVD and choose "Repair my computer".
When the system looks for Vista installations to repair, it probably won't find any. Don't panic; just click Next.
In the System Recovery Options list, choose Startup Repair. The system will process for a minute or two, then state that it needs to reboot to finish its repair. Allow it to reboot.
Remove the DVD at the appropriate time and allow the system to boot from the hard drive.
If the system complains that it was not shut down properly, choose "boot normally".
That's it. With any luck at all you should have a bootable system again.
The STOP 0x0000C1F5 bug is a nasty one, and I am confident that Microsoft will release a hotfix and/or Windows Update for it soon. In the meantime, if you are experiencing the problem, I hope this article helps to get you running again.