Thursday, December 23, 2010

Was this for our own good? (Part III) - a.k.a. "Fault Tolerant Heap??? I don't want your Fault Tolerant Heap..."

I've been doing some work on the AIRAC database loader for our upcoming FSLabs A320 addon... mainly to make it load faster. Well, that's an excuse - the real reason was, I read an article about the new Concurrency features that Visual Studio 2010 provides and I was really curious if using the Concurrency library would make things faster by parallelizing some specific pieces of my code - and guess what! It does!

The example code I chose was the AIRAC database file that contains the NAV FIX points - all 183 thousand of them, which were previously being loaded serially through the file - I thought, wouldn't that be a good example for parallelism?

I changed the code to allow for Concurrency - and had a simple heap corruption error... that I couldn't find at first (don't worry, I've found it since). Three test runs later, the debug version of my test application started running VERY slowly... and I mean VERY slowly... (about 100 times slower per 5000 fixes read in), making things quite undebuggable... ("but I hadn't changed anything, I swear!") and that was one of those moments when I started staring at the screen, not quite knowing "WTF" (I saw somewhere an explanation of this from a father to his child... Sir, it does NOT mean "Welcome To Facebook" - but I digress...)

Then an epiphany occurred. One of those light bulb moments, when you look at the debugger output window and notice an extra line there that wasn't there before...


 "Fault tolerant heap shim applied to current process. This is usually due to previous crashes."

A bit of Googl Microsoft Bing-ing later, and I could find the reason: Windows 7, in all its Microsoft spirit ("Don't worry, we'll fix it for ya!"), decided that my heap was getting corrupted too often by this irresponsible and stupid application executable and needed some more totalitarian help... ("Sir, nothing to see here, move along"), so it added the (debug) executable to its internal list of "applications which don't behave".

Long story short, when FTH (Fault Tolerant Heap) services are active for a specific application, it starts CRAWLING instead of running properly because each heap activity (which, for 183 thousand entries means 183 thousand of them) would be monitored and followed...

The solution was simple enough: Kill the specific FTH registry entry for that app, restart - all is well again.

The registry key where all this 'magic' exists is

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\FTH\State

if you find ANYTHING in there, other than the (Default) key, please be aware that that specific app will behave very poorly, because "Big Brother" is watching it.

Wouldn't the solution be simpler if instead of making our apps silently crawl, Microsoft would simply pop a message up saying "Your app sucks. Fix it, or else"?

1 comment:

Bill Roper said...

Sadly, I have the same problem in Windows Server 2008 R2, but the registry entries don't exist there.