Thursday, May 26, 2011

The dreaded Blue screen of death - a driver crash. Or debugging the Windows BSOD with WinDbg

Well it happens. There you are working on something important and wham, a blue screen of death appears. Or BSOD for short. You wait patiently as it writes "stuff" somewhere and as the seconds tick by you just hope it reboots. Then you hope you can recover some of your work. (Save often - and in multiple places if you can!)  

Welcome to Windows. Hey, you'll get used to it. And don't imagine something similar doesn't occasionally happen in alternative operating systems like UNIX or the recent UNIX-derived Apple OS. Anyone who's worked on Apple hardware for example knows that whilst "closed shop" or proprietary designs can keep potentially clashing hardware and software designs under a tighter reign, when they do break it's often far less simple - and more expensive - to get to the root cause. And it's the very openness of the original "IBM compatible" and later Intel and Microsoft hardware/software partnership that both rewards us with plentiful alternatives at lower cost and delivers us into the looser, vaguer world of mismatched versions and uncontrolled design. But enough of that. How do you fix it?

Well the BSOD usually gives you a clue. I just got one - and it blamed "NV4_disp.dll". You don't have to be Einstein to realise that "NV" is probably NVIDIA and "disp" is probably "display". It's your smoking gun, usually.

The "dll" bit is a Dynamic Link Library file, simply a file that provides one or more particular functions and/or some data for a given application. Generally speaking - and I'll use NV4_disp.dll as my example here - it's a device driver of some sort. So in this example NV4_disp.dll is happily driving the screen (or monitor or display if you like) and we call it the video driver because having many names for simple things is cool. Then you start up a new or recently updated video player (like I just updated Real because it asked me to) and it innocently makes a call upon NV4_disp.dll that just doesn't make complete sense. Perhaps your version of NV4_disp.dll is (like mine) 6 months old or more and is subtly different from "today's" standard. Somewhere along the road an error crops up that doesn't get handled properly and Windows itself steps in to save us all from disaster - by shutting down. Extreme, I know, but probably safer than letting things go from bad to worse.

The best fix here is simply to update NV4_disp.dll. Whatever the BSOD identifies is usually the culprit, unless it simply can't work out what broke first. In which case you need to dig deeper (see the end of this story for more clues).

But how do you update a video driver? Or any other driver for that matter? Well the Internet can be your friend here - just search for say "NV4_disp.dll update" and choose the most likely - like the NVIDIA website. They have a tool there that searches automagically for the right driver. If that doesn't work (it didn't for me and my Windows XP SP3 machine) then go to your control panel and open the NVIDIA control panel. Look for "system information" and bingo, you have your driver data.

Plug that info in manually and it'll come up with the latest driver. Download and install that. Remember it's safer to download from the manufacturer directly, if you can. Run a virus check on the file just to be safe. 

Hopefully that'll fix it.

But what if you need more clues, Sherlock?

Well if the BSOD is clueless, try "WhoCrashed", a program by that does the hard yards for you - and for free if you are a home user. (Search the Internet for it but remember to be careful who you download from and run a virus scan on the file.) WhoCrashed may ask where your source files are - and these are your "minidump" files. Minidump is simply a Windows repository for crash-logging files and is usually found under C:\WINDOWS\Minidump or similar.

I've rarely found Minidump turned off but some "tune up" software may turn it off to save space (not that it would save much). If turned off, turn it on (you'll find it via the "Help and Support Centre" in Windows, simply click on  "Use tools.. diagnose problems" then "System Restore" and "System Restore settings". Phew. Then open "Advanced" and "Startup and recovery" then "Settings". Still with me? Inside settings you should have a tick in "write an event to the system log" and "Small Memory dump" as the address written to... it will default now to %SystemRoot%\Minidump. Easy. Press OK to save and exit.)

WhoCrashed will spit out a report. Read it, it will probably help to determine what, or perhaps who, actually crashed. If it identifies specific hardware or software then follow that trail with updates, reinstalls or rollbacks as needed. Search on the Internet for more opinons if you like, too. Often there are multiple solutions as well as countless false trails.

And if you prefer to use the genuine Microsoft debugger it's called WinDbg and it comes with the genuine Windows set of debugging tools, downloadable from the MSDN website (just search for it in the usual way). You'll also need the Symbols download or use the MS server like WhoCrashed does. Install it all (it's big but beautiful) and run WinDbg. You'll need to set your source files folder to C:\WINDOWS\Minidump and your symbols folder to C:\WINDOWS\symbols (or wherever you put them). Then select "open crash dump" and the specific file you want - likely to be the most recent.

When set up, click away and it'll open a report. Read it, I'll wait here.

It'll probably suggest "Use !analyze -v to get detailed debugging information" somewhere down that report, so do that as well. Again, read the report and you'll usually get the gist of what the fault was. Usually. Take what action seems reasonable (ie fix, update, upgrade or throw it in the bin and buy a new machine).

If any of the above sounds ludicrously complex then just don't do it. Take it to a shop - or (if you are on the Central Coast of NSW) call me -  instead.

No comments: