My First Linux Crash

For the first time in nearly three years, my Linux box crashed. Actually, that headline is misleading as my problems yesterday where hardware related rather than Linux related and left me with a system that wouldn't boot into it's operating system.

I bought the base unit on eBay nearly three years ago, and it has always had issues with overheating. As a result, I've run it with the side panel off all that time.

Having recently cleared my desk in preparation for a dual monitor set up, yesterday morning I turned my attention to underneath the desk. I tidied up all the cabling, dusted off the cobwebs from inside the computer (it's amazing the crap that collects in there especially if you have your computer sitting on the floor) and installed a new case fan. With a new fan in place I chose to put the side panel back on.

Big mistake!

With everything sorted I settled down for an afternoon with the horses. Midway through the session, the CPU got too hot and caused everything to completely freeze. Fortunately I wasn't in the middle of a trade at the time. I had to cut the power and reboot to get things going again, only it didn't.

At first, the BIOS wouldn't recognise the hard drive. A couple of restarts later and it was fine, except that it wouldn't get past the Grub 2 boot loader, which kept reporting 'Error 2'.

A quick search on the Internet via a Live CD revealed that this error is caused by the boot loader not finding the hard disk in question, or it not being reported by BIOS. However, I knew the disk was OK. BIOS was indeed finding it and I could access it via the Live CD.

I then spent much of the remainder of the day trying various solutions I found on the Internet. However, these tended to address an issue with Grub, whereas I was certain my issue was hardware related. I couldn't see why it was causing a problem since the drive was recognised and accessible.

As is the nature of these things, the solution is often born out of a hunch.

I have two hard disks in my machine, configured as Master and Slave. The slave holds a previous version of Ubuntu, which I don't use. Indeed, that drive is never mounted when I'm working in the latest version. I acted on the hunch that perhaps the crash had caused the Grub 2 boot loader to look at the slave drive for it's boot up information.

I cut the power to the slave, rebooted, and hey presto, I was straight into my normal system. The weird things is, when I fed power to the slave drive again and rebooted, everything was back to normal.

As an engineer, I hate that. Fixing something without knowing exactly what the mechanism was that caused the situation. Anyway, problem resolved and the side panel will remain off the case.

No comments: