Have I said lately that I love Linux? No? Okay then, "I love Linux!" I found yet another reason to promote Linux over other operating system software this week. Bad RAM chips. Some of you may now be thinking to yourselves, "What? Is this guy nuts?" I assure you I am completely sane. I will explain.
One of my business clients is also a Linux fanatic. I introduced her to Linux several years ago and she almost instantly adored Linux. Having that cute "Tuxie" helped I think. She first began using Linux with KDE, then tried and liked Gnome as well and switched back and forth using either one or the other for specific tasks. However, she became seriously disenchanted with KDE when the poor decisions were made to foist an unfinished KDE 4 upon the unsuspecting public. So I helped her switch completely to Gnome on her AMD Quad-core based office PC in 2010. She has been a happy camper ever since the switch, until a recent upgrade from Mandriva 2010.0 to Mandriva 2010.2 seemed to be going flakey on her system.
After this upgrade in late December 2010 she started having problems with Nautilus (Gnome's default file manager.) running slow and hanging. Then certain applications such as Firefox began to crash regularly. She called on her computer guy, me, to take a look at the system earlier this week after her Linux system experienced a hard hang that required a press of the reset switch. The hard hang was bad enough that she could not even ssh into her PC from her laptop to try to see if she could kill a runaway process, like I had shown her how to do. She knows that the Linux kernel just does not hang like this, so something had to be wrong with her hardware. I immediately suspected a RAM problem after she enumerated all the hangs, crashes and especially the hard hang of the complete system.
Enter Memtest86+ to check that RAM. Sure enough Memtest reported memory errors in the address range 0x0d646aa8 to 0x0d646f68 (hexadecimal). This is a very small range of memory in her 2 GB of RAM and tossing the RAM out was not an option at the moment as money is tight. But she still needed to use her PC. So, I went looking for a solution as I had seen an oblique mention of some kernel hack called "badram" at one point. Sure enough, I found Rick van Rein's pages for his badram hack. A little further research showed that Mandriva has included this as a patch in their kernels for quite a while. So I edited /boot/grub/menu.lst on my client's PC and added a new kernel stanza:
title linux-badram kernel (hd0,0)/boot/vmlinuz BOOT_IMAGE=linux root=UUID=bb2a27be-33ed-4a18-b576-37adb9bdfa3b splash=silent vmalloc=256MB vga=788 badram=0x0d646aa8,0x0d646f68 initrd (hd0,0)/boot/initrd.img
I rebooted her PC, chose the linux-badram menu option and told her to try the PC for a couple of days and let me know if that solved her hangs and crashes. If so, I would make that the default for her. Two days have passed since then with no report from her, so I asked her about the system this morning. She was so excited when she reported a stable system this morning that I had to smile. She said, "Nautilus is fast again and not one application has crashed." Prior to this she had been having multiple crashes of her X applications every day. Problem solved with the Linux kernel and the badram patch. Thanks Linux kernel team and Mr. Rein for your attention to little details like this. This is part of why I love Linux.
So, if you are looking at some misbehaving RAM in your own PC consider using the badram patch before you toss out otherwise good RAM. Of course if you are not using Linux I suppose you will just have to throw out that flakey RAM and buy new RAM … or you can send it to me.
Update Thu Feb 3 11:26:43 CST 2011: This labor was a warranty job, so the client paid nothing for us to implement this fix. The RAM is under a limited lifetime warranty from Kingston and is going to be replaced. In the meantime, Linux with badram is allowing her to use her PC.
Notice: All comments here are approved by a moderator before they will show up. Depending on the time of day this can take several hours. Please be patient and only post comments once. Thank you.