Badtux the Snarky Penguin

In a time of chimpanzees, I was a penguin.

Religious fundamentalists are motivated by the sneaking suspicion that someone, somewhere, is having fun -- and that this must be stopped.


Sunday, March 20, 2005

Penguins and Linux: A love-hate relationship

If you are wondering where I have been for the past two days, it is simple: I have been buried deep in the guts of my computer, which locked up from time to time then when reset, came back rebuilding one of my RAID arrays.

First, a quick run-down of my computer's configuration. This is not an el-cheapo setup by any means. I make my living with this thing, and I long ago learned that skimping on the tools of your trade is no virtue, quality pays. This thing has a top of the line Antec case and power supply. A Promise 4-port SATA controller. A top-of-the-line (at the time) Soyo motherboard with a one-notch-below top of the line (at the time) VIA chipset that has two SATA ports onboard, as well as normal IDE, Ethernet, Firewire, and USB2.0. A middle-of-the-road NVIDIA video card (ultimate video performance is not required for my purposes, whereas absolute stability is). A 2.1ghz AMD Athlon XP processor (one notch below top of the line at the time) and 512MB of RAM. And, most importantly, three 160GB Maxtor 7200RPM SATA hard drives, arranged as one RAID1 (mirroring) array across the bottom for my boot partition, and five Linux RAID5 arrays splitting up the rest of the drives for my various partitions. All of this is running Red Hat Fedora Core 3 Linux. To round things out, I have a Firewire DVD burner and 250GB Firewire external hard drive that get shared with my laptop for backups, and of course a couple of printers (a laser printer for invoices and other business printing, an Epson inkjet for my personal printing).

Okay, the first notion was that because I heard a loud click from the hard drive area when it locked up, it was a hard drive locking up. The problem is this: Why in the world should a hard drive locking up take out the whole freakin' computer?! I mean, the whole point of RAID is that if a hard drive goes AWOL, only that one hard drive goes out, not the whole friggin' system! So I did a read test of all the drives. Unfortunately, they all passed the read test (i.e., I could read every byte on every drive). It was clear that a read test wasn't going to diagnose the problem.

At the time I had the on-board SATA disabled, and was using only the four-port Promise SATA. So I decided, hmm, maybe Promise's driver for Linux sucks. So I enabled the onboard SATA and moved two drives to it, and rolled a new initrd with the driver for the onboard SATA and rebooted into the new configuration.

Well, it locked up again when I tried to back up everything to the firewire hard drive. So I said to myself, "Hmm, maybe it's the one that's still on the Promise controller." So I swapped cables.

And sure enough, it locked up again the next time 'round... but *THIS* time, using the VIA SATA driver, it printed out which drive was gumming the works. So I unplugged that drive, rebooted (into degraded mode), backed up my stuff to the Firewire hard drive with no problem (which 100% verified that the drive in question was the evil one gumming up the works), and then I had to repair the thing.

So a quick trip to Fry's Electronics, grab a new drive, install it, boot into rescue, copy the partition table from one of the other drives, manually assemble the RAID arrays in two-drive (degraded mode), hot add the new partitions to the RAID arrays, wait for the RAID arrays to rebuild (25 minutes -- this is a FAST computer), and... It works!

Lessons learned:

  1. Linux sucks.
  2. Linux, out of the box, on commodity hardware, is not ready for mission-critical purposes because it locks up under situations that are absolutely unacceptable. (Note: There are Linux systems that do not share this flaw, but they are running special hardware, not commodity hardware).
  3. That said, I would not have fared any better under Windows. Indeed, with all the hardware-swapping involved, I probably would have ended up having to re-install Windows (I have had very poor luck getting Windows to deal with massive hardware changes).
  4. Penguins still love Linux, but hate it too.
And now that this real world exercise in technological sado-masochism is over, maybe I can get some work -- and blogging -- done :-}.

- Badtux the Linux Penguin

Posted by: BadTux / 3/20/2005 08:07:00 PM  

Comments:

Post a Comment

<< Home
 My Photo
Name: BadTux
Location: Some iceberg, South Pacific, Antarctica

I am a black and white and yellow multicolored penguin making his way as best he can in a world of monochromic monkeys.

Archives
April 2004 / December 2004 / January 2005 / February 2005 / March 2005 / April 2005 / May 2005 / June 2005 / July 2005 / August 2005 / September 2005 / October 2005 / November 2005 / December 2005 / January 2006 / February 2006 / March 2006 / April 2006 / May 2006 / June 2006 / July 2006 / August 2006 / September 2006 / October 2006 / November 2006 / December 2006 / January 2007 / February 2007 / March 2007 / April 2007 / May 2007 / June 2007 / July 2007 / August 2007 /


Bill Richardson: Because what America needs is a competent fat man with bad hair as President (haven't we had enough incompetent pretty faces?)

Cost of the War in Iraq
(JavaScript Error)
Terror Alert Level
Links
Honor Roll
Technorati embed?
Liberated Iraqis

"Keep fighting for freedom and justice, beloveds, but don't forget to have fun doin' it. Lord, let your laughter ring forth. Be outrageous, ridicule the fraidy-cats, rejoice in all the oddities that freedom can produce." -- Molly Ivins, 1944-2007 "The penalty good men pay for indifference to public affairs is to be ruled by evil men."

-- Plato

Are you a spammer? Then send mail to my spamtrack mailbox to get permenantly banned! Remember, that's iamstupid@badtux.org (hehehhe!).

More blogs about bad tux the snarky penguin.

This page is powered by Blogger. Isn't yours?