9:28pm: This is not what is supposed to happen
Rough day at work today. And by rough I mean one step above total disaster. It's only a minor disaster.
A hard drive died in the development database server. That happens. Normally it's not a big deal. You put a spare in, rebuild the RAID array, and life goes on.
For reasons that we don't quite understand yet, this was not a normal situation. Write failures appeared across the board, the server crashed, and there was major corruption of the array. An entire partition went missing (we were able to recover it using a tool called
Recover My Files). On the partitions that survived, there are randomly corrupted files and parts of files. We've got log files that look normal except for one section of total gibberish. Batch files that don't work anymore, data files that have one bad block out of a million, and so on.
To call it a mess would be an understatement. Making it worse is that the DBA is out due to a family emergency (and for good reason), so we're left trying to deal with it without really having the knowledge we need. Oracle is a fickle thing when you're learning on the fly in disaster recovery.
Now, this is an older server, but it's still a server. RAID 5 is not supposed to behave like this when one disk goes. Something VERY wrong happened here. My day wasn't spent trying to figure out what went wrong, it was spent trying to get things working again... which was somewhat less then entirely successful. I'm not sure it can be done without a full restore at this point given how widespread and seemingly random the damage is (one database only has tables starting with the letter A now, go figure!).
We're going to try again tomorrow, and after that we'll decide what the next step is.
ps - I hate summer.
Current Mood: 
exhausted