Sunday, September 05, 2004
System admin real horror story :~(
Reading some horror story from some sys admin on the net (http://www.kenthamilton.net/humor/admin-horror.html#MiscStories), kinda funny but it's mind gigling oriented and you might end up being fire by your employer :-) It was happen to me many times (i'm really not awakening so not learning from mistakes eh?). It was happened to me past few month when i'm doing disk clean up on 1 of server that i handle. The machine is a SUN sparc, pretty ancient server but still tough running as primary DNS which hosting a few top level domain on it. The machine run openbsd 2.9 with generic kernel (don't have much space left on / and /usr to download kernel source and custom the thing. The machine actually been run for a year without reboot / shutdown but that day i dunno what makes me sooo workaholic and need extra work. Initially i'm just checking any logs and core dump that i can rm -rf to get extra free space and after cleaning all the "space eating" stuff, i'm about to logout from remote terminal but then my finger just really itchy to cd / and i'm really curious what and why the hell this boot* file was in the slice and i feel like it's annoying and my irritating mind order me to rm -rf boot* as fast as i could.
Then i just hit the key to shutdown -r now and then satisfied with the cleaning disk things that i've done for that day. But the things won't smooth as i expected, after waited for some times for the machine to reboot and back again, it won't !! I try to remote login to the machine but nothing appear on port 22. And i was like "what the fuck is going on down there" ??? Then i search google for boot* file in openbsd. Then y'know what, The file is needed for machine to boot the OS or it won't boot up !! I was really panic !!!!! like never had before. Fcuk fuck fuck !!!!!!! ^%&%&&$%^!@##!@!#@# and spontaneously a lot of swear words throw out. I feel like i'm wasting my time and effort doing the maintenance job just now. And after couple minutes there's some users complaining they can't browse website (yess, because domain resolver was offline) and then they can't browse some public website being host on that organization as well (yesss, because there's no DNS to accept and return query). It was really a horror of the decade for me :~(. It was saturday morning and i quickly going downstairs and pickup my bike, speed up straight away to the machine location just few hundred metres away from home. I switch on the 21" sparc monitor and saw it was stuck at the boot prompt and there's some error and failed of disk mounting and something. Unfortunately i haven't backup all the dns zone and reverse files and some important files on the other machine / media and the slave NS wasn't doing any AXFR fetch to those machine. I gotta read all over again some docs on the net of how to configure openbsd on sparc (forgot!) and redo all the dns files from zer0 ! fuck off ! Luckily i found some backup dns configuration files that i upload to some other machine, but the backup is 1 year ago. I'm nearly cried because quite a lot of my personal files in my $HOME was gone. I'm searching on the net how to recover the boot* without reinstalling the machine fresh but there's no avail. Then i grab 1 floppy disk to do openbsd boot floppy for installing the OS openbsd 3.4 via ftp. I gotta sacrifice my time with stayed in the server room until sunday night to redo everything. It is really a horror aight ? Thereafter i realize a quote from some fellow sys admin shout out to any sys/network admin out there :
-Do backups more often than you go to church !!!

