[lang]

Present Perfect

Personal
Projects
Packages
Patches
Presents
Linux

Picture Gallery
Present Perfect

Filed under: General — Thomas @ 12:22

2003-12-31
12:22

Just got back from GUADEC, it was my first time and it was a great one. So sad to be at home now ;)

I was going to post some more on it tomorrow, but seeing Gman's list brought a proverbial tear to my eye and it makes me wonder why he forgot to mention our sight of a Sun engineer doing releases on a roof. Pants off and bork !

Filed under: General — Thomas @ 12:21

12:21

I had the shittiest day at work ever and I don't even want to talk about it right now.

The only reason I'm writing anything is because I want to try out this new advodiary python script everyone seems to be using.

So if you're reading this I guess it worked.

Filed under: General — Thomas @ 12:20

12:20

I finally got back a replacement SCSI drive for the external RAID system we have. Can you believe that ? Two full months. I've had the RAID shut off for most of it. Honestly, what's the point in having a RAID if you can't replace bad disks fast enough ?

In any case, as soon as I got it back, I stuck it in the free slot and turned on the machine. Then the beeping started again. I shut off the alarm in the card's bios then tried to rebuild the disk(1) array. It started doing so, then stopped at 0% and went off beeping again. Sigh.

What's a responsible sysadmin to do in a case like that ? Not tell his boss and spend the rest of his day challenging fate and defying odds, that's what. You know, sometimes computers really are more a case of black art and intuition than anything even closely resembling engineering practices or science.

As luck would have it, I happened to have an external enclosure with *FOUR* disk slots, and I use only three of them. So I still had the top slot free. Being in a really illogical state, ready to try anything, I decided to stick the drive in the top slot instead of the bottom one.

Hm, it didn't fit. Strange. Try an empty disk holder, that works. Try with the drive again, still doesn't work. Ok, try to take a drive out of slot 2 and stick that in slot 1. That works. Hm, strange. Take out drive 1 from slot 2 again, insert new drive in slot 1, doesn't work. Spend about twenty minutes physically comparing both the disk drive and the disk holders. They look identical. Mount new drive in holder 1 from slot 2 and try in slot 1. Doesn't work. Drive 1 in holder 3 from slot 4 works though. This isn't making sense at all. Especially since I can really feel that with the new drive, it slides in all the way except for the last bit and the latch at the front of the disk array that needs to go in a clamp-style thingy can't reach it.

Re-swap drives and holders, jiggle the SCSI connectors a bit since they look rather wobbly (don't try this at home), and try again. Now it slides in and it works. Anyone care to explain ? ;)

So, back to BIOS, turn off beeping again, ask for a rebuild, and after five very long minutes the 0% counter reaches 1%. Now I just have to hope that by the time it is rebuilt, the array will have been smart enough to do everything RAID is supposed to do ... and THEN I can tell my boss what happened, and assure him that everything's fixed. Phew ...

(1) Can anyone tell me why we use "disk" for hard drives and floppy disks but "disc" for cd-rom's, dvd's, cd-rw's and frisbees ?

Filed under: General — Thomas @ 12:19

12:19

Prologue : A meeting room at Adaptec, two years ago. Engineers are discussing the design of the 2100S Raid controller...

  • Engineer 1 : wouldn't it be great if we put a beeper on the card to warn people when one of the disks fails ?
  • Engineer 2 : yeah, great idea. We can charge more for the card then, and people will be really happy when it fails because they'll be able to tell immediately !

Happy engineers leave the meeting room, congratulating eachother on a job well-to-be-done.


So I arrive at work this morning. I am there for all of, oh, five minutes, and a really annoying beeping sound makes itself heard. I look around trying to figure out which of the ten PC's in my immediate vicinity is making the sound. It seems to come from one of the server machines. Yep, it's the big server with the external storage unit (of which, if you read through other entries of mine, you might know that the external storage unit is connected through an internal SCSI cable to the inside of the server machine).

What seems to be the matter ? One of the LEDs on the storage unit is flashing. I start looking for manuals on the storage unit, but it doesn't help me further. Meanwhile, the beeping is hurting my ears, but I don't want to switch off the machine until I see there's no other option.

I quickly give up and turn off the machine anyway, as a result of peer pressure. Good thing it only started beeping after I arrived. I don't dare to think what people would have done at six o'clock to get rid of the sound - or, worse yet, at the start of my week-long holiday the day after tomorrow.

Hm, ok, so judging from the BIOS, one of the three disks has failed. No matter, it's a RAID. I can backup the drive and ask for a replacement, right ? Let's see, when did I buy this unit ?

Hm, too bad. Exactly one year and two weeks ago. The drive s have a warranty of one year. That sucks ...

Ok, they're also pretty expensive. Hm, how am I going to bring this up with my boss ?

Meanwhile, I still need to backup. The Adaptec site mentions how to turn off the alarm in about six articles, so I'm guessing they've had complaints about that bad design decision (TM) in the past. Only, the site mentions a command-line utility which I have, but it doesn't work. I run raidutil -h first but that doesn't do much. It should print help info, no ?

I check the man page. raidutil -h creates a hot spare. oops. Luckily I didn't supply arguments, who knows what might have happened ;( The man page itself is messy and plain wrong. It mentions a -a argument and contains the world "alarm" there twice, but all of the arguments mention other actions. From what I gather from the six Adaptec articles, the alarm should be set using -A, and there is also a -a option for other stuff, so the man page seems to mix both up together ;( Talk about bad quality control.

Anyway, none of the options seem to work. Probably having upgraded the machine from RH62 to RH72 has something to do with it, since the Adaptec tools are probably geared towards the previous kernel. Maybe the utilities can't speek dpt-ese at the moment.

So I reboot with the Adaptec bootable CD. The card beeps again, of course. It's a custom Red Hat boot cd. It starts by autoprobing my video card and then starts X. Surprise surprise, X is botched up.

By now I'm pretty pissed and I decide I'll get this fixed no matter what. I'll spare you the details, but poking around the CD-ROM allowed me to get X configured properly in 800x600. So I start X and adaptec's software starts up in a really ugly window manager I seem to vaguely remember from the dark ages.

The software tells me I have a failed disk. Yes, but why ? Luckily I can turn off the alarm.

So what know ? Make backups ? It's 60 GB. Probably not a good idea to do that over the network, but let's try, just for fun. Ok, so there is a module for my network card, but no network tools like ping, and it doesn't seem like the network card wants to work with the driver. So this isn't going to work out either.

Shut down, attach a spare IDE drive to copy stuff to, reboot, repeat process, open terminal. Hm, what device is the RAID system now ? It used to be /dev/sdb. But that's my main SCSI disk now. and the CD-ROM is /dev/sda for some reason. And I cannot find the raid controller anywhere else. And there's no dmesg or /var/log/messages output. It's probably been turned off in this custom kernel.

By know I feel I could really make good use of a shotgun. Luckily, I don't have one.

*SIGH* Check drivers and software on the site, download RPM's for older RedHat versions, restart the machine, try to install them (during the beeping of course), debug the messages from console to get a clue of what is going wrong, and FINALLY connect to the adaptec storage manager and be able to turn the bloody beeping OFF. By this time I've already spent quite a few hours trying to do this the right way and my mood has gone to an all-time low this year because of it. I really need that holiday !.

So right now I'm copying all of the data of the drive (well, almost all - I have 50 GB of free space and 60 GB of data to copy and I'm deciding who to piss off by taking risks with their data) and I'm composing a mail to the hardware vendor trying to persuade them to take the back drive under the warranty.


On to better news : I've experimented with GTK+ this weekend. I wanted to code a better panel application for Dave/Dina, to be used with the Infrared Controller. Since I'm a GTK+ newbie, I started out with GTK+ 2. Might as well try to get it right from the start.

I must say it was a pleasant experience. After the pains of messing about with Xlib for another project of mine, Xmsgd, this was fairly easy to code. I have a small demo application that shows how the panel would react. I'll probably put it on-line soon at http://davedina.apestaart.org/ so other people can comment on the UI. It's bare-bones, but it'll be functional until some GTK wizard helps out in development ;)

Meanwhile, I'm getting patches for some of my projects and it is really satisfying to be a part of this invisible open-source network.

Oh, and I forgot to mention that GStreamer released 0.3.1 last week ! We're setting up guidelines for 0.4.0 at the moment. And Wim wrote a new capabilities negotiation system (the thing that makes plugins decide if they want to

talk to each other).

And now I need to blow off some steam ...

Filed under: General — Thomas @ 12:18

12:18

I finally got out a first release of columbus. If you have a laptop or a computer you take with you to various places and you're sick of having to somehow manually change your system configuration, then help me out in improving columbus.

Columbus is simple, a bit hackish, but rather elegant. It has worked flawlessly on my laptop for the past two weeks. I take out my network cable at work, put the laptop to sleep with apm, come home, wake it up and plug the cable back in, and presto : everything works the way it should. Right IP (without DHCP), right hosts file, right hostname, ...

All it needs is a few MAC and IP addresses and some configuration files to actually use.

So try it out and let me know if it works for you.

« Previous PageNext Page »
picture