A new Fedora, a new decision on which machines to upgrade. Usually I try to stagger the three machines I use most - my work desktop, my home desktop, and my laptop. I had updated work machine and laptop to F-15 when it came out, and kept my home desktop at F-14.
I actually have two or three root partitions on each of those machines, and I typically do a fresh install on a separate root, so I can try things, poke around, and make sure everything I will need works. When I do the install, I don't mount my /home partition, because I don't want to have the new version upgrade things for me on my user config.
I have a pretty long checklist by now that I go through on each install/upgrade, installing the packages I use a lot, setting up specific configuration, copy over ssh keys, ...
I actually liked F-15 a lot, and though GNOME 3 has its issues (which I still want to document in a separate post), I overall enjoyed the experience. At home, I noticed myself using the windows key or moving my mouse to the top left corner expecting something to happen.
That is how you know you really are ready for GNOME3.
So I thought, what the heck, let's get to upgrading all of them. I started with my laptop, as usual. That mostly went fine, except for hurdle number one. My laptop actually has /home encrypted. And I did not add it to my custom layout in anaconda. So, the system dropped me in a rescue shell after booting. It took me quite a while to figure out that I had to copy over /etc/crypttab from the old system. After that, things worked again.
Arguably, hurdle #1 may not be Fedora's fault. Maybe normal users don't encrypt home drives, or use custom partitioning like I do (although on a few fedora upgrades this saved my bacon when it turned out certain things I needed didn't work in the new Fedora, like VMWare)
And yes, GNOME 3.2 is a slight improvement. Enough to make a difference at least. All the usual applications seem to work, so I can now mount my old /home directory.
That's when I ran into hurdle number 2: the default uid/gid numbering change. My thomas user now was 1000:1000 as opposed to 500:500 on all my machines before Fedora 16.
In this day and age, I still have to shell it up to fix things like that:
find / -uid 500 -exec chown 1000 {} \;
find / -gid 500 -exec chown :1000 {} \;
If I had less shame I'd tell you how embarassing it is if you do this for a few users on your system, and start thinking "let's put this in a for loop", and because it's already 1 AM you start doing things like
for a in 0 1 3; do find / -uid 50$a -exec chown 10$a {} \; ; find / -gid 50$a -exec chown 100$a {} \; done
Note how I got the number of 0's wrong in the first find, and how I actually forgot the : in the second. You can imagine how amusing it is to fix the effect of those commands.
But I'm a shameful person so I won't tell you about this bit. Instead, suffice it to say that this took a long time.
Ok, so now /home is mounted on the laptop, and for the most part things worked fine.
On comes the weekend, so I turn to the home machine. I tend to keep the work machine for last, because I don't want to spend work time on fixing distro problems. And I usually take a whole weekend to upgrade at home. The home machine turned out to be more of a problem. I ran headlong into hurdle number three. You see, there is this new thing called GPT for your partition table, and it is now the default, and it means that fdisk will no longer work, and now you should use gdisk (which sadly is not installed on the rescue bit of the install DVD, boo!), and this is all so we can have grub2, which is supposed to be better or something.
I'm sure one day I will be thankful. But on my home machine, I didn't know any of this, and just had anaconda tell me something about the boot image being too large and there was no space for it and my system may not boot. (I am not sure why I did not run into this problem on my laptop - presumably, looking at the disk layout now, because I kept the original install, which includes Windows, and just shrunk that and added linux - so it's probably the windows thing doing the booting). And sure enough, the Fedora 16 install did not boot. It dropped me into my friend, the shell.
So here's the thing. This new way of doing things needs more space than your average MBR, so you actually need to create a primary partition for this, and it needs to be in the first 2 TiB. So you know what time it is now. It's resize-o-clock time - I get to learn the joys and mysteries of shrinking ext4-on-software-raid so I can make space for this new partition, which doesn't need to be big, apparently 5 MB is more than enough. Aren't I happy now that I stubbornly stuck to having a /boot partition as the primary one on my machines, so I can just shrink that a little?
So shrinking an ext partition I already had down pat. I learnt about shrinking software raid partitions, and again I got into the land of not understanding which of the many types of numbers (sectors ? blocks ? bytes ? cilinders ? Mebi vs Mega ?) are understood the same way by the tools, or not understanding how much of those numbers you need to count extra because of the layer of indirection being added (encryption on logical volume on LVM on software RAID anyone ?). So to be safe I end up shrinking 10% on each layer of the onion as I go deeper - then let the tools handle growing to the maximum space again, since that's the one thing they're usually decent at.
But you know, if I've done all this, I want to get it right. I don't want a stinking BIOS boot partition sitting after my /boot partitions. That's not how F16 sets it up by default. But I have never actually moved a partition. So, download gparted, look at it, figure out how it can let me do that, make sure I ask it to count by cylinders so it doesn't leave gaps, be puzzled at why it doesn't let me fractions for MiB sizes of partitions, and work around it in some other way. And so I finally have those two software raid /boot-wearing partitions where I want them - sitting right behind this new BIOS partition.
I create a new partition in fdisk (which is what I'm used to), but I can't actually set the partition type to EF02, which has four characters where I expect two. But really that is what BIOS BOOT should be.
And now the internet tells me I need to set some flag on it using a tool called parted - some flag called bios_grub. Except when I type that magical command that sets the flag, it tells me it can't exist:
[root@otto ~]# parted /dev/sda set 6 bios_grub
parted: invalid token: bios_grub
Flag to Invert?
Isn't this tool nicely written for only the writer of the tool instead of for human beings? Of course I don't know this when it barfs this at me, but at the end of this story I figured a bunch of things out that this tool could have told me.
You see, invalid token just means that it doesn't accept the flag named bios_grub. I know this because I'm a programmer so I know the programmer used a token parser - a thing normal people shouldn't have to know about. What's that you're asking? Flag to Invert? How about the Belgian flag, I would quite like to see the colors go in the opposite direction. No, that's a prompt to choose a different flag to invert than bios_grub. Apparently bios_grub is a flag, not a setting, and I'm trying to invert it, instead of setting it. Can you tell me what flags you do know about, dear parted ?
(parted) help set
set NUMBER FLAG STATE change the FLAG on partition NUMBER
NUMBER is the partition number used by Linux. On MS-DOS disk labels,
the primary partitions number from 1 to 4, logical partitions from 5
onwards.
FLAG is one of: boot, root, swap, hidden, raid, lvm, lba, hp-service,
palo, prep, msftres, bios_grub, atvrecv, diag, legacy_boot
STATE is one of: on, off
Wait, what ? You do know about bios_grub ? But you don't let me set it ?
I seriously spent 30 minutes on trying to figure that one out.
In the end, it's because a) I should run gdisk b) parted won't let you set that flag on a normal MBR drive c) gdisk should convert to using GPT and d) the messages gdisk prints by default are SUPER scary and the docs say that this is intentional to keep away stupid Windows users (I am not making this shit up). Well, that's why I use software RAID, isn't it ? How about we take our chances, dive in deep, and let this gdisk thing do the conversion to GPT on the first disk. Gulp.
OK, I got lucky. That actually worked. I can now create this partition, with the proper flag set. While I'm at it, why don't we try this 'sort partitions' option in gdisk so that this new partition, which is now at the start, but listed as number 4 out of 4, shows up as number 1. Sure, it will renumber all other partitions, but let's just hope that most things use UUID's and labels and what not by now, and if not I should be able to figure things out.
In what feels like Day 5 in a two-day weekend, the system now boots! I actually see a new grub (wait, why is that text-mode only again ? Fedora guys, you spent years to make everything look graphical, because that was some huge important feature that mostly got in my way when it took longer than it was supposed to and I had no way to see why except reboot and remove quiet and rhgb from the options) and now you suddenly let grub2 take that back from you? Show us some spine, please), and the system shows me plymouth again. Until it doesn't anymore, and drops me into a terminal screen.
Hurdle number four. Can you guess what it is ? Go on, take a stab. If you've updated your system, I'm sure you know the answer. I'll give you some whitespace to think about it...
SELinux. Riding in to relabel my file system to save it from the evil people out there. And sure, it warns me. This may take a long time. And then it proceeds to throw asterisks in my face. Lots of asterisks. It's not the first time this happens. But every time it does, I cannot help but wonder one thing.
Who thought it was a great idea to throw asterisks at the user? How many asterisks am I supposed to expect? Never mind that you can't actually count them unless you glue your eyeball at the screen, because there are so many they actually scroll off at the top. You know, if you squint hard enough, you can see the maniacally laughing face of the programmer who thought this was a nice way of showing progress. Never mind that tools like fsck can show a progress bar that actually means something (if you trick it into sending data to file descriptor 0) in a sensible way - one line on the console, and visible progress towards an end goal of 100%.
If only I could guess what a long time is going to end up being. Is it a 'get a drink' amount of time? Or 'watch some dexter'. Or nookie time? Or, get the hell out of the house and do all the shopping for the next three hours because there's no way you'll be doing anything useful with this system for that long?
So I do all of those things, twice, and one even four times times (I won't tell you which but I ended up having to pee a lot), and I come back, and the system has rebooted, and there's actually a GUI asking me to log in.
You know, this Fedora 16 better be frigging spectacular after this six day weekend.
I log in, follow my standard upgrade checklist, try out some of my tools. Media keys don't seem to work as before for my prototype music player (it flashes a nasty forbidden sign at me), and even though I set up to have nothing happen on inserting audio CD's (because my LEGO robot is inserting CD's into an external drive about fifty times a day), Rhythmbox craps on and FORCES me to select which of the many CD's with exactly the same name that audio CD might be. So, par for the course so far.
Maybe a reboot will fix that, it may not know about those settings until I have everything installed and upgraded. And if I reboot, I'd better convert my second drive to GPT and fix my /boot and set that flag and all that. So I do. And for some reason I can't figure out how to tell software raid that sda2 and sdb2 (which are both still perfectly mountable as ext file systems and were part of the previous RAID-1 /boot array before I resized them) really are a software raid. So there's this point where I've wasted more time on trying that then it would have taken me to actually manually type every byte on that /boot partition, and I just give up and recreate a software raid on those two partitions and copy stuff over.
And then I reboot. And won't you know it. Effing goddamn selinux relabel all over again. In fact, this way too long entry was typed completely in less than half the time selinux took to complete some work it had already done an hour ago.
I better have a working system after this last relabel finishes. Now excuse me while I go make some comfort food, potatoes and beans and runny eggs with butter sauce. I'm going to eat it while my good friend Dexter comes back from a long holiday. It's the only thing that is going to get me out of this weekend funk. And you know who I will be thinking about every time my friend Dexter tells me of a problem he solved...
A new Fedora, a new decision on which machines to upgrade. Usually I try to stagger the three machines I use most - my work desktop, my home desktop, and...