squeak troubles

I just had a nasty experience that was compounded by a number of factors. Here's what i figured out had happened:
i put squeak (my computer) to sleep so i could go out for the evening last night at around 18:30. what i didn't notice was that /var was full, and mysqld was choking, waiting for disk space to get freed up. Somehow, this interrupted my sleep request, and squeak stayed awake (though unplugged). for some reason, squeak also decided to try to use the wireless card at about 22:30, and couldn't get the connection he wanted. Since squeak was awake and unplugged, he ended up draining both batteries, and ran out of juice at 3:30 the next morning (on a positive note, that's 9 hours of battery life!).
However, i've been tracking etch (testing) pretty closely, and that means that while i'm running kernel 2.6.12, a new version of the kernel (2.6.15) is prepped and ready for squeak, and the new tools for creating a different initial ramdisk (initramfs-tools) snuck into the system for 2.6.15.
However, squeak uses an encrypted root volume (using cryptsetup), and initramfs-tools doesn't support that yet. Nonetheless, version 0.53 of initramfs-tools actually had the nerve to wipe out my functioning initrd for kernel 2.6.12 and replace it with an initrd that was not capable of handling cryptoroot.
This meant that when i booted squeak after the battery drainage, i couldn't even get a root filesystem mounted, because the initrd i was using was thoroughly incapable of kickstarting dmcrypt and the associated kernel stuff.
Fortunately, i had a spare old copy of the initrd lying around, and i just swapped it in for the new, broken one. the old copy had its own problems (like prompting for a passphrase for my swap partition, which should be pulled from /dev/random at boot time instead), but it could at least boot the machine.
Unfortunately, when it booted, it hung while trying to initialize the mysql server, presumably because /var was still full (i didn't know /var was full at this point). This is double-extra-bad because it means that a full /var on a machine with mysqld on it basically cannot boot. i need to file this as a bug... So, lastly, i booted into recovery mode, made mysqld not start up automatically at boot, and rebooted one more time, only to see the postgresql database daemons also fail to start. However, the postgresql initscripts were better written than the mysql initscripts, and they didn't hang the whole boot. I booted successfully, figured out that /var was full, made some room on it, and got back to a basically functional situation.
not a good way to spend a couple hours, however. And it leaves me with (at least) two bugs to file:
- mysql initscripts should not fail to return if there is no room in /var. something more sophisicated is needed.
- initramfs-tools should not try to make an initrd if it sees that root is on a crypto device. It's one thing to say that it can't make an initrd for a cryptoroot system yet. It's entirely another thing to clobber a functional initrd and replace it with a non-functional one.
On the plus side, when i get this sorted out, it'll be easier to move to kernel 2.6.15. gah...

jesus, dude. that sucks. th
jesus, dude. that sucks. this is the kind of thing that makes me really nervous about pushing forward with a system like yours. too many interconnected systems that can fail. i'm really impressed that you got it all up and working again in such short order, though. very very smart of you to keep a backup initrd around. very smart indeed.
and you definitely need to submit a bug to the initrd tools (and the mysql). both of their behaviors are unacceptable as far as i can tell. critical functioning parts of a system should NEVER be removed with out suitable warning to the admin. especially something as critical as the initrd which controls booting of the system altogether.