The following took place more than a year ago, but it is still fresh in my mind. After a few colleagues urged me to write about it, I decided to finally do it. If the output of the commands does not match exactly whatever I had while dealing with it – bear with it. It’s far from being the point.
I took a break from work last year and decided to go and have some fun in NZ. Oh, did I have fun there!
There’s nothing more frustrating than returning to work, turning on your dusty computer and witnessing the following:
*** An error ocfcurred during the filesystem check. Give root password for maintenance (or type Control-D to continue):
Investigating it just a bit more I got to a conclusion that my /home is not mounting. OMG!!! all of my personal customizations and some private data is at /home!
I must admit, it’s nothing that couldn’t be reproduced at a reasonable amount of time, but having my neat KDE customizations I didn’t want to start the process from the beginning. Think about yourself losing your /home, it’s no fun. I decided I wanted it back.
OK, so I’m running e2fsck:
# e2fsck /dev/sda5 e2fsck 1.39 (29-May-2006) e2fsck: No such file or directory while trying to open /dev/sda5 The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 #
Oh man, how frustrating, e2fsck can’t read my superblock!
Something I did notice during boot up, is that the HD is very noisy, in addition to very slow boot process. It wasn’t a new HD and it has worked hard (I tested on it numerous times our FS infrastructure of video recording). Probably it’s time has come.
I wanted to feel at /home again.
I decided the smartest thing would be to first try and copy this whole partition aside as I knew there is a hardware problem with the HD. After I’ll solve that one, I could hopefully handle the missing superblock problem much better.
So I quickly inserted a fresh new HD to my machine, disconnected the old faulty HD (it caused the computer to boot so slow because of it’s defects) and issued a network install.
15 minutes later I’m again at linux, with a bare new /home, and the faulty HD connected to it, slowing the computer as hell.
I was sure dd would come for the rescue:
# dd if=/dev/sda5 of=home-sweet-home.ext3 bs=4M
After a few minutes of anticipation and cranky HD noises, I’m with:
dd: reading `/dev/sda5': Input/output error 0+0 records in 0+0 records out # ls -l /home/home-sweet-home.ext3 -rw-r--r-- 1 root root 0 Jul 10 08:26 home-sweet-home.ext3
Great :(. I’m searching the net, searching for an aggressive dd program, something that instead of giving up on bad sectors, would fill them with zeros and continue on (hoping the defects on the HD are at a very specific place). I must admit I have almost written something by myself, but finally I’ve found dd_rescue.
And off we go:
# dd_rescue -e 1 -A /dev/sda5 /home/home-sweet-home.ext3
It ran for hours! It was 65GB that dd_rescue had to tackle. With a dying HD that could take a lot of time. After more or less 8 hours I was back at my desktop, looking at my old home:
# ls -lh /home/home-sweet-home.ext3 -rw-r--r-- 1 root root 61G Jul 10 20:43 home-sweet-home.ext3 #
OK, that’s it, I have my data. Time to dump the old HD and deal with the logical errors I still have with this partition dump. Mounting the partition gave me the same result as I pasted above: no superblock – no fun!
Oh! but ext3 always creates a few backup superblocks, maybe this is my lucky day where I will finally be able to use one of these backups. You are probably familiar with the following output:
# e2fsck /tmp/e2fsck-test ... 27 block groups 8192 blocks per group, 8192 fragments per group 1992 inodes per group Superblock backups stored on blocks: 8193, 24577, 40961, 57345, 73729, 204801 Writing inode tables: done Writing superblocks and filesystem accounting information: done ...
Now go figure out where your backup superblocks are. Trying the obvious of 8193 and 32768 did not work for me. I knew there should be more backup superblocks. Google comes for the rescue again. I was quite close at this time as well to writing a small C program that would search the partition dump for ext3 superblock signatures and tell me where the backup superblocks are. But then again, I thought I’m probably not the first one who needs such a utility, here TestDisk came for the rescue.
I simply ran TestDisk which reveled the remaining trustworthy superblocks on my damaged filesystem.
Later on I discovered that it is also possible to run e2fsck on a partition with the same size and see where the superblocks get written. However, I think that probing the superblocks is much cleaner altogether.
# mkdir /home2 # mount -o loop,ro,sb=884736 /home/home-sweet-home.ext3 /home2 #
Did it work?
# ls -l /home2 drwxr-xr-x 41 dan dan 8192 Feb 5 13:39 dan drwx------ 7 distcc distcc 88 Jul 26 15:24 distcc drwxrwxrwx 2 nobody nobody 1 Feb 5 05:39 public #
Wow, it did!
So how much of it was eventually damaged? – less than 1%!
So I’ve found a few garbled files I didn’t need anyway, but I was more than 99% back at /home.
Needless to say that ever since I’m backing up my /home in a strict manner. But this is obviously not the point.
The simplicity of Linux in general and ext2/3 in specific is something we should adore. I wouldn’t want to imagine what would have happened on a different OS or a different filesystem (and please don’t start a flame war about it now…).