Last Wednesday afternoon, when I tried to empty my MacBook Pro’s Trash, I got an error message. The message said that an error had occurred, and gave me an error code: -36. This disturbed me greatly, since, as an old-time Mac developer, I know that -36 is a serious error. In old Mac (Carbon and pre-Carbon) header files, this error was described as “I/O error (bummers)”, and that’s usually what it means — some serious general error has occurred.
My MacBook Pro is not quite stock; I have replaced its internal SuperDrive with a 120 GB solid-state drive (SSD), leaving the 500 GB internal hard drive in place. This hardware configuration would affect my efforts to repair the damage. Lest you think I have a truly weird setup, consider the fact that Apple currently sells two Macs — the MacBook Air and the server configuration of the Mac mini — that also lack an optical drive.
Concerned for the health of my disk, I fired up Disk Utility and had it check my startup disk. It found several minor issues, and one that it described as an “invalid sibling link.” At this point, I went into recovery mode. I shut down my MacBook Pro, and restarted it in FireWire Target Disk Mode, and plugged it into the Mac Pro I have at work.
The first thing I did after connecting the MacBook Pro as a disk to the Mac Pro was to run Disk Utility again and confirm that the SSD was, in fact, corrupted, and that the hard disk was not. I immediately started making a disk image of the SSD, and while that was running, considered my situation.
I back up my laptop to an external hard drive using Time Machine, but thinking back on it, I realized that the last time I plugged in the MacBook Pro was Sunday night, so my backup was about 60 hours old. I had written a bunch of code Monday and Tuesday, but it was all in a source control system on a server, and thus safe. Most of my email is server-based (Exchange at work and Gmail for personal stuff), but I still have two POP accounts that I read using Eudora, and that mail is stored locally. So there was definitely mail that I needed to recover. There were also a few other minor things that I had done on my MacBook Pro since the last backup (new entries in Address Book, and so on) that would be annoying, but not disastrous, to lose.
When the disk image was complete, I locked the resulting file, mounted it, and had Disk Utility check the image. It had the same “invalid sibling link” that the original disk did — no surprise there. I then made another image of just the Users folder, locked it, and checked it with Disk Utility — it was fine.
At this point I could see three options:
Repair the disk. This would be the simplest and easiest way to get back to where I was when the problem cropped up, but it’s not always possible.
Restore from the Time Machine backup, and then lay recent changes from the disk images on top of the restored system. This would take a bit of effort, but I was pretty sure that it would work.
Erase the disk and install a fresh version of Mac OS X. Then use Migration Assistant to copy my applications, data, and configuration files from the Time Machine backup, and then manually recover recent files from the disk images. Again, this would likely work, but it seemed like the most effort. (I considered using Migration Assistant to recover everything from the backup disk image, but the fact that it had corruption made me worry that something would be messed up in the process.)
At this point, it was 10 PM, so I copied the two disk images I made onto the MacBook Pro’s internal hard disk and went home. Once there, I started the MacBook Pro up in FireWire Target Disk Mode again, and connected it to an iMac. I ran Disk Utility and checked again — still bad. I told it to repair the disk. It spun for several minutes, and announced that it was unable to repair the disk. I said some bad words, and went to bed.
Thursday, I had to teach an all-day class at work, so I fired up DiskWarrior and started it on the disk before I left. It ran for about 30 minutes before I had to leave, and was still going when I came back 9 hours later. I took this as a sign that I needed to update my copy of DiskWarrior from 4.2 to 4.3. The new version took only about 2 hours to run, but was not able to repair the damage. I googled “invalid sibling link”, and found several suggestions that the command-line tool
fsck_hfs would be able to fix it. I tried that — no dice either. So my first option — repairing the disk — wasn’t going to happen. On to restoring from Time Machine.
The recommended way to restore from a Time Machine backup is to boot from a Mac OS X Install DVD, erase the destination disk with Utilities > Disk Utility, and then choose Utilities > Restore from Time Machine. I couldn’t do this, having replaced my SuperDrive with the SSD that was having the problem. Then I remembered that the Intel-based Macs can boot from USB, so I put the contents of a Snow Leopard Install DVD onto a 16 GB USB flash drive.
Unfortunately, my MacBook Pro wouldn’t boot from the USB drive — I could pick the volume if I held down the Option key at boot, but it would never progress past the gray screen. Holding down
v (for a verbose boot) or
s (to boot into single user mode) didn’t work either. Flummoxed, I installed Snow Leopard onto the MacBook Pro’s internal hard disk as well, but it never got past the gray screen. At this point, I was starting to wonder if there was something else wrong with the MacBook Pro, so I stepped back and did other stuff for a couple of hours to clear my head. I could have removed the SSD, reinstalled the SuperDrive, and replaced the internal hard
drive with the SSD such that I could have booted from DVD and restored to the SSD, but I decided to hold that out as a last resort.
After lunch and some yard work, I had an “Aha!” moment. I had put 10.6.0 onto the USB drive (and the internal hard drive), and my 2010 MacBook Pro shipped after Snow Leopard came out. It’s not at all unusual for Macs to refuse to boot from versions of Mac OS X that predate them. I dug out my MacBook Pro’s original discs, noted that they were 10.6.3, and put that onto the USB drive, plugged it into the USB port, and turned the Mac on. Lo! It booted! (And I was very relieved that there wasn’t some more serious problem.)
Now I could proceed. I erased the SSD and checked it. Still corrupted. How can that be — it was empty! After more googling, and a suggestion from OWC tech support, I learned that I had to repartition the drive to get it to write a fresh volume onto the drive. Then I was able to restore from Time Machine, which took 2 hours. Once the restore had finished, I checked again with Disk Utility and the disk was fine. I rebooted, logged in, and verified that there were no problems in Disk Utility once again. I may be paranoid, but I was also almost done.
The final task was to find the files on my disk image that were newer than my Time Machine backup. I used the
find command-line tool. First, I mounted the locked disk image that I made Wednesday night, fired up Terminal, and typed:
find /Volumes/SSD/Users -mtime -5
That printed out the paths of all the files with a modification date less than five days before (by this point, it was Friday night). I ended up with a long list, most of which I could ignore; files in my Safari cache, for example. However, the
find command found all the modified files in my Eudora Folder (which I copied over in its entirety), iChat transcripts, a few files in my Downloads folder, and a few photos in my iPhoto Library. I examined those by hand, and copied them into the right places on my laptop, and all was well again.
At this point, I had been without my laptop for 48 hours, and I had about 600 email messages waiting for my attention. But in case my experience can be of use to others, I thought I would recap what I had learned.
Having a backup is vital, even if it is not completely up to date. I wish I had plugged in my Time Machine disk more recently, but at least I had a more-or-less recent backup that put a floor under what I would lose. The way Lion will allow Time Machine backups to continue even when the destination disk isn’t present might have saved me, as would a backup program like CrashPlan that backs up constantly to an offsite destination. Backups should not rely on manual intervention.
When faced with potential data loss, Don’t Panic! Instead, stop, take a breath, and think about what can you do, and what data is most important. For me, the important data were those last few days of email and my set of highly configured applications.
Don’t do anything that’s even potentially destructive without first making a copy. If things don’t go your way, or you inadvertently do something wrong, it’s a lot easier to cheerily say “Oops!” if you are working on a copy instead of on the “one and only.”
Having another Mac with lots of disk space is a huge advantage. At the least, you may need access to Google to search for solutions to the problems you’re experiencing. FireWire Target Disk Mode is also a boon, if both of your Macs have FireWire, so make sure you have a FireWire cable around to use it.
If you try something, and it doesn’t work, attempt to figure out why rather than just moving on. If I hadn’t realized why my attempts to install a working system on the MacBook Pro were failing, I might have wasted a lot of effort moving on to rebuilding the contents of the SSD from scratch.
Having an independent way to boot your Mac is key — I’ve ordered an 8 GB flash drive that I’ll put a bootable system on, and carry it in my backpack from now on. Although Lion will have a recovery partition, that won’t help if the drive itself goes south, whereas my USB flash drive will always work.