Recovering from Disk Corruption Without a SuperDrive
Last Wednesday afternoon, when I tried to empty my MacBook Pro’s Trash, I got an error message. The message said that an error had occurred, and gave me an error code: -36. This disturbed me greatly, since, as an old-time Mac developer, I know that -36 is a serious error. In old Mac (Carbon and pre-Carbon) header files, this error was described as “I/O error (bummers)”, and that’s usually what it means — some serious general error has occurred.
My MacBook Pro is not quite stock; I have replaced its internal SuperDrive with a 120 GB solid-state drive (SSD), leaving the 500 GB internal hard drive in place. This hardware configuration would affect my efforts to repair the damage. Lest you think I have a truly weird setup, consider the fact that Apple currently sells two Macs — the MacBook Air and the server configuration of the Mac mini — that also lack an optical drive.
Concerned for the health of my disk, I fired up Disk Utility and had it check my startup disk. It found several minor issues, and one that it described as an “invalid sibling link.” At this point, I went into recovery mode. I shut down my MacBook Pro, and restarted it in FireWire Target Disk Mode, and plugged it into the Mac Pro I have at work.
The first thing I did after connecting the MacBook Pro as a disk to the Mac Pro was to run Disk Utility again and confirm that the SSD was, in fact, corrupted, and that the hard disk was not. I immediately started making a disk image of the SSD, and while that was running, considered my situation.
I back up my laptop to an external hard drive using Time Machine, but thinking back on it, I realized that the last time I plugged in the MacBook Pro was Sunday night, so my backup was about 60 hours old. I had written a bunch of code Monday and Tuesday, but it was all in a source control system on a server, and thus safe. Most of my email is server-based (Exchange at work and Gmail for personal stuff), but I still have two POP accounts that I read using Eudora, and that mail is stored locally. So there was definitely mail that I needed to recover. There were also a few other minor things that I had done on my MacBook Pro since the last backup (new entries in Address Book, and so on) that would be annoying, but not disastrous, to
lose.
When the disk image was complete, I locked the resulting file, mounted it, and had Disk Utility check the image. It had the same “invalid sibling link” that the original disk did — no surprise there. I then made another image of just the Users folder, locked it, and checked it with Disk Utility — it was fine.
At this point I could see three options:
- Repair the disk. This would be the simplest and easiest way to get back to where I was when the problem cropped up, but it’s not always possible.
- Restore from the Time Machine backup, and then lay recent changes from the disk images on top of the restored system. This would take a bit of effort, but I was pretty sure that it would work.
-
Erase the disk and install a fresh version of Mac OS X. Then use Migration Assistant to copy my applications, data, and configuration files from the Time Machine backup, and then manually recover recent files from the disk images. Again, this would likely work, but it seemed like the most effort. (I considered using Migration Assistant to recover everything from the backup disk image, but the fact that it had corruption made me worry that something would be messed up in the process.)
At this point, it was 10 PM, so I copied the two disk images I made onto the MacBook Pro’s internal hard disk and went home. Once there, I started the MacBook Pro up in FireWire Target Disk Mode again, and connected it to an iMac. I ran Disk Utility and checked again — still bad. I told it to repair the disk. It spun for several minutes, and announced that it was unable to repair the disk. I said some bad words, and went to bed.
Thursday, I had to teach an all-day class at work, so I fired up DiskWarrior and started it on the disk before I left. It ran for about 30 minutes before I had to leave, and was still going when I came back 9 hours later. I took this as a sign that I needed to update my copy of DiskWarrior from 4.2 to 4.3. The new version took only about 2 hours to run, but was not able to repair the damage. I googled “invalid sibling link”, and found several suggestions that the command-line tool fsck_hfs
would be able to fix it. I tried that — no dice either. So my first option — repairing the disk — wasn’t going to happen. On to restoring from Time Machine.
The recommended way to restore from a Time Machine backup is to boot from a Mac OS X Install DVD, erase the destination disk with Utilities > Disk Utility, and then choose Utilities > Restore from Time Machine. I couldn’t do this, having replaced my SuperDrive with the SSD that was having the problem. Then I remembered that the Intel-based Macs can boot from USB, so I put the contents of a Snow Leopard Install DVD onto a 16 GB USB flash drive.
Unfortunately, my MacBook Pro wouldn’t boot from the USB drive — I could pick the volume if I held down the Option key at boot, but it would never progress past the gray screen. Holding down v
(for a verbose boot) or s
(to boot into single user mode) didn’t work either. Flummoxed, I installed Snow Leopard onto the MacBook Pro’s internal hard disk as well, but it never got past the gray screen. At this point, I was starting to wonder if there was something else wrong with the MacBook Pro, so I stepped back and did other stuff for a couple of hours to clear my head. I could have removed the SSD, reinstalled the SuperDrive, and replaced the internal hard
drive with the SSD such that I could have booted from DVD and restored to the SSD, but I decided to hold that out as a last resort.
After lunch and some yard work, I had an “Aha!” moment. I had put 10.6.0 onto the USB drive (and the internal hard drive), and my 2010 MacBook Pro shipped after Snow Leopard came out. It’s not at all unusual for Macs to refuse to boot from versions of Mac OS X that predate them. I dug out my MacBook Pro’s original discs, noted that they were 10.6.3, and put that onto the USB drive, plugged it into the USB port, and turned the Mac on. Lo! It booted! (And I was very relieved that there wasn’t some more serious problem.)
Now I could proceed. I erased the SSD and checked it. Still corrupted. How can that be — it was empty! After more googling, and a suggestion from OWC tech support, I learned that I had to repartition the drive to get it to write a fresh volume onto the drive. Then I was able to restore from Time Machine, which took 2 hours. Once the restore had finished, I checked again with Disk Utility and the disk was fine. I rebooted, logged in, and verified that there were no problems in Disk Utility once again. I may be paranoid, but I was also almost done.
The final task was to find the files on my disk image that were newer than my Time Machine backup. I used the find
command-line tool. First, I mounted the locked disk image that I made Wednesday night, fired up Terminal, and typed:
find /Volumes/SSD/Users -mtime -5
That printed out the paths of all the files with a modification date less than five days before (by this point, it was Friday night). I ended up with a long list, most of which I could ignore; files in my Safari cache, for example. However, the find
command found all the modified files in my Eudora Folder (which I copied over in its entirety), iChat transcripts, a few files in my Downloads folder, and a few photos in my iPhoto Library. I examined those by hand, and copied them into the right places on my laptop, and all was well again.
At this point, I had been without my laptop for 48 hours, and I had about 600 email messages waiting for my attention. But in case my experience can be of use to others, I thought I would recap what I had learned.
-
Having a backup is vital, even if it is not completely up to date. I wish I had plugged in my Time Machine disk more recently, but at least I had a more-or-less recent backup that put a floor under what I would lose. The way Lion will allow Time Machine backups to continue even when the destination disk isn’t present might have saved me, as would a backup program like CrashPlan that backs up constantly to an offsite destination. Backups should not rely on manual intervention.
-
When faced with potential data loss, Don’t Panic! Instead, stop, take a breath, and think about what can you do, and what data is most important. For me, the important data were those last few days of email and my set of highly configured applications.
-
Don’t do anything that’s even potentially destructive without first making a copy. If things don’t go your way, or you inadvertently do something wrong, it’s a lot easier to cheerily say “Oops!” if you are working on a copy instead of on the “one and only.”
-
Having another Mac with lots of disk space is a huge advantage. At the least, you may need access to Google to search for solutions to the problems you’re experiencing. FireWire Target Disk Mode is also a boon, if both of your Macs have FireWire, so make sure you have a FireWire cable around to use it.
-
If you try something, and it doesn’t work, attempt to figure out why rather than just moving on. If I hadn’t realized why my attempts to install a working system on the MacBook Pro were failing, I might have wasted a lot of effort moving on to rebuilding the contents of the SSD from scratch.
-
Having an independent way to boot your Mac is key — I’ve ordered an 8 GB flash drive that I’ll put a bootable system on, and carry it in my backpack from now on. Although Lion will have a recovery partition, that won’t help if the drive itself goes south, whereas my USB flash drive will always work.
Thanks, Marshall!
Useful tips, both in general and in specific.
If I were replacing my SuperDrive with an SSD, I'd definitely spring for the external case so I had a way to both the system from DVD. This matters much less if you always have another Mac with optical drive and FW TDM available, but I think I'd still do it.
I had a similar issue last night. My MacBook Pro's hard drive has been showing signs of impending failure (the machine occasionally completely froze for 30 seconds when attempting a disk read). I bought a new drive and attempted to use asr to copy the old drive to the new mounted in a NewerTech Voyager, but kept getting read errors which caused asr to bail.
The Time Machine backup (wirelessly to a Time Capsule so it's always up-to-date) was still working. I tried booting from three different OS X DVDs, but apparently my internal SuperDrive is flaky because I never got past the gray screen. I have an external DVD drive in a Primera disk duplicator on my studio iMac. I finally thought to hook that up to the MBP through USB and booted from there.
I was able to restore from my Time Machine backup. It took about seven hours over wired Ethernet from the Time Capsule.
I never thought to copy an OS X DVD to a USB drive. If this ever happens again, I'll do that first.
You could also make a bootable clone of your start up SSD drive to an external USB drive or, even better, to the 500 GB internal drive. In fact, you could create two 120 GB partitions on the internal drive, one for daily backup/cloning and one for weekly backup/cloning. These backups could run automatically at, say, 6 AM every day then sleep the machine.
In the event of a boot drive failure, your machine should then be able to start up from one of the internal clones by holding down the option key on startup and selecting one of the clones. Once started, you can reverse clone back to the original boot drive (or its replacement). You can at least test this recovery scenario before it actually happens.
Two popular cloning applications are Carbon Copy Cloner and SuperDuper! (my choice). The SuperDuper! user manual explains most of these options and you can get good support on the company's discussion forum.
Lots of issues here. First, using the SSD on the ODD bus instead of the HDD sata (rated for 3GBs).
Depending on such unreliable tech (SSDs are not made by HDD companies but chip makers) and there is no TRIM for Mac (yet).
Disk utilities were made for ____. SSDs do not necessarily have teh same block structure as drives do (512byte vs 4K). You lose a block on SSD, you lose 8x the data!
Did you know there are NO companies that can recover SSDs? Kinda the cart before the horse.
Using the SSD on the on the "optical drive" SATA port is not an issue. He's using a 2010 Macbook Pro, thus both his ports are SATA 3Gbs. Sure, he'll saturate the bus at 375 MBps, but that's still faster than a HDD.
Second of all, why is the fact that most SSDs are made by chip manucaturers an issue? Who else has more experience with memory of all types than folks like Kingston, Crucial, OCZ, etc?
Thirdly, drives with Sandforce-based chipsets have their own hardware-based garbage collection that operates in parallel with data writes: TRIM certainly doesn't hurt, but it is NOT neccessary.
As to block data loss, yes, this is true. However, newer HDDs use 4k sectors as well. All high-density HDDs that are now in the pipeline use 4k sectors. Plus, SSDs have no read/write head, thus no potential for sector loss due to head crashes.
You are correct on SSDs being less reccoverable than HDDs however. Sadly, this is just the nature of the beast because of wear-leveling and garbage collection
Guess I'll have to disagree re using current disk utilities with SSD drives.
In another post today I described using Disk warrior to correct a major clobbering of my SSD used as an external USB boot drive.
tech Tool pro would not mount, nor would disk utility.
I have noted that after short use, the SSD will have mostly scattered files across the whole disk, even with 30 percenbt of the total disk free. So other than cleaning up the directory from time to time- volume optimization is not only not necessary, it would take a long time.
Disk Warrior had a similar problem notice about not having enough 'free' space when cleaning up the directory, such that a warning was given that loss of power while updating directory might result in major corruption.
I'm using a OWC mercury extreme SSD and have had no other problems.
You can often avoid having to recover mail files from POP accounts by setting your mail program to not delete mail from the server immediately. I usually set it to be several weeks if my mailbox quota is high enough.
That way, if I do have a disk crash, all I have to do after restoring the machine and the previous mail, is to check for new mail. The newer messages will then just come down again. Unfortunately they will all be marked as unread, but better than losing them totally.
Another problem is that you don't get your outgoing messages. I have often wondered about including an automatic BCC: in every outgoing message to an archive account somewhere like Gmail, to make sure I always have access to those outgoing messages.
I get all mails filtered through the good services of Spamcop before downloading them locally (Entourage, one single file!). Then I save them in a monthly backup folder created for that purpose on Spamcop's server, which I erase every two months. Your idea about outgoing mail is a good one, but I suppose it would have to be to an ad hoc mail account, in order not to clutter an existing one.
You could have booted your iMac with The proper installer DVD and attached your laptop in FireWire Target Mode and installed that way. Much easier and faster.
Another cool trick that you could have used is installing Snow Leopard FROM your iMac TO your Macbook Pro in target disk mode, while the iMac was still running OS X. This is essentially the same thing the Lion installer does, running as an application on a booted copy of OS X.
Check it out:
http://hints.macworld.com/article.php?story=20071203051319476
I'm using a mac mini 2010 2 ghz core dual and a OWC 40GB SSD outboard in a usb case for normal boot and use with 10.6.8 software. had a similar problem. Since my internal disc had a previous software version in a separte partition from my data, I was able to boot in the oldere version and connect via USB the SSD. Disk Utility simply would not work. tech tool pro would not even mount the disk I then used disk warrior, which in time managed to correct the headers, file names, and other somehow corrupted files.
FWIW this is the first time in nearly three decade of mac usage that I have had such a problem that disk utility would not mount or handle. But being a belt and suspenders type, I am fairly religious about keeping my data/email/etc on a separate partition on my internal disk along with somewhat timely backups of my data.
I don't use TimeMachine, but Intego Backup Manager allows me to make multiple incremental backups for as many days as I wish to (for my purposes, a 1Tb external disk is enough for 5-6 days, roughly). That's how I was able to recover a 5-day old plist file to replace some printer settings that had been lost after a reset.
It's probably worth noting this article concerning SSD reliability. I'm not an owner of an SSD, but I sure want the speed improvements. However, not at the risk of data loss. I'll hold off until the failure rates improve.
http://foliovision.com/2011/06/26/memory-ssd-reliability