Photo by Aleksandar Pasaric from Pexels
Code 42, makers of the CrashPlan backup service for small businesses (but not individuals, see “CrashPlan Discontinues Consumer Backups,” 22 August 2017), has announced that, starting in May 2019, the service would no longer allow users to back up applications, virtual machine image files from apps like Parallels Desktop and VMware Fusion, and some backup files.
Code42 explains the change by saying that excluding these files will likely result in smaller backup archives and faster restores, syncs, and backups. That’s obvious, no? If you eliminate large files from a backup, everything will run faster (and Code42 won’t need to add storage as quickly). More troubling is the comment earlier on in the note, which says:
We have always recommended you not include applications or large files in your selection as they may not backup correctly.
Seriously? Code42 is actually admitting that CrashPlan may not back up large files correctly? Isn’t that Job #1 for any backup app?
Needless to say, Code42’s announcement perturbed some users, who notified us of the change. TidBITS member Peter Erbland said, “It seems like it is defeating the purpose of an offsite cloud backup in case of a catastrophic loss of data.”
I was curious about Code42’s performance claims, however, so I checked with CrashPlan competitor Backblaze (which has sponsored TidBITS in the recent past). Yev Pusin, Backblaze’s Director of Marketing, explained that Code42 wasn’t just making an obvious statement about smaller archive sizes improving performance for a few reasons:
- On initial backup or when a lot of data changes, large files take a long time to upload. That’s to be expected, of course, but what people may not realize is that smaller files can be blocked from uploading during that time, leaving them unprotected.
- After the initial upload, apps like Backblaze and CrashPlan do block-level data deduplication, which means that they analyze small blocks of each file, compare them to what’s already backed up, and copy only those blocks that are new or changed. It might seem as though large files wouldn’t present a problem after initial backup as long as they didn’t change all that much. However, as Yev pointed out, the resources necessary to analyze all the blocks in a multi-gigabyte file are significant—you need enough drive space to store a copy of the file, and then the backup app has to spend a lot more time and CPU power analyzing all those blocks.
- On the restore side of the equation, if you’ve been backing up a large VM image for months, with changes happening regularly, and then you need to restore it, you’ll hit another performance problem. That’s because the backup app needs to reassemble all those individually backed-up blocks into the current representation of the file, and the more of those there are, the longer it will take the backup servers to provide your file.
For these reasons, Backblaze also excludes VM image files and other large file types (it also doesn’t back up system files or applications), as you can see in the app’s Exclusions screen.
So what should you do if you have a large VM image file that you want backed up? In fact, I’m in precisely this situation, since I have Parallels Desktop set up to run a Windows app called HyTek Meet Manager for managing Finger Lakes Runners Club track meets. I use Meet Manager infrequently—just a day or two six times per year—but it’s vital that its data be backed up when I am using it so I wouldn’t have to recreate a meet file if something bad happened.
With Backblaze at least, there are three solutions:
- Share a Mac folder with Windows in the virtual machine, and store all the essential data in that folder. This solution, which is what I do, is by far the easiest and best since it ensures that all the Windows app data is backed up just like Mac app data.
- Back up the VM image file as it exists on the drive. Although Backblaze excludes VM image file types by default, nothing prevents you from editing that exclusion list and removing the file type corresponding with your image file. However, it will impact performance in the ways discussed above.
- Install the Windows version of Backblaze within the virtual machine and set it to back up the important Windows app data. Unfortunately, this approach requires buying another license for Backblaze since it thinks it’s running on another computer. It’s not a very satisfying solution unless you work in the virtual machine all the time and have a lot of data that can’t easily be stored in a shared Mac folder.
As much as it’s easy to think of Internet backup services as being complete backups, they just aren’t. There’s no point in backing up system files in particular because no Internet backup service I’m aware of can perform a “bare-metal” restore after reformatting a Mac’s drive.
(Interestingly, the company that’s in the best position to provide a complete backup service is Apple, since Macs already have Internet Recovery for reinstalling macOS, and iOS devices can back up to and restore from iCloud without needing a computer with iTunes involved. If Apple ever decides it needs more Services revenue, all it needs to do is let Macs back up to iCloud as well and sell a lot more iCloud storage. That would be a significant hit to existing Internet backup services.)
While applications can usually be backed up and restored, that’s an awful lot of data to analyze, transfer, and store in a situation where you can generally redownload with just a few clicks. The Microsoft Office suite is 8.6 GB on its own, and the big four of Adobe’s Acrobat, Illustrator, InDesign, and Photoshop weigh in at 5.6 GB. Apple’s iWork suite of Pages, Numbers, and Keynote is svelte in comparison at 1.8 GB.
But remember, an Internet backup service is only one part of a solid backup strategy, and it should always be seen as the backup of last resort, the one you turn to if everything else is lost. Spending a little time reinstalling macOS and downloading apps isn’t the end of the world—losing your actual data is.
So whether you’re using Backblaze or CrashPlan or some other Internet backup service, always make sure that you also have a local versioned backup made by something like Time Machine and a bootable duplicate created by something like SuperDuper, Carbon Copy Cloner, or Chronosync. The versioned backup lets you recover an older version of a file that became corrupted or was damaged by human error and then saved, and the bootable duplicate enables you to get up and running as quickly as possible if your entire drive dies.