Rich Mogull 9 April 2012 7 comments

How to Tell If Your Cloud Provider Can Read Your Data

With the tremendous popularity of services like Dropbox and iCloud there is, rightfully, an incredible amount of interest in cloud data security. Once we start hosting our most sensitive data with cloud services (or any third-party provider) it’s only natural to wonder how secure our data is when it’s in the hands of others. But sometimes it’s hard to figure out exactly who can look at our information, especially since buzzwords like “secure” and “encrypted” don’t necessarily mean you are the only one who can see your data.

How Cloud Providers Protect Your Data — In part because there are numerous ways cloud providers could protect your data, the actual implementation varies from service to service. All consumer cloud services are what we in the cloud world call public and are built for multi-tenancy.

A public cloud service is one that anyone on the Internet can access and use. To support this the cloud providers need to segregate and isolate customers from each other. Segregation means your data is stored in your own little virtual area of the service, and isolation means that the services use security techniques to keep people from seeing each other’s stuff.

Practically speaking, multi-tenancy means your data is co-mingled with everyone else’s on the back end. For example, with a calendar service your events exist in the same database as all the other users’ events, and the calendar’s code makes sure your appointment never pops up on someone else’s screen. File storage services do the same thing: intermingling everyone’s files and then keeping track of who owns what in the service’s database. Some, like Dropbox, will even store only a single version of a given file and merely point at it from different owners. Thus multiple users who happen to have the same file are technically sharing that single instance; this approach also helps reduce the storage needed for multiple versions
of a file for a single user.

Although multi-tenancy means co-mingling data, the cloud provider uses segregation techniques so you see only your own data when you use the service, and isolation to make sure you can’t maliciously go after someone else’s data when you’re using the system.

The cloud provider’s databases and application code are key to keeping all these bits separate from each other. It isn’t like having a single hard drive, or even a single database, dedicated to your information. That simply isn’t efficient or cost-effective enough for these services to keep running. So multi-tenancy is used for files, email, calendar entries, photos, and every other kind of data you store with a cloud service.

Not all services work this way, but the vast majority do.

Encryption to the Rescue? — A multi-tenancy architecture has two obvious problems. The first is that if there’s a mistake in the application or database the service runs on, someone else might see your data. We’ve seen this happen accidentally; for example, last year Dropbox accidentally allowed any user access to any other user’s account. There is a long history of Internet sites (cloud and otherwise) inadvertently allowing someone to manipulate a Web page or URL to access unauthorized data, and the bad guys are always on the lookout for such vulnerabilities.

The second problem, which has been in the press a lot lately, is that the cloud provider’s employees can also see your data. Yes, the better services usually put a lot of policy and security controls in place to prevent this, but it’s always technically possible.

One way to mitigate some of these concerns is with encryption, which uses a mathematical process coupled with a digital key (a long string of text) to turn your data into what looks like random gibberish. That key is necessary to decrypt and read the data.

Most cloud providers use encryption to protect your Internet connection to them (via SSL/TLS — look for https URLs) so no one can sniff it on the network. (Unfortunately, some large email providers still don’t always encrypt your connection.) Most of the time when you see “encryption” in a list of security features, this is what they mean. But encrypting data in transit is only half the battle — what about your data in the provider’s data center? Encryption of storage is also necessary for any hope of keeping your data secret from the cloud provider’s employees.

Some providers do encrypt your data in their data center. There are three ways to do this:

Encrypt all the data for all users using a single key (or set of keys) that the cloud provider knows and manages.
Encrypt each individual user’s data with a per-user key that the cloud provider manages.
Encrypt each individual user’s data with a per-user key that the user manages.

By far, most cloud services (if they encrypt at all) use Option #1 — keys that they manage and that are shared among users — because it’s the easiest to set up and manage. The bad news is that it doesn’t provide much security. The cloud provider can still read all your data, and if an attacker compromises the service’s Web application, he can usually also read the data (since it’s decrypted before it hits the Web server).

Why do this level of encryption at all? It’s mostly to protect data if a hard drive is lost or stolen. This isn’t the biggest concern in the world, since cloud providers have vast numbers of drives, and it would be nearly impossible to target a particular user’s data, if the data could be read at all without special software. It also means that providers get to say they “encrypt your data” in their marketing. This is how Dropbox encrypts your data.

Option #2 is a bit more secure. Encrypting every user’s data with an individual key reduces, in some cases, the chance that one user (or an attacker) can get to another’s data. It all depends on where the attacker breaks into the system, and still relies on good programming to make sure the application doesn’t connect the wrong keys to the wrong user. It’s hard to know how many services use this approach, but when done properly it can be quite effective. The major weakness is that the cloud provider’s employees can still read your data, since they have access to the keys.

Option #3 provides the best security. You, the user, are the only one with the keys to your data. Your cloud provider can never peek into your information. The problem? This breaks… nearly everything. First of all it means you are responsible for managing the keys, and if you lose them you lose access to your data. Forever. Also, it is extremely difficult — if not impossible — to allow you to see or work with your data in a Web page since the Web server can’t read your data either. Thus it works for some kinds of services (mostly file storage/sharing) and not others, and only for sophisticated users who are able to manage their own keys.

As is so often the case, these options reveal the tradeoff between security and convenience.

How to Tell if Your Cloud Provider Can Read Your Data — In two of the three options I listed above, the provider can read your data, but how can you tell for yourself if this is the case?

There are three different (but similar) indications that your cloud data is accessible to your provider:

If you can see your data in a Web browser after entering only your account password, the odds are extremely high that your provider can read it as well. The only way you could see your data in a Web browser and still have it be hidden from your provider is if the service relied on complex JavaScript code or a Flash/Java/ActiveX control to decrypt and display the data locally.
If the service offers both Web access and a desktop application, and you can access your data in both with the same account password, odds are high that your provider can read your data. This is because your account password is also probably being used to protect your data (usually your password is used to unlock your encryption key). While your provider could technically architect things so the same password is used in different ways to both encrypt data and allow Web access, that really isn’t done.
If you can access the cloud service via a new device or application using your account user name and password, your provider can probably read your data. This is just another variation of the item above.

This is how I knew Dropbox could read my files long before that story hit the press. Once I saw I could log in and see my files, or view them on my iPad without using a password other than my account password, I knew that my data is encrypted with a key that Dropbox manages. The same goes for the enterprise-focused file sharing service Box (even though it’s hard to tell when reading their site). Of course, since Dropbox stores just files, you can apply your own encryption before Dropbox ever sees your data, as I explained last year at Securosis.

And iCloud? With iCloud I have a single user name and password. It offers a rich and well-designed Web interface where I can manage individual email messages, calendar entries, and more. I can register new devices and computers with the same user name and password I use on the Web site. Thus, from the beginning, it was clear Apple had the capability to read my content, just as Ars Technica reported recently.

That doesn’t mean Dropbox, iCloud, and similar services are insecure. They generally have extensive controls — both technical and policy restrictions — to keep employees from snooping. But it does mean that such services aren’t suitable for all users in all cases, especially businesses or governmental organizations that are contractually or legally obligated to keep certain data private.

Doing It Right — The backup service CrashPlan is an example of a service that offers flexible encryption to fit different user needs, with three separate options. (For more on choosing the appropriate encryption method for CrashPlan, see Joe Kissell’s “Take Control of CrashPlan Backups.”)

First, by default, your data is encrypted using a key protected by your account password. This still isolates and protects it from other users, while enabling you to view file information through the CrashPlan Web site and the CrashPlan Mobile app. But CrashPlan’s employees could still access your data.

Second, if you want more security, you can add a separate backup password that only you know. This approach still allows access through the CrashPlan Web site and the CrashPlan Mobile app, but CrashPlan employees can’t see your data except (maybe) during a Web session after you enter your separate password. Attackers can’t access your data either, though your password may be susceptible to brute force cracking or social engineering.

Third and finally, you can generate your own per-device encryption keys, which CrashPlan never sees or knows about, rendering your backups readable only by you (or anyone who can beat the key out of you — never underestimate the power of a wrench — props to xkcd!). You could technically use a different encryption key on each device (or share, your choice) so that even if one system were to be compromised, it wouldn’t allow access to backups from your other devices. Clearly, this is much more difficult to manage and well beyond the needs or capabilities of the average user (heck, even I don’t use it).

So if you want to be certain that your data is safe from both attackers and the cloud provider’s employees snooping, look for services that offer additional options for encrypting data, either with a password or an encryption key known only to you. If such an option isn’t available at the next cloud service you check out, you’ll know that the provider’s employees could technically read your data. And when the next big story of a cloud provider reading data hits the headlines, you can smugly inform your friends that you knew it all along.

Comments About How to Tell If Your Cloud Provider Can Read Your Data

Norbert E Fuchs
9 April 2012

Rich

Thanks for your very informative article.

Do you have an assessment of LaCie's service Wuala?

Regards.

--- nef
- Rich Mogull
  10 April 2012
  
  Sorry, I haven't had a chance to look at that one.
Chris Pepper
9 April 2012

I was going to plug SpiderOak as the more-encrypted alternative to Dropbox, but is currently down, so I'm not recommending it!
Boyan
9 April 2012

Technically it is always possible for a provider to engineer things so that they get access to your data, unless you provide your own encryption. But probably "option 3" providers won't do that, because if it gets out they are most likely out of business. But it is always something to think about for the most paranoic users :)
Simon
10 April 2012

I use Dropbox regularly, so definitely an interesting read.

I've been wondering if I should try out fruux since full iCloud support isn't being backported to Snow Leopard. But the one thing holding me back is that I can't seem to find any analysis of how fruux treats users' data. What about privacy, integrity, etc. on fruux.com? Anybody care to offer some insight?
Howard H Metcalfe
11 April 2012

Why not just encrypt your most sensitive individual files on your hard disk (which is a good idea anyway when not working with them) and then back them up on dropbox or whatever? There are several file encrypion programs available for Macs and PCs.
heavyboots
12 April 2012

SpiderOak is a pretty interesting alternative to DropBox. Unfortunately, due to the limitations you mention in your article, their setup is crazy complex compared to Dropbox. But if you want to be security-conscious, definitely an alternative.

Subscribe today so you don’t miss any TidBITS articles!

Comments About How to Tell If Your Cloud Provider Can Read Your Data