Thoughtful, detailed coverage of the Mac, iPhone, and iPad, plus the TidBITS Content Network for Apple consultants.

Oil of OLE: Document Security and You

During the last two weeks, reports of a security problem with Microsoft Office 98 for the Macintosh have been published from Macintosh news venues such as MacAddict, MacFixIt, and MacWEEK. According to these stories, Microsoft Office 98 applications - particularly Microsoft Word - may acquire random data elsewhere on your computer and incorporate it invisibly into your Office data files.

Here's the bad news: the problem is real and long-standing. Further, the problem applies to all applications using Microsoft's OLE technology on the Mac, not just Office 98, and there's no guaranteed way to work around the problem right now.

Here's the good news: though serious, this problem isn't a tremendous concern to many Mac users, and a fix should be available from Microsoft shortly. Furthermore, you can do simple things right now to reduce your exposure to this problem significantly.

Thanks for the Memory -- The problem seems to stem from applications writing uninitialized OLE data structures to disk, which allows information previously in RAM or on disk to be incorporated into a document's data. Though the OLE applications don't display or use this data, it does become part of the file and can be viewed in that file using other programs, such as BBEdit or a disk editor.

OLE (pronounced "oh-lay") stands for Object Linking and Embedding, a technology created by Microsoft that, in essence, lets applications share code and data. Although it's more established under Windows, OLE has been available on the Mac since at least 1992 and has been incorporated into a variety of mainstream Macintosh applications, including Microsoft Office and Adobe PageMaker. OLE is also the basis of Microsoft's COM (Common Object Model) and ActiveX technologies, and has outlived competing Apple technologies such as Publish & Subscribe and OpenDoc.

So, what's an uninitialized data structure, and why is writing it to disk a problem?

When an application needs to deal with some data, it asks the operating system for a block of RAM to store the information. In general terms, the operating system either responds with an error (if the memory isn't available) or an address pointing to the start of a memory block.

However, when an operating system gives an application a block of memory, that doesn't mean the memory is empty, just available. In fact, the memory probably contains remnants of previously stored data - possibly even if it was put there before the computer was last restarted (although shutting down your Mac will clear out your RAM). This memory is usually described as "uninitialized," because its initial contents can't be easily predicted. Usually, the contents of uninitialized memory don't matter: the application's next action is often to initialize the memory (filling it all with zeros, for example) or fill it with actual data - sometimes, applications do both. But if the application doesn't initialize or overwrite the memory, any pre-existing data remains intact.

Something similar happens with disk space. When an application writes information to disk, the operating system locates some disk space, then writes the data to it. But, like RAM, the disk space may not be empty, and can contain information previously stored there. (When you delete a file, the areas where that file was stored aren't erased, just marked as available for re-use. That's how data-recovery programs such as Norton Utilities are often able to "unerase" files you've deleted recently.) Once again, an application will usually overwrite any pre-existing data in disk space it plans to use. But if the application doesn't overwrite the entire disk space - and most applications don't aways do so completely - then the original data (or a portion of it) will remain.

Oy vey OLE -- Applications that use OLE seem to display two behaviors that constitute a possible security problem. First, information previously stored in disk sectors to which OLE data is written may "show through" unused areas of OLE structured storage, effectively incorporating that pre-existing information into the data of the new file. Second, OLE applications may fail to initialize RAM they've requested from the operating system, then proceed to write that uninitialized memory to disk when they create or save a file.

The net result is that fragments of information that previously existed on your hard disk or in RAM memory may be stored as part of the data file of an OLE application. There's no realistic way to predict what the information might be: it could be part of an email message, confidential financial information, or a part of an unwanted binhexed email attachment you deleted months ago. Further, although OLE applications ignore the extraneous data when working with the file, the information "sticks with" the file when you copy it to another disk, or send it to someone via email.

Testing the Waters -- To test these behaviors, I wrote two small applications in C. The first writes a four-byte signature to all free space on a disk, effectively tagging those areas (I used a ShrinkWrap volume as my test disk). The second program fills all available RAM with a different four-byte signature. I used OLE applications I had on hand to create both small (single-character) and large (100K) test files on the tagged ShrinkWrap volume, then examined the contents of those files with tools such as BBEdit and Norton Disk Editor. Between each test, I re-initialized and re-tagged the ShrinkWrap volume and re-tagged available memory. The OLE applications I tested were Microsoft Word 5.0a, 5.1a, 6.0.1, and Word 98; Microsoft Excel 4.0, 5.0, and Excel 98; and PageMaker 5.0, 6.0, and 6.5.2. The non-OLE applications I tested were Nisus Writer 5.1 and FileMaker Pro 4.0.

The results? Every large test file created by OLE applications contained the test disk signature as part of the file's data, in continuous stretches approaching 4K in size. Additionally, most of the OLE applications also wrote the RAM signature to disk (often using byte ordering common to Intel processors), although in considerably smaller chunks than the those containing the disk signature.

Various applications demonstrated different exposures to these issues, probably due to differences in my test documents and the ways the programs use OLE. Microsoft Word 5.x, for instance, doesn't seem to create OLE objects as part of its file structure by default, thus limiting its exposure to any "see-through" effect. However, Word 5.x documents using OLE objects readily display the problems. Similarly, Word 98 seems to have the greatest exposure, incorporating as much as 10K of "see-through" data in a single-character Word document, presumably because it makes much more extensive use of OLE.

None of the non-OLE applications I tested created files with the disk signature as part of their file data, although one wrote the RAM signature to disk as part of its file data (it created four 16-byte chunks).

I must emphasize I only tested files created from scratch and written to disk once: I did not edit or re-save these files, or conduct tests with pre-existing files. (Many applications have similar - though unrelated - behaviors where deleted material is retained in modified files. Email and database applications - plus programs with "fast save" features - are typical examples.) Also, since I don't have access to the internals of OLE or the applications, these results indicate a correlation between OLE applications and the reported security issues. Although the findings may be persuasive, they do not constitute absolute proof.

Age Before Beauty -- Armed with my test data, I investigated reports of similar problems with OLE. To the best of my knowledge these issues were not reported on the Macintosh before the middle of June, although they've almost certainly existed since the introduction of Word 5.0.

However, on the Windows platform, OLE apparently has a long history of incorporating pre-existing information on disk into new files. Although the articles don't appear to be available online, Steve Manes of the New York Times reported the problem in October of 1995 when it re-appeared in the version of OLE that shipped with Windows 95 (Microsoft had quietly repaired the problem once before with a revision "c" of its Office applications for Windows in the summer of 1994). Although Microsoft released a fixed version of OLE for Windows via the Internet, the fix never appeared in retail versions of Windows 95, which were available until two weeks ago. I didn't find any reports of Windows versions of OLE writing extraneous information from RAM to disk as part of new files.

Saving Your As -- Microsoft plans to have a fix for these problems available shortly, and should make an announcement at this week's Macworld Expo in New York. Fortunately, the fix should correct these problems in all OLE applications, not just Microsoft Office programs.


In the meantime, concerned users who share or transmit files created by OLE applications can avoid the worst of these problems by using Save As to rewrite a file to a newly initialized disk (like a floppy, RAM disk, or disk image). This will ensure that most "see-through" data in the file is merely blank space from the newly initialized disk. Note that merely copying the file to an initialized disk is not enough: you must use the Save As command. Also, you must initialize any disk you use for this purpose; simply deleting the files it contains is not sufficient. Use a disk utility or the Erase Disk command on the Finder's Special menu to initialize a disk. If you modify or delete a file from your disk, you should initialize it again for the highest degree of safety.

These precautions do not prevent data in RAM from being written to disk; however, in my tests, little data was written from RAM to disk: usually less than 1K total, and always in small chunks. Furthermore, because the data often used Intel byte-ordering (think of it as "backwards" for the Mac), it's less intelligible to Mac users than "see through" data from a disk.

I don't know where to find a complete list of OLE applications on the Mac. If you're concerned about a particular program, you can use the technique outlined above until a patch is available, or contact the application vendor.

You're on Report -- The issues outlined here have been widely reported by Macintosh news outlets as a new security problem with Office 98 or Word 98, often in alarmist language. Frankly, the Macintosh media's response to this issue has disturbed me. Although I wouldn't characterize the coverage as irresponsible, I would certainly call most of it incomplete and misleading.

It would seem many Macintosh news outlets are primarily concerned with spreading stories rather than investigating or confirming them. Sure, this isn't a simple case: I spent over thirty hours during the Fourth of July weekend tracking these issues and conducting tests. Sometimes that amount of work is necessary to avoid passing off unwarranted speculation under color of authority.

So, in short: this is not an Office 98 problem, it's an OLE problem that's been present since at least 1992. If you use OLE applications, have potentially sensitive information on your computer, and frequently share documents with others, consider saving those documents to a newly initialized disk before sending them off until a fixed version of OLE is available.


Make friends and influence people by sponsoring TidBITS!
Put your company and products in front of tens of thousands of
savvy, committed Apple users who actually buy stuff.
More information: <>