Adam Engst 26 January 2004

The Mac at 20: An Interview with Bruce Horn

Twenty years of Macintosh. At this year’s Macworld Expo, Steve Jobs played a version of the famous "1984" ad that launched the Mac, and Alan Oppenheimer, who was responsible in large part for AppleTalk, gave a fabulous talk about the history of networking on the Mac. What I found most interesting was that although twenty years have passed, many of the original people from those days are not only still around, they’re still producing great work. The history of the Macintosh is not only still being written, some of the same people are still doing the writing.

<http://www.opendoor.com/nethistory/>

Let me introduce you to another member of the original Macintosh team, Bruce Horn, who was responsible for a number of the key aspects of the Mac and who has continued to write innovative code. At Apple, Bruce was responsible for the design and implementation of the Finder (oh, that!), the type/creator metadata mechanism for files and applications, and the Resource Manager (which handled reading and writing of the resource fork in files; a note in Apple’s technical documentation at one point exclaimed, "The Resource Manager is not a database!"). The Dialog Manager and the multi-type aspect of the clipboard also appeared thanks to Bruce’s ingenuity.

So, to commemorate this 20th anniversary of the Macintosh, I wanted to talk with Bruce about not just what he did at Apple, but also what he’s up to now, since in many ways, his current work is both a return to his roots and a glimpse at what might be possible with the Macintosh in the future.

Adam: Bruce, many of the aspects of the original Mac that you worked on revolve around accessing structured data. The Finder was a front end to the filesystem; the Resource Manager, despite that note in the documentation, was a bit like a flat-file database; and type/creator codes were metadata that were just screaming to be used by a database. To what extent was all that planned, or did you just come to these solutions as you were working?

Bruce: Several different goals drove me to these solutions. Having had most of my programming experience in Xerox’s Smalltalk environment, where you could change anything you wanted at runtime (changes made while the program was running), I was looking for a dynamic way to handle objects in the system so data such as localizable strings, menus, images, etc. could be modified by non-programmers without recompiling the source code. At the same time, I was realizing that the kind of data that I needed to manage with the Finder – icons for applications and documents, and bindings to those icons – needed the same sort of mechanism, and I wanted a unified solution. So the Finder’s Desktop Database was the driver for much of what the Resource Manager ended up providing.

The file metadata also was driven by Finder needs. Early on I realized that to provide a double-click-to-open mechanism for documents, I’d need a simple way to link a document to a default application that would open it. Similarly, since multiple applications could open multiple file types, I couldn’t just have a single mapping from a type to an application that would handle all files of that type. Thus the separation of the type code (the actual format of the file) and the creator code (the default application, which could be easily changed). Independent type and creator codes stored in the filesystem also enabled us to avoid polluting the filename with type information, which I felt was a significant advantage of our approach over others.

The Desktop Database was a cache of the bindings between types and creators and the icons representing them, stored as resources. Since application bundles – groups of resources tied together describing document type and icon information – were stored in application resource forks, installing an application simply involved copying the appropriate resources from the application into the Desktop. The redundant information – type and creator information in the directory, and bundle information in application resource forks – made it possible to rebuild the database at any time without losing anything. It turns out that this was important in the early days.

Resources were, of course, heavily used in factoring out non-program data (like menus and text strings) that could be localized to different languages. With ResEdit, this allowed language experts to quickly create versions of an application without needing access to the source code.

Once I was able to convince Andy Hertzfeld of the utility of the Resource Manager, he rewrote most of the Toolbox to take advantage of it, which saved significant space in the ROM and gave us the ability to easily localize applications in a general way.

Adam: So Mac OS X’s reliance on Unix-style filename extensions for mapping documents to applications is something of a step backward, then?

Bruce: Yes and no. The original rationalization behind this was that Mac OS X needed to be compatible with Windows filename conventions, and to do so we’d need to force filename extensions to be provided. Because there are so many places that a file might leave the sanctity of the Mac OS and go out into the cruel world where extensions are required, it was deemed impossible to translate names from the Mac convention (with types and creators) to the outside world’s convention. As far as compatibility is concerned, this did the trick.

But over time it has become apparent that it is difficult to do this right, and the original mechanism of having redundant type information, and allowing the user to name the files whatever she wants, was more flexible and less prone to error. It turns out that Mac OS X still needed a creator mechanism by which individual documents could be opened by specific applications, so this information is stored in the resource fork of the file (of all places, since Apple is discouraging use of the resource fork), rather than simply in a creator code.

So the filename extension approach has worked, but with a little less elegance than the original.

Adam: Why didn’t you go all out and create a system-level database to handle all this data in the original Mac? Was it a horsepower issue, or were the software problems too tricky at the time?

Bruce: It would have been nice. I had some ideas in mind, but when it came down to fitting it in the 64K ROM, the Resource Manager was all we could fit. It was a real effort on everyone’s part to make code as small as possible. The Resource Manager was 3K, and the Finder 46K – amazing considering the size of applications these days!

Adam: When did you leave Apple, and what caused your departure?

Bruce: I left Apple in the spring of 1984, after doing a "final" version of the Finder. I guess I was just looking for something new to do: having spent several years working intensively on the Mac, I was ready for a break. Being on the Mac team, working with absolutely tremendous people, was one of the most significant things I’ve done, and it still gives me wonderful feelings when I think about those times.

Adam: Can you give us a quick rundown of where you worked after Apple? Were there any common threads among the various projects?

Bruce: After Apple I went to Adobe and worked a bit on a variety of small projects, including a LaserWriter spooler. When I was there I met a couple of Carnegie Mellon grad students, and, to make a long story short, they convinced me that I should go to CMU for graduate school (Chuck Geschke, one of the founders of Adobe, was also a CMU Ph.D.) Grad school was a great experience. I spent some time at the University of Oslo, Norway as a research assistant, did some consulting at Apple now and then, and had a chance to work with some intriguing startups while I was a student. My Ph.D. thesis described the design of a constraint-based object-oriented programming language called Siri, which I’d love to re-implement someday.

After graduating I went back to Apple as a consultant in the Advanced Technology Group and worked on a project called LiveDoc with Tom Bonura and Jim Miller, among others. LiveDoc was an experiment in automatically structuring documents so that various recognizers could determine that, for example, 555-1212 was a phone number and 124 Main Street was an address, and provide contextual actions on those items. It was a lot of fun, and I wish I had LiveDoc today in Mac OS X. Simson Garfinkel’s SBook provides some of these features as a PIM application.

<http://www.sbook5.com/>

But none of these projects really addressed the problem I wanted to solve, which was: how can I design an information browser that works with all types of data, from email messages to images to music files to documents, and provide a unified mechanism for organizing, searching, and viewing this information?

I began the iFile project in 1997 to do this, and worked on it for a couple of years before putting it on the back burner to start my other company, Marketocracy, where I’ve been since the middle of 1999.

Marketocracy is a mutual fund company that I co-founded with my business partner Ken Kam. Our team built a Macintosh-based Web site running WebObjects and a FrontBase database to allow over 50,000 people worldwide to buy and sell stocks in real time (but with fake money) to create a model stock portfolio. We provide a wide variety of tools to help our users to become better portfolio managers, and by watching their performance over time and ranking them, we can find the best people in the world to run our funds. Our Masters 100 Fund, based on the top 100 in our community, has been running for over two years now and has surprised even us with its impressive performance and low risk. It has returned over 39 percent since inception when the market has been essentially flat, and with a beta of 0.47 – half as risky as the market!

<http://www.marketocracy.com/>

Adam: What are you working on now?

Bruce: Recently I’ve picked up where I left off in 1999 with iFile (just a codename for now). iFile is a unified desktop information browser, like the Finder, but with significant architectural improvements. It is based on an object-oriented database of my own design that provides a general way for linking together and organizing objects of all types. The basic unit of organization is called a "collection," which is distinct from a folder in that an object may exist in many collections but in only a single folder. Collections are like iPhoto albums or iTunes playlists, but they can contain anything: text files, images, email messages, music files, contacts, notes, appointments, and so on. While this sounds a bit like BFS (BeOS Filing System) and the BeOS Tracker combined, it is much more general and can be used on any filesystem with the appropriate drivers.

The obvious first application for the iFile technology was in photo organization, an area in which iPhoto does quite well already. However, iFile provides more capability in organization by image metadata (it currently keeps track of 46 different pieces of metadata for each image), and it should scale much more smoothly for large collections than iPhoto. But iFile is not simply a photo manager: it is a general purpose information browser that can be used in a variety of ways, and can easily integrate different information sources, such as PIM, email, and music, among other data types. I think the version of iFile that I will release publicly will provide much more capability in those domains.

Adam: Is it fair to describe iFile as the Finder you’d write today?

Bruce: Possibly. I think it is much more ambitious than I had originally intended. If I can eventually get it scaled down to a level where new users can understand it quickly, it might be a nice alternative to the Finder.

Adam: Have you shown it to people at Apple? What did they think?

Bruce: Back in 1999 I showed it first to the Finder group, then to Avie Tevanian, and finally to Steve Jobs. I think that Apple was strongly focused on solving the problems of getting Mac OS X out the door as soon as possible, and looking at an alternative Finder was low on their priority list. I believe they were intrigued but had already committed to a different direction, and couldn’t turn the ship in time to take advantage of the iFile technology. Given the history of Mac OS X, I think they made the right decision.

Adam: Let’s look at iFile more deeply. There are two aspects to any filing system, getting data in and displaying that data to the user. How would someone get data into iFile?

Bruce: The current version of iFile requires the user to specify the folders that the user would like iFile to track; this is done by dragging the folders into the iFile workspace window. Once this is done, iFile tracks any changes to the contents of the folders and automatically updates the database as required. For example, the user can drag in the Pictures folder and be able to browse all the images, create collections, etc., without actually copying any files or moving any data. iFile respects your directory structures and never modifies anything directly, in contrast to iPhoto, which copies images into its own directory hierarchy.

The release version of iFile will not require the user to request that certain folders be scanned. Instead, iFile will initially provide a view on the user’s home directory, and will scan the files and folders in the background automatically.

Adam: Good! The less work users must do, the better. In fact, one of the main problems with any filing system is that few people put enough effort into categorizing and managing their data to be able to find things later reliably. Can iFile automatically categorize files based on metadata and content?

Bruce: Yes, it can. Collections are a way to automatically categorize files by their properties. Because iFile maintains file metadata in the object database, it can search and sort through the metadata very quickly to return the appropriate files. Collections are also "live": specifically, if files appear on the disk that match a collection’s specification, they will be automatically added to that collection, regardless of whether the collection is currently being viewed. One can imagine all sorts of interesting AppleScript scripts that could be triggered based on these events.

Collections also collect files based on their content. Rather than searching for individual words as Google does, collections search for key phrases: a word or a sentence. Files that contain any of the key phrases specified in the collection are automatically gathered into that collection.

So, what collections do is provide a new way to slice-and-dice the information you already have in a different way, without requiring you to import your data or commit to a completely new organization.

Adam: What do you think about adding a capability along the lines of a Bayesian classifier that would evaluate the contents of a file statistically, much the way some spam filters or the email classifying program POPfile work? That could reduce the user’s effort even further.

Bruce: That is a great idea and has been discussed for quite some time. In fact, Apple had worked on a project that was based on this idea. Piles were automatic groupings of files based on their content:

<http://www.theregister.co.uk/content/archive/ 30360.html>

One of the challenges here is to determine an appropriate similarity function: how do you decide what the collections should be a priori, to avoid the problems of hundreds of collections, each with one file, or a small number of collections with thousands of files? That will take some work.

Adam: What does iFile do on the display side? Can users create their own "smart folders" (a bit like smart playlists in iTunes) that automatically show files that match a specific query?

Bruce: Absolutely. A collection is essentially a smart folder, with a query specification. For example, it is easy to create a collection that groups together all the images taken by a particular model camera by specifying "<Model> is ‘2500’ and <Make> is ‘Nikon’", since that data is available in the EXIF metadata for the image. Similarly, metadata such as ID3 tags for music; image data such as resolution, width, and height; file data such as filenames, creation and modification dates, and sizes; and so on are all stored in the database for object retrieval and organization.

So collections actually have three mechanisms for grouping: manually via drag-and-drop; automatically via metadata query specification; and automatically via key phrase match.

Adam: iFile’s architecture sounds tremendously appealing, but I suspect the devil is in the details, and thus in the interface. Does iFile stick with the current file/folder metaphor (despite the terminology shift to collections), or does it offer a rethinking of how we interact with our data?

Bruce: You are right that the devil is in the details. I’m currently working on how to present all this information in an appropriately intuitive fashion, and I think I’m getting closer, but there is still clearly work to do.

iFile begins with the traditional, icon-based file and container organization (containers being either folders or collections), but goes further with a variety of different views and layouts. Many of the layouts provide preview views of the contents of the files, and in the case of text files, iFile automatically creates hyperlinks to related collections from within the text. It’s difficult to explain, but once you use iFile you’ll find that some of the views do in fact provide you ways to view your data from different perspectives.

The more you provide iFile with information regarding how you want to see your data, via defining collections, the more it can help you by cross-indexing and showing relationships where they were not clear before.

Adam: Are some of the things you’re attempting in iFile beyond what many users can understand? Lots of people just want to be told what to do, and something with iFile’s flexibility might be lost on them unless it was able to watch their actions and automatically build collections.

Bruce: I agree that iFile can be somewhat intimidating to new users: there are a lot of different things that iFile can do, and there needs to be more immediate gratification when using it. Creating collections automatically is a good approach, and by creating useful collections based on not only images but documents and email, I think that the power of the technology will become more apparent. I’m planning on implementing some of this in the next few months, so stay tuned! For anyone interested in this technology who would like to be contacted when there is a public version available, sign up at the site below, and I’ll keep you up to date. I’d be happy to go into detail about the release version in a future issue of TidBITS.

<http://www.ingenuitysoftware.com/>

Adam: Bruce, thanks for taking the time to chat with me, and we’re all looking forward to seeing what you come up with iFile. Who knows, perhaps now that Apple has stabilized Mac OS X, they’ll be interested in looking at what you’ve done again.

Share

Subscribe today so you don’t miss any TidBITS articles!