The Database Returns
I’ve noticed a trend worth watching recently. More and more products are putting databases under their hoods. Two Web servers, Web Server 4D and NetWings, are based on ACIUS’s 4th Dimension database. The just-released FireSite (see TidBITS-340) sits on top of a custom relational database, and EveryWare’s Bolero Web tracking tool uses their Butler SQL database. Lest you think this tendency is just related to Web software, a database also drives DiamondSoft’s new font management utility, FontReserve.
What’s causing this trend? Two things, I think. First, serious databases provide flexibility and performance not offered by the built-in pseudo-database functionality of the Mac OS – namely the HFS file system and the Resource Manager. The second, related reason is that database power enables additional transparency that would otherwise require significant manual effort. For instance, basing a Web server on a database simplifies the use of repeating elements within a Web site. Similarly, the way FireSite uses its relational database back-end enables it to monitor usage and predict which files would be best to replicate.
For those unfamiliar with databases, there are two common types: flat-file databases and relational databases. A flat-file database is like a stack of index cards, and each card contains the same fields of information. Flat-file databases are useful when there’s a one-to-one relationship between the data (Name to Telephone Number, for instance). In contrast, a relational database is more akin to several stacks of index cards. Each stack can have one or more different fields of information on it, and each stack can selectively "see" into the other stacks to access common information (like a name or telephone number) so information is only stored in one place. A relational database is useful for one-to-many or many-to-many relationships (Student to Classes, for example). In some ways, you can think of a relational database as a number of flat-file databases that share information.
It’s easy to see how a database can help a wide variety of programs. For instance, ListSTAR has great flexibility when it comes to processing incoming and outgoing email, but it stores mailing lists as simple unsorted text files. Locating an address within a large list like our 7,000-plus person DealBITS list can take several minutes. A similar search in a decent database should be essentially instantaneous. That’s one reason why we’re using a custom FileMaker Pro database to manage all our mailing lists. There’s a fairly significant overlap between the DealBITS list and the TidBITS list, so it’s silly to maintain two separate lists of subscribers when we can just note in a relational database that any given person subscribes to just DealBITS, just TidBITS, or both. Any future mailing lists benefit from the database as well.
I’ve heard rumblings of other mailing list management programs based on databases, and depending on how well they implement their database functionality, they could present serious competition to ListSTAR unless Quarterdeck is able to graft a database onto the program.
Assuming that a database back-end makes sense for a number of types of programs, I see several ways that a program could take advantage of database technology. A program’s developers could write a database, which would undoubtedly be a lot of work, but provides the most control. A more efficient method would be to license generalized database code from another company (I suspect this sort of thing already exists, although I don’t know the specifics).
Perhaps the most interesting way to get a database into a product is via a system-level database that any application could utilize. There have been a few starts in this direction, although nothing has gone all the way.
- HyperCard almost seems to fits this bill, especially a few years back when it was likely that any given Macintosh user had and used it. Unfortunately, HyperCard’s file format has never been public, which means data can only be accessed via the HyperCard application. More seriously, HyperCard was never designed to be used solely as a database; though it can sometimes perform capably in that fashion, it requires expertise, add-on tools, or both. But, HyperCard sports an accessible programming language and easy interface building tools, which ease prototyping if not implementation.
- UserLand Software’s Frontier includes an Object Database, and Frontier users have been doing some experiments with serving Web pages directly out of the database to avoid the performance overhead of the Mac file system. Other applications could take advantage of Frontier’s Object Database as well, although it currently isn’t at the system level.
- Peter Lewis and Quinn’s public domain Internet Config stores Internet-related preferences for any Internet Config-aware program to use. Internet Config isn’t really a database, though, but it’s a good example of the advantages of sharing information between programs at a system level.
I’d be fascinated to see what might happen if someone, perhaps even Apple, created a system-level general purpose database that any application could use for storing data. In fact, it might have already happened. Although I’ve been using the term "database" in a traditional manner, Apple has a technology coming called V-Twin that enables incredibly fast text indexing and searching. It might be conceivable that something like V-Twin, which is already used in Apple e.g. and to search email in Cyberdog, could stand in for a general purpose database engine.
Although Apple would seem to be the logical choice for defining what such a database could do, I frankly think that some small developers could get together, define some general functionality, and release something far more quickly than Apple could. I’m no database expert, but here are some of the things that I imagine the database needing.
- Speed. Performance is important, especially if multiple applications will be calling this database engine simultaneously.
- Data types. If this is a generalized database engine, it can’t discriminate in terms of data types – it must accept anything. Dealing with different data types would require it to know about the file system to handle aliases, despite the fact that bringing the file system into the mix might hurt performance.
- Relational. Despite the ease-of-use of flat-file databases, they don’t offer enough flexibility, and this database engine would have to be tremendously flexible for it to be useful for all the tasks dreamt up for it.
- Stable. Once lots of applications are using this engine, a single crash could cause incredible damage unless the database were solid and corruption-resistant. I’ve been distressed by the apparent ease with which I can destroy a FileMaker Pro database.
- Individual files. Each application should create its own file to reduce confusion or conflict over data, as well as to keep the individual files smaller.
Clearly, these requirements would require a tremendous amount of integration. However, there’s already a technology that Apple has released that deals with some of these same integration issues – OpenDoc. Maybe that means that the way to create a system-level database is to create it as a set of Live Objects (the new name for OpenDoc parts) and let anyone tap into its power.
OpenDoc or no, I won’t pretend that such a project would be simple, but I believe that providing such functionality to any application that wished to use it could result in significantly more powerful programs with more transparent interfaces. As an example, a look at the BeOS (used by Jean-Louis Gassee’s BeBox machine and possibly by Power Macintosh machines in the future) is instructive, since the BeOS file system is a relational database. You can use it as though it were a traditional file system or as a database, and an object in the database doesn’t have be a file, nor does it even have to be on disk. There’s no easy way the Mac OS could change in this fashion and maintain backward compatibility, but the fact that Be designed their file system in this way is telling. The question is, who will listen?