Ingenious bibliographic software offers power and flexibility - but no scriptability
Article 1 of 1 in series
by Matt Neuburg
Throughout my Classics career, the hardest part of scholarly writing was managing the bibliography. My thesis was particularly nightmarish. Like most humanities Ph.DShow full article
Throughout my Classics career, the hardest part of scholarly writing was managing the bibliography. My thesis was particularly nightmarish. Like most humanities Ph.D. theses, it involved an extended critique of the existing scholarship - a complete history of claims and contradictions on hundreds of disputed issues. I maintained a vast collection of oversize note cards with holes round the outside; a v-shaped punch and some knitting needles helped me retrieve references bearing on a given matter. My typescript had to follow precisely the official style sheet on footnotes and citations, lest the Dreaded Thesis Secretary reject the whole thing on formal grounds.
I could really have used a program like Papyrus, from Research Software Design.
I've never used any commercial bibliography management software, so before studying Papyrus I imagined an ideal system that might have helped me with my thesis and other writings, and decided that it should:
Act as a database for entry, storage, and flexible retrieval of references.
Do the textual "arithmetic" to combine the fields of each reference into a canonically formed citation.
Integrate with a word-processor to insert citations.
Act as, or integrate with, note-taking software.
Papyrus does all these, and adds a fifth function, obviously valuable in this computer age, but which hadn't occurred to me:
Automatically import text references culled from online and CD-based bibliographical databases.
I'll discuss Papyrus under each of these heads in turn.
The Database -- Papyrus's database is absolutely splendid. The interface is clean and intuitive, revolving around two chief window types: the Reference, where you edit individual references, and the Group, which lists a subset of the stored references. There is powerful use of such devices as drag & drop, double-clicking in a window to open a related window, keyboard navigation, and other well-implemented conveniences too numerous to mention.
Papyrus knows which fields are relevant to each type of material (a book, an article in a journal, an article in a book, and so on), and ingeniously presents these as required fields, optional fields, and fields so rarely needed that they are hidden unless you ask to see them. Text fields are styled and WorldScript-savvy. You can add fields and material types, but you probably won't need to.
As you edit, Papyrus's "intelligence" saves you from errors and unnecessary work. For example, it knows which fields might repeat (there might be more than one author, for instance) and automatically provides a new blank when you fill one in; if you omit the comma in the author field (because you've forgotten that Papyrus expects last name, comma, first name) it prompts you; it distinguishes automatically between a first name and a first initial, understands a further comma to mean an appendage after the last name ("Dumas, Alexandre, Jr."), capitalizes names for you, and so forth.
Papyrus's database is truly relational, in a transparent, automatic way. Thus, for example, as you're entering entire references, Papyrus is gathering authors' names into a separate internal structure. Therefore, additional information can be associated with an author; plus, you can change a name and propagate that change to every reference that uses it. Also, this permits intelligent lookup: having entered an author in one reference, it suffices to type the first few letters of that name in another.
You can assemble groups manually or through a query, and both groups and queries can be saved. Even more important, since it's easy to save and open a group listing one project's references, you're likely to have just a single database comprising references for all your projects - so that you take advantage of Papyrus's relational capabilities across all of them at once. It's a brilliant architecture.
Building a Citation -- Papyrus constructs citations from fields by way of a Format, which is a set of instructions you enter partly through a series of dialogs and sub-dialogs, and partly through a grep-like formula describing the desired output. So, for example, to say that authors should be shown last name plus comma plus initials with period space after each, you drill down to a dialog and check the desired options; but to say that an article should appear as author, period, space, journal name in italics, comma, series if there's a series, volume number in bold, and so forth, you build a formula which looks rather like a Nisus PowerFind expression. Papyrus comes with a large number of formats, corresponding to various citation styles such as American Medical Association, Forestry Chronicle, Chicago Manual, and Turabian; you can use or modify one of these, or construct your own from scratch.
This is fine in principle, but I worry that the means to describe the desired output lacks sufficient generality. Papyrus seems to assert that it knows best what you should want to do: stay within these limitations and you'll be fine. Personally, I prefer programs that put the power into the hands of the user. The problem stems from three causes:
You can't circumvent Papyrus's "intelligence," which is sometimes more simple-minded than reality. For instance, you have no way to say that the proper abbreviated form of Yuri Gagarin's first name is "Yu." (and if you actually enter his first name as "Yu.", Papyrus strips the period, thinking it's his whole first name).
You can't enter an option unless a dialog provides for it. For instance, there's no way to say that the first of multiple authors should have the first name written out but subsequent authors should have the first name abbreviated.
The language of the output formula is weak. For example, there's no if-then-else construct; so you can't say that if a book's abbreviated-title field is defined, it should be used, and otherwise its normal title field should be used. This is almost shocking when you consider the extensive prior art for letting the user express just this sort of thing (such as Helix abacus dataflow diagrams, FileMaker calculation fields).
Word Processor Integration -- At a basic level, Papyrus works with just about any word processor. You can copy and paste (or drag & drop) a reference in Papyrus into a word processor document; the result is a citation in a particular format. And you can export an entire group of references as citations in a particular format, as MacWrite or RTF, which most word processors can import.
But if your word processor is Nisus Writer or Microsoft Word, both of which are scriptable, Papyrus is much smarter. Your document can contain coded abbreviations for references, like this: [], meaning your reference whose ID number is 6. When you're done writing, Papyrus automatically examines your document and constructs, based on these abbreviations, both the citations and the bibliography. For example, if [] means my Lysistrata translation, then Papyrus will substitute "(Neuburg 1992)" for it, and will include the full citation in the bibliography that it appends. In my tests, this worked rather better with Word than with Nisus Writer.
This is lovely, but it doesn't go far enough. What if you encounter bugs in Papyrus's substitution algorithm, or in its scripting of the word processor? (I did.) Or what if you use a different word processor, like AppleWorks? To be sure, you can get around such problems manually; you could just export all your references in any needed formats, and then rearrange them within your project. But a far better solution would be for Papyrus itself to be scriptable; you might then, say, write an AppleScript or OneClick script that examines the number in the current selection, asks Papyrus for the citation for that reference in a given format, and pastes it into your document - essentially building your own automatic integration where Papyrus's fails.
But Papyrus is not scriptable - ironic, considering the odium that the Papyrus documentation heaps upon AppleWorks for its lack of scriptability. Once again, rather than opening itself to the user's commands, Papyrus wants all the power for itself, driving other programs through scripts that the user can't see, modify, or work around.
Note-Taking -- On one hand, Papyrus holds great promise as a note-taking program, because of the power of its queries. Every reference can have multiple keywords, and you can define relationships between references, and between keywords, and use them in queries. For example, you can define a reference relationship "contradicts," and then perform a query which yields not only all references with (let's say) the keyword "determinism," but also all references which contradict any reference in that found set. This remarkable capability to evoke the structure of argumentatively related positions reminds me of MacEuclid, whose like I thought I'd never see again.
But alas, the same window which is so suitable for entry of brief reference fields is clumsy for comments longer than a sentence or two; nor is there any true hypertext, where a phrase becomes a link to another reference. Thus, I'd find Papyrus uncomfortable for note-taking, in contrast to a dedicated tool such as Palimpsest, or even an outliner. Papyrus needs a completely revised "note card" interface; additionally, it could take a cue from such programs as MORE, Helix, or In Control, by allowing a field to be an alias for opening a notes file with a different application. Papyrus seems once again, by its lack of such cooperation with other programs, to assert that it knows better than you do what you should want to do and how you should want to do it.
Import -- Papyrus can automatically import text bibliographies into reference fields; but this relies again upon Formats, and suffers from the same shortcomings. For instance, the input formula language (which is the same as the output formula language, a very odd design decision) lacks an "either/or" or "shortest match" construct. A genuine grep would have been really useful here; Nisus Writer's grep does a far better job of rearranging a text bibliography into canonical form than Papyrus could possibly do.
Crossing The Ts -- Papyrus is a splendid program. It is reliable, thoughtful, original, ingenious, straightforward. It is also easy to learn; the printed manuals rank with the best I have ever seen, and there is superb online and balloon help. It has many excellent touches I haven't had room to mention here. Doubtless my personal library contains books whose bibliographic style Papyrus would be hard pressed to emulate, but that's a minor issue; some last-minute hand tweaking is perfectly acceptable. If you maintain lots of references, generate citations in certain standard formats, and are using Microsoft Word, you should certainly give Papyrus a try.
But despite offering such excellent features, Papyrus doesn't turn out to be the bibliography manager of my dreams. As I investigated Papyrus, I discovered that it lacks an important general quality for my work, that openness and programmability that I seek in any major workhorse. A bibliography system is basically just a database, after all; and I already have several database applications, plus other utilities, that are scriptable. So I can use these to form a bibliography management system that works the way I want. At present, it's a toss-up as to whether Papyrus gives me a good enough reason not to do that. On the other hand, when Papyrus sports a more sophisticated formatting language, a better note-taking interface, and scriptability, along with a less implicitly restrictive philosophy, I'll be hooked.
Papyrus is $90, or $140 with printed manuals. A free demo version, limited to 200 references, is available for download. Papyrus requires System 7.0 or later, and is about an 11 MB installation (about 20 MB with full online help).