One of the burdens of publishing for nine years is that there are nine years’ worth of back issues that must be archived, organized, and made available to readers in useful ways. TidBITS’s efforts in this area go all the way back to our first issue – published as a HyperCard stack that automatically integrated future issues – and continues today with the latest features in our Web-based article database.
Webward Ho! In late 1996, with the Web undergoing explosive development, Matt Neuburg converted our back issues to HTML, pointing the way toward a full Web-based archive. A few months later, Adam fired the starting gun on a race to provide a way for readers to search the entire contents of TidBITS via the Web, signalling an end to readers needing to maintain local archives of TidBITS (see the article series "Search Engine Shootout").
Adam’s search engine shootout was a tremendous idea: it let TidBITS cover a variety of Macintosh-based search engine technologies without requiring us to become experts with all of them – and, in the end, TidBITS would get its own search engine! We announced the contest, reviewed the entries, and (with considerable consternation) selected a winner based on Apple e.g., a search technology from Apple that eventually became part of products like AppleShare IP, WebSTAR, and even Mac OS 8.5’s Sherlock.
A month or two before the search engine went online, I’d begun experimenting with an online article database for TidBITS, built using FileMaker Pro and Blue World’s Lasso. Although implemented as more of a thought experiment than a serious effort, it evolved into an adjunct of the full-text search engine that we called an "Author/Title" search. It didn’t allow users to search the full text of articles but was handy for locating articles by a specific author or within a particular date range.
More significantly, the FileMaker-based system gave us a way to refer to specific articles instead of a complete issue of TidBITS. We’d been wanting this level of focus for years (and it wasn’t offered by our Apple e.g.-based search engine), so we rolled this capability into the major Web site redesign unveiled in October of 1997 in TidBITS-400. From that point on, GetBITS URLs (like the one below) have been liberally sprinkled throughout TidBITS issues, pointing readers directly to relevant articles we had previously published.
A Sinister Plan — Although I didn’t explain it to anyone at the time, I had a secret agenda for all those GetBITS URLs. While they have an immediate benefit of taking readers to the right thing (an article, a MailBIT, an update, an article series, or, more recently, threads in the TidBITS Talk database), I thought they might also have a long term benefit. After all, we were integrating new articles into the database every week, and those articles contained GetBITS URLs that pointed to related resources in the same database. Over time, I hoped those links between items would become useful in and of themselves.
This may not come as a great revelation to many readers, but the fundamental strength of a database is not searching through information, but the capability to organize that information in useful ways. Fundamentally, a search just finds things, while a database can make those found objects smart – how smart depends on the database design and the point of view of the user. I hoped that by capturing the information about what other TidBITS material was relevant to a particular article, I’d eventually be able to make articles smarter.
Foiled! Unfortunately, even our best laid plans don’t always work out. I don’t recall the precise moment when our collective frustration with the Apple e.g.-based shootout winner reached critical mass, but I hit my limit while Adam and Tonya were in Australia during the beginning of 1998 and I spent hours trying to prevent Apple e.g. from crashing constantly. Despite our best efforts, it seemed we’d have to provide a TidBITS search engine ourselves, and the "Article/Title" database was pressed into service.
Without getting into a lengthy discourse on the numerous issues and performance bottlenecks involved in serving FileMaker Pro databases to the Web, let’s just say that the transition was not without difficulty. The situation was further complicated by the premiere of the TidBITS Talk archive, our most ambitious database project to date, and the introduction of Sherlock, Mac OS 8.5’s Internet-savvy search tool.
How could Sherlock impact our databases? As useful as Sherlock might be, it’s responsible for many inappropriate (and unintended) queries to Internet search engines. Since Sherlock doesn’t make it convenient for users to activate or deactivate search resources selectively, users tend leave all their plug-ins enabled. So, the TidBITS article database regularly receives queries unrelated to TidBITS. Sherlock search queries from this morning include "Beatles lyrics," "screenwriting tips," and "MIDI violin" – and unpublishable queries for adult materials. These searches tie up the database, and although we’ve implemented query requirements to reduce the burden, inappropriate searches remain troublesome. In any case, finding ways to reduce the database load caused by Sherlock took time and prevented me from working on my master plan for smarter articles… until now.
Towards Smarter Articles — Articles retrieved from the TidBITS database now present considerably more contextual information, which will hopefully be useful to TidBITS readers. It might be helpful to follow along in a Web browser, using Matt Neuburg’s review of Conflict Catcher 8 from TidBITS-446 as an example.
If you’ve accessed items in our database before, the first new thing you’ll notice is a set of links to the right of the article’s text, collected into several colored boxes. These boxes group together items related to the current article, including articles that appear in the same TidBITS issue, series the current article belongs to, specific TidBITS items referenced by the current article, and most significantly, articles that later referred back to the current article.
Looking at Matt’s Conflict Catcher review, you’ll see the article was later referenced by two additional TidBITS articles discussing subsequent updates to Conflict Catcher. (By the time you read this, that review will also know that this article refers to it.) You can also see that Matt pointed back to three previous Conflict Catcher reviews that have appeared in TidBITS over the years, as well as an article about the demise of quality printed documentation and a review of InformINIT. You’ll also see that the review is part of a series, along with a list of other articles that appeared in TidBITS-446.
Because we’ve been using GetBITS URLs for only about eighteen months, they don’t appear in our older articles and, hence, older articles may not yet know about articles that refer to them or articles to which they themselves refer. Nonetheless, I have completely cross-linked about 550 articles going back to the latter part of 1994, while about 200 earlier articles still need to be fully integrated.
Since introducing TidBITS Talk last year, we’ve been repeatedly startled by the quality of discussion and information traversing that list, and it seemed appropriate to make relevant TidBITS Talk material available from TidBITS articles. So, if a TidBITS article mentions a discussion in TidBITS Talk or is itself specifically referenced by a message sent to TidBITS Talk, we display a link at the top of the article that takes you directly to the appropriate items in the TidBITS Talk archive, using a new browser window. (The TidBITS Talk links are also one of the few instances where we use more than one graphic on a Web page – with a total transfer burden of 367 bytes, I couldn’t resist.)
Towards a Smarter Presentation — Along with making TidBITS articles more useful and context-sensitive, we’ve also tried to enhance other ways people interact with our database. GetBITS URLs now appear in browser location fields and history lists, so it’s easier to bookmark specific articles and see which items you have previously visited. We’ve also enhanced our search results pages, and our main search form now offers some pre-formed queries that display recent articles TidBITS has published in particular categories, as well as listing the most popular articles in our database. In general, the HTML served by the database is cleaner; we’ve implemented some discrete changes that make the archive more accessible to Windows users; and commonly accessed items are much more Lynx-friendly. Finally, queries now search the contents of both TidBITS and NetBITS issues.
Today Searching, Tomorrow the World! We hope you enjoy these database changes in honor of TidBITS’s ninth anniversary, and that they make our content more accessible and useful to you. As usual, there are other grand schemes, plots, and enhancements we hope will bear fruit in the future – but we can’t tell you about them now. It would spoil the suspense!