As a general rule, TidBITS likes to be self-documenting: if we set up a new system or service, make some changes, or stop doing something, we generally write about it. We’ve even written about a few unusual events like the poll to determine whether we should continue listing product reviews from other magazines and our contest to find a full-text search engine.
We do this not only so TidBITS readers know what we’re up to, but also because we use Macs to solve real-world problems. Sure, industry news, product announcements, reviews, and analysis are necessary, but explaining how we do things can be as important as explaining how other people do them.
If you’ve read TidBITS for a while, you might remember that in August of 1996 we moved the TidBITS mailing list from a LISTSERV-based service at Rice University to a system we maintain and administer ourselves. We frequently receive questions about how we manage a mailing list the size of TidBITS. What hardware and software do we use? Did we do anything special to set up the mailing list? How does it all work, anyway?
Although there are now several mailing lists associated with TidBITS, the biggest and longest-standing is the TidBITS setext mailing list, which currently boasts about 50,000 subscribers. We also have separate mailing lists for some TidBITS translations; we host most of these (Dutch, French, and German, for example), while the Japanese translation team handles the sizable Japanese list separately.
Important Notes — Before going further, let me cover a few important administrative details. First, mailing lists hosted by TidBITS are absolutely confidential. We do not sell, share, lend, or disclose information about individual TidBITS subscribers to anyone for any reason or under any circumstances. Period.
Second, although subscription information is included in every TidBITS issue, it bears repeating. To subscribe to TidBITS, send email to <[email protected]>; to unsubscribe, send email to <[email protected]>. You can also use forms on many of our Web pages to subscribe. If you want to receive a TidBITS translation via email, check the translation’s Web pages to see if a separate mailing list exists.
To move your subscription to a new address, unsubscribe from your current address, then re-subscribe from your new address once you’re set up. If you no longer have access to your old address, send a message to <[email protected]> asking to update your subscription (be sure to include both your old and new address).
Geography & Server Tricks — Each week, we distribute TidBITS using StarNine’s ListSTAR/SMTP mailing list software, running on a mere Power Mac 6150/66 (a PowerPC 601 machine). The 6150 – along with another Power Mac that acts as our primary Web server – are kindly hosted in downtown Seattle, Washington, by Point of Presence Company (POPCO). When we distribute a TidBITS issue, the 6150 uses a fast Unix system at POPCO as a mail gateway because the Unix system pushes issues out to the Internet much faster, helping to ensure prompt delivery. In contrast, an SE/30 running Fog City’s LetterRip at Adam and Tonya’s house handles several of the smaller TidBITS translations lists.
I maintain the primary TidBITS mailing list using a set of FileMaker Pro databases. Those databases essentially live on my main desktop machine, a Power Mac 7600/120, and no one else uses them.
If this arrangement seems convoluted, you’re right. Understanding this setup means explaining TidBITS’s physical topography. Adam and Tonya work from their home well east of Seattle, and managed to get a 56K frame relay connection installed after battling US West for nine months; I also work from my home east of Seattle, where my Macs are connected to the Internet via a dedicated ISDN line; and TidBITS’s primary servers live in Seattle, where they enjoy a T1 connection. The upshot is that the subscription databases are physically separate from our distribution server.
How do we tie ListSTAR and FileMaker together? Since they don’t live on the same machine or the same AppleTalk network, we use a combination of ListSTAR rules, custom HyperCard stacks, FTP, and some slight-of-hand. When someone subscribes to TidBITS, ListSTAR appends the name and address to a list of people who want to subscribe and sends an acknowledgment. (The same sort of processing is done for unsubscribe requests.) Every night, HyperCard-based automation on one of my Macs uses an administrative capability in NetPresenz’s FTP server to tell ListSTAR to quit. The automation then deftly downloads the lists of addresses to be added to or removed from the TidBITS mailing lists and replaces them with empty lists. Finally, the automation launches ListSTAR again, which resumes mail processing, unaware it’s just been hoodwinked.
Now, the subscription information in the ListSTAR files must be integrated into the FileMaker database. Another HyperCard stack processes each of the ListSTAR files, taking the appropriate action for each address, then composing and sending an appropriate reply or subscription report via email. I chose HyperCard for this task because it’s very stable, I have years of experience with it, and building an interface is easy with its tools. HyperCard’s capabilities provide flexibility; for instance, since this system went live, email replies from the subscription database have been sent using a variety of tools, including Eudora, custom email XCMDs, Allegiant’s Marionet, Atul Butte’s TCP/IP Scripting Addition, Chuck Shotton’s scriptable TCP utility NetEvents, and FileMaker Pro 4’s built-in SMTP capabilities.
Why All This Trouble? There are simpler ways to host a mailing list. We could handle the whole thing in ListSTAR, avoiding both FileMaker and these scripting shenanigans. Or, we could pay someone to host TidBITS and never have to manage a mailing list again.
We rejected those options for a variety of reasons. As one of the largest Macintosh mailing lists, we wanted to find a Mac-based solution for distributing TidBITS. Back in 1996, no mailing lists the size of TidBITS were hosted using Macs, and we felt we had an opportunity to change that. Second, fees for hosting a large mailing list were extraordinary, and there was no guarantee of quality service or – importantly – the security of the TidBITS mailing list.
It is possible to host a mailing list the size of TidBITS using only ListSTAR. However, ListSTAR’s address list editor won’t open a mailing list with more than about 15,000 entries, making it difficult to search for addresses or names manually. Also, since none of us are in the same physical location as the ListSTAR server, we’d have to perform administrative tasks for the mailing list via Timbuktu Pro, which isn’t ideal.
There are other advantages to using a database to manage a mailing list, rather than a simple add-and-remove system. First, a relational database makes it possible to centralize subscription information for multiple mailing lists. If you subscribe to both TidBITS and NetBITS, you may have noticed that the subscription reports returned by the database list every subscription for your particular email address. So, if a subscriber writes to say that her email address has changed, I only have to update information in one place. Similarly, the ability to search and replace through a database makes it simple to correct or remove batches of email addresses when a domain name changes or disappears (remember eWorld?), eliminate duplicate addresses, or search for problematic subscriptions based on fuzzy criteria.
Also, mailing list programs don’t store much information about subscribers – usually just an email address and maybe a name. A database, on the other hand, can store any arbitrary information associated with a particular subscriber or subscription. Our database keeps an event log for each, including subscription requests, email problems, analysis results, and manual changes by the database administrator. I’ve also set up additional fields that serve as flags for malformed email addresses, repeated email problems, and other things I might want to know about. All this makes administration significantly simpler.
Another advantage to a database is that it doesn’t have to forget anyone. When a mailing list program like ListSTAR or LetterRip receives an unsubscribe request, it deletes the matching entry. At that point, the subscriber is gone: there’s no indication the person was ever subscribed. Conversely, when someone unsubscribes from TidBITS, no information is deleted. The database just sets a termination date for that subscription and sends a note indicating the subscription has been cancelled. All information about that subscriber and subscription is retained.
Why? The database keeps this information in part because I feel it’s essential to managing a list the size of TidBITS. I have queries and tools that look for patterns in the information, such as entire domains that suddenly start reporting email delivery problems (this usually indicates a server-level spam filter run amok) or a subscriber who unsubscribes and re-subscribes to TidBITS every day or every week (usually a misconfigured autoreply). This information lets me deal with some problems proactively, and provides more information to work with when subscribers have trouble.
List Profiling — Other benefits of a database are more subtle. When the TidBITS list was hosted at Rice University, the LISTSERV system there worked well for us – in no small part due to the diligent efforts of Mark Williamson. However, we were often frustrated that we knew so little about the TidBITS mailing list. We knew how large it was at any given time, and we could retrieve copies to search and analyze. But that didn’t tell us everything we wanted to know.
For instance, with a database it’s trivial for us to figure out how many different domains are represented on the TidBITS list at any time (currently 127, including exotic locales like Niue, San Marino, Cuba, and Iran), which Internet providers have the most subscribers (AOL, EarthLink, CompuServe, and Netcom), and how many people in Japan (1,582), Switzerland (333), or Singapore (139) subscribe to the English version of TidBITS.
The subscription database provides other kinds of information as well. A big issue with services like AOL and cable television is the "churn rate" of subscribers: people who join a service and drop it after a certain period of time. A service’s churn rate is an indirect way to measure how well the service retains subscribers, and hence how valuable those subscribers feel the service is. The churn rate for some cable television services like HBO has been estimated to be as high as 40 percent; AOL’s churn rate has been placed over 50 percent at times.
Since TidBITS is free, its churn rate should be different from a commercial service, and we’ve been pleased to learn that’s the case. Since we took over the list in August of 1996, TidBITS’s average thirty-day churn rate has been 1.2 percent, meaning more than 98 percent of the people who subscribe to TidBITS stick with it for more than 30 days. Similarly, TidBITS’s average 90-day churn rate is 1.9 percent, and our average annual churn rate is 8.3 percent. Another interesting statistic is that more than 72 percent of the people who were subscribed to TidBITS when we took over the list 18 months ago are still subscribed today.
Although these kinds of figures are imprecise measures, they imply that TidBITS subscribers perceive TidBITS as a valuable service, and that they stick with TidBITS over time. People often tell us these things when they send us email, but it’s good to know those sentiments are borne out over time across the entire range of TidBITS subscribers.
Follow the Bouncing Email — One of the major problems with administering a list the size of TidBITS is the bounced email. We currently receive 12 to 15 MB of bounced messages and email errors in response to every TidBITS issue. Even though much of that material is the complete text of issues that have been returned to us, it’s still too much to cull by hand. So I wrote a program, called Hired Thug, that does most of the work.
Hired Thug works on a simple principle: it looks for several dozen known patterns in the bounces we receive and tries to ferret out the email address that generated the error. Hired Thug then checks the found address against a list of email addresses that have had problems in the past. If the found address isn’t in that list, Hired Thug appends it along with the date; otherwise, Hired Thug compares the date of the existing entry with the current date. If the two dates are sufficiently close together, Hired Thug finds the problematic address in the subscriptions database, suspends all associated subscriptions, and makes a note of the type of email error we received.
Every week, Hired Thug pulls 400 to 1,000 email addresses out of the errors and bounce messages we receive; but only one quarter to one half of those addresses are "repeat offenders" who have returned an error to us recently. A short-lived problem with your email provider won’t cause your TidBITS subscription to be terminated, but sustained problems that cause repeated errors will result in suspended subscriptions. Also, if you try to subscribe to TidBITS and our confirmation message can’t be delivered, we suspend the subscription immediately.
Because our database distinguishes between subscriptions that have been cancelled and subscriptions that have been suspended due to email errors, trying to resubscribe from an address we’ve suspended won’t work. Instead (assuming your email is working correctly), you’ll receive a notice that your subscriptions are suspended and you can’t reactivate them. At that point, contact us at <[email protected]> and explain the situation. We can usually tell you the nature of the errors we’ve received and help you get your subscription running again.
Although mailing list software like LetterRip and Lyris (not available for Macs) now offer error-handling capabilities, they still use a "simple delete" methodology, so all information about the problematic address is lost. By tracking errors ourselves, we’re better able to track problems like errant spam filters, mail loops, or mail processing changes made by large services like AOL or CompuServe.
If you administer your own mailing list and want to use Hired Thug to process bounces, you can’t. Hired Thug is tightly integrated with the TidBITS subscriptions system and can’t easily be converted into a stand-alone utility. However, Vince Sabio’s program SmartBounce offers similar features, and it’s designed to work with most mailing list software. I wrote Hired Thug before SmartBounce was available, but if Hired Thug wasn’t working so well and tailored precisely to our needs, we’d probably use SmartBounce.
Show Me The Database — When people learn that we manage subscriptions via a database, they immediately want to know why we don’t let people manage their own subscriptions or let people send subscription commands from email addresses other than the particular one subscribed to TidBITS. Although these requests are well-meaning, we don’t allow these things to preserve the confidentiality of our subscribers and prevent simple forms of list tampering.
For instance, if we were to set up Web-based searches through our subscriptions database, it wouldn’t be long before a spammer used that capability to obtain thousands of email addresses. Similarly, if we accepted subscription commands on behalf of third parties, some people would use the capability to determine whether a particular address was subscribed, cancel someone else’s subscriptions, or even add someone to the TidBITS list without their permission. If this sounds far-fetched, consider that we receive a few dozen bogus subscription attempts every week (with addresses ranging from Bill Gates to Bill Clinton), and occasionally someone attempts to subscribe an address twenty or thirty times under slight variants of the same email address. Although the mailing list database isn’t tamper-proof (an impossible task even with digital signatures or photo ID) we try to make abuse difficult.
All For Now — Ideally, all this complexity will be invisible to TidBITS subscribers, who should be able to sign up and begin receiving issues with minimal effort. In the future, we hope to offer more advanced capabilities and subscription options, but for now, we’re pleased that the system is working well and that we can do it all from our Macs.