William Porter 23 September 2002

Mailsmith and Distributed Filtering, Part 1

We’ve never met, but I know something about you: you’re getting more email this year than you did last year, possibly a lot more. If you simply let messages pile up in your incoming and outgoing mailboxes, sooner or later you’ll have an organizational nightmare on your hands. The best way to prevent this nightmare (and the best way to deal with the mess if it has already developed) is to define and use email filters. Indeed, after allowing you to receive mail and send mail, helping you organize your mail is the single most useful thing an email client can do, and filtering is the number one tool for the job.

This article is a followup to "Mailsmith 1.5: Lean, Mean Email Machine," my review of Mailsmith in TidBITS-638. In that review, I stated my judgment that Mailsmith’s filtering options are more powerful, more flexible, and more varied than those of any other Mac OS email client. Mailsmith’s most distinctive feature, called "distributed filtering," is so novel that the editors of TidBITS have given me a chance to say a bit more about the subject, both so people considering Mailsmith come to appreciate what it might offer them, and so those already using Mailsmith can take full advantage of the power at their fingertips.

<https://tidbits.com/getbits.acgi?tbart=06870>

<http://www.barebones.com/products/ mailsmith.html>

Distributed Filtering — You can use, and people do use, Mailsmith’s filters in the traditional way, simply sorting incoming messages into the appropriate destination mailboxes. Mailsmith’s traditional filters are powerful; perhaps more so than those in any other email program. But Mailsmith also provides a completely different and wholly original way to approach filtering: distributed filtering.

If you use traditional filters, every message, as soon as it hits the incoming mailbox, is examined by each and every filter you have defined. Even if the message happens to meet the test in filter 29, it must usually continue to be tested against filters 30 through 50. When all the filters have had a chance to examine the incoming message, the program determines which tests, if any, have been satisfied, then decides how to process the message, resolving conflicts between filters if necessary. Normally, the result is that the message is sent directly to the mailbox where you want it to end up. Note that, in this scenario, the way your mailboxes are organized has no effect whatsoever upon filtering.

Not so with Mailsmith’s distributed filtering, which uses the way your mailboxes are organized as a way of controlling and limiting the application of filters to incoming messages. Incoming messages are greeted initially by the mailboxes at the top level of the hierarchy, starting with the first one in alphabetical order. As soon as a mailbox "recognizes" an incoming message, that is, as soon as a test in one of the filters attached to a mailbox is met, that mailbox lays claim to the message. The message now continues to be examined by any mailboxes inside the one that claimed it; but the message will never be tested by filters attached to mailboxes at the first level of the hierarchy that come alphabetically after the mailbox that claimed it.

How Distributing Filtering Works — My description above is by necessity a bit abstract, so let’s look at a concrete example that shows the power of distributed filters.

Consider the following mailbox hierarchy, based loosely on my own setup. The (incoming) and (trash) mailboxes belong to Mailsmith – in other words, they are created by Mailsmith and cannot be moved, deleted or renamed. I created the other top level mailboxes – clients, lists & subscriptions, and personal – which correspond to the three main server accounts (POP mailboxes) from which I download email.

 (incoming)

 (trash)

 clients

 lists & subscriptions

   – Mailsmith

      — Mailsmith / keep

   – FileMaker

      — FileMaker / keep

   – TidBITS

 personal

Let’s create filters that will catch messages from the first two server accounts:

 If Server Account Contains "clients"

 [Then] Deposit

and

 If Server Account Contains "lists"

 [Then] Deposit

Now attach the first filter to the clients mailbox and attach the second filter to the lists & subscriptions mailbox. (Note that my example filters look much like what you will see in the Mailsmith filter definition dialog. I’ve edited them only slightly to make them easier to understand here.)

When new mail arrives from the lists server account, it will be offered first to the clients mailbox for examination, because "clients" sorts alphabetically before "lists & subscriptions." But the message won’t match the criterion in the clients filter, so it will be passed to lists & subscriptions. The filter attached to that mailbox will match the message, so the message will be deposited in lists & subscriptions. The personal mailbox will never see it.

But the message is not home yet. It may be filtered further by the mailboxes inside lists & subscriptions. There is a mailbox named "TidBITS" in there, and let’s assume that this filter is attached to it:

 If To Contains "tidbits"

 [Then] Deposit

If our imaginary message happens to meet this test, it will end up deposited in the TidBITS mailbox. Using distributed filters with the "deposit" action, messages percolate through the mailbox hierarchy in a straightforward and efficient way.

Why is this approach better? Setting up distributed filters is concrete. You can visualize the way your filters will work by simply looking at your mailbox list. This makes troubleshooting easier, too. None of my mailboxes have more than one or two filters attached to them; my incoming mailbox has no filters attached to it at all. If mail does not end up where it is supposed to end up, I just observe where it does end up in my folder hierarchy, and climb back up the mailbox tree until I find the branch where things went wrong. This process almost never requires looking at more than one or two filters. In Microsoft Entourage, by contrast, if you have fifty filters and one isn’t working, almost any of the other filters could potentially be causing the problem — not to mention Entourage’s mailing list rules and junk mail filters, both of which are located elsewhere in the program.

Filtering to the Max — So far we’ve looked only at the basics of distributed filtering. What’s most impressive about distributed filtering is not that it does what traditional filters do, just a little better, but rather that distributed filtering takes the whole idea of processing your mail to a new level. Consider the following:

I subscribe to the active and helpful Mailsmith Talk list. A filter initially deposits incoming mail from the list in a mailbox named "Mailsmith." When I find the time to read new messages, their status changes from unread to read automatically. I enjoy reading all the messages (traffic on the list is not so heavy that this is impossible) but I’m interested in saving only a handful each week. So as I read, if I want to keep a message for future reference, I use a simple keystroke I defined to mark the message with a custom label ("keep"). Now, inside my "Mailsmith" mailbox there is a child mailbox named "Mailsmith / keep," to which two filters are attached. Here is the first, named "Archiving."

 If ((Label Is Equal To "keep"

   Or From Contains "[email protected]")

   Or Answered Is Equal to True)

   And Read is Equal to True

 [Then] Deposit

I’ve used parentheses above to show how Mailsmith interprets the criteria. This filter catches messages that meet one of the initial three criteria – I applied the label "keep" to them, they’re from me, or I replied to them, – and they have been read.

What happens to the rest of the messages? They are processed by the following simple filter named "Trash."

 If Read Is Equal To True

 [Then] Transfer [to] "(trash)"

This filter simply takes everything that wasn’t caught by the first filter and moves it into the trash mailbox.

Note that the alphabetization of the filter names matters here. If the Trash filter got to the messages before the Archiving filter, well, all my read mail would get routed into the trash. I could make the Trash filter safer by adding more tests to it, but I have come to trust this setup completely.

Of course, incoming messages are by definition unread, so these filters never catch new messages. They process messages after they have been read; most filters process messages before they are read. So how are these filters activated? Although I could automate the process by writing a simple AppleScript script that runs, say, every time I launch Mailsmith, I prefer to activate the filters manually, by using Mailsmith’s Re-Apply Filters command on selected mailboxes. Messages that had already been filtered once when they arrived are now filtered again, and since their properties have changed, they meet filter tests that they didn’t meet originally.

And so all my list traffic – hundreds of messages a day – is processed from cradle to grave, so to speak, by Mailsmith’s distributed filters. I don’t bother deleting messages one by one. Instead, as I read, I focus on what I want to keep, rather than on what I want to trash. This is far more efficient, since in most cases, I want to keep far fewer messages than I want to delete.

Contextual Filtering — But wait, distributed filtering is even cooler yet! You can attach the very same filter to many different folders, and its effect will be determined by the context in which it is applied.

All of my list mail is processed in exactly the same way as mail I receive from the Mailsmith Talk list. Mail from the various FileMaker lists I subscribe to is deposited initially in a "FileMaker" mailbox. Inside that mailbox, there is a child mailbox named "FileMaker / keep," to which are attached the same two filters attached to the "Mailsmith / keep" mailbox.

Look back at those two filters and you’ll see they test for properties that have nothing to do with whether a message came to the Mailsmith list or the FileMaker list. You can test in Entourage to see if a particular message is in a particular folder and respond accordingly, but that isn’t contextual filtering, because the test must be defined within the filter.

Filtering Multiple Accounts — Distributed filtering works exceptionally well for users like me who have multiple email accounts. It lets me route all mail from one account directly into that account’s top-level mailbox, and then filter further using content-based tests specific to the mail I get from that account. The content filtering works especially well for my list traffic, since lists messages always come to the same address and are easy to match in a filter.

Unfortunately, not all of my incoming mail is so cooperative, and some of the uncooperative mail is extremely important. I try to encourage my clients to use a special email address when they write to me, so their mail ends up in a dedicated POP account. I can then snag it with this filter attached to the clients mailbox:

 If Server Account Contains "clients"

 [Then] Deposit

Inside the clients mailbox, I have special mailboxes defined for clients with active projects. Each of these mailboxes has attached to it a filter that catches mail specifically from that client. For example, the mailbox for a client named Not So Big Company, Inc., might look like this:

 If From Contains "@notsobig.com"

 [Then] Deposit

But as you might imagine, my clients do not always use the preferred address when they write to me. Sometimes client mail comes to my personal account instead. My solution is simply to attach the client-specific filters both to the top level "clients" mailbox and to the individual client mailboxes inside it. That way, if the first filter doesn’t catch the message, the second filter will. Any given mailbox can have multiple filters attached to it.

Is this approach better than simply defining a transfer-action filter and attaching it to the incoming mailbox? I think so. Even when there is a certain amount of redundancy in the way they are applied, distributed filters are still easier to define and troubleshoot, although it would be nice if Mailsmith’s filter list could show me to which mailboxes a given filter is currently attached.

Next week, I’ll finish up this explanation of Mailsmith’s innovative distributed filtering by examining how you can use distributed filtering to manage not just your incoming mail, but your outgoing mail as well. Plus, we’ll look at how distributed filtering can help you stem the ever-increasing tide of spam.

Share

Subscribe today so you don’t miss any TidBITS articles!