Will Mayall 23 October 1997

Winning the MIME QP Doll

[NetBITS reader Donovan Watts <[email protected]> asked why it is that a number of email messages in Claris Emailer, mostly digest versions of mailing lists, contain equal symbols at the end of some lines. Although this annoyance is by no means limited to Emailer, we turned to Will Mayall of Fog City Software, developers of Emailer and the extremely neat LetterRip mailing list management software, for the answer, which turned out to be rather complex. Take it away, Will. -Adam]

<http://www.fogcity.com/>

The short answer to why you sometimes see equal symbols in your email is that Claris Emailer, like most modern email programs, is MIME (Multipurpose Internet Mail Extensions) compliant. The problem is that sometimes mail is forwarded by servers that strip out some of the information necessary to identify and decode the MIME content properly. This isn’t necessarily the fault of any of the programs, but is usually a result of a mixture of old and new standards.

To encode extended ASCII characters (8-bit ASCII), MIME formatted messages generally encode the text using quoted-printable (QP) encoding. QP generally leaves normal ASCII (7-bit) alone and only encodes the extended ASCII. Extended ASCII characters are encoded with an equal symbol followed by the hex value of the character. That’s why you sometimes see things like =E3 embedded in the text. QP must also encode = symbols since they are part of the encoding process. (This is a reason to avoid using = symbols as "cosmetic" items in text.)

QP also marks all "soft" line breaks with an = symbol followed by a carriage return. Soft line breaks occur when carriage returns are automatically inserted within paragraphs to keep line lengths less than 76 characters. The = symbols at the ends of lines are generally the most distinctive aspect of a QP-encoded message that has not been decoded.

Incidentally, soft line breaks are one of the most obvious differences between most QP encoded messages and "old style" messages. Messages used to always have hard returns within paragraphs when the lines in paragraphs were longer than 75-80 characters. This was necessary due to limitations in some mail servers. QP maintains the line length restriction, but marks the artificial line breaks. Then, when the message is decoded, the line breaks are removed. Some email programs, most notably Netscape 2.0, improperly display the decoded paragraphs as a single long line of text that never wraps.

Also as an aside, the QP encoding was designed in such a way that even if a message has been encoded but then not decoded, the encoding is not so obnoxious as to make the message unreadable. This was an elegant solution to the problem of backwards compatibility.

There are two reasons why QP encoding might remain visible in a received message. First, if you use a non-MIME compliant email program, it won’t decode the QP encoding, so it will remain in the text. Second, if the necessary information that tells your email program to decode QP is missing, your email program won’t know how to do the decoding.

In email, information about the message is generally maintained in a header line, and MIME messages insert at least one header to identify that they are MIME messages. It looks like this:



Mime-Version: 1.0

In addition, if the message is QP encoded, there will also be at least the following header, although generally there are several others:



Content-transfer-encoding: quoted-printable

Email programs need to know that a message is a MIME QP encoded message for a message to be properly decoded. If a message lacks the above headers, it won’t be decoded.

Although mail servers are not supposed to remove headers, there are a few miscreants that monkey with the headers. These are often gateways to local email systems. So, if your organization uses one of these gateway programs, your email program won’t know how to decode the QP encoding, even if it is capable of doing so.

Another common source of the problem is mailing list servers. Mailing list servers are not simple mail servers. Not only do they forward messages like other mail servers, they also create digests of multiple messages. The digests are where the problem arises.

Remember that the proper message headers are extremely important. Without the headers, the QP encoding is not decoded. In most digests, most of the headers for each individual message are removed. In particular, MIME headers generally bite the dust.

Since a digest can include both encoded and non-encoded messages, it is sent without QP headers and therefore won’t be decoded. Although annoying to some, this technique is in fact the conservative and correct thing to do. There is little that can be done to avoid the problem other than to encourage others not to send QP encoded messages to mailing lists.

A solution to the issue of MIME messages in digests does exist. There is a standard (RFC) for a MIME digest, which retains enough information about each individual message within the digest that each individual message can be decoded. However, there are several problems even with MIME digests:

Many mailers do not decode MIME digests. MIME digests that are not decoded are readable but aren’t pretty.
For a MIME digest to be decoded properly, it should be burst into individual messages. However, for many people, the utility of a digest is that it’s a single document to scan through, whereas a burst digest requires the user to look at many separate messages.

Share

Subscribe today so you don’t miss any TidBITS articles!