5. Use of encoded-words in message headers

An encoded-word may appear in a message header or body part header according to the following rules:

  1. An encoded-word may replace a "text" token (as defined by RFC 822) in any Subject or Comments header field, any extension message header field, or any RFC 1521 body part field for which the field body is defined as "*text". An encoded-word may also appear in any user-defined ("X-") message or body part header field.

    Ordinary ASCII text and encoded-words may appear together in the same header field. However, an encoded-word that appears in a header field defined as "*text" MUST be separated from any adjacent encoded-word or "text" by linear-white-space.

  2. An encoded-word may appear within a comment delimited by "(" and ")", i.e., wherever a "ctext" is allowed. More precisely, the RFC 822 ABNF definition for "comment" is amended as follows:

           comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"

    A "Q"-encoded encoded-word which appears in a comment MUST NOT contain the characters "(", ")" or " encoded-word that appears in a "comment" MUST be separated from any adjacent encoded-word or "ctext" by linear-white-space.

  3. As a replacement for a "word" entity within a "phrase", for example, one that precedes an address in a From, To, or Cc header. The ABNF definition for phrase from RFC 822 thus becomes:

           phrase = 1*(encoded-word / word)

    In this case the set of characters that may be used in a "Q"-encoded encoded-word is restricted to: <upper and lower case ASCII letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_" (underscore, ASCII 95.)>. An encoded-word that appears within a "phrase" MUST be separated from any adjacent "word", "text" or "special" by linear-white-space.

These are the ONLY locations where an encoded-word may appear. In particular, an encoded-word MUST NOT appear in any portion of an "addr-spec". In addition, an encoded-word MUST NOT be used in a Received header field.

Each encoded-word MUST encode an integral number of octets. The encoded-text in each encoded-word must be well-formed according to the encoding specified; the encoded-text may not be continued in the next encoded-word. (For example, "=?charset?Q?=?= =?charset?Q?AB?=" would be illegal, because the two hex digits "AB" must follow the "=" in the same encoded-word.)

Each encoded-word MUST represent an integral number of characters. A multi-octet character may not be split across adjacent encoded-words.

Only printable and white space character data should be encoded using this scheme. However, since these encoding schemes allow the encoding of arbitrary octet values, mail readers that implement this decoding should also ensure that display of the decoded data on the recipient's terminal will not cause unwanted side-effects.

Use of these methods to encode non-textual data (e.g., pictures or sounds) is not defined by this memo. Use of encoded-words to represent strings of purely ASCII characters is allowed, but discouraged. In rare cases it may be necessary to encode ordinary text that looks like an encoded-word.