Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:
That one I am not sure. I got mails of the opposite opinions asking
for strict RFC 2047 compliance (in Jcode), especially when line folding
was concerned. So I made Encode::MIME::Header RFC 2047 compliant. But
I agree that =20 instead of '_' maybe too much. Nevertheless, =20 is
exactly what RFC 2047 recommends;
RFC 2047
As a consequence, unencoded white space
characters (such as SPACE and HTAB) are FORBIDDEN within an
'encoded-word'.
I must re-read the RFC but I think I am saying "don't encode multiple
ASCII words as one UTF-8 word.
For example, the character sequence
=?iso-8859-1?q?this is some text?=
would be parsed as four 'atom's, rather than as a single 'atom' (by
an RFC 822 parser) or 'encoded-word' (by a parser which understands
'encoded-words'). The correct way to encode the string "this is
some
text" is to encode the SPACE characters as well, e.g.
=?iso-8859-1?q?this=20is=20some=20text?=
But likewise a traditional RFC822 Subject line
Subject: This is some text
_is_ 4 words
But
Subject: =?iso-8859-1?q?this=20is=20some=20text?=
Is one word.
(3) 8-bit values which correspond to printable ASCII characters
other
than "=", "?", and "_" (underscore), MAY be represented as those
characters. (But see section 5 for restrictions.) In
particular, SPACE and TAB MUST NOT be represented as themselves
within encoded words.
With this understood,
Suggestions:
- leave ASCII or even iso-8859-1 sequences as such
Only ASCII printable was allowed so I have to decline this one.
ASCII printable would solve most of my issues - my memory of RFC was
that iso-8859-1 was the "default" - if it is only ASCII then fine.
'MIME-Q' is already implemented that way. Bottom line is that I do not
want to give up RFC 2047 conformance.
Neither do I.
- wrap sequences of ch > 0xff in qhichever of 'Q' or 'B' is shorter
(do both encodings and throw one away).
I'll consider this one instead. This one at least does not breach RFC
2047.
Are patches in that direction likely to be accepted or do I build
a MIME-Smart on top ?
As I said, Encode::MIME::Header has those restrictions;
* the Encode API
* RFC 2047
This is very restrictive considering the nature of MIME Header
Encoding. Surprisingly the name space Encode::MIME itself remains
empty and maybe we can make use of it....
I probably will - there are a whole slew of Encode-oid issues with
body part of MIME.
Dan the Encode Maintainer
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/