ietf-822
[Top] [All Lists]

Re: text --> IA5 ?

1991-04-09 07:17:46
Vincent's latest message was, as usual, thoughtful and reasonable.  It
also worries me.

The problem appears to be what I would characterize as the attempt to
*combine* two different content-types.   In this case, we have the
apparently reasonable request for nroff source with an international
character set.

This seems to violate a very fundamental assumption that we've been
making all along, which is that mail messages (or parts of messages) are
uniquely typed.  Here we have a sort of "multiple inheritance" problem,
and at least two radical solutions might spring to mind:

-- make character sets & content types independent (Vincent mentioned
this one, as did Timo earlier)
-- allow somehow for general-case multiple or cascaded content-types

I'm very reluctant to go down either of these paths, because I think
that there is a LOT of potential complexity here to support what I
really think are likely to be "pathological" cases.  In this particular
example, nroff is an ancient and poorly-defined "rich text"
representation.  Unlike most such representations, it doesn't really
explicitly address character set issues, and so it is almost necessary
to think of nroff and character sets as independent.  However,  this is
NOT the case for most modern text representations.  For example, when
Andrew (which is by no means state-of-the-art in its dealing with
multiple charactersets) sends messages in non-standard character sets,
it uses a representation for them that means that the whole overall
message is in US ASCII.  In almost all other cases that I know of, rich
text formats explicitly deal with character set issues, as well they
should.  That is, there is a standard file format, for each of these
representations, that encodes character sets among other things.  This
is NOT the case for nroff, of course, for which charsets are
"out-of-band" information.

The question that remains, I think, is a simple one:  how many
troublesome cases like Vincent's example will there really be, and how
important are they?  Without much evidence to support it, I have a gut
feeling that the answer is "not too many and not too important."  If
this is the case, we can handle them much more simply with a small
proliferation of content-types, e.g.:

Content-Type: nroff/iso-8859-1; null; tbl, ms

In other words, if there aren't too many such cases, we can handle them
by defining different content-types to handle the charset-variations
within a type such as nroff.

Anyway, the argument above is my gut response, which is strongly
motivated by a desire not to open new cans of worms at this stage.  I
fully realize that there are some cans that just have to be opened, and
I suppose I might yet be convinced that this is one of them, but I'm not
convinced just yet...  -- Nathaniel

<Prev in Thread] Current Thread [Next in Thread>