[Top] [All Lists]

Re: How to handle a lot of character set Content-types

1991-05-03 00:05:26
I find it distressing that there's so much intense discussion of how  
to deal with zillions of these codesets simultaneously using all  
kinds of baroque header information and scheming when a wonderful  
answer is lying around waiting to be picked up.  Just map your codes  
into something, like Unicode

Below is a plan for getting to the future using multiple codesets.
Its basic characteristic is that things keep going as they are except
that we add informative headers and stop using illegal 8-bit transport.
We then have a basis for moving into a future which could be based
on a universal codeset. What is the unicode plan? Unlike mine it
wouldn't go anywhere until a lot of things had changed quite a lot.

Bob Smart


Stage 1: 

Promulgate rfc-xxxx and rfc-zzzz. The former needs tidying up.  Both
seem well thought out to me. For my stage 2 to work we need to have
Content-types for all the Character sets in use today -- these
should perhaps be documented in the Assigned Numbers RFC.

Stage 2. 

Starting changing UAs to add Content headers, and to do 7-bit encoding
of 8 bit stuff. Start changing MTAs to: (a) talk the new 8-bit when
appropriate; (b) add Content headers on behalf of dumb UAs; (c)
Convert transport encoding to 7 bit where necessary and appropriate;
(d) Convert 7 bit encodings to 8 bit when sending to nearby UAs which 
the MTA knows can understand 8 bit but not Content headers. 

These changes don't require any UA to change the way they handle the
body of a message. In fact everything at this point will still be
working in the same imperfect limp-along way as it is today. There will 
not be illegal 8-bit transport, also messages will start to have 
headers so that we can work out what the body is meant to mean.

Stage 3.

Write sophisticated UAs that work with graphical interfaces to support
multiple character sets and other fancy stuff. 

Stage 4.

Having clarified the behaviour of SMTP with illegal verbs the IETF can
promulgate optional add-ons to smtp and rfc-822/xxxx in experimental
RFCs to introduce improved performance and functionality to
internet mail. These can then be assessed for later moving to
proposed and recommended status after operational experience. This
is where I see return-receipt coming in.

Stage 5.

Move to switch all UAs to a universal character set. I realise that
some people want to do this now, but that would be premature. Let's
give 10646 and unicode a chance to fight it out first.

Obviously these stages would overlap a fair bit.