Dreaming about replacements (was IDN (was Did anyone tell Microsoft ye


moore(_at_)cs(_dot_)utk(_dot_)edu (Keith Moore)  wrote on 30.04.02 in 
<200204301731(_dot_)g3UHVHe18588(_at_)astro(_dot_)cs(_dot_)utk(_dot_)edu>:

and there's so little benefit in "pure UTF-8" that it's conceivable
that it would never be worth the trouble to transition to it.  for
example, "pure UTF-8" wouldn't rid the user agent of the need to
parse message header fields and to treat different parts of a
structured field in different ways.


Hmm. So *if* the real solution is a completely new format, that should  
address these things, too.

So what we would need - at least for headers - is some format that has a  
consistent scheme for structured information.

While not universally loved, that immediately brings to mind XML. Which  
conveniently also has an existing solution to the character set problem.  
AND many existing implementations.

I assume data would still have MIME types, the parameters would just be  
encoded differently - I see no reason to invent a different data typing  
scheme here.

Hmm. But the right way to combine possibly binary blobs with these headers  
isn't at all clear. We'd certainly like to avoid yet again ASCII-encoding  
binary data - that's one of the things we'd like to get away from, after  
all! So no gigantic CDATA sections.

Possibly the correct thing is to follow one XML blob (with everything that  
in a MIME message is in headers, including internal ones in multiparts)  
with a number of binary blob, possibly with a HTTP-1.1-like minimal length  
encoding ... hmm. Can't say I like that a lot ... but MIME-header-like  
versions don't really seem to be any better either.

Comments?

MfG Kai