--On Friday, January 03, 2003 11:38 AM -0800
ned+ietf-822(_at_)mrochek(_dot_)com wrote:
[...]
Newsgroup names are an interesting case because they aren't something
email
cares about. While an 8bit newsgroup name in some news-specific header
field may not survive a trip into and out of email, at least it doesn't
trash any of the email-specific pparts of the message.
An argument to have this one thing be in 8bit would still be a very tough
sell, but far far easier than arguing that it is OK to have 8bit fields
that are shared between email and netnews.
Newsgroup names are a very hard problem. Note that the way Usenet works
right now requires these get passed gracefully through e-mail: in moderated
newsgroups posts are e-mailed to the moderator.
I'm not sure I understand the NEED for the "canonical" name to be pure
UTF-8 (that is, encoded Unicode) instead of some other encoding.
A similiar problem happened in IMAP4 when people wanted to internationalize
mailbox names (which function similiarly or identically to newsgroups,
depending on perspective). In IMAP4's case, we decided on a modified UTF-7
encoding over the wire. This means servers have an option of the underlying
canonical name: they can store the mUTF-7, they can store UTF-8, they can
store UTF-16, they can store in a legacy charset (and refuse all mailboxes
with names that can't be encoding in the legacy charset). IMAP4 mailboxes
can also appear in IMAP URIs, in which case they are encoded in UTF-8 and
then hex-escaped. The IMAP4 URI spec included C code to do this conversion.
I don't think that introducing an all new encoding is definitely the right
choice for Usenet---of course, the draft currently does introduce a new
encoding but it's only used some of the time.
Larry