ietf-822
[Top] [All Lists]

Re: Unicode newsgroup name options

2003-02-21 15:42:06

On Fri, 21 Feb 2003, Russ Allbery wrote:
   We know that some existing software will work with
   UTF-8 newsgroup names out of the box without modification, although it
   will require some tweaking for ideal operation.

Step back from the tree to see the forest.  "Some tweaking" includes such
small details as not assuming that one byte means one screen space, which
has all sorts of implications in GUI programs and screen editors.  Don't
forget that you don't just have variable length characters in UTF-8, but
you also have variable width characters in Unicode, plus characters which
interact with other characters in interesting ways.

A bit more than a "tweak".

Stepping back further is the realization that it won't work in any
reasonable way if the program doing the actual display and user input
(whether terminal emulator or the news reading application itself) is not
configured for UTF-8.

   By comparison,
   punycode (C) we know won't work correctly with *any* existing software;
   the only reason why that column is a D instead of Y is that users can
   use the funny-looking encoded names and still participate in the
   groups.

Just as "some tweaking" was understated, "won't work" is overstated here.

Punycode names will work perfectly well as ASCII names.  If the program
doing the actual display and use input is not cognizant of punycode, it
will fall back on ASCII display/input that will work in the same way
around the world.

The wildmat problem is a red herring.  Wildmat implementations need to be
cognizant of Unicode in far more substantial ways than merely overcoming
punycode issues.  A well-thought-out stringprep requirement will help
some, but then the stringprep has to be implemented.

-- Mark --

http://staff.washington.edu/mrc
Science does not emerge from voting, party politics, or public debate.