This discussion resembles much of the arguments in the pre-MIME days:
"8-bit clean MTAs is enough, fix that and we're done". I didn't agree.
Now, I can't resist a followup on the "MALAYALAM" example... In
http://www.unicode.org/charts/charindex.html {,2,3}
I looked for
cat charindex*.html | egrep '>2E|2E<'
Unless I'm misstaken 0x2E == US ASCII '.'. Of course, this character
never occurs inside the DNS itself, since it's the separator and is
coded as such in all DNS transactions (implicitly coded in the length
fields, rather).
But, how are "8-bit clean" applications supposed to handle domain
names where these symbols occur mixed with "normal" US ASCII domain
symbols & separators - think of
www."PEACE SYMBOL".org
www 0x2E 0x26 0x2E 0x2E org
(this discussion could have made use of some of that :-). So, there
is a need to encode anyway. My 10 öre: Let's stick with an encoding
that is "as little destructive as possible", i.e. 7 bit US ASCII, and
leave all the fancy improvements for the future.
Gunnar Lindberg
ALL AROUND-PROFILE 232E
BELOW, COMBINING BREVE 032E
BREVE BELOW, COMBINING 032E
CIRCLE, PLUS SIGN IN RIGHT HALF 2A2E
CJK Phonetics and Symbols Area 2E00
CJK Radicals Supplement 2E80
COMBINING BREVE BELOW 032E
CONTOUR INTEGRAL 222E
decimal point 002E
dot 002E
ESTIMATED SYMBOL 212E
FULL STOP 002E
HALF CIRCLE, PLUS SIGN IN RIGHT 2A2E
INTEGRAL, CONTOUR 222E
OVERRIDE, RIGHT-TO-LEFT 202E
PEACE SYMBOL 262E
period 002E
Phonetics and Symbols Area, CJK 2E00
PLUS SIGN IN RIGHT HALF CIRCLE 2A2E
PROFILE, ALL AROUND- 232E
Radicals Supplement, CJK 2E80
RIGHT-TO-LEFT OVERRIDE 202E
rlo 202E
Symbols Area, CJK Phonetics and 2E00