Robert Allerstorfer <roal(_at_)anet(_dot_)at> writes:
I agree that perl should accept all the IANA names.
As for the default names _I_ decided to use MIME name as prefered name
when it existed - they seemed to be more "usable" (less embedded or at
least more systematic-looking punctuation, more familiar from e-mail
and HTTP headers etc.) We can revisit that if people think it would
help.
Yes, I also think that the MIME names, if existing, are prefered. But,
continuing my example of 'shiftjis' used as default name by Encode,
this is not true.
Whoops - you are right I had missed the _ removal. I think this is
a result of the historical fact that very early Encode was based
on Tcl's data (and to a lesser extent code) and Tcl uses "shiftjis"
or rather their file is ".../library/encoding/shiftjis.enc".
Tcl has/had two things which added "spin" to its names:
A. At least once-upon-a-time it was fitting in an 8.3 DOS-oid filename space
B. Some of its encodings are targetted at X11 font encodings - hence
its 'jis0212' is a 16-bit fixed-length font-fiendly one
which "we" call 'jis0212-raw',
If you watch the entry of MIBenum 17 at
http://www.iana.org/assignments/character-sets
its preferred MIME name is 'Shift_JIS'. If there is a name marked as
'preferred MIME name' by IANA, this name is the recommended one. This
also meets the W3C guidelines. W3C also recommends to use them all in
lowercase. Since they are case insensitive, I don't see any advantage
in not using them in all lowercase. The only allowed aliases for
shift_jis approved by IANA are 'MS_Kanji' and 'csShiftJIS', but not
'shiftjis'.
I concur. We should change the name _in_ our .ucm file, possibly
_of_ our .ucm file (thoug that is not really important to our scheme).
Another example where Perl meets IANA's convention as well as their
'preferred MIME name' is MIBenum 4 which official name is
'ISO_8859-1:1987' but the preferred MIME name is the alias
'ISO-8859-1'. I would find it useful if Encode would be revised to
know all names listed in the IANA list mentioned and default to their
preferred MIME names, all in lowercase. Maybe the unique ID number
("MIBenum") could also be taken into account.
I have no objection to that - and I doubt Dan will either.
Would you care to at least enumerate the cases we fail - or ideally
provide patch(es) ?
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/