On Wed, Mar 20, 2002 at 05:50:42PM +0300, Anton Tagunov wrote:
What we have is an ambigeous name. If we were meaning the 8 bit
encoding (that has a MIME name) it should have been GB2312, not
GB 2312.
This has been raised before, at:
http://archive.develooper.com/perl-unicode(_at_)perl(_dot_)org/msg00819.html
But we have the 7-bit encoding. What name should it have not
to be mistaken for the other one?
....in which I proposed to rename gb2312 to gb2312-raw to avoid the
ambiguity. The *other* 8-bit MIME encoding is euc-cn.
And yes, this is in disagreement to iconv and hc's conventions, in
which gb2312 is an alias to euc-cn, and the raw gb2312 is not
directly accessible:
* EUC-CN = GB2312
We implement this because it is the widely used representation
of simplified Chinese.
Thus, the "=?GB2312?B?0LvQu8Tjo6E=?=" spam received by NI-S is
not encoded in perl's GB2312, but is "Thank you!" in EUC-CN.
Executive summary: Encode.pm isn't just for transport use, so we
have a namespace clash; neither GB2312 is 'more right' than the
other interpretation. But as the main use of Encode.pm would be
(imho) in IO disciplines and "use encoding;", I'd suggest:
- Retain the file gb2312.enc.
- Alias /gb-?2312/i to 'euc-cn'.
- Make a 'gb2312-raw' to point to 'gb2312.enc'.
Makes sense?
/Autrijus/
pgp20jujZYlNp.pgp
Description: PGP signature