Hello, i18n/l10n people.
I've released String::Multibyte v1.03.
This module provides some methods to manipulate
multiple-byte encoded string without Perl's Unicode support.
I.e. it can also run with Perl 5.003 or 5.005.
Newly the following Chinese and Korean encodings are supported.
Big5, Big5Plus, EUC-TW, GB18030, GBK, Johab, and UHC.
(Already UTF-8, UTF16-BE, UTF16-LE, EUC-(CN|KR), EUC-JP,
and Shift-JIS have been supported.)
Here is an example:
#!perl
use String::Multibyte;
$gb18030 = String::Multibyte->new('GB18030');
$gb18030_len = $gb18030->length(
"\xF7\xA1\x41\x98\x37\xA5\x37\x81\x40\x89\x32\xF9\x30");
print "$gb18030_len\n";
# you'll get 5.
$gb18030_sub = $gb18030->substr(
"\xF7\xA1\x41\x98\x37\xA5\x37\x81\x40\x89\x32\xF9\x30",
2, -1);
print $gb18030_sub;
# you'll get "\x98\x37\xA5\x37\x81\x40".
__END__
Though it will become available from CPAN soon,
it is available now from my website:
tarball
http://homepage1.nifty.com/nomenclator/perl/String-Multibyte-1.03.tar.gz
HTML-ized POD (in UTF-8)
http://homepage1.nifty.com/nomenclator/perl/String-Multibyte.html
Regards,
SADAHIRO Tomoyuki SADAHIRO(_at_)cpan(_dot_)org