Hi Ken,
I am wondering if the simplest solution is to put in isascii() in
front of those tests in that function. We only really care about
those tests returning "true" for ASCII characters. Thoughts?
Just some tests, really, on Linux of different locales. One multibyte,
the other two single, and one of those two with bytes which have the
same values as Unicode control ones.
$ iconv -f utf-8 -t utf-8 <<<'From: ÁßÇÐË <a>' | LC_ALL=en_GB.utf8
uip/mhbuild -
From: =?UTF-8?B?w4HDn8OHw5DDiw==?= <a>
MIME-Version: 1.0
Content-Type: text/plain
$ iconv -f utf-8 -t utf-8 <<<ÁßÇÐË | hd
00000000 c3 81 c3 9f c3 87 c3 90 c3 8b 0a |...........|
0000000b
$ base64 -d <<<w4HDn8OHw5DDiw== | hd
00000000 c3 81 c3 9f c3 87 c3 90 c3 8b |..........|
0000000a
$
$ iconv -f utf-8 -t iso-8859-1 <<<'From: ÁßÇÐË <a>' |
LC_ALL=en_GB.iso-8859-1 uip/mhbuild -
From: =?ISO-8859-1?B?wd/H0Ms=?= <a>
MIME-Version: 1.0
Content-Type: text/plain
$ iconv -f utf-8 -t iso-8859-1 <<<ÁßÇÐË | hd
00000000 c1 df c7 d0 cb 0a |......|
00000006
$ base64 -d <<<wd/H0Ms= | hd
00000000 c1 df c7 d0 cb |.....|
00000005
$
$ iconv -f utf-8 -t cp1252 <<<'From: †‡•œ <a>' | LC_ALL=en_GB.cp1252
uip/mhbuild -
From: =?WINDOWS-1252?B?hoeVnA==?= <a>
MIME-Version: 1.0
Content-Type: text/plain
$ iconv -f utf-8 -t cp1252 <<<†‡•œ | hd
00000000 86 87 95 9c 0a |.....|
00000005
$ base64 -d <<<hoeVnA== | hd
00000000 86 87 95 9c |....|
00000004
$
Attached is locale.c which prints a lot of the local ctype.h values in
two forms. Its output looked ‘normal’ here.
--
Cheers, Ralph.
locale.c
Description: locale.c