On Sun, Dec 05, 2004 at 11:58:54AM +0900, Dan Kogai wrote:
Sine Gisle's patch make use of utf8n_to_uvuni(), it seems to be a
problem of perl core. So I have checked utf8.c which defines that.
Seems like it does not make use of PERL_UNICODE_MAX.
The patch against utf8.c fixes that.
But breaks 2 core tests, t/op/tr.t and ext/Unicode/Normalize/t/illegal.t
--- perl-5.8.x/utf8.c Wed Nov 17 23:11:04 2004
+++ perl-5.8.x.dan/utf8.c Sun Dec 5 11:38:52 2004
@@ -429,6 +429,13 @@
}
else
uv = UTF8_ACCUMULATE(uv, *s);
+ /* Checks if ord() > 0x10FFFF -- dankogai */
+ if (uv > PERL_UNICODE_MAX){
+ if (!(flags & UTF8_ALLOW_LONG)) {
+ warning = UTF8_WARN_LONG;
+ goto malformed;
+ }
+ }
if (!(uv > ouv)) {
/* These cannot be allowed. */
if (uv == ouv) {
(this is utf8 mangled by an 8 bit terminal)
not ok 54 - translit w/complement
# Failed at t/op/tr.t line 229
Wide character in print at ./test.pl line 48.
# got 'ĬÃ
ÄĬÃ
Ä'
Wide character in print at ./test.pl line 48.
# expected 'ÄÃ
ÄÄÃ
Ä'
ok 55
ok 56 - translit w/deletion
ok 57
ok 58 - translit w/squeeze
ok 59
ok 60
ok 61
ok 62
ok 63 - UTF range
ok 64
ok 65
ok 66
ok 67
ok 68
not ok 69
# Failed at t/op/tr.t line 288
Wide character in print at ./test.pl line 48.
# got 'È'
# expected 'X'
not ok 70
# Failed at t/op/tr.t line 291
Wide character in print at ./test.pl line 48.
# got 'È'
# expected 'X'
and
not ok 91
# Failed test 91 in ext/Unicode/Normalize/t/illegal.t at line 53 fail #10
not ok 92
# Failed test 92 in ext/Unicode/Normalize/t/illegal.t at line 54 fail #10
not ok 93
# Failed test 93 in ext/Unicode/Normalize/t/illegal.t at line 55 fail #10
not ok 94
# Failed test 94 in ext/Unicode/Normalize/t/illegal.t at line 56 fail #10
ok 95
not ok 96
# Failed test 96 in ext/Unicode/Normalize/t/illegal.t at line 58 fail #10
not ok 97
# Failed test 97 in ext/Unicode/Normalize/t/illegal.t at line 59 fail #10
not ok 98
# Failed test 98 in ext/Unicode/Normalize/t/illegal.t at line 60 fail #10
not ok 99
# Failed test 99 in ext/Unicode/Normalize/t/illegal.t at line 61 fail #10
not ok 100
# Failed test 100 in ext/Unicode/Normalize/t/illegal.t at line 62 fail #10
not ok 101
# Failed test 101 in ext/Unicode/Normalize/t/illegal.t at line 53 fail #11
not ok 102
# Failed test 102 in ext/Unicode/Normalize/t/illegal.t at line 54 fail #11
not ok 103
# Failed test 103 in ext/Unicode/Normalize/t/illegal.t at line 55 fail #11
not ok 104
# Failed test 104 in ext/Unicode/Normalize/t/illegal.t at line 56 fail #11
ok 105
not ok 106
# Failed test 106 in ext/Unicode/Normalize/t/illegal.t at line 58 fail #11
not ok 107
# Failed test 107 in ext/Unicode/Normalize/t/illegal.t at line 59 fail #11
not ok 108
# Failed test 108 in ext/Unicode/Normalize/t/illegal.t at line 60 fail #11
not ok 109
# Failed test 109 in ext/Unicode/Normalize/t/illegal.t at line 61 fail #11
not ok 110
# Failed test 110 in ext/Unicode/Normalize/t/illegal.t at line 62 fail #11
ok 111
ok 112
I don't know what is at fault here, the tests, or the patch.
Nicholas Clark