On Friday, April 19, 2002, at 05:01 , Nick Ing-Simmons wrote:
I am not sure when the change went in, but current Encode.xs
has broken Tk804.
Ouch.
With $encoding->decode($string,1)
now croaks if character does not map. Croaking is fine as a default
for checking but Tk would like a value of check which does not croak,
but just returns leaving $string starting with the failing character.
I could do a G_EVAL but that is a lot of overhead, and does not tell me
which character position failed (unless $string is updated before
the croak.)
Yikes. I DID fix the behavior as documented. But it was not just
Encode::CN::HZ that was taking advantage of UNDOCUMENTED feature after
all :).
(Tk does 10,000s of probes - found a character XXXX, have font
with encoding YYYY, can YYYY encode XXXX ? I hope to reduce that
number by refining the code but it will still do a lot)
With current Encode I don't get to try any interesting fonts
because it croaks when Tk asks iso-8859-1 if it can do the interesting
character :-(
~!(_at_)#$%^&*()_+ (My feeling expressed in octet stream :)
Right now we have:
check == 0, fallback char (New and overdue - thanks!)
check == -1, perlqq \X{xxxx} style croak
Ah, it does not croak. It FALLS BACK that way.
otherwise \N{U+XXXX} style croak
(Did \N{U+XXXX} get (back) in ? - I seem to recall it got removed once.)
Didn't touch that part.
You have established the principle of check values meaning something
(which was always the plan).
Can I suggest though that we make it a bit mask - a stab at an initial
set of bits :
check == 0 - fallback
(check & 3) == 1 - croak
(check & 3) == 2 - warn
(check & 3) == 3 - silent return
(check & 4) - \x{xxxx} vs \N{U+XXXX}
If you like make $string adjustment optional
check & 8 - Update Don't bother to update $string.
Looks good to me. Maybe I should add constants for that. Maybe I would
modify which bits means what, however.
Thus
check == 0 - fallbacks
check == 1 - \N{U+XXXX} croak
check == 2 - \x{XXXX} croak
check == 3 - silent fail
chack == 4 - Uninteresting
check == 5 - \N{U+XXXX} warn
check == 6 - \x{XXXX} warn
check == 11 - silent fail with $string updated (What Tk wants)
Better schemes welcome.
What a good timing. I was about to release the next version. I'll take
a shower, implement them, possible add test suits for them before the
release.
Another alternative hinted at in old pods was passing check as an SV.
Then if SV was a scalar ref, then set $str to point at fail and return
reason code in the scalar.
This one is very attractive but too attractive when code freeze is
near. So let's go bit masks for the time being.
PS:
To pick nits - Encode.xs's "layout" looks rather peculiar
with perl source's default tab setting of 8 and expected indent of 4,
and many of files you have touched now have trailing whitespace
on ends of lines.
I've noticed that. Trailing spaces must be due to patches after patches
applied (When you paste directly that happens. That has already been
fixed in the upcoming version
(I applied "indent-buffer" in Emacs :).
Dan the Encode Maintainer