ietf-822
[Top] [All Lists]

Re: RFC 2047 and gatewaying

2003-01-10 16:15:11

Andrew Gierth wrote:
"Bruce" == Bruce Lilly <blilly(_at_)erols(_dot_)com> writes:


 Bruce> That would only work if it could be absolutely guaranteed that
 Bruce> no untagged data in any other character set might appear.  No
 Bruce> such guarantee is possible, and in fact Usenet abounds with
 Bruce> untagged charsets, of which, according to Andrew Gierth, "no
 Bruce> significant amount" is utf-8.

and as I also pointed out on USEFOR, the untagged utf-8 that _does_
appear can be distinguished from the other untagged 8-bit charsets by
means of a trivial heuristic with an extremely low error rate (no
false negatives, very few false positives).

Two observations:
1. if only an insignificant amount (of untagged utf-8) currently appears,
   extrapolating the rate of error rates is quite risky; i.e. one might
   find much higher error rates if more untagged utf-8 were used.
2. non-zero error rates are probably acceptable for non-critical
   purposes (e.g. display), but are generally unacceptable for critical
   use (as in transmission via gateways to/from domains where strict
   transmission protocols are in effect).


<Prev in Thread] Current Thread [Next in Thread>