"Bruce" == Bruce Lilly <blilly(_at_)erols(_dot_)com> writes:
Bruce> That would only work if it could be absolutely guaranteed that
Bruce> no untagged data in any other character set might appear. No
Bruce> such guarantee is possible, and in fact Usenet abounds with
Bruce> untagged charsets, of which, according to Andrew Gierth, "no
Bruce> significant amount" is utf-8.
and as I also pointed out on USEFOR, the untagged utf-8 that _does_
appear can be distinguished from the other untagged 8-bit charsets by
means of a trivial heuristic with an extremely low error rate (no
false negatives, very few false positives).
--
Andrew.