Bruce Lilly writes:
If you think carefully and objectively about why untagged 8-bit content
in charsets other than utf-8 is bad, you will realize why untagged utf-8
is also bad.
Untagged UTF-8 is good. It is more and more widely supported by readers,
and will eventually be the only format produced by writers. It works.
In contrast, a mess of multiple character encodings---even with tags---
is bad because of its unnecessary complexity. Throw away tags and it's
a disaster: you can't decode it.
---D. J. Bernstein, Associate Professor, Department of Mathematics,
Statistics, and Computer Science, University of Illinois at Chicago