Russ Allbery wrote:
It's best to think of all 8-bit character data as encoded. UTF-8 is
just as much an encoding as RFC 2047.
[...]
* Assuming that the world agrees on Unicode (there don't seem to be any
other viable options for a universal character set used all over the
world), there are at least three major encodings (UTF-8, UTF-16, and
UTF-32) that can be used with it, as well as a bunch of other minor
ones.
It's worse than that; there are at least 3 different versions of UTF-8.
They differ in the longer multi-byte sequences.