P.S. Does utf8 support surrogates? Surrogate pair is definitely the
No. Surrogates are solely for UTF-16. There's no need for surrogates
in UTF-8 -- if we wanted to encode U+D800 using UTF-8, we *could* --
BUT we should not. Encoding U+D800 as UTF-8 should not be attempted,
the whole surrogate space is a discontinuity in the Unicode code point
space reserved for the evils of UTF-16.
--
$jhi++; # http://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen