Nicholas Clark <nick(_at_)ccl4(_dot_)org> writes:
On Mon, Mar 18, 2002 at 08:01:55PM +0900, Dan Kogai wrote:
That reminds me of this this question. What is a (de jure|de facto)
standard for fallback character? Is it up to each module?
FYI My humble Jcode uses "blank square" (aka Tofu) and MacOS X uses
single '?'.
links uses '*', which I find easier to read than '?'
(for all those *****y MSHTML pages that allege ISO-8859-1 and then use
MS sexed '', where the server should be reporting the page as Windows charset)
I believe that the fallback should be configurable on a per-something
basis (per charset?) which then leaves us debating what the default should
be.
The existing encoding mechanisms provide a fallback character on a per-encoding
basis for Unicode->xxxx direction. It is usually '?' for ASCII-oids.
What is not yet clear is how the API should enable that vs stopping vs ...
I agree that xxxx->Unicode should use U+FFFD - which is what it is for.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/