----- Original Message -----
From: "Masataka Ohta"
To: "Tom.Petch" <sisyphus(_at_)dial(_dot_)pipex(_dot_)com>
Cc: "Ned Freed" <ned(_dot_)freed(_at_)mrochek(_dot_)com>; "ietf"
Sent: Wednesday, December 28, 2005 2:05 PM
Subject: Re: Troubles with UTF-8
The Unicode data I am thinking of may have come from an upper layer protocol
needs to be passed transparently (as with an error or hello message,
even); it may or may not already be NUL-terminated (ever had that security
foul-up where some userid/password are entered/stored NUL-terminated and
are not?) - hence I see the need to terminate the string in some other way,
to escape or in some other way transfer encode (parts of) the string. I
at existing RFC, found many different approaches, all viable but none that
really said to me 'this is good engineering, this is best practice'. Hence,
floating the issue to see if there were any better ones out there. I think
which is of itself worth knowing.
You can do nothing.
That problem is that Unicode is stateful with complex and
indefinitely long term states, which is a lot worse than
properly profiled ISO 2022 such as that of RFC1468, which
is the character encoding most widely used for Japanese.
Unicode is not even finite state, which means some pattern
matching and normalization problems are hard or insolvable.
OTOH, if you start from scratch, you can have encoding with
a lot shorter term and finite states.
You've lost me here. I don't understand the use of state in the context of
Unicode (perhaps because I have just been reading about stateless serverside TLS
resumption:-). I suspect that in the context of transporting
param = value
with value being supplied by an upper layer protocol in UTF-8 then it is not a
concern. There is always base64 or quoted-printable:-(
Ietf mailing list