ietf
[Top] [All Lists]

ABNF Re: Troubles with UTF-8

2005-12-24 12:31:43
Dave

Is this an ok use of RFC4234?  Reading it, I am not clear whether U+FEFF should
be
specified as %xFE %xFF or whether %xFFEF is ok?  And what is the ABNF for any
possible ISO 10646 character, all 97000 of them?

Tom Petch

----- Original Message -----
From: "Ned Freed" <ned(_dot_)freed(_at_)mrochek(_dot_)com>
To: "TomPetch" <sisyphus(_at_)dial(_dot_)pipex(_dot_)com>
Cc: "ietf" <ietf(_at_)ietf(_dot_)org>
Sent: Friday, December 23, 2005 7:13 PM
Subject: Re: Troubles with UTF-8
<snip>

B) Code point. Many standards are defined in ABNF [RFC4234] which allows
code
points to be specified as, eg,  %b00010011 %d13 or %x0D none of which are
terribly Unicode-like (U+000D).  The result is standards that use one
notation
in the ABNF and a different one in the body of the document; should ABNF
allow
something closer to Unicode (as XML has done with &#000D;)?

ABNF is charset-independent, mapping onto non-negative integers, not
characters. Nothing prevents a specification from saying that a given ABNF
grammar specifies a series of Unicode characters represented in UTF-8 and
using
%xFEFF or whatever in the grammar itself.

<snip>


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>