ABNF Re: Troubles with UTF-8

Dave

Is this an ok use of RFC4234?  Reading it, I am not clear whether U+FEFF should
be
specified as %xFE %xFF or whether %xFFEF is ok?  And what is the ABNF for any
possible ISO 10646 character, all 97000 of them?

Tom Petch

----- Original Message -----
From: "Ned Freed" <ned(_dot_)freed(_at_)mrochek(_dot_)com>
To: "TomPetch" <sisyphus(_at_)dial(_dot_)pipex(_dot_)com>
Cc: "ietf" <ietf(_at_)ietf(_dot_)org>
Sent: Friday, December 23, 2005 7:13 PM
Subject: Re: Troubles with UTF-8
<snip>

B) Code point. Many standards are defined in ABNF [RFC4234] which allows

code

points to be specified as, eg,  %b00010011 %d13 or %x0D none of which are
terribly Unicode-like (U+000D).  The result is standards that use one

notation

in the ABNF and a different one in the body of the document; should ABNF

allow

something closer to Unicode (as XML has done with &#000D;)?


ABNF is charset-independent, mapping onto non-negative integers, not
characters. Nothing prevents a specification from saying that a given ABNF
grammar specifies a series of Unicode characters represented in UTF-8 and

using

%xFEFF or whatever in the grammar itself.

<snip>


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: Troubles with UTF-8, JFC (Jefsey) Morfin

Next by Date:

Re: Troubles with UTF-8, Tom.Petch

Previous by Thread:

Re: Troubles with UTF-8, JFC (Jefsey) Morfin

Next by Thread:

Re: Troubles with UTF-8, Tom.Petch

Indexes:

[Date] [Thread] [Top] [All Lists]