To put the special character issue more succintly:
id-spec := local-part "@" addr-spec
; Globally unique
; Characters in [RFC1738] unreserved
; must be escaped with [RFC1738] escape
That says it all! The words trying to explain the "restriction" can
be eliminated. Latter on the draft would say
To transform a cidurl or midurl into a valid content-id or
message-id, replace the [RFC1738] escape sequences by the
actual character and surround the resultant string with the
enclosing brackets, i.e.,
content-id := "<" raw-spec ">"
message-id := "<" raw-spec ">"
raw-spec: := id-spec
; escape seqeuences replaced with
; actual characters
Seems pretty straigtforward to me. Systems are free to %hex encode
whatever they want.
Thanks.../Ed
On Thu, 02 Nov 1995 21:29:21 PST Larry Masinter wrote:
where id-spec is a restricted form of "addr-spec" as defined
in [RFC822] and hostname and uchar are defined in [RFC1738,
sec 3.1]. The purpose of the restriction on addr-spec is to
eliminate special characters from the cid URL. Such
characters, if required, can be encoded using the [RFC1738]
%xx hex encoding escape mechanism included in uchar.
This still seems awkward. The problem is that the RFC 822 productions
allow unsafe characters in all parts of the IDs, and have special
quoting and bracketing rules. You really have two choices: one is to
include the full production for id-spec in parallel with the
production of addr-spec in RFC 822, but then allow (%hex)-encoding of
the individual terminal elements, or else just to define
id-spec to be (%hex)-encoded addr-spec, and leave out the
id-spec = local-part "@" hostname
production completely. Why NOT allow "@" to be URL-encoded, too?
These things aren't really going to be parsed.