spf-discuss
[Top] [All Lists]

[spf-discuss] Re: Error in RFC4408: URL encoding

2006-10-01 06:24:16
wayne wrote:
 
it looks like we should never have mentioned "uric" at all.
Is "unreserved" the correct set to use?

| 3986 UNRESERVED3: ALNUM           - . _ ~

With that we'd percent-encode everything minus LDH and "._~".

Characters that must be percent-encoded are CTL and SP.  

And I thought that  " < > \ ^ ` { | } also "must" be encoded.
But RFC 4622 happily allows almost everything unencoded, the
"must" depends on the scheme, and the part of the URI.

For <uric> I found letters plus digits plus...
2396 URIC : !   $ % & ' ( ) * + , - . / : ; = ? @     _ ~
3986 URIC3: ! # $ % & ' ( ) * + , - . / : ; = ? @ [ ] _ ~

#-fragments are now (3986) considered as part of the URL.  But
any # before the fragment has to be encoded.  Unless you parse
the URL (depending on the scheme, some schemes have no concept
of fragment) you can't know what to do with a #.

The other difference betweeen 2396 and 3986 are [ and ], that's
for IPv6-literals in the "authority" part (host, port, etc.) of
an URL.  Unless you parse the URL (see above, some schemes have
no concept of authority, e.g. im:, pres:, mailto:)

<uric> is at least mentioned in appendix D.2 of 3986, that it's
wrong for our purposes isn't our fault: 

3986 URIC_D2:     $ % &         + , - . / : ; = ? @     _ ~
3986 URIC3  : ! # $ % & ' ( ) * + , - . / : ; = ? @ [ ] _ ~

Again excluding letters and digits.  URIC3 is what I got as
union of all sets explicitly defined in RFC 3986.  Based on
what RFC 4622 does it boils down to "leave all VCHAR as is
unless you know the scheme".  

For backwards compatibility I'd still also percent encode any
" < > \ ^ ` { | } no matter what RFC 4622 wants or says.

Frank


-------
Sender Policy Framework: http://www.openspf.org/
Archives at http://archives.listbox.com/spf-discuss/current/
To unsubscribe, change your address, or temporarily deactivate your 
subscription, 
please go to 
http://v2.listbox.com/member/?listname=spf-discuss(_at_)v2(_dot_)listbox(_dot_)com