spf-discuss
[Top] [All Lists]

Re: ANNOUNCE: SRS v0.15 documentation and code

2004-02-11 11:58:56
In 
<Pine(_dot_)LNX(_dot_)4(_dot_)53(_dot_)0402110051270(_dot_)29116(_at_)astray(_dot_)com>
 Shevek <spf(_at_)anarres(_dot_)org> writes:

I would appreciate feedback on both the documentation and the code.

Ok, I've thought about this some more.


First, I actually went and checked and according to RFC2821, the local
part *MAY* be case sensitive, so technically, MTAs can't change the
case, but I think we can't assume this.  So, I think base64 encoding
is out, and base36 is probably pretty reasonable.


Secondly, I think a few characters can be squeezed out of the format.

By combining the hash and the timestamp, both binary data, into
one field that is base64 encoded would save at least one character
(the "+" that currently separates them) and possibly a second
character by reducing the number of bits used by the timestamp.

If we really wanted to, we could also remove the "+" between the
SRS[01] and the first domain name and the "+" between the
hash/timestamp and the second domain.

So, if we really wanted to, we could remove 4 characters that are
using 6% of the 64 character "limit" on the local part.


Meng said that he would try to gather up some statistics on how
frequency of different lengths of envelope-froms.  Depending on what
this data shows, saving a few characters may be very important, or
completely irrelevant.  


Ok, one other thing that is nagging me about this proposal is that I'm
not certain that this encoding is unambiguous.  It makes me nervous
that strings supplied by a third party are left unquoted in the
local-part.   In particular, the local-part can contain really nasty
looking stuff, including things that look like domains.

However, I've been checking this, and I can't find any problems.  The
old local-part is at the very end rewritten local-part (by design, I'm
almost certain) and the valid characters in a domain name are pretty
limited.  RFC2821/2822 say that "+" is not valid in a domain, so it
can be used as a safe delimiter.

So, I still feel nervous about this part, but I can't find anything
wrong with it.

I guess the other thing to ponder is if there are MTAs out there that
are violating the RFCs and using "+" in a domain name.  It *is* a
valid domain name, according to RFC1035, but I don't know if you could
get away with using such a domain name as a valid mail hostname in the
real world.

Maybe Meng could dig up some real-world data on the types of
characters that are found in the domain names, in particular, if he
has ever seen a "+" in one.


All-in-all, I think the SRS system looks very good.  Shevek has
cleared up the cpu usage issue.  The only thing I still have questions
about is how big of a problem the 64 character local-part limit is
going to be.


-wayne