spf-discuss
[Top] [All Lists]

Re: draft-schlitt-spf-00pre2 suggestions

2004-10-17 13:02:55
In <1098003412(_dot_)13235(_dot_)172(_dot_)camel(_at_)alakazee> Andy Bakun 
<spf(_at_)leave-it-to-grace(_dot_)com> writes:

Issue 1:
        domain-spec    = *( macro-string )
                         1*( ( "." 1*ALPHA ) / macro-expand )
        macro-string   = *( macro-expand / macro-literal )
        
        macro-expand   = ( "%{" ALPHA transformer *delimiter "}" )
                         / "%%" / "%_" / "%-" / macro-expand-invalid
        macro-expand-invalid  = "%" ( %x21-24 / %x26-2C / %x2E-5E /
                                 %x60-7E )
                                 ; visible characters except "%", "_", 
                                   and "-"
        macro-literal  = %x21-24 / %x26-7E
                       ; visible characters except "%"

To make sure I'm reading this correctly, I want to clarify:
        
        1. the fact that a domain-spec can start with a period is kind
        of hidden in macro-literal

Yes, that has always been true.  

        2. a domain-spec that does start with a period matches both of
        the concatenated subexpressions of domain-spec

yes, a domain of ".foo" has always been syntactically allowed by the
SPF grammars.

        3. a macro is not allowed immediately after a period (all
        periods must be followed by 1*ALPHA) anywhere else than the
        start of the string.

This is not true.  It would be matched by the "*( macro-string )" part
of the above grammar.

        4. double leading periods are allowed (expansion of
        "*(macro-string)" and the concatenation of "macro-string" and
        '"." 1*ALPHA'

Yes, a domain of "..foo" has always been syntactically allowed by the
SPF grammars.


        5. a domain-spec can start with a slash (see issue 2 below)

Yep.  That is allowed (but ambigous) in spf-draft-200406 and allowed
(unambigiously) in draft-schlitt-spf-00*


Correct?  The first one is okay, the second is confusing, and the third
one goes against the examples in 8.2 (the others are just things I
noticed).  Also, 8.2 could use an example matching:

As I mentioned above, the third one is not correct and matches the
examples.


Issue 2:
In section 8.1, the sentence:

        Note that the two different macro contexts, domain-spec, and
        macro-string allow slightly different sets of legal visible
        characters.

parses funny because of a comma placement.

Thanks, fixed.  This sentence has been deleted since domain-spec and
macro-string both allow slashes in draft-schlitt-spf-00 (and
spf-draft-200406). 



Issue 3:
A review of RFC2234 says that the construct "<a>*<b>element" where <a>
and <b> are not given default to 0 and infinity respectively.  I can not
resolve the multiple leading variable repetition operators in:

        domain-spec    = *( macro-string )
                         1*( ( "." 1*ALPHA ) / macro-expand )
        macro-string   = *( macro-expand / macro-literal ) 

I can not figure out why the definition of macro-string would/could
match any of the following:

        macro-expand macro-expand
        macro-expand macro-expand macro-literal
        macro-literal macro-expand macro-literal
        etc...

Uh, because macro-string can be repeated zero or more times, and "/"
means alternation (regular expressions use "|" or "\|" instead of
"/").  I'm pretty sure I don't understand what you don't understand
and so I don't understand how to help you.  (other than by writting
confusing sentences. ;-)


[domain-spec definition]
The leading variable repetition operator is redundant.  I think it
should be removed from macro-string, since domain-spec is meant to match
the whole entity, and macro-string is meant to only match a specific
substring of that.

I think the redundant repetition operator should be removed from
domain-spec, not macro-string, and have done so in
draft-schlitt-spf00.  macro-string needs to keep its repetition
operator because it is used elsewhere.


Issue 4:
The example in 8.1:
        
        A '%' character not followed by a '{', '%', '-', or '_'
        character MUST be interpreted as a literal '%'. Domains MUST NOT
        rely on this feature; they MUST escape % literals. For example,
        an explanation TXT record 
        
                Your spam volume has increased by 581%
                
        is incorrect. Instead, say 
        
                Your spam volume has increased by 581%%

is kind of ambiguous, as the end of the string comes after the % in the
incorrect one.  I agree that the null-width-assertion-$ does satisfy
"not followed by a '{', '%', '-', or '_' character", but an example with
an actual character at the end might make it clearer to those not well
versed in regular expressions (where you do this kind of stuff all the
time).  Might I suggest:

    Domain forgeries have been reduced by 581%%, according to our logs.

Or even sticking a period at the end of the current example (I also
think it might be wise to avoid mentioning spam since the immediate goal
is not to reduce spam, but I digress).

Good points about giving an example with a messy case and using the
word "spam".  I notice that I need to fix up the ABNF even more to
allow for the trailing "%" to be valid.  In particular, it should be
noted while a trailing % gets left as a %, "% " is never allowed
because a space is always used to separate mechanisms.

So, the SPF record "v=spf1 a:foo%_bar.com" contains the single
mechanism a:, with a domain that contains a space.  The SPF record
"v=spf1 a:foo% bar.com" contains two mechanism, the second one being
the invalid mechanism "bar.com", and the % is at the end of the
macro-string, it does is not the invalid percent escape of "% ".
A "% " sequence is never possible to encounter.

Oh, wait, check that.  In explanation TXT RRs, a "% " *is* possible to
encounter, so the ABNF for all this stuff needs to be made even messier.


This stuff is *UUUuugly*.

Again, I'll ask for comments about whether support for this invalid %
escape stuff that MUST NOT ever be used should be just eliminated from
my doc and as a result, create syntax errors instead.

I just checked my survey of SPF records and there *are* a few cases
where people have writting things like "%i", "%(i)", "{%i}", or even
"%.{d}" and all of these would become syntax errors.  Granted, none of
these currently do anything close to what I think the domain owners
intended with the current spec, but now they are silently ignored.


Issue 5:
In Section 8.1:

        If transformers or delimiters are provided, the replacement
        value for a macro letter is split into parts. After performing
        any reversal operation and/or removal of left-hand parts, the
        parts are rejoined using "." and not the original splitting
        characters. 
        
        By default, strings are split on "." (dots). Note that no
        special treatment is given to leading, trailing or consecutive
        delimiters, and so the list of parts may contain empty strings.
        Macros may specify delimiter characters which are used instead
        of ".". Delimiters MUST be one or more of the characters: 
        
                "." / "-" / "+" / "," / "/" / "_" / "="
          
There is a disconnect here between the words "delimiters" and "original
splitting characters" in the first paragraph above and the phrase 'split
on "."' in the second, in such a way that I originally though the
examples in 8.2 were wrong.  I don't think it's clear enough that the
specified delimiter characters are used to explode the the string, not
used for rejoining (which there is no way to specify).  I think part of
the problem is that the first paragraph talks about latter steps and the
second paragraph talks about earlier steps.  Might I suggest a
rephrasing, putting the series of steps to be taken closer to each other
and in order:

        If transformers or delimiters are provided, the replacement
        value for a macro letter is split into parts, using "." by
        default or, if provided, the specified delimiter characters,
        taken from the set:
        
                "." / "-" / "+" / "," / "/" / "_" / "="
                
        Using a sole delimiter of "." is equivalent to not specifying a
        delimiter at all.  Transformations, either reversals and/or
        removal of left-hand parts, are then applied.  The parts are
        then rejoined using "." and not any of the original delimiters.
        
        Note that no special treatment is given to leading, trailing, or
        consecutive delimiters, and so the list of parts may contain
        empty strings.

I didn't snip any of your long comment in order to give people the
complete context.

I've been over (and rewritten) this particular section with Meng so
many times, that I find all of it equally confusing.  I appreciate a
fresh pair of eyes on this section.  If others agree that the
suggested text is clearer, I will use it instead.



Issue 6:
I think the examples section in 8.2 could use an example of leading,
trailing or consecutive delimiters, to actually show what an empty
string will look like:

        <sender> is strong--bad(_at_)email(_dot_)example(_dot_)com
        
        %{l}                 strong--bad
        %{l-}                strong..bad

An example of multiple delimiters (since macro-expand contains
"*delimiter") would also be helpful:

        <sender> is j(_dot_)random+folder(_at_)email(_dot_)example(_dot_)com
                
        %{l}                 j.random+folder
        %{lr}                random+folder.j
        %{l.+}               j.random.folder
        %{lr.+}              folder.random.j

Hmmm...  As Mark likes to say, the spec should not be a how-to, but I
can see your point.  I'll have to think about this one a little more
and see...


-wayne