spf-discuss
[Top] [All Lists]

Re: draft-schlitt-spf-00pre2 suggestions

2004-10-17 01:56:52
(Wayne, I'm glad you're back and excited about participating in whatever
capacity you see fit -- much better than not being around at all, in my
opinion)

I'm posting this to the list rather than sending to wayne directly
because I think some of these issues exist in other drafts (most likely
4, 5, and 6).  If I'm totally off my rocker, feel free to shoot me down
(but please clear up my confusion in the process).

On Sat, 2004-10-16 at 01:48, wayne wrote:
* The ABNF and wording for invalid percent escapes has been cleaned
  up, but I'm not sure if I like it.

  Right now, if you say "%a", which is an invalid percent escapement
  (only %-, %_ and %% are allowed), implementations are instructed to
  interpret it as the string "%a" instead of a syntax error.  Domain
  owners are told to never depend on this.  As a result, you have a
  bunch of messy standardization that doesn't really do much of
  anything.

  I would need to double check the actual deployed SPF records, but I
  strongly suspect that we could declare such constructs as syntax
  errors and not invalidate more than a handfull (if any) deployed SPF
  records.  It would certainly clean up the spec a little bit.
        
Issue 1:
        domain-spec    = *( macro-string )
                         1*( ( "." 1*ALPHA ) / macro-expand )
        macro-string   = *( macro-expand / macro-literal )
        
        macro-expand   = ( "%{" ALPHA transformer *delimiter "}" )
                         / "%%" / "%_" / "%-" / macro-expand-invalid
        macro-expand-invalid  = "%" ( %x21-24 / %x26-2C / %x2E-5E /
                                 %x60-7E )
                                 ; visible characters except "%", "_", 
                                   and "-"
        macro-literal  = %x21-24 / %x26-7E
                       ; visible characters except "%"

To make sure I'm reading this correctly, I want to clarify:
        
        1. the fact that a domain-spec can start with a period is kind
        of hidden in macro-literal
        
        2. a domain-spec that does start with a period matches both of
        the concatenated subexpressions of domain-spec
        
        3. a macro is not allowed immediately after a period (all
        periods must be followed by 1*ALPHA) anywhere else than the
        start of the string.
        
        4. double leading periods are allowed (expansion of
        "*(macro-string)" and the concatenation of "macro-string" and
        '"." 1*ALPHA'
        
        5. a domain-spec can start with a slash (see issue 2 below)

Correct?  The first one is okay, the second is confusing, and the third
one goes against the examples in 8.2 (the others are just things I
noticed).  Also, 8.2 could use an example matching:

        %{ir}a%{v}.example.com
i.e.:
        *( macro-string ) 1*("." 1*ALPHA)
in domain-spec.

Also, the examples in 8.2 seem to imply that macros need to be delimited
by periods (by design?).  Is this really the intent?


Issue 2:
In section 8.1, the sentence:

        Note that the two different macro contexts, domain-spec, and
        macro-string allow slightly different sets of legal visible
        characters.

parses funny because of a comma placement.  I assume that "domain-spec"
and "macro-string" are the two contexts, which would make the phrase
"domain-spec and macro-string" an appositive (which happens to be a two
element enumerated list).  The original comma placement seems to treat
"macro contents", "domain-spec", and "macro-string" solely as an
enumerated list, which I don't think is what you want.  Try:

        Note that the two different macro contexts, domain-spec and
        macro-string, allow slightly different sets of legal visible
        characters.

Which, without the appositive, is still readable:

        Note that the two different macro contexts allow slightly
        different sets of legal visible characters.

The next sentence:

        In particular, macro-string allows the slash character.

doesn't make much sense, since domain-spec is defined in terms of
macro-string, which contains macro-literal, which includes slash.  Of
course, if the previous sentence actually is an enumerated list with the
three elements "macro contexts", "domain-spec", and "macro-string", then
this sentence is then correct (but is not exactly clear, since
domain-spec also allows the slash character, but macro-string is called
out as being solely different from the other two), but I can not find an
exact list of the contexts other than the ABNF.  Also, even if
domain-spec was not defined in terms of macro-string, it still would not
explicitly exclude slash so making this distinction is not very useful.

If Issue 1 is fixed (by possibly removing "*( macro-string )" from
domain-spec) then this issue become moot, except for the suggested
difference in slash-character-inclusion in the different contexts, which
still won't be the case (because nothing in the ABNF specifically
excludes slash).


Issue 3:
A review of RFC2234 says that the construct "<a>*<b>element" where <a>
and <b> are not given default to 0 and infinity respectively.  I can not
resolve the multiple leading variable repetition operators in:

        domain-spec    = *( macro-string )
                         1*( ( "." 1*ALPHA ) / macro-expand )
        macro-string   = *( macro-expand / macro-literal ) 

I can not figure out why the definition of macro-string would/could
match any of the following:

        macro-expand macro-expand
        macro-expand macro-expand macro-literal
        macro-literal macro-expand macro-literal
        etc...
        
since a series of macro-string are taken care of by the leading * in
domain-spec.  To expand these constructs:

        domain-spec = *( *(macro-expand / macro-literal ) )
                      1*( ( "." 1*ALPHA ) / macro-expand )

The leading variable repetition operator is redundant.  I think it
should be removed from macro-string, since domain-spec is meant to match
the whole entity, and macro-string is meant to only match a specific
substring of that.


Issue 4:
The example in 8.1:
        
        A '%' character not followed by a '{', '%', '-', or '_'
        character MUST be interpreted as a literal '%'. Domains MUST NOT
        rely on this feature; they MUST escape % literals. For example,
        an explanation TXT record 
        
                Your spam volume has increased by 581%
                
        is incorrect. Instead, say 
        
                Your spam volume has increased by 581%%

is kind of ambiguous, as the end of the string comes after the % in the
incorrect one.  I agree that the null-width-assertion-$ does satisfy
"not followed by a '{', '%', '-', or '_' character", but an example with
an actual character at the end might make it clearer to those not well
versed in regular expressions (where you do this kind of stuff all the
time).  Might I suggest:

    Domain forgeries have been reduced by 581%%, according to our logs.

Or even sticking a period at the end of the current example (I also
think it might be wise to avoid mentioning spam since the immediate goal
is not to reduce spam, but I digress).

Issue 5:
In Section 8.1:

        If transformers or delimiters are provided, the replacement
        value for a macro letter is split into parts. After performing
        any reversal operation and/or removal of left-hand parts, the
        parts are rejoined using "." and not the original splitting
        characters. 
        
        By default, strings are split on "." (dots). Note that no
        special treatment is given to leading, trailing or consecutive
        delimiters, and so the list of parts may contain empty strings.
        Macros may specify delimiter characters which are used instead
        of ".". Delimiters MUST be one or more of the characters: 
        
                "." / "-" / "+" / "," / "/" / "_" / "="
          
There is a disconnect here between the words "delimiters" and "original
splitting characters" in the first paragraph above and the phrase 'split
on "."' in the second, in such a way that I originally though the
examples in 8.2 were wrong.  I don't think it's clear enough that the
specified delimiter characters are used to explode the the string, not
used for rejoining (which there is no way to specify).  I think part of
the problem is that the first paragraph talks about latter steps and the
second paragraph talks about earlier steps.  Might I suggest a
rephrasing, putting the series of steps to be taken closer to each other
and in order:

        If transformers or delimiters are provided, the replacement
        value for a macro letter is split into parts, using "." by
        default or, if provided, the specified delimiter characters,
        taken from the set:
        
                "." / "-" / "+" / "," / "/" / "_" / "="
                
        Using a sole delimiter of "." is equivalent to not specifying a
        delimiter at all.  Transformations, either reversals and/or
        removal of left-hand parts, are then applied.  The parts are
        then rejoined using "." and not any of the original delimiters.
        
        Note that no special treatment is given to leading, trailing, or
        consecutive delimiters, and so the list of parts may contain
        empty strings.


Issue 6:
I think the examples section in 8.2 could use an example of leading,
trailing or consecutive delimiters, to actually show what an empty
string will look like:

        <sender> is strong--bad(_at_)email(_dot_)example(_dot_)com
        
        %{l}                 strong--bad
        %{l-}                strong..bad

An example of multiple delimiters (since macro-expand contains
"*delimiter") would also be helpful:

        <sender> is j(_dot_)random+folder(_at_)email(_dot_)example(_dot_)com
                
        %{l}                 j.random+folder
        %{lr}                random+folder.j
        %{l.+}               j.random.folder
        %{lr.+}              folder.random.j


I apologize if things like this should have been pointed out sooner --
this is the first chance I've had to read a draft, any draft, from the
perspective of an implementor, so if my humble suggestions are worthy of
action, they might need to be changed in a number of drafts (most
notably the lentczner draft this one is based on).

Andy Bakun
spf(_at_)leave-it-to-grace(_dot_)com