Re: [sieve] Issues with RFC 5260 - Date/Index (was: Issues with specific

Hi, Cyrus!

On Wed, Sep 09, 2009 at 12:04:01PM -0400, Cyrus Daboo wrote:

--On September 9, 2009 6:01:20 PM +0200 Hannah Schroeter
<hannah(_at_)schlund(_dot_)de> wrote:

Is it okay to discuss issues (what seems like ambiguities or gaps to
me) in Sieve specifications (RFC texts, perhaps also drafts) here,
from an implementor's perspective?

Yes, you are more than welcome to do that.

Okay, thanks for your prompt reply.

And, what's the relationship between this list (sieve@) and
ietf-mx-filters?

sieve(_at_)ietf(_dot_)org is the new home for the mailing list. The old 
@imc.org
address wil forward to the new list.

Ok, thanks.

I've currently got the task to add more extensions to our implementation.

I'm stalled with the Date/Index Extensions a bit because of a few things
that are unclear to me.

- Index extension: It seems it is not specified what happens if the
  index is out of range, that is it's greater than the number of headers
  actually found for the header names given to the test.

  if header :index 5 "Received" "..."

  if there are only 4 Received headers, for example.


The test is supposed to fail in this case. I agree the RFC could be clearer
about this.

  For this, I've gone on and made the test fail (not match) - even for
  :contains with an empty key -, but the RFC doesn't explicitly say
  this (in contrast to for example yielding a run-time error).


That's the correct behavior.

- The index extension doesn't specify its behaviour with respect to
  :count tests. (The date extension does.)
  For header, is it meant to yield 1 if the index is in range, 0
  otherwise? And for address, the number of addresses in the one
  selected header line (if the index is in range)?


The meaning of :count is tied to the test being performed. :index only
restricts the applicability of the test to a specific header field occurannce;
it does not change the interpretation of count in any way. So, when :index is
used with header, :count produces a 1 if the specific header with that index
exists, 0 otherwise. Address is more useful; it would return a count of all the
addresses in the header. Date would again be a 0 or 1 depending on the
existance of the header and whether or not it contains a valid date.

I don't see a need for further clarification here.

- Date extension:
  * Section 4.2 enumerates the possible values for the date-part argument.
    However, it does not specify any error handling for the case if the
    script uses an invalid value there; especially if that value is
    generated by use of the variables extension, thus possibly making
    static checking impossible.

      Should invalid date-part arguments be an error (static, if possible,
    run-time else)?


Yes, I think so. Our implementation certainly does.

      A precedent would be the "envelope-part" argument for the envelope
    test where implementations SHOULD consider unknown envelope parts
    an error (but not "MUST"). For comparators (that are given as string),
    it's even stronger: unknown comparators are an error (because they
    are either declared with require and then require causes an error if
    it's not known, or they aren't, then it's an error to use an
    un-require'ed comparator; section 2.7.3 of the base spec, towards
    the end).


I'm not sure why it's a SHOULD as opposed to a MUST, but I have no problem with
it.

  * Section 4.3 tells about comparator vs. date-part interactions.
    Do I read it right and that section is just recommendations for
    users?


Yes, that's all it is.

I.e. the implementation shouldn't reject a script that
    uses i;ascii-numeric in a non-recommended way (e.g. for "date",
    where it will in fact just use the year), but at most warn the user
    that the comparator may lose information (because it strips
    the value at the dash separating year from month)?


I'm not sure how you'd issue such a warning and the specification certainly
doesn't require it.

  * Date extraction: The specification (in section 4, before 4.1) says
    the implementation must be able to extract a date from the entire
    field content or from the end of a field, following a semicolon.
    As the obsolete syntax is still not weeded out, even in RFC 5322,
    this extraction may prove difficult.

    Received: [... received-info ...]; Mon, 10 (This is a ridiculous
      comment containing a semicolon: see here: ; ) Aug 2009 ...

    Undoing the folding is probably easy, but finding the *relevant*
    semicolon is difficult if I want to implement/use only a *Date*
    parser and spare myself the work of parsing the structure of the
    Received header too.


I have to say I'm not seeing the difficulty here. A common trick is to parse
from the end backwards. That way all you have to contend with is comments,
which requires nothing but a simple counter.

Alternately, you can preprocess the entire header and remove all the comments
first.

    The Date parser will be able to remove/skip
    the obsolete comment, but not the leading received-info. So we'd
    have to do a first pass in *reverse* to skip over CFWS until we hit
    a semicolon (outside the CFWS we skipped).

      Does the spec (RFC 5260) require that or would a simple-minded
    implementation (strip everything up to the last semicolon,
    regardless of structure, which would fail in my contrived example,
    i.e. make the test return false and use a count of zero) fulfill the
    spec too?


It's a border case, but I'd have to say I'd regard it as incompliant.

I should also note that we're planning on a revision to the date-index
speification to fix an issue with the example Julian date routines (they return
regular, not modified, Julian dates). We might as well clarify these issues as
well.

                                Ned
_______________________________________________
sieve mailing list
sieve(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/sieve

Re: [sieve] Issues with RFC 5260 - Date/Index (was: Issues with specifications)