Re: My open issues with RFC3028bis

Hmm, did you miss my reply on these points to your previous post?
Rather than repeat my comments...

http://www.imc.org/ietf-mta-filters/mail-archive/msg06094.html

I think I was waiting for a response or more comments before making
changes.  I should have added them to the open issues list while
waiting.

Sorry, I probably really missed it. :(

Strings Containing Header Names

Section 2.4.2.2:

   A header name never contains a colon.  The "From" header refers to a
   line beginning "From:" (or "From   :", etc.).  No header will match
   the string "From:" due to the trailing colon.

No header will match it, but what's the result?


The result is the same as if you tried to test a header that isn't
present - the test fails unconditionally.

How about ""?


No, because the empty string doesn't guarantee a failure - there can be
patterns that match it.

This is actually spelled out in the the third paragraph of section 5.7.
(The first paragraph's wording is pretty awkward, however.)

If users
write "From:", most likely they made a mistake.  Should that really not
cause an error?


If we were specifying this for the first time, yes, absolutely. But we're not
doing that. We're revising an existing proposed standard, and we're supposed to
keep changes to a minimum so as not to break existing compliant
implementations.

This is what our WG charter has to say:

(1) Revise the base sieve specification, RFC 3028, with the intention
    of moving it to draft standard. Substantive additions or revisions to
    the base specification are out of scope of this working group. However,
    the need to loosen current restrictions on side effects of tests as
    well as the need for a normative reference to the newly-defined comparators
    registry may necessitate a recycle at proposed.

This sort of change consistutes a substansive revision in my book, and is
therefore out of scope.

MIME-Encoded NUL Characters
Header Test With Invalid MIME Encoding In Header

Hmm.  To paraphrase (and mangle) something that Dave Crocker said at the
meeting in Minneapolis: when there's a spec for something, it's a bad
idea to try to address general problems arising from it anywhere but in
that spec.

As I quoted, I see a contradiction in the base spec.


Well, the "contradiction" seems to be between the handling RFC 2045 recommends
for invalid encodings versus the handling RFC 2047 recommends. But this is
really not a contradiction since the two documents are talking about disjoint
sorts of encodings - one of body data and the other of encoded words in the
message header. So, while the handling of broken encodings specified in RFC
2045 might perhaps be releant to the body extension, I do not view it as in any
way relevant to the sieve base specification.

Now, having said that, there appear to me to be three issues here:

(1) What to do about "incorrect" encoded words, where the "incorrectness"
    manifests as them being syntactically invalid according to RFC 2047.
    In this case I believe RFC 2047 rules should apply: This is simply
    literal text, not encoded words.

(2) What to do about "incorrect" encoded words where the encoded payload
    doesn't match up with the rules for the specified encoding. From what
    I can tell RFC 2047 is silent on this issue. However, RFC 2047 does
    suggest that encoded-words that specify an unrecognized encoding might
    best be treated as ordinary text. I think it appropriate to suggest, but
    not require, this handling for encoded words with busted encoded
    content.

(3) What to do about encoded words that decode to a series of octets that
    contain NULs, CRs, LFs, or whatever. RFC 2047 is pretty clear that this
    is actually legal and implementations must take it into account, so I have
    no problem with saying sieve implementations must also allow such output.
    OTOH, if sufficient existing implementations have issues, we might want
    to allow some leeway by making this just a suggestion.

Either
interpretation could be valid, but only one probbaly is.  The fact that
the same Sieve code yields different results on different implementations
is not something to be described in the MIME RFCs.


Given that encoded words can specify any charset they like, including very
obscure ones, and different implementations are bound to provide different
charset support, I find it hard to get excited about different implementations
behaving differently in this regard. We're already there even with inputs that
are valid in every way.

String Arguments

As you can see, I agree with you and probably everybody else here
now.  Yet, the RFC still does not specify it.  If the collation draft
determines caseful or caseless matching, then please refer to it doing
so.  That does not answer how other extension names are being matched,
although we agree it should be caseful.


We do case-sensitive matches on require strings, so this is OK as far as we're
concerned. Hopefully there isn't a popular implementation that has spawned a
corpus of require "ENVELOPE"; scripts out there.

                                Ned