Re: WGLC on draft-ietf-sieve-mime-loop


I like these edits a lot! +1 (with a very large value of 1).

One editing note, recall that RFC's should always use the Oxford comma in
lists of three items or more. I've noted two missing commas below. (Sorry,
this is uber-nit-picking; the RFC Editor should pick these up anyways.)

Aaron


On Fri, Aug 10, 2007, Nigel Swinson 
<Nigel(_dot_)Swinson(_at_)mailsite(_dot_)com> said:


WRT to: http://tools.ietf.org/html/draft-ietf-sieve-mime-loop-03 and some
related discussion of http://tools.ietf.org/html/draft-ietf-sieve-body-06

Abstract

I don't think it's necessary to start the abstract by saying what can't be
done with an existing standard.  By stating what this extension offers you
implicitly propose that no other standard offers this.

-   The Sieve email filtering language has no way to examine individual
-   MIME parts or any way to manipulate those individual parts.  However,
-   being able to filter based on MIME content is important.  This
-   document defines extensions for these needs.
+   This document defines extensions to the Sieve email filtering language
+   to permit analysis and manipulation of the MIME body parts of an email
+   message.

1.  Introduction

I don't think it's necessary to justify the current extension based on what
other standards don't allow you to do.  Also while it might be helpful to
refer to other related standards, it seems to just create headaches when you
want to revise the standard later, so I'd suggest:

-   Sieve scripts are used to make decisions about the disposition of an
-   email message.  The base Sieve specification,
-   [I-D.ietf-sieve-3028bis], defines operators for looking at the
-   message headers, such as addresses and the subject.  Other extensions
-   provide access to the body of the message ([I-D.ietf-sieve-body]), or
-   allow you to manipulate the header of the message
-   ([I-D.ietf-sieve-editheader]).  But none of these extensions take
-   into account that MIME messages ([RFC2045]) are often complex
   MIME messages ([RFC2045]) are often complex
   objects, consisting of many parts and sub-parts.  This extension
   defines mechanisms for performing tests on MIME body parts, looping
   through the MIME body parts, extracting information from a MIME body
   part, changing the contents of a MIME body part, and enclosing the
-   entire message with a wrapper.
+  entire message within a wrapper.

3.  Sieve Loops

   The base Sieve language has no looping mechanism.  Given that
   messages may contain multiple parts, in order to support filters that
   apply to any and all parts, we introduce a new control command:
   "for_every_part", which is an iterator that walks though every MIME
   part of a message, including nested parts, and applies the commands
   in the specified block to each of them.  The iterator will start with
   the first MIME part (as its current context) and will execute a
-   command block (Sieve commands enclosed by { ...}).  Upon completion
+   command block (Sieve commands enclosed by {...}).  Upon completion
   of this command block, the iterator advances to the next MIME part
   (as its current context) and executes the same command block again.


   "for_every_part" commands can be nested inside other "for_every_part"
   commands.  When this occurs, the nested "for_every_part" iterates
-   over the MIME parts contained within the MIME part current being
+   over the MIME parts contained within the MIME part currently being
   targeted by the nearest enclosing "for_every_part" command.  If that
   MIME part is a terminal MIME part (i.e. does not contain other MIME
   parts) then the nested "for_every_loop" is simply ignored.

4.1.  Test "header"

   The "header" test is extended with the addition of a new ":mime"
-   tagged argument, which takes a number of other arguments.
+   tagged argument and its associated options.

   Usage:  header [:mime] [:anychild] [MIMEOPTS]
      [COMPARATOR] [MATCH-TYPE]
      <header-names: string-list> <key-list: string-list>

   Usage:  The definition of [MIMEOPTS] is:

      Syntax: ":type" / ":subtype" / ":contenttype" /
      ":param" <param-list: string-list>

   When the ":mime" tagged argument is present in the "header" test, it
-   will parse the MIME header lines in a message so that tests can be
+   will parse the MIME header lines in the message so that tests can be
   performed on specific elements.

I don't think the section below reads well, as I think the list with
introductory partial sentence construct should mean you can prefix each list
item with the indroductory partial sentence and it still read well.  Below
you'd end up with:

- If the ":anychild" tagged argument is NOT specified if used within the
context of a "for_every_part" iterator, etc....
- If the ":anychild" tagged argument is NOT specified if used outside the
context of a "for_every_part" iterator, etc...

Also I think the behaviour should be explained when :anychild is used with
for_every_part.

-   If the ":anychild" tagged argument is NOT specified:
-
-   o  If used within the context of a "for_every_part" iterator, the
-      "header" test will examine the headers associated with the current
-      MIME part context from the loop.
-
-   o  If used outside the context of a "for_every_part" iterator, the
-      "header" test will examine only the outer, top-level, headers of
-      the message.
+   When used outside the context of a "for_every_part" iterator, and
+   without an ":anychild" tagged argument, the "header" test will examine
+   only the outer top-level RFC2822 headers of the message.
+
+   When used inside the context of a "for_every_part" iterator, and
+   without an ":anychild" tagged argument, the "header" test will examine
+   the headers associated with the current MIME part context from the loop.
+
-   If the ":anychild" tagged argument IS specified, the "header" test
+  When used outside the context of a "for_every_part" iterator, and
+  with an ":anychild" tagged argument, the "header" test
   will examine all MIME body parts and return true if any of them
   satisfies the test.
+
+  When used inside the context of a "for_every_part" iterator, and
+  with an ":anychild" tagged argument, the "header" test will examine
+  the current MIME part context and all it's nested MIME body parts,
+  returning true if any of them satisfies the test.


   Example:

   require ["mime", "fileinto"];

-   if header :mime :anychild :contenttype :comparator
+   if header :mime :anychild :contenttype
             "Content-Type" "text/html"
   {
       fileinto "INBOX.html";
   }

   In this example, any message that contains any MIME part with a
   content-type of "text/html" is saved to the mailbox "INBOX.html".

Reading over this again I don't think it's clear when you should use
for_every_part, and when you should use :anychild.  I think we should offer
an example that illustrates why we have both mechanisms.  Consider the
question, "Which one of these should I use:

   if header :mime :anychild :contenttype :comparator
             "Content-Type" "text/html"
   { ... }

Or:

    for_every_part {
       if header :mime :contenttype :comparator
             "Content-Type" "text/html"
       { ... }
    }

Perhaps change the for_every_part example.  It had the same ":comparator"
bug that the previous example had anyway.

   Example:

   require ["mime", "for_every_part", "fileinto"];

   for_every_part
   {

+       if allof (
-       if header :mime :param "filename" :comparator
-          "Content-Disposition" "important"
+      header :mime :param "filename" :contains
+         "Content-Disposition" "important",
+      header :mime :subtype "Content-Type" "pdf",
+       size :over "100K")
       {
           fileinto "INBOX.important";
           break;
       }
   }

-   In this example, any message that contains any MIME part with a
-   content-disposition with a filename parameter containing the text
-   "important" is saved to the mailbox "INBOX.important".

+    In this example, any message that contains a MIME part that
+    has a content-disposition with a filename parameter containing the text
+   "important", has a content-subtype of "pdf" and is bigger than 100 Kb
+    is saved to the mailbox "INBOX.important".


needs a comma after "pdf".


4.2.  Test "address"

I've tried and failed to find any specification on the "content-from"
header.  Perhaps partially because googling for the term "content-from" is
fairly unfruitful, and there's no mention in 2045.  So I'd suggest this
section be dropped, or changed to reference where "content-from" is defined.
I'm also not clear at all why you can't just use a vanilla address test.  ie
why doesn't this work:

    if address :is :all "content-from" "tim(_at_)example(_dot_)com"

I can understand utility in:

    if address :mime :anychild :is :all "content-from" 
tim(_at_)example(_dot_)com

In which case I'd rather it was just:

    if address :anychild :is :all "content-from" tim(_at_)example(_dot_)com

And the ":mime" be the way of accessing the type, subtype, parameters of
those structured headers.

4.3.  Test "exists"

Again I don't see why you couldn't just write:

if exists :anychild "content-md5"

5.  Action Replace

   If the :mime parameter is not specified, the replacement string is a
-   text/plain part.
+   text/plain part in UTF-8.

   If the :mime parameter is specified, then the replacement string is,
   in fact, a MIME entity as defined in [RFC2045] section 2.4, including
+ both MIME headers and content.
-   both MIME headers and content.  If the optional :mime parameter is
-   not supplied, the reason string is considered to be a UTF-8 string.

6.  Action Enclose

   A new Sieve action command is defined to allow an entire message to
   be enclosed as an attachment to a new message.  After enclosure,
   subsequent actions affecting the message header or content use the
-   newly create message instead of the original message; this means that
+   newly created message instead of the original message; this means that
   any use of a "replace" action or other similar actions should be
   executed before the "enclose" action.

The text implies that enclose happens immediately, and therefore subsequent
actions affecting the message header/content apply to the enclosing message,
but then it goes on to say if there are multiple enclose actions, then it's
only the last one that will win.  This means if I have an enclose action,
I'll have to remember that the outer body part is the new "enclosing" body
part, and should I get another enclose action then I should "replace" that
enclosing body part rather than enclose the message yet again such that it
has two layers of enclosing.

This sounds like a real hassle to implement for questionable utility, I'd
rather two enclose actions just add two layers of enclosing mime structure
and body text body parts be present.  Just like they would if the message
goes through two separate Sieve scripts, both of which add an enclose.
Either that or define that the enclose action should happen at the end of
the script and therefore doesn't affect the mime structure during the
existing processing (that might even be inside a for_every_part construct).
I'd actually suggest we leave it to the implementation like we do for
fileinto.  The implementation can choose to have the enclose happen
immediately, or at the end.

7.  Action extract_text

Shouldn't we leave this to the body test?  I know in section 5 of
http://tools.ietf.org/html/draft-ietf-sieve-body-06 it says that we
shouldn't set the match variables, but perhaps it should define an optional
argument of :extracttext in the body spec to do this if we really wanted
them.

8.  Sieve Capability Strings

-   A Sieve implementation that defines the ":mime" tagged arguments to
+  A Sieve implementation that defines the ":mime" tagged argument to
+   the "header" test and the ":anychild" tagged argument to
   the "header", "address" and "exists" commands will advertise the
   capability string "mime".


Needs a comma after "address".

   A Sieve implementation that defines the "extract_text" action will
   advertise the capability string "extract_text".  Note that to be
   useful, the "extract_text" action also requires the "variables"
-   [I-D.ietf-sieve-variables] and "mime" capabilities.
+   [I-D.ietf-sieve-variables] and "for_every_part" capabilities.

9.1.  Example 1

   A Sieve script to replace all the Windows executable attachments in a
   message would be:

This example isn't indented either and should be:

    require [ "for_every_part", "mime", "replace" ];
    for_every_part
    {
-  if ( anyof (
+  if anyof (
             header :mime :contenttype :is "Content-Type" "application/exe",
             header :mime :param "filename"
               ["Content-Type", "Content-Disposition"] :matches "*.com" )
      {
        replace "Executable attachment removed by user filter";
      }
    }

9.2.  Example 2

   A Sieve script to warn the user about executable attachment types
   would be:

   require [ "for_every_part", "mime", "enclose" ];

   for_every_part
   {
     if header :mime :param "filename"
        ["Content-Type", "Content-Disposition"] :matches
          ["*.com", "*.exe", "*.vbs", "*.scr",
           "*.pif", "*.hta", "*.bat", "*.zip" ]
     {
       # these attachment types are executable
-       enclose :subject "Warning" "
+      enclose :subject "Warning" text:
   WARNING! The enclosed message contains executable attachments.
   These attachments types may contain a computer virus program
-   that can infect your computer and potentently damage your data
+   that can infect your computer and potentently damage your data.

   Before clicking on these message attachments, you should verify
   with the sender that this message was sent by them and not a
   computer virus.
-   ";
    .
       break;
     }
   }

9.3.  Example 3

   A Sieve script to extract subject and text out of messages from the
-   boss
   boss:

11.  Security Considerations

   The "enclose" action creates an entirely new message, as compared to
   just redirecting or forwarding the existing message.  Therefore, any
   site policies applicable to message submission should be enforced.
-   site policies applicable to message submission should be enforced
-   here.

   The looping specification specified here provides easier access to
   information about the message contents, which may also be achieved
   through other sieve tests.  This is not believed to raise any
   additional security issues beyond those for the Sieve "envelope" and
+ "body" ([I-D.ietf-sieve-body]) tests.
-   "body" tests.

Cheers

Nigel

--