ietf-mta-filters
[Top] [All Lists]

Re: Comments on draft-daboo-sieve-mime-00.txt

2004-03-15 17:03:45

I've spent quite a lot of time thinking about this particular extension too and 
can't wait to implement it, but have never really felt comfortable with a set 
of syntax, and don't feel comfortable with any of the proposals so far; so 
here's another to throw into the pool for thought:

I think MIME consists of two significant parts here; there's a set of new 
headers, and then there's the facility for multiple body parts.  Looking at the 
list of new headers, most of them seem to fit neatly into our existing tests 
(Mime-Version, Content-Id, etc), however the only one that has additional 
significant structure is the Content-Type header, which is defined as:

     content := "Content-Type" ":" type "/" subtype
                *(";" parameter)
                ; Matching of media type and subtype
                ; is ALWAYS case-insensitive.

Then looking more generically we've also got:

     disposition := "Content-Disposition" ":"
                    disposition-type
                    *(";" disposition-parm)

And there's probably others.  So we have this syntax.

     mime-header := header-name ":"
                    header-value
                    *(";" param-name "=" param-value)

So lets provide a tagged argument to the header, exists, address tests that 
parses out this structure and allows us to select exactly which bits we want to 
test against:

   Syntax:   header [HEADER-PART] [COMPARATOR] [MATCH-TYPE]
             <header-names: string-list> <key-list: string-list>

    HEADER-PART: ":value" / ":parameter" <param-names: string-list>

- If we specify ":value" as the optional header-part parameter, then the test 
only operates on the header-value.  
- If we specify the :parameter optional header-part parameter, then the string 
list describes the list of parameter values that we should search in.  
- If neither optional parameter is specified, the header test operates on the 
entire header value as normal.
- Other HEADER-PARTS could be defined in the future if we have other headers 
with a different defined structure.

If we think :value is too confusing for the "header value when with parameters" 
compared to the "whole header value", then perhaps we could use :valueonly, 
:noparameters or :mimevalue or something else?  Maybe the prefix could be like 
a "namespace" defining the header parsing algorithm to use, and if we have 
other headers that have a common structure, we could define other namespaces 
that detail the header parsing method to use.


The above would test only in the "active body part" which is the top level body 
part, which is the email headers, which is what we've all been dealing with 
already.  It would NOT recursively look for headers in all child body parts.  
That would be done with the second half of my proposal.  Where a server doesn't 
have a mime parser, or was worried about the extra overhead of parsing the mime 
structure, it wouldn't support this second part, but could still support 
Header-Parts.

The second significant half of MIME permits us to recursively define body 
parts, which each have a set of headers, and content.  So any solution that we 
have needs to cope with the potential recursive nature of the MIME structure.  
So what if we define a new tagged argument for use with all of our existing 
tests that specifies that we should recursively look in the body parts too?  
Call it :bodyparts.  So this means that if I write a test that says:

    if header :contains "Content-Type" "text/html"

Then this would only match if the message contains a single body part that is 
text/html.  However if I write the following, then this would match if there is 
ANY text/html body part anywhere in the legal MIME structure of the message:

    if header :bodyparts :contains "Content-Type" "text/html"

With our new mimeheader test this would become more correctly:

    if header :bodyparts :value :contains "Content-Type" "text/html"

Then if we combine this with allof(), we could further define that multiple 
tests within an allof() must apply to the same body part.  So if we say:

    if allof :bodyparts (header :value :contains "Content-Type" "image/jpeg",
                                  size :over 100K)

Then this would catch all messages that contain a image/jpeg body part that is 
over 100K.  However if we say the following, then this would catch any message 
that contained a body part that was image/jpeg and a body part (potentially 
different) which was over 100K in size.:

    if allof (header :bodyparts :value :contains "Content-Type" "image/jpeg",
                size :bodyparts :over 100K)

This then very neatly fits with the body extention, and we get a test like this 
which would test for all messages that contain a text body part that contains 
the word "Spam":

    if allof :bodyparts (header :value :contains "Content-Type" "text",
                                  body :contains "Spam")

For more examples, I'll go through Cyrus examples and show how they would look 
for this proposal.  However before I do, I note that many look for an "in any 
header value" or an "in any parameter value" concept.  We currently haven't 
found a need for these, and I'm not sure we need them.  If we do, then I'd 
propose we have new tests like "anyheader", "anyexists" and a new header-part 
of ":anyparameter".

Also to strictly "like for like" Cyrus's examples, I have to use my :bodypart 
parameter on them all, even though that wouldn't be necessary, and to remove it 
may actually create a better test.

1) Test the presence of a specific MIME header:

    if (mimeheader :name "Content-Foo" :exists)

if exists :bodyparts "Content-Foo"

(Reads as: If there is any "Content-Foo" header in any body part)

2) Test the presence of a specific parameter in any MIME header:

    if (mimeheader :parameter-name "filename" :exists)

if anyexists :bodyparts :parameter "filename"

(Reads as: If there is any header in any body part which has a filename 
parameter)

3) Test the presence of a specific parameter in a specific MIME header:

    if (mimeheader :name "Content-Disposition" :parameter-name "filename" 
:exists)

if exists :bodyparts :parameter "filename" "Content-Disposition"

(Reads as: If there is a Content-Disposition header in any body part which has 
a filename parameter)

4) Test for text anywhere in a specific MIME header

    if (mimeheader :name "Content-Disposition" :text :contains ".exe")

if header :bodyparts :contains "Content-Disposition" ".exe"

(Reads as: If there is a Content-Disposition header in any body part which 
contains the string ".exe")

5) Test for text only in the value (not parameters) in a specific MIME 
header

    if (mimeheader :name "Content-Disposition" :value :contains "attach")

if header :bodyparts :value :contains "Content-Disposition" "attach"

(Reads as: If there is a Content-Disposition header in any body part which 
contains the string "attach" in the value excluding the parameters)

6) Test for text in any parameter values only in a specific MIME header

    if (mimeheader :name "Content-Disposition" :parameter-value :contains 
".exe")

if header :bodyparts :anyparameter :contains "Content-Disposition" ".exe"
 
(Reads as: If there is a Content-Disposition header in any body part which 
contains the string ".exe" in any parameter)

7) Test for text in a specific parameter value only in a specific MIME 
header

    if (mimeheader :name "Content-Disposition" :parameter-name "filename" 
:parameter-value :contains ".exe")

if header :bodypart :contains :parameter "filename" "Content-Disposition" ".exe"

(Reads as: If there is a Content-Disposition header in any body part which 
contains the string ".exe" in the filename parameter)

8) Test for text in any MIME header

    if (mimeheader :text :contains ".exe")

if anyheader :bodypart :contains ".exe"

(Reads as: If there is any header in any body part which contains the string 
".exe")

I'd imagine this to be more useful though:

if anyof :bodyparts (header :anyparameter :contains "Content-Disposition" 
".exe",
                                header :anyparameter :contains "Content-Type" 
".exe")
 
9) Test for text only in the value (not parameters) in any MIME header

    if (mimeheader :value :contains "7bit")

if anyheader :bodyparts :value :contains "7bit"

(Reads as: If there is a header in any body part which contains the string 
"7bit" in the value not including the parameters)

10) Test for text in any parameter values only in any MIME header

    if (mimeheader :parameter-value :contains "us-ascii")

if anyheader :bodyparts :anyparameter :contains "us-ascii"
 
(Reads as: If there is any header in any body part which contains the string 
"us-ascii" in any parameter)

11) Test for text in a specific parameter value only in any MIME header

    if (mimeheader :parameter-name "filename" :parameter-value :contains 
".exe")

if anyheader :bodyparts :parameters "filename" :contains ".exe"


(Reads as: If there is any header in any body part which contains the string 
".exe" in any "filename" parameter)


What do you think?

Nigel