Hi,
Regarding draft-ietf-sieve-spamtestbis-02.txt :
Mostly it looks pretty nice (despite the length of this note). I have
one potential conflict, and then a nitpicky thing (with many instances)
that could be completely ignorable.
The conflict: There is text describing the two forms of result strings
that the underlying implementation provides for testing against. One
form is a digit string and some optional text, with the digit string's
value used in relational comparisons, and the text used in string
comparisons. There's a warning against using the string part as it's
non-portable. The other form is a simple string "untested" that
indicates that no test has been done, and that also is used in string
comparisons. So basically both forms can be used in string comparisons,
which is kind of ugly and ambiguous. If the recommendation is not to
use "untested" and use non-:percent instead, why not just drop the
"untested" result? Either that or you have to have a prohibition
against the underlying implementation returning the string "untested" as
the optional part when a digit string is returned. Or, perhaps, have
the untested result be "0 untested" vs "0[ anything-else]" for tested and
clear.
The nits: my big bother is the overloading of the words "spamtest" and
"virustest" to refer to both the new Sieve verbs and the underlying
implementation's analysis (and words about the "result" and the "return"
from the commands). The Sieve-enabled application interprets the
underlying test results, normalizes it, and gives it as input to the
command, and the command uses that normalized evaluation and applies
some logic to that, and essentially produces a true or false result.
E.g.:
3.1. General Considerations
The "spamtest" and "virustest" tests described below can both return
a string that starts with a numeric value, followed by an optional
space (%x20) character and optional arbitrary text.
I understand what this means, and it may not even be confusing except to
ultra-literal readers, but still: it talks about what "spamtest" and
"virustest" (the names of the two new commands) return. The commands
themselves return (or evaluate to) true or false; their input is the
normalized result described in the quoted sentence. I've mentioned this
before and perhaps some of it has been improved. But still..
To be specific, here are the places that I think contribute to the
overloading of the terms, and some suggestions.
Abstract
The SIEVE email filtering language "spamtest", "spamtestplus" and
"virustest" extensions permit users to use simple, portable commands
for spam and virus tests on email messages. Each extension provides
a new test using matches against numeric 'scores'. It is the
responsibility of the underlying SIEVE implementation to do the
actual checks that result in values returned by the tests.
It is the responsibility of the underlying SIEVE implementation to do
the actual checks that result in proper input to the tests.
1. Introduction and Overview
The purpose of this document is to introduce two SIEVE tests that can
be used to implement 'generic' tests for spam and viruses in messages
processed via SIEVE scripts. These tests return a string containing
a range of numeric values that indicate the severity of spam or
viruses in a message, or a string that indicates the message has not
passed through any spam or virus checking tools, or provides a direct
indication of whether the message has been tested for spam or not.
The spam and virus checks themselves are handled by the underlying
SIEVE implementation in whatever manner is appropriate, and the
implementation maps the results of these checks into the numeric
ranges defined by the new tests. Thus a SIEVE implementation can
have a spam test that implicitly checks for third-party spam tool
headers and determines how those map into the spamtest numeric range.
I would rearrange slightly and disambiguate. And frankly I would move
some of the details down to the section 3 intro, and leave the overview
more overviewy, e.g. just say here that the new tests relieve the script
writer of knowing the intimate details of the spam tests.
The purpose of this document is to introduce two SIEVE tests that can
be used to implement 'generic' tests for spam and viruses in messages
processed via SIEVE scripts. The spam and virus checks themselves
are handled by the underlying implementation in whatever manner is
appropriate, so that the SIEVE spam and virus test commands can be
used in a portable way.
And then move the specifics down to 3.1, q.v.
3.1. General Considerations
The "spamtest" and "virustest" tests described below can both return
a string that starts with a numeric value, followed by an optional
space (%x20) character and optional arbitrary text. The numeric
value can be compared to specific values using the SIEVE relational
[I-D.ietf-sieve-3431bis] extension in conjunction with the "i;ascii-
numeric" comparator [I-D.newman-i18n-comparator], which will test for
the presence of a numeric value at the start of the string, ignoring
any additional text in the string. The additional text can be used
to carry implementation specific details about the tests performed
and descriptive comments about the result. Tests can be done using
standard string comparators against this text if it helps to refine
behaviour, however this will break portability of the script as the
text will likely be specific to a particular implementation.
The "spamtest" and "virustest" tests described below evaluate the
results of implementation-specific spam and virus checks in a
portable way. (The implementation may, for example, check for
third-party spam tool headers and determine how those map into the
way the test commands are used.) To do this, the underlying SIEVE
implementation provides a normalized result string as one of the
inputs to each test command. The normalized result string is
considered to be the value on the left hand side of the test, and
the comparison values given in the test command are considered to be
on the right hand side. [e.g., something like what rfc3431 says.]
The normalized result string may be provided in one of two formats:
1. A digit string, with its value being within a range of numeric
values used in the specific SIEVE command, indicating the
severity of spam or viruses in a message or whether the check
was done at all. This may optionally be followed by a space
(%x20) character and arbitrary text. The numeric value will be
used when a relational test is done. The optional arbitrary
text can be used to carry implementation-specific details about
the tests, or for descriptive comments about the result. This
optional text will be used when standard string comparisons are
used.
2. A string indicating that the message has not passed through any
spam or virus checking tools. This string is used when
standard string comparisons are used.
3.2. Test spamtest
[...]
The "spamtest" test evaluates to true if the spamtest result matches
the value.
The "spamtest" test evaluates to true if the normalized result
matches the value.
3.2.1. spamtest without :percent argument
When the ":percent" argument is not present in the "spamtest" test,
the result of the test is a string starting with a numeric value in
the range "0" (zero) through "10", with meanings summarised below:
When the ":percent" argument is not present in the "spamtest" test,
the normalized result string provided for the left side of the
test starts with a numeric value ...
In this example, any message that has not passed through a spam check
tool will be filed into the mailbox "INBOX.unclassified". Any
message with a spamtest value greater than or equal to "3" is filed
into a mailbox called "INBOX.spam-trap" in the user's mailstore.
Any message with a normalized result value ...
3.2.2. spamtest with :percent argument
When the ":percent" argument is present in the "spamtest" test, the
result of the test is a string starting with a numeric value in the
range "0" (zero) through "100", with meanings summarised below, or
When the ":percent" argument is present in the "spamtest" test, the
normalized result string provided for the left side of the test
starts with a numeric value ...
In this example, any message that has not passed through a spam check
tool will be filed into the mailbox "INBOX.unclassified". Any
message with a spamtest percentage value greater than or equal to
"30" is filed into a mailbox called "INBOX.spam-trap" in the user's
mailstore.
Any message with a normalized result value greater than or equal..
3.3. Test virustest
[...]
The "virustest" test evaluates to true if the virustest result
matches the value.
... evaluates to true if the normalized result string matches the
value ...
The virustest result is a string starting with a numeric value in the
range "0" (zero) through "5", with meanings summarised below:
The normalized result string provided for the left side of the
test starts with a numeric value ...
In this example, any message that has not passed through a virus
check tool will be filed into the mailbox "INBOX.unclassified". Any
message with a virustest value equal to "4" is filed into a mailbox
called "INBOX.quarantine" in the user's mailstore. Any message with
a virustest value equal to "5" is discarded (removed) and not
delivered to the user's mailstore.
Any message with a normalized result value equal to "5" ...
mm