ietf-mta-filters
[Top] [All Lists]

Re: Comments on draft-melnikov-sieve-external-lists-02.txt

2009-07-19 16:52:54

Would we need to have :list-is, :list-matches, :list-contains, or can we
get away with just :list, which would check for the exact match?

Making the match type part of the list argument is just moving the information
around and not changing the underlying semantics. Having any specification at
aall of the match type other than "it's matching a list" is exactly what we
don't want and which cannot be supported.
Perhaps some specific examples will make this clear. Suppose I have a list of
whitelisted addresses, but because of privacy concerns this list contains the SHA-1 hash of each address. The only match type such a list would support is
:is. And it's an inherent characteristic of the list that that's the only
match type.

Now suppose I  have a list of half a million or so strings I don't want to see
as part of a subject line. I've used some algorithm or other to turn this into
a data structure that will tell me if any of those substrings appear in a
single  pass over the subject. The implied match type for such a list is
:contains.

Now consider a list consisting of IP addresses and net masks. I want to check
and see if a given IP address matches anything on the list. (Assuming the list
content is properly normalized, this can be done for IPv4 with no more than 32
exact match comparisons, 128 for IPv6. Of course other, more sophisticated
approaches are also possible.) This list has an effective match type that
corresponds to nothing we currently have in sieve, yet it is a perfectly
reasonable sort of list to want to match against.

Now think for a moment what's involved with having a list that support :matches
or worse, :regex. You now have a list of patterns that have to be checked one
by one, and checking each one is pretty expensive. In other words,
enumerability is a requirement for such a list, and depending on how things are
stored, the match type may still be implied by the underlying storage
mechanism.

Again, there is no doubt that, for the subset of lists that are enumerable,
having the match type be separate makes sense and provides the most
flexibility. But many of the lists we're going to want to check aren't
enumerable, either because the underlying technology doesn't support it or
because enumeration is simply too expensive. And I really have to wonder
if allowing all the combinations when most of them are going to produce
a runtime error is the right way to design this.

And if we decide not to make :list a MATCH-TYPE, the text is going to have to
state that any particular combination of MATCH-TYPE and :list may produce a
runtime error for some subset of the available set of lists. This may even
extend to the default :is MATCH-TYPE, creating a situation where a MATCH-TYPE
has to be specified.

                                Ned