ietf
[Top] [All Lists]

Re: Gen-ART LC/Telechat review of draft-freed-sieve-in-xml-05

2009-08-20 17:01:12
The thread so far has gotten difficult to follow, so I'm going to try to reset the conversation. I think we have been disagreeing in 2 areas.

The first is how much the document should say about how much semantic knowledge of sieve is expected for editors. On rereading the thread, I think we're expending a lot of energy here on something that really isn't that important. So I yield that point.

OTOH, I think we've highlighted some interoperability could be important.

First, I think we are talking past each other about applying requirements to sieve editors in general--that is, editors that do not implement this spec. I never intended my comments to apply to non-XML based sieve editors. It would be nice if they interoperated, but that's not what this draft is about. (For that matter, if there is any attempt to standardize their behavior beyond a hope that they all create correct sieve scripts, I am not aware of it.

But I think the draft creates opportunities for failure between implementations of the xml format. Let me try to make my previous questions on work group more concrete. I see 3 interop scenarios entirely among implementations of this draft. Again, _all_ of my comments and questions _only_ concern the set of editors and processors that implement this draft.

1) Interop between an arbitrary xml-based editor and an arbitrary xml to native sieve processor - I think this is covered reasonably well, and is explicitly mentioned as a goal of the draft.

2) Interop among different xml-based editors - that is, can editor B operate on XML created by editor A, and can A operate on the result without losing data? (I am not expecting that B would be able to edit or render every feature supported by A). This is not well supported by the current draft due the fact editor B would be allowed to remove any metadata inserted by editor A.

This could be fixed with a normative requirement to not delete metadata elements that you don't understand, and possibly not to delete elements from xml namespaces you don't understand. Failing that, it would help a lot to offer non-normative guidance that any metadata or extension namespace elements are likely to go AWOL between editing sessions.

3) Interop among different xml to sieve conversion processors - that is, can I take an sieve script in xml, convert it to native sieve with processor C and convert back to xml with processor D without losing data. Can I do the same converting from native sieve to XML and back to sieve? This scenario is jeopardized by the SHOULD (rather than MUST) level requirement to use the structured comment format to store metadata.

This could be fixed by strengthening the requirement to a MUST. Failing that, it would help to include the motivation for making this a SHOULD rather than a MUST, and guidance on the consequences of not following the SHOULD.

Does the work group expect to support scenarios 2 and/or 3? If the answer is "no", then I think the draft needs a scope or applicability statement to that fact. Otherwise, I am afraid implementors will approach this with incorrect interoperability expectations.

You mentioned that there were practical engineering considerations preventing MUST level requirements for scenarios 2 and 3--do those considerations still apply if we limit the discussion to XML-based editors and processors that implement this draft.?

If the answer is yes to 2 or 3, then I think the draft needs the stronger normative language, or non-normative guidance I mention above.




On Aug 18, 2009, at 7:54 PM, Ned Freed wrote:


On Aug 16, 2009, at 11:01 AM, Ned Freed wrote:

[...]

>> it would be helpful to have a sentence or two somewhere (maybe
>> in the intro) to explicitly say so. My confusion might be around the
>> meaning of the term "client" in this context.
>
> No, I think your confusion is that you read a lot more into the text
> than it
> actually says. There's a pretty big difference between "no semantic
> understanding whatsoever" and "an incomplete semantic understanding'.

I think the confusion is that the text says very little one way or the other. You have assumptions in mind about the semantic knowledge of an
editor that are not explicitly stated.

On the contrary, we have made _no_ assumptions whatsoever about it. And the draft reflects that. You, OTOH, appear to have approached this with a set of assumptions I for one frankly don't comprehend in your head. Perhaps - and this is just speculation on my part - this is because, as you have stated, you haven't done much work using XML tools. If so, then you need to understand that this document assumes considerable familiarity with XML and the tools used to
manipulate it. And given the topic of the document this is a perfectly
reasonable assumption to make IMO.

A reader that was not privy to
the process of creating this draft  may come with a different set of
assumptions, and may not draw the inferences you expect them to.

In my case, it seemed counter-intuitive that an implementer would be
willing to implement sieve semantics but unwilling to deal with the
syntax.

And this is a case in point. The purpose of this specification is to provide a means of representing Sieve using an alternate syntax without changing any of the language semantics. As such, the audience is *exactly* the group of people who are "willing to implement sieve semantics but unwilling to deal with the
syntax". (And from all indications - there are now alternative XML
representatiions for many other applications formats - this is a pretty large
group.)

I have to say that approaching such a specification with the idea that it's entire goal is counterintuitive is a pretty good recipe for confusion on your part. And I don't think any amount of clarifying prose can possibly assist you
in dealing with such a fundamental expectation mismatch.

Your "template" comment below illustrates a case where that
makes more sense.

Again, the extent to which an editor understands and can deal with Sieve semantics is largely orthogonal to the representation format. There are extant Sieve editors that don't use the XML representation and which understand essentialy no Sieve semantics at all - they are controlled by embedded comments in special formats only, and treat the Sieve material between the comments as opaque. Just think how easy it would be for some other Sieve generation
facility to confuse such an editor.

>
>> Is the expectation that
>> an "editor" must be semantically aware of sieve, but a processor does
>> not (beyond the list of "controls")?
>
> The expectation is that the amount of semantic understanding an
> editor is going
> to need will very much depend on the range of operations the editor
> is able to
> perform. Simple template-based systems will only manipulate labelled
> blocks of
> Sieve code without any understanding of what that code does. A more
> sophisticated editor might need to have a detailed knowledge of how
> blocks in
> Sieve work, or how to build conditional expressions, or even the
> details
> sematics of various tests and actions.

That paragraph clarifies a lot. I think it would be helpful to include
it in the draft.

I disagree. The above paragraph might make sense to have in some sort of Sieve
usage document. It's unnecessary and distracting here.

>
>> ...
>
>> Instead of round trip "conversion", I should have said round-trip
>> "editing". My concern is, if I create a script using Editor A, then
>> later edit it with Editor B, any metadata created by Editor A is
>> likely to be lost.
>
> And that's a valid concern to have. Again, there are going to be
> cases where
> one editor has no choice but to strip the information added by
> another. This is
> simply how things are; there's nothing this or any other
> representation scheme
> can do to eliminatte this possibility.
>
>> Is that the intent?
>
> It's not a matter of intent. It is simply an unavoidable reality.
>
>> If so, it's probably worth
>> mentioning that an editor needs to be able to deal rationally with
>> the
>> loss of its own metadata.
>
> First, while it is certainly desireable for all editors to have this > characteristic, there are going to be cases where it cannot possibly
> work this way. So this can't be a requirement.

So am I understanding correctly that it's unreasonable to expect an
editor to just leave metadata alone if it doesn't understand it,

it depends on the context. Hopefully the XML format will help make it a little
easier to do this in some cases. But certainly not all.

and
it's also unreasonable to expect an editor to behave in a sane manner
if its metadata gets stripped?

Again, it depends on the context.

It seems like there are three choices here: You can expect editors to
preserve metadata from other editors, you can allow stripping of
metadata and expect editors to deal rationally with its loss, or you
can expect that if a user uses more than one editor over the lifetime
of a script, one or both of the editors is likely to fail in a non-
graceful way.

Did the working group really choose the third option?

It isn't a question of what was chosen. The WG came up with one of the simplest language syntaxes imagineable - the ABNF for Sieve is *tiny* - but any language with sufficient flexibility to represent any sort of useful subset of the scripts people want to write to process email is going to be one that's too complex for many editors to want to understand fully. And since editors aren't always going to have full semantic understanding, they cannot be expected in all cases to be able to manipulate the full set of possible sieves producing by
other systems without screwing up.

Of course the WG could have imposed some requirements on this, saying in effect "you must fully inderstand Sieve in order to be a compliant editor". But such a requirement would either have been roundly ignored, or implementors would choose some other language that doesn't have such requirements. And again, this document is absoutely not the place for stating such requirements, even if they
made sense to have, which IMO they do not.

Put another way, the language you appear to be seeking here is one that is
trivially shown to be overconstrained by engineering realities into
nonexistance.

>
> Second, even if it were appropriate to make this a requirement, this
> document
> isn't the place for it. All this document does is describe an XML
> representation for Sieve. All of the requirements it imposes are
> directed at
> the representation and the process of converting to or from that
> representation.
>
> But since there is no requirement that a Sieve editor use this XML
> representation at all - and in practice most extant Sieve editors
> operate
> directly on the native Sieve format - imposing requirements on
> editors here
> makes little if any sense.

I fail to understand why it is acceptable to put requirements on
processors but not on editors. Certainly no one would expect an editor
that does not implement this specification to be bound by any
requirements in it.

And that's precisely the problem. Most editors operate directly on the regular Sieve representation, not the XML representation. If you want to impose a requirement on Sieve editors, this is not the place to do it because you're
only hitting a fraction of the audience.

For that matter, you already have (admittedly
weak)  2119 language referring to editors

Actually, there is exactly one constraint the document imposes on editors (the other compliance language explains a couple thinkgs editors are explicitly
allowed to do), which has to do with the contents of displayblock and
displaydata not being allowed to include comment close sequences. This is done to simply conversion processing and, unlike the requirements you want to impose, applies only to Sieve editors operating on the XML representation. So
it is appropriate for this document to state such a requirement.

(That said, properly speaking this should be a Schema and RNG constraint, but it turns out to be very difficult to do in those languages, so we cheated and did it as a prose constraint. In other words, this is a kluge to get around a limitation in the specification language, just like text descriptions attached to ABNF do similar stuff on a regular basis in many other specifications.)

But if you are unwilling to place normative requirements around this,

It isn't a question of what I'm willing or unwilling to do, but rather what I, as an individual author working on WG document, is able or unable to do. The stuff you appear to be affter clearly doesn't belong in this document or AFAICT in any other document the WG plans to produce. If you want to see various general requirements on Sieve editors written down somewhere you're going to
have to convince the WG that such an effort is worth it.

it would still help quite a bit to have some non-normative guidance to
the effect that, since there is no requirement for an editor to
preserve metadata from another editor, an editor implementation can
expect to have its metadata removed from any given script. It it does
not handle this gracefully, bad user experiences are likely to result.

Again, while such discussion might arguably be useful, this is not the place
for it and I'm not the one you need to convince to do it.

>> >> Why not MUST? Wouldn't violation of this requirement introduce
>> >> interoperability problems between different implementations?
>> >
>> > It's a SHOULD because the WG believed that there may be some
>> > exception cases
>> > where an alternate format makes more sense.
>
>> Can you offer (in the text) some examples of those exceptional cases,
>> and the consequences thereof?
>
> I see no need to.
>
>> My concern is that it seems like violating the should would pretty
>> much break interoperability between processors, wouldn't it?
>
> Sure, which is why it's a SHOULD, not a MAY. Again, this is the
> compliance
> level the WG decided was appropriate. Even if I agreed with you,
> this is not a
> simple editorial nit that I can change on my own.

It has been my experience that SHOULD level requirements that both
significantly impact interoperability and offer no explicit guidance
about the consequences of violation are some of the biggest sources of
interoperability problems in existing specs.

I'm starting to think that the WG had very limited expectations of
interoperability between implementations that use this format.

Realistic expectations would be closer to the mark. But again, you persist in confusing issues inherent in automatic generation and modification of Sieve code with this specific representation format. To the extent this specification attempts to address this, it is by relieving implementorz of the burden of having yet another parser and supporting yet another syntax, and by selecting a syntax which has a vast array of very powerful manipulative tools available. We hope that this will help make some of the problems inherent in this space a
little easier to overcome.

I
recall a sentence stating that you expected interoperability between
editors and processors. I think an average reader would expect
interoperability among multiple editor implementations and among
multiple processor implementations. If the work group did not intend
that degree of interop, it would be extremely helpful to have some
sort of applicability statement to that effect.

Again you're asking for all sorts of stuff that far, far, far exceeds the
purview of this specification.

>
>> Or at
>> least cause encoded metadata to get lost if you convert from XML to
>> sieve using one processor, and back to xml with another?
>
> That's the obvious case where such a loss would occur.
>
>> >
>> >> -- Security Considerations, last paragraph:
>> >
>> >> You mention that potentially executable content can be
>> introduced via
>> >> other namespaces, and that "appropriate security precautions"
>> should
>> >> be taken. I think this needs more discussion, as I am not sure an
>> >> implementor will understand what the authors considered
>> appropriate.
>> >
>> > The point of Sieve namespaces is to allow multiple XML vocabularies
>> > to be used
>> > in a single document. This is a completely open ended mechanism and
>> > it is not
>> > our intent to label any particular use as inappropriate. As such,
>> > unless you
>> > have some specific text in mind, I for one fail to see what could
>> be
>> > added here
>> > that would be useful.
>
>> Maybe an examples of the sorts of bad behavior that could be enabled
>> by this would help.
>
> I think introducing another XML vocabulary into this document simply
> for
> purposes of showing that you can put bad stuff in XML would be
> belaboring the
> obvious.
>
>> Are you concerned that a scriptable editor that
>> stores scripts in metadata could be attacked by hand coding scripts
>> into structured comments in native Sieve?
>
> For that to happen there would have to be a pretty serious bug in the
> conversion process, so no, this is not the concern here at all.
>
>> Buffer overflow attacks on
>> conversion processors?
>
> This would be another sort of conversion process bug and not
> relevant to the
> concern at hand.
>
> All this text is doing is point out the rather obvious fact that XML
> namespaces allow you to mix vocabularies in a single document. As
> such, it
> is possible to drag in some other vocabulary that has its own set of
> security
> problems.
>
> If this still isn't clear to you I'm sorry, but I'm at a loss as to
> how
> to explain it further.

I think it's clear to me after reading your explanation. Am I correct
in understanding that the point of that sentence was that any given
namespace mayl have its own set of security considerations, and that
is beyond the scope of this document? If that is a correct
understanding, then I suggest replacing the last sentence with
something to the effect of:

"Such facilities will come with their own sets of security
considerations, which are beyond the scope of this document."

I really don't think this is that much clearer, but I can live with changing
it to read:

Such material will necessarily have its own security
considerations, which are beyond the scope of this document.

Also, you elided one of the questions from my previous email without
responding:

>
>
>> -- Section 4.1, paragraph 11:  "Implementations MAY use this to
>> represent complex data
>> about that sieve such as a natural language representation of sieve
>>   or a way to provide the sieve script directly."
>
>> I'm not sure I understand the last part --are you saying this can be
>> used as an alternate encoding of the script?
>
> Of course not. Since when do we have programs capalable of taking
> completely
> arbitrary natural language statements and reliably encoding them into
> programming language statements?
>
> I see nothing unclear about this at all.

I get the part about representing a "natural language representation",
but what did you intend by "... or a way to provide the script
directly"?

My intent was to say exactly what was said - a UI could present Sieve
statements directly to the user. Really, I cannot see anything unclear about
this at all and I am completely at a loss to explain it furhter.

                                Ned

_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf