Re: draft-freed-sieve-in-xml status?


On Sun, Dec 14, 2008 at 9:59 PM, Ned Freed 
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:

On Sun, Dec 14, 2008 at 6:06 PM, Ned Freed 
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:

i note that the draft describes the infoset rather than defining it in
the standard way. is there a reason for this decision?


I don't know what "the standard way" is you're referring to. Perhaps you
could provide a reference to an RFC where this has been used?

AIUI XML is maintained by w3c (rather than IEFT) so is a
recommendation. http://www.w3.org/TR/xml-infoset/ is the current
document.


Quite true, however, the IETF has its own specification for XML is supposed to
be used in RFCs: RFC 3470. And while infosets are mentioned as one approach to
specifying things about an XML format, there's no recommendation, let alone
requirement, that they be used.

This document is a little unusual in that it's defining a mapping of, if 
you
will, a non-XML infoset onto XML. As such, the natural approach seemed to 
be to
first discuss the structure of the language being mapped, then explain the
mapping, and finish up with additional unique-to-XML semantics.

i agree that most of this arangement is natural. it's just jumping to
a schema seems - to me - a little premature and inflexible.


First of all, the use of XML Schema is in fact too inflexible to be allowed
to continue. The next revision will use Relax instead.


XML schema is flexible but the flexibility comes at the price of
readability. one of relax variants would be a better choice.

however (in my experience) the generative tools commonly used for XML
and web service binding, and editor generation tend not to offer good
relax support. IMO the draft should offer secondary informative XML
Schema or Schemata to assist developers using these tools.

But I'm sitll a little confused as to what you're asking for here. If you're
asking for removal of the explicit inline XML syntax examples in favor of a
more abstract approach, I'd be fine with that if there's a WG consensus to 
make
such a change.


no - i'm very happy with the syntax examples

i would like to see the approach used in RFC 5023 (and others)
adopted, adding a normative description of the XML and making the
schema only informative.

Beyond that, however, lies a slippery slope. If what you're after is a
restatement of Sieve elements and semantics in infoset form, that is not going
to happen on my watch. RFC 5228 is the definitive source of information about
such things. It may be a little awkward for implementors to have to 
interpolate
back through the specifications to get at the meaning of things they have in
their XML, but the alternative of having two separate specifications that are
bound to be inconsistent in some way or other is much worse.


no, not restatement

This approach is perhaps not the best choice for someone coming at this 
trying
to get at Sieve semantics starting with XML, but I believe consumers of the
document with that mindset will be distinctly in the minority. The main 
focus
here is to provide people familiar with Sieve a means of mapping Sieve to 
XML
so that XML tools can be applied.

my experience is entirely opposite

developers that use the java libraries i work on have good XML but
lack a good understanding of underlying mail technologies (for example
sieve). there is a large and growing requirement for integration
between mail and enterprise systems (typically coding in Java and .NET
but also ruby and python). developers from enterprise backgrounds are
typically strong on web+xml but very weak on mail.


Yep, I've seen a lot of this as well. And the problem emcompasses far more 
than
Sieve: For example, a lot of people who are unfamiliar with email don't
understand very basic concepts such as the separation between envelope and
message content. (This particular issue actually pokes through into Sieve in
the form of whether an envelope or header test is appropriate.)


i beg to differ slightly on this one

some enterprise mail processing may happen during the SMTP transaction
but it is more typical for the mail processing after storage. not all
mail stored arrives through SMTP and so it is typical for any envelope
information to be reduced to simple MIME headers. most developers in
these mail processing environments do not need to understand the
difference between envelope and message content because - for them -
there is no difference.

Sieve works very well as a general MIME document processing language.
the envelope tests are - in many ways - peculiar since the rest of the
specification really isn't mail specific. there are potentially some
very interesting applications in this area so it would be a shame - i
think - for the expert group to focus too strongly on SMTP at the
expense of other IMHO equally valid Sieve use cases.

But here's the dilemma: This stuff is complicated and in some cases fairly
subtle. This in turn means that the reiteration of even a subset of the
underlying design principles that implementors need to know takes up a lot of
space and will still fall short of the mark of giving the necessary guidance.
But it may lead to the belief that reading this specification (or for that
matter this one and RFC 5228) is in fact sufficient to understand how to use
Sieve. It quite simply isn't.


again, i beg to differ

sieve is very similar structurally to the guerrilla standards used in
enterprise mail system for more than 5 years now. for most mail
processing applications, only the container builders need to have a
good understanding of the protocols. application developers are
offered a safe environment and an OOP interface. i see no reason why
sieve should be any different.

IMO what's needed is a proper architectural specification for email. We're
trying to get one of those done, but progress has been very slow.


providing that mail is interpreted sufficiently broadly, i agree

<snip>

Had this been the more usual case of simply defining an XML formal, I have 
to
admit that I would have gone with the informal approach used in, say, RFC 
2629.
I'm not all that keen on lots of formalism  - IMO it often hinders
understanding more than it helps.

IMHO the problem is getting the right level of formalism

more modern approaches to specification (eg Atom
http://www.rfc-editor.org/rfc/rfc5023.txt etc) tend to make the schema
only informative and the description of the infoset normative. this
would be more flexible for example, by allowing different schema
langauges to be used, alterations in namespace or additional
annotations in foreign vocabularies.


But doesn't this fly in the face of your earlier suggestion of doing this
by annotating the schema?


i'll explain a little more what i meant by that

i was suggesting that might be worthwhile creating an independent,
clean room schema (based on the RFC), documenting it then releasing
under a suitable FOSS license (MIT). similar - in spirit - to the
Annotated XML Specification
http://www.xml.com/pub/a/axml/axmlintro.html.

- robert