On Sun, Dec 28, 2008 at 10:31 PM, Ned Freed
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:
On Sun, Dec 14, 2008 at 9:59 PM, Ned Freed
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:
On Sun, Dec 14, 2008 at 6:06 PM, Ned Freed
<ned(_dot_)freed(_at_)mrochek(_dot_)com> wrote:
i note that the draft describes the infoset rather than defining it in
the standard way. is there a reason for this decision?
I don't know what "the standard way" is you're referring to. Perhaps you
could provide a reference to an RFC where this has been used?
AIUI XML is maintained by w3c (rather than IEFT) so is a
recommendation. http://www.w3.org/TR/xml-infoset/ is the current
document.
Quite true, however, the IETF has its own specification for XML is
supposed to
be used in RFCs: RFC 3470. And while infosets are mentioned as one
approach to
specifying things about an XML format, there's no recommendation, let alone
requirement, that they be used.
This document is a little unusual in that it's defining a mapping of,
if you
will, a non-XML infoset onto XML. As such, the natural approach seemed
to be to
first discuss the structure of the language being mapped, then explain
the
mapping, and finish up with additional unique-to-XML semantics.
i agree that most of this arangement is natural. it's just jumping to
a schema seems - to me - a little premature and inflexible.
First of all, the use of XML Schema is in fact too inflexible to be allowed
to continue. The next revision will use Relax instead.
XML schema is flexible but the flexibility comes at the price of
readability. one of relax variants would be a better choice.
however (in my experience) the generative tools commonly used for XML
and web service binding, and editor generation tend not to offer good
relax support. IMO the draft should offer secondary informative XML
Schema or Schemata to assist developers using these tools.
The problem is that the unique particle attribution limitation in XML Schema
effectively precludes using it without some compromises. I am therefore
opposed
to continuing to include it.
adopting a standard prefix - sieve, say - is all that is required
is this really too much to ask?
But I'm sitll a little confused as to what you're asking for here. If
you're
asking for removal of the explicit inline XML syntax examples in favor of a
more abstract approach, I'd be fine with that if there's a WG consensus to
make
such a change.
no - i'm very happy with the syntax examples
i would like to see the approach used in RFC 5023 (and others)
adopted, adding a normative description of the XML and making the
schema only informative.
Personally, I find RFC 5023 approach, like the XOPEN object descriptions it's
similar to, to be almost totally unreadable. Maybe it's the only reasonable
way
to do it when the element structure is quite complex, but that's not the case
here.
So, absent some fairly strong support for this from others in the group, I'm
not going to pursue this.
is there anyone else in group - excepting you and myself - who cares
enough to contribute at all to this discussion?
sieve). there is a large and growing requirement for integration
between mail and enterprise systems (typically coding in Java and .NET
but also ruby and python). developers from enterprise backgrounds are
typically strong on web+xml but very weak on mail.
Yep, I've seen a lot of this as well. And the problem emcompasses far more
than
Sieve: For example, a lot of people who are unfamiliar with email don't
understand very basic concepts such as the separation between envelope and
message content. (This particular issue actually pokes through into Sieve
in
the form of whether an envelope or header test is appropriate.)
i beg to differ slightly on this one
some enterprise mail processing may happen during the SMTP transaction
but it is more typical for the mail processing after storage. not all
mail stored arrives through SMTP and so it is typical for any envelope
information to be reduced to simple MIME headers.
Robert, with all due respect, you may have substantial expertise on the XML
front, but your comments here are actually doing little more than illustrate
the validity of my argument that there's a general issue with people not
getting how email works that isn't going to get fixed by anything we do here.
ned - with all due respect - your comments illustrate a lack of
understanding of this class of mail server
If you want this addressed the place to look is the email architecure
specifications being worked on by Dave Crocker.
And it is NOT typical for envelope information to be stored as headers.
for the class of application (enterprise mail servers is a name that's
sometimes used but quite possibly that's not familiar to others in the
group), unfortunately it is
There are several reasons for this:
(1) Envelopes only exist between the time of submission and final delivery.
Transport actions do record certain bits of envelope information in
trace header fields and final delivery is supposed to copy some additional
envelope information into a couple of header fields, but these are NOT
a message envelope and it is mistake to assume they are.
(2) During the time the envelope exists it is highy mutable, often changing
form at every hop. This makes header storage of envelope information
somewhat problematic.
true
(3) There are several SMTP extension that add to the envelope in various ways,
requiring negotiation of what envelope information can and cannot be
passed from one system to another. This tends to interact badly with
schemes that store envelope information as a static part of the message.
(when SMTP delivery is just the first step in mail processing, this
isn't such a problem)
(4) The fact that protocols other than SMTP are used for various email
operations doesn't necessarily impact header/envelope separability.
Other protocols maintain this separation and at least one of them, X.400,
actually has a far greater degree of separation than SMTP does.
(not all protocol maintain this separation and if any mail enters
through those protocols, this information will not be available)
(5) Because there are effectively no controls on what ends up in headers, it
is fairly easy for the separation between "header" headers and "envelope"
headers to get lost. Among other things, this can create serious
security vulnerabilities.
(only during SMTP delivery )
Now, this is not to say there aren't various ad-hoc schemes in use where
active
envelope information ends up getting stuffed into the header. Such schemes
date
back to BITNET's use of X-Envelope-To: to work around the 8x8 limit and
probably long before. But in my experience at least these things invariably
fail to provide a full and correct mapping for all of the possible information
that can exist in an SMTP (or X.400) envelope. And as a consequence they
invariably cause problems because of their inability to truly express envelope
semantics.
Indeed, if you have to capture envelope information in a static form - the
main
current use-case for this is compliance archiving - in most cases you're
better
off NOT using header-based schemes. We even have a standard format defined for
this: Batch SMTP. Although the format that's probably used the most is the one
Exchange generates that they call "envelope journaling", which puts the
envelope in the first text part of a MIME multipart. (On a side note, if
anyone
knows where there's a precise and complete specification of the syntax used
for
envelope journaling, I've appreciate a pointer.)
i agree that stuffing this information into headers is a bad idea (i
intended to observe not advocate above)
i disagree that this is a protocol problem with a protocol solution -
it's simply a poor choice of data representation by the designers.
more modern approaches to meta-data use namespacing and this prevents
loss of information. (this is often then compounded by confusing a
dead MIME document with a live email.)
most developers in
these mail processing environments do not need to understand the
difference between envelope and message content because - for them -
there is no difference.
Yeah, that's what a lot of them think. The problem is they're quite simply
wrong, and it is isn't a harmless thing to be wrong about. I get plenty of
support calls from customers who got screwed by this lack of understanding.
And it is NOT a minor detail when someone sets up a compliance archiving
system
that ends up in many cases not being able to determine who actually sent or
received a given message. (I only wish I was making up this example.)
:-)
(that's why containers are now often used in the enterprise)
Sieve works very well as a general MIME document processing language.
Actuallly that's not Sieve's purpose at all and it isn't something Sieve is
currently good at. In fact we've only recently taken the first fairly
tentative
step down the MIME processing path with the MIME loops extension and possibly
the convert extension. We'll see how well that turns out, but I have to say
I'm
not optimistic that it will replace existing ad-hoc MIME processing facilities
like MIMEdefang.
AIUI these limitations only apply to multipart documents. outside
email, multipart documents are not so common and sieve works fine on
those.
the envelope tests are - in many ways - peculiar since the rest of the
specification really isn't mail specific. there are potentially some
very interesting applications in this area so it would be a shame - i
think - for the expert group to focus too strongly on SMTP at the
expense of other IMHO equally valid Sieve use cases.
I don't object to the use of Sieve in other contexts - in principle. But the
devil is in the details. A good example of this is the use of Sieve in an IMAP
server as defined in draft-ietf-lemonade-imap-sieve-05.txt. This doesn't seem
like too much of a stretch from existing usage, but when I reviewed this
document a while back I found all sorts of semantic mismatches, some of them
quite serious.
But here's the dilemma: This stuff is complicated and in some cases fairly
subtle. This in turn means that the reiteration of even a subset of the
underlying design principles that implementors need to know takes up a lot
of
space and will still fall short of the mark of giving the necessary
guidance.
But it may lead to the belief that reading this specification (or for that
matter this one and RFC 5228) is in fact sufficient to understand how to
use
Sieve. It quite simply isn't.
again, i beg to differ
sieve is very similar structurally to the guerrilla standards used in
enterprise mail system for more than 5 years now. for most mail
processing applications, only the container builders need to have a
good understanding of the protocols. application developers are
offered a safe environment and an OOP interface. i see no reason why
sieve should be any different.
Understanding of the protocols isn't necessary, but I'm very much afraid that
there's no avoiding an understanding of email semantics if you want things to
work properly. We may wish it were otherwise, but it just isn't.
depends on what you mean by that
if you're talking about processing as part of SMTP processing (or any
other protocol) then i agree
but often when people think they are processing email, the use case
boils down to essentially processing a dead MIME document with
meta-data that has been previously delivered through some protocol or
other. they may do some other protocol stuff - such as forwarding some
result - but that's essentially acting as an email client. this is
usually within the capability of most general developers without a
good understanding of email semantics.
- robert