Re: simplemail draft RFC available

I've read over the draft, and naturally I have lots of comments.


Thanks for the feedback.


The first metacomment, which shouldn't surprise Bill or Evan but may
surprise other folks, is that I LOVE this proposal and don't at all mind
the "competition" with text/enriched.  I'm not really sure that I
percieve them as 100% competitive anyway in that there may be a role for
both (for example, simplemail is easier to generate by hand, but
enriched may be easier to generate in the context of existing word
processors).


No argument here.

Given the fluidity of the situation, however, I've concluded that I
should publish the text/enriched draft as an Experimental RFC rather
than a Proposed Standard RFC.  I'd encourage Bill & Evan to do the same
with simplemail, but that's their decision.


I'll have to plead ignorance here.  I'm new to this and don't know the
difference between the various flavors of RFC (nor, actually, how to
make one official).

As another metacomment, I sincerely hope that nobody feels the need to
start a "simplemail vs enriched" war.   For my part, I don't know if I'm
going to have time to write my own implementation of text/simplemail any
time soon (even text/enriched will be a challenge at this point) but
I'll happily include a good PD text/simplemail interpreter in the
metamail distribution if anyone offers me one.


I had about 75% of an implementation of an earlier draft (written in
perl).  I should have time to update/finish it soon.  Bill said that
he was working on one as well.  Mine can go to plain text or
nroff-like backspace sequences.  What other markup languages would be
useful for metamail?  (I suppose I could probably hack it to convert
to text/enhanced...)

As a final metacomment, my perception -- not having tried to implement
it yet -- is that a minimal text/simplemail interpreter will be a little
more complex than a minimal text/enriched interpreter.  I'd love to be
proven wrong.  I have gotten very positive response to the simple
text/enriched interpreter that I provide as an appendix, and I'd
encourage you to try to include something similar.


Well, it depends on what you mean by minimal.  One of the goals of
simplemail was that *no* processing be necessary in order for it to be
readable.  Since most of the markup is common already, the only thing
that you might want to strip are the leading colons and perhaps equals
signs.  The colons can be handled with something like:

    :' while ($line = <>)
    :' {
    :'   $line =~ s/^((\w*>\s*)*)\s*:\S? /$1/;
    :'   print $line;
    :' }

To do more complicated text formatting (line folding, footnote
extraction, hanging tags, etc.) is a little more complicated, but we
should be able to show a PD proof by example that hopefully runs in
the hundreds of lines.

-- At the risk of sounding like Dave Crocker (horrors!), I'd *really*
like to see a formal grammar in this document.  (Dave probably passed
out when he read the previous sentence.  Either that, or he assumes this
message is a forgery.)   I'm *pretty sure* that the specification is
unambiguously parsable, but I'm not positive.  A formal grammar would be
reassuring.


It would, wouldn't it.  Unfortunately, it's probably easier to
implement than to state :-).  This mostly has to do with the rules
"_x_ is implicitly terminated by the end of the paragraph if no
closing delimiter is seen" and "a paragraph ends with a blank line or
a change of quote prefix or indentation".  These rules are necessary
to allow safe handling of trimmed and broken up quoted material and
are easily understood by a human, but look like they'd be messy to
write formally.  Sill, an attempt should be made.  Want to take a
crack at it, Bill?

The only *real* nastiness (which really needs to be formalized) is how
to decide that the first delimiter of a particular type in a paragraph
is a *closing* delimiter (the opening one got dropped) if the
formatter is trying to be smart.  (It is allowed to be out of phase
for the course of a paragraph.  No information should be lost.)
Section 3.2 points out that an opening delimiter cannot be followed by
white space, but a line such as

    : > him*! ...

should probably treat the asterisk as a close.  Handling this right
depends on knowing about alphanumerics and punctuation and how they
work in various character sets.

-- Section 3.2.3 makes reference to sections 4.5.1 and 4.6.  Those
sections don't exist.  In fact, section 4 is very short, and there's no
section 5.  This seems a tad strange, but the page numbers don't
indicate any missing pages...  (Note that all my comments refer to the
postscript version.)


Between my giving the draft to Bill and his posting it, all of the
major section numbers decreased by one (my section one was the
"Status", which he moved onto the title page--a move I agree with).
Evidently not all of the cross references got updated.  These should
refer to 3.5.1 and 3.6

-- You've avoided the issue of non-ASCII character sets.  I think this
is important, but the issues are complex, especially for multibyte
character sets.  I've recently come up with a new idea for how to do
text/enriched with multibyte character sets, which involves making the
metacharacters ("<" and ">" in the case of text/enriched;
text/simplemail has a few more) paramaterizable on the content-type
line.  You might want to check this out (I'll be publishing a new
text/enriched spec shortly) and consider a similar approach for
text/simplemail.


I'll take a look at it.  One of the headaches I saw with being
cognizant of /dependent on character set was figuring out how to
handle a message in one character set quoting one in another.  But
it's an issue that will have to be adressed sooner or later.

-- By the way, I commend you on the general formatting of your
postscript version -- it was a pleasure to read, and a good argument for
the value of formatted RFC's.


Blush.

-- Section 3.2.3 doesn't tell how to include backquotes in literal text.
 (Perhaps this is intended to be explained in the mythical sections
4.5.1 or 4.6, though.)


The general doubling rule is supposed to apply here.  I'm not thrilled
with doubling, but we couldn't agree on a quote character.

-- I was disappointed by the content of section 3.7.2 ("External
References") because I was hoping it was going to give a method for
referencing other MIME body parts that might be inside the same MIME
message as the text/simplemail body part.  Given the other mechanisms
you've defined, this should be pretty easy to add.  How about
<<<content-id>>> as a syntax?


Something like that probably isn't a bad idea.  We were trying to keep
ad-hoc determiners to a minimum, so I'd rather not introduce another
set, but something like `<<= content-id>>` might work.  The only
problem I see is that it would become meaningless (wouldn't it?) when
quoted.

In general, a beautiful document.  Great work!  -- Nathaniel


Thanks again,

Evan Kirshenbaum                       +------------------------------------
    HP Laboratories                    | Ye knowe ek, that in forme of speche
    3500 Deer Creek Road, Building 26U |    is chaunge
    Palo Alto, CA  94304               | Withinne a thousand yer, and wordes
                                       |    tho 
    kirshenbaum(_at_)hpl(_dot_)hp(_dot_)com            | That hadden prys now 
wonder nyce and
    (415)857-7572                      |    straunge
                                       | Us thenketh hem, and yet they spake
                                       |    hem so
                                       |                     Chaucer