Re: I-D ACTION:draft-shafranovich-feedback-report-01.txt


On Wed May 25 2005 15:26, Yakov Shafranovich wrote:


Bruce Lilly wrote:


Review of         draft-shafranovich-feedback-report-01      by B. Lilly

...


   [X] The document should probably be split: registration of media
       types in the standards tree requires an RFC of any type.
       Registration procedures (which should be specified for
       registration of extension items) are typically specified in a BCP
       RFC.  Specifications with conformance criteria (which should be
       explicit) are typically Standards Track RFCs.


Can you elaborate as to why the document must be split? For example, RFC 
3464 includes the MIME registration in the same document. Why should 
this document, which is also a child-of-RFC3462 be any different?


Note that I said "probably should", not "must". Note that there are
companion documents to 3464; RFCs 3461-3463. Likewise for the MTRK
documents, RFCs 3885-3888.  One reason to consider separating different
issues is to facilitate progress of the Standards Track part (those that
specify conformance criteria, syntax specifications, etc.).  If you mix
other issues and it becomes necessary to change the document, the status
on the Standards Track may be reset and/or the timer for advancement to
the next stage may be reset.  In particular, registration procedures do
not need the type of phased roll-in used for Standards Track; as long as
the procedure is acceptable to IESG and IANA, it can be a BCP document
and you don't have to worry about it anymore as the syntax etc.
specification gets revised as it progresses along the Standards Track.

The document suffers from the following serious defects:

...


   [X] missing or inadequate internationalization considerations


I am not sure what this is refering to. This is a message intended for 
machines, not humans, and follows the same conventions as RFC 3462 from 
which it descends. What internationalization issues have to be discussed?


You should then state that the keywords are case-independent protocol
elements.  See RFC 2277 section 2.

   [X] incompatible with one or more Internet Standards


Can you be more specific? What Internet standards is this document not 
compatible with, what specific issues are there, etc.?


Specifying HTTP and (to a lesser extent) SMTP syntax for message fields
is incompatible with the message format.  The conflict with RFC 2822
Subject field semantics.  Conflicts between use of MUST and SHOULD re.
RFC 2119.  Missing BCP 90 registrations.  Inadequate security
considerations vs. RFC 2026.  The RFC 2277 issue noted above.  The good
news is that all of those things can be fixed.

Specific issues with the draft:

...

   o The draft implies that "opt-out" is viable.  It is not, and the
     draft will meet some considerable resistance if it does not
     distance itself from that spammer-supported concept.  (opt-out is
     not viable because there are more than 10 billion potential senders
     of unsolicited material (nearly that many individuals, plus a large
     number of corporate entites).  Responding to a single message from
     each, at 5 seconds per response, working continuously, would take
     more than 1,500 years of uninterrupted effort.  How long do you
     plan to live, and do you wish to do something other than "opt-out"
     of unsolicited messages?)


This is a technical standard, not an editorial page. The "opt-out" 
option is included to be used in feedback loops from ISPs to legit bulk 
mailers such as SkyList, etc., not a general opt-out mechanism. It is 
perfectly viable according to the ISPs and the bulk mailers I have 
spoken to.


I'd expect to hear that from bulk emailers.  ISPs that I've talked to are
opposed to "opt-out" (as distinct from "opt-in").  If you intend for the
mechanism to be distinct from the other uses (only between ISPs and bulk
mailers, in contrast to other uses by individuals, etc. in reporting abuse),
then it's probably best to put it in a separate specification.  Incidentally,
the draft says "To inform email service provides [sic] about opt-out
requests", which sounds different from "from ISPs to [...] bulk mailers".

   o The draft states "this format is intended specifically for
     communications among providers", implying that a mere individual
     (not a provider) cannot use it to report abuse to a government
     agency (also not a provider), for example.


I will change it to read "this format is intended primarly for 
communications among providers, but can also be used in other 
situations". Would that be more clear?


It's an improvement, but explicitly listing examples or specifying
particular uses would be better still.  Otherwise you may find a
frequently-asked question "what other situations?".

   o The draft states "The first MIME part of the message contains a
     human readable description of the report" but does not state
     whether or not that part can be a MIME composite type (e.g.
     multipart/alternative).  Nor does it provide for (e.g.) an audio
     type, which might be appropriate if the recipient is known to be
     visually impaired.


This is a child-of-RFC3462. Whatever is specified there, applies here. 
Last time I checked, I did not see any of those options there or in any 
standards that descend from it such as DSNs and MSG-TRACK.


Well, these are comments on this draft, not on other documents.  It is
unclear from the draft what exactly is permitted.  The first sections
of DSNs, MDNs, and MTSNs are typically machine-generated.  It's unclear
whether you expect that to be the case for the reports described in the
draft under discussion.  If so, it's unclear where they would come from
(DSNs can be generated by MDAs, MDNs by MUAs); in this case a MUA won't
have access to SMTP MAIL FROM, and obviously an MDA or MTA isn't a UA, so
won't have anything to put in "User-Agent"...  If not, then that means
manually generated, and some people might want to provide multiple
formats (e.g. plain text plus formatted text), etc.

   o The draft states "it is RECOMMENDED that the entire original email
     message be included without any modification" but does not indicate
     how such a message containing a virus or other malware can be
     successfully conveyed in the presence of filtering (at the sender's
     site, in transit, or at the intended recipient's site) without
     encryption and/or encoding.


This issue was brought up by someone already. One option would be to 
relay just the headers of the message (an option included in this 
document). However, the consensus in the discussions that I had with 
ISPs is that this is something out of scope for this document. Rather, 
this is an issue for the abuse desks (sender and receiver) to deal with.

I do not see any other plausible solution short of mucking around with 
the RFC3462 format and making this even more complicated. Both ISPs and 
abuse desks have expressed their desire to keep this simple.


Dealing with the issue early is the best way to keep it simple for the
affected parties.  If the specification doesn't deal with the issue,
chances are that there won't be implementations that handle it.  And
users don't generally hand-craft complex MIME messages using
multipart/report -- they rely on implementations to do that.

   o The draft states "The subject line of the feedback report SHOULD be
     the same as the included email message", which conflicts with the
     defined semantics of the Subject field as stated in RFC 2822, viz.
     a description of the topic of the message containing it, not a
     purported description of a different message.


This was included due to the fact that smaller ISPs tend to use the 
Subject line for manual processing. This is currently the accepted 
convention and I don't know if mandating anything else will actually 
accomplish something.

However, what about prefixing the subject as follows:

"[FEEDBACK REPORT:] subject"
?


See draft-malamud-subject-line-05.  As noted by another reviewer, the
original message Subject is available in the third part.

   o The draft refers to RFC 2616, which is an HTTP specification that
     uses a different field syntax from the Internet Message Format.


I did not find any other standard that defines the user agent field. The 
"User-Agent" and "Mailer" fields used by email programs are not defined 
in any standard that I was able to locate. Perhaps a new registration or 
document for the user agent field should be written, OR the syntax 
copied from RFC 2616 and included in this document in long hand.


HTTP's "separators" differ from RFC 2822's "specials", and there are other
differences (e.g. LWS vs. FWS).  You might want to instead consider
something like RFC 3798 Reporting-UA (though beware that that specification
is ambiguous).  Or maybe something like RFC 3464 Reporting-MTA (since, as
noted above, a UA won't have access to SMTP MAIL FROM).

   o In several proposed fields, e.g.  "Original-Mail-From:", the draft
     makes statements such as "The format of this field is defined in
     section 4.1.1.2 of RFC 2821", whereas there is no such definition
     of any such fields in the referenced RFCs.


It is refering to the following:

"MAIL FROM:" ("<>" / Reverse-Path)
                        [SP Mail-parameters] CRLF


That's fine for a definition of the SMTP MAIL FROM command syntax, but
it's an SMTP command, not a message field.

Specifically to everything that appears after MAIL FROM. The format is 
probably the same as Return-Path header, but I wanted to reference the 
original SMTP transaction directly.


Do you really mean no space or comments after the colon?  That's very
unusual for a structured message field.

   o The draft seeks to define a media type with multiple fields (but
     N.B. not *header* fields in this case), but does not provide enough
     detail:

      o Where's the syntax specification for the format?

      o What order should the fields appear in?

      o Is the order significant?

      o May empty lines appear between fields?


This is a child-of-RFC3462.


But other documents provide a reasonable specification (e.g.
RFC 3798 section 3, including all subsections).

      o What about the promised extensibility?


There is an extensibility section.


Are there any X- fields (as in 3798 section 3.3)?

      o Where are the syntax specifications for the fields?


Are you refering to ABNF?


ABNF or a similar formal syntax format, including a normative reference to
the formal syntax specification.  ABNF is usually used, as there are
validation tools available (sort of).

      o Where are the BCP 90 registration templates for the fields?


DNS and message tracking standards do not register their fields with BCP 
90 IANA registry. Since this is a similar child-of-RFC3462, I do not see 
why it should be any different.


BCP 90 (a.k.a. RFC 3864) came after DNS and MDN specifications, and MTSN
was already in the queue when BCP 90 went into effect.  Your specification
comes after BCP 90.  Congratulations! You get to register fields (just as
RFC 4021 did, and a number of current drafts do).

   o The media type registration form doesn't say anything about a
     charset parameter, or about required charsets.  Can I send such a
     report in EBCDIC?


This is a child-of-RFC3462, everything that applies there applies here 
as well (only 7bit ASCII).


It would help to say so.

   o The draft proposes establishing an IANA registry for header fields
     (actually fields which do not appear in a header).  There is
     already such a registry and corresponding registration procedure as
     established by BCP 90.  That mechanism could be used by extending
     BCP 90 in small ways to accommodate specifying applicability of
     fields to the defined media type proposed in the draft.


See above comment re BCP 90 (and take a look at the DSN and MSGTRACK 
stuff as well).


Commented on above.

   o There is no mention of architectural or internationalization
     considerations w.r.t. keywords.  Are keywords case-independent
     protocol elements or text?  Is it OK to use "Feedback-Type:
     Betrug"?


Are you refering to the feedback-type keywords?


Yes.  See the comments about RFC 2277 above.

   o The draft provides a catch-all "other - any other feedback that
     doesn't fit into other types".  How does one distinguish a
     subsequent extension from "other" (hint: only register types that
     have a specific definition)?


Other specifically refers to a case where the abuse reporter DOES NOT 
want to specific a specific feedback type and leave that task for the 
receiver.


Wouldn't it be simpler and less confusing to make the Feedback-Type field
optional?  Then a reporter can simply leave out the field if none of the
registered values apply.

   o There is provision for one specific type of malware -- "virus", but
     not other types (worms, dialers, keystroke loggers, logic bombs,
     etc.).  The Security Glossary FYI may be a useful reference.


I assumed that it includes all malware.


You might want to pick a more general term.  Or define more keywords to
cover the other cases.

   o The Security considerations section is completely unacceptable.


Obviously, this is only a draft at this point and has not been sent to 
the IESG or anyone else for approval just yet. Rather, it is still 
something that needs work, and specific suggestions would be highly 
welcome. Blanket statements like the one above are not much helpful - of 
course I am aware that this section is not finished yet. If you would 
like to help writing it, I would be very happy.


A starting point would be to review what is specified in DSN, MDN, and
MTSN specifications, and review the guidelines to writing security
considerations sections.  Ultimately it's going to take some careful
thought.

   o References are normally unnumbered sections.


Can you provide a reasoning for this or is this simply something the 
community is used to?


"Instructions to Requests for Comments (RFC) Authors",
draft-rfc-editor-rfc2223bis-08.txt.  Not a big deal.

   o Examples are badly broken.


Aside from the various 2822 problems highlited above, how so?


Those are the currently machine-detectable problems.  There seem to be
very many.  The excessively long lines are also a problem for the draft
format.  There may be problems in the examples related to the specified
format and its fields, but without a precise syntax specification, one
cannot build a validator to check that...  In particular, see the "no
space after colon" comment above re. "Original-Mail-From" and compare
to example A.3.