ietf-smtp
[Top] [All Lists]

Re: RFC 5321 VRFY and quoting syntax

2011-05-11 15:45:15



On 05/11/2011 08:47 PM, Murray S. Kucherawy wrote:

-----Original Message-----
From: owner-ietf-smtp(_at_)mail(_dot_)imc(_dot_)org 
[mailto:owner-ietf-smtp(_at_)mail(_dot_)imc(_dot_)org] On Behalf Of A. Rothman
Sent: Monday, May 09, 2011 8:03 AM
To: IETF SMTP
Subject: RFC 5321 VRFY and quoting syntax

1. The syntax definition for commands such as VRFY and EXPN (e.g.
section 4.1.1.6) is

vrfy = "VRFY" SP String CRLF

with the syntax of "String" defined in section 4.1.2 as:

String         = Atom / Quoted-string

however, section 3.5.1 states:

     The character string arguments of the VRFY and EXPN commands cannot
     be further restricted due to the variety of implementations of the
     user name and mailbox list concepts.

What is not entirely clear to me is which is true - is any string of
characters valid (syntax-wise), or must it be a Quoted-string (in double
quotes) if any non-atext characters appear in it. This is especially
confusing having seen various examples online (and some discussions on
this mailing list as well) where pointed brackets are included in the
VRFY argument (using a Path syntax similar to the one defined for the
MAIL FROM and RCPT TO commands), but with no quoting.
I'm missing how your first sentence here is a conflict.  It has to be a 
quoted-string if there are any non-atext characters, otherwise the quoting is 
not needed.
This is not exactly an issue of conflict, but rather of leaving room for interpretation. The ABNF is indeed well-defined on it's own as you state, however when one reads that the argument 'cannot be further restricted' (is that before or after quoting?) and sees many examples out in the wild, most notably the brackets but also others (as in Hector's previous response which uses backslash escaping outside of a quoted-string), it raises doubt of whether 'cannot be further restricted' means 'you can do anything you want' or indeed still must abide by the rules of quoting. Having many examples violating the rules of quoting gives more weight to the first interpretation. Or, in more practical terms - since there are many common violations, the RFC wording might have some room for improvement in regards to being more explicit and preventing these violations.
Also, a good point on the angle brackets.  The syntax seems to suggest they're 
not valid (or, indeed, need quoting themselves).  I'm sure it's commonly 
accepted this way since the most commonly used SMTP commands (MAIL and RCPT) 
require them but most implementations also usually tolerate their absence.

2. Section 4.1.2 defines the backslash-escaped character mechanism in
quoted-pairSMTP, which is used only in a Quoted-string (within
double-quotes), and does not mention such escaping outside of a
Quoted-string. The following text section states:

     Note that the backslash, "\", is a quote character, which is used to
     indicate that the next character is to be used literally (instead of
     its normal interpretation).  For example, "Joe\,Smith" indicates a
     single nine-character user name string with the comma being the
     fourth character of that string.

So, it is unclear whether this paragraph applies only to the
Quoted-strings defined above, or to any characters in any argument to
any command, or only to mailboxes (discussed in the preceding paragraph)
or some other definition of when it does and does not apply.
It appears to apply only within quoted strings.  Its main function then would 
be to escape quotation marks within quoted strings, because everything else is 
not a character that separates tokens.
If you believe the ABNF, that's correct. If you believe the text in that paragraph... not so sure any more. Even in the example within the paragraph of "Joe\,Smith", it is unclear if the quotes are part of the actual argument, or just a way to embed the example within a paragraph of explanatory text. Here too, I think the fact that there are various violations out there which can legitimately point to this paragraph and say 'it says so in the RFC'... it would be better imho to update the spec's wording to be unambiguous, e.g. "Note that the backslash, "\", when used within a quoted string, is a quote character..."
3. Still regarding VRFY (and maybe also EXPN?) section 3.5.1 states:

     If a normal (i.e., 250) response is returned,
     the response MAY include the full name of the user and MUST include
     the mailbox of the user.  It MUST be in either of the following
     forms:

        User Name<local-part@domain>
        local-part@domain

Whereas section 3.5.2 claims:

     When normal (2yz or 551) responses are returned from a VRFY or EXPN
     request, the reply MUST include the<Mailbox>   name using a
     "<local-part@domain>" construction

Notably, the second example in the former section does not comply with
the latter section (as it contains no pointed brackets). Which is the
correct form?
I would say the angle-bracket form is what would be expected, but you're right, 
this looks like an ambiguity.  I would recommend opening an errata item for it 
if that hasn't already been done.
ok, I'll do that when this thread ends, if no one comes up with a counter-explanation.
-MSK