ietf-openpgp
[Top] [All Lists]

Re: [openpgp] User ID conventions (it's not really a RFC2822 name-addr)

2019-11-06 01:37:43
Hi Brian,

On Wed, 06 Nov 2019 01:05:46 +0100,
brian m. carlson wrote:
On 2019-11-05 at 22:35:11, Neal H. Walfield wrote:
I'm considering using the following "grammar".  (I've put grammar in
scare quotes, because it is not a valid grammar according to RFC 5322
due to several ambiguities.  In particular, the production "*WS
[name] *WS" is ambiguous when applied to a string containing a single
whitespace character: the whitespace character could match the first
WS or the second one.  In practice, this ambiguity doesn't matter,
because we only care about what the "name", "comment-content" and
"addr-spec" productions match.)

     WS                 = 0x20 (space character)

     comment-specials   = "<" / ">" /   ; RFC 2822 specials - "(" and ")"
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\" /
                          "," / "." /
                          DQUOTE

     atext-specials     = "(" / ")" /   ; RFC 2822 specials - "<" and ">".
                          "[" / "]" /
                          ":" / ";" /
                          "@" / "\" /
                          "," / "." /
                          DQUOTE

     atext              = ALPHA / DIGIT /   ; Any character except controls,
                          "!" / "#" /       ;  SP, and specials.
                          "$" / "%" /       ;  Used for atoms
                          "&" / "'" /
                          "*" / "+" /
                          "-" / "/" /
                          "=" / "?" /
                          "^" / "_" /
                          "`" / "{" /
                          "|" / "}" /
                          "~" /
                          \u{80}-\u{10ffff} ; Non-ascii, non-control UTF-8

     name-char-start    = atext / atext-specials

     name-char-rest     = atext / atext-specials / WS

     name               = name-char-start *name-char-rest

     comment-char       = atext / comment-specials / WS

     comment-content    = *comment-char

     comment            = "(" *WS comment-content *WS ")"

     addr-spec          = dot-atom-text "@" dot-atom-text

dot-atom-text isn't defined here, so it isn't clear to me what it
includes.  Does it permit UTF-8 in addresses according to the SMTPUTF8
RFCs?

Thanks for catching that.  When turning my code into a grammar, I
somehow forgot that production.


The dot_atom_text is unchanged from e.g. RFC 2822:

   dot_atom_text      = 1*atext *("." *atext)

But since we've extended atext to include non-control UTF-8
characters, this should allow international email addresses.

RFC 6531 (the SMTPUTF8 RFC) extends atext as follows:

  atext   =/  UTF8-non-ascii
    ; extend the implicit definition of atext in
    ; RFC 5321, Section 4.1.2, which ultimately points to
    ; the actual definition in RFC 5322, Section 3.2.3

  https://tools.ietf.org/html/rfc6531#section-3.3

which, I think, is what I did above.

But, I've only skimmed RFC 6531 so I might have missed something else.

Thanks!

:) Neal

_______________________________________________
openpgp mailing list
openpgp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/openpgp