Paul Smith wrote:
How do you think you should parse this header according to RFC 2822?
From: Joe \(Joseph\) Bloggs <joe(_at_)joe(_dot_)com>
My reading of RFC 2822 says that this is three "tokens", 'Joe', '\' and
an unfinished comment '(Joseph) Bloggs <joe(_at_)joe(_dot_)com>'
As far as I can see you can have quoted pairs like '\)' in comments, but
not outside comments, so the '\' before the '(' isn't a quoting '\' but
a real '\'. Outside comments you use " characters for quoting.
Is this right, or am I missing something?
AFAICS, syntactically correct formations of the line could be:
From: "Joe (Joseph) Bloggs" <joe(_at_)joe(_dot_)com>
From: (Joe \(Joseph\) Bloggs) joe(_at_)joe(_dot_)com
From: Joe (\(Joseph\)) Bloggs <joe(_at_)joe(_dot_)com>
I am not sure the second one is valid legacy format.
My view is why are you parsing the display name?
There is really just two parts here:
angle-addr --> <joe(_at_)joe(_dot_)com>
display-name --> everything else, who cares!
Generally, you have this two formats to check:
displayname <addr-spec> current
addr-spec (displayname) legacy
The display name should be passive and transparent.
So the parser should first check for the angle-addr for the address and
everything else becomes the display name.
If there is no angle address, then look for the quote string or comment
for the display name and the remainder is the address. In fact, as I
mentioned above, the second example does not appear to be valid as the
comment should follow the address. Therefore, you can check for the
first space as a delimiter if there is no angle address. But even if it
was acceptable, you can detect it is not a valid address since it is
Hector Santos, CTO