[Top] [All Lists]

Re: Parsing a header

2007-12-03 05:55:46

Paul Smith wrote:

How do you think you should parse this header according to RFC 2822?

From: Joe \(Joseph\) Bloggs <joe(_at_)joe(_dot_)com>

My reading of RFC 2822 says that this is three "tokens", 'Joe', '\' and an unfinished comment '(Joseph) Bloggs <joe(_at_)joe(_dot_)com>'

As far as I can see you can have quoted pairs like '\)' in comments, but not outside comments, so the '\' before the '(' isn't a quoting '\' but a real '\'. Outside comments you use " characters for quoting.

Is this right, or am I missing something?

AFAICS, syntactically correct formations of the line could be:

From: "Joe (Joseph) Bloggs" <joe(_at_)joe(_dot_)com>
From: (Joe \(Joseph\) Bloggs) joe(_at_)joe(_dot_)com
or even
From: Joe (\(Joseph\)) Bloggs <joe(_at_)joe(_dot_)com>

Hi Paul,

I am not sure the second one is valid legacy format.

My view is why are you parsing the display name?

There is really just two parts here:

    angle-addr   -->  <joe(_at_)joe(_dot_)com>
    display-name -->  everything else, who cares!

Generally, you have this two formats to check:

    displayname <addr-spec>      current
    addr-spec (displayname)      legacy

The display name should be passive and transparent.

So the parser should first check for the angle-addr for the address and everything else becomes the display name.

If there is no angle address, then look for the quote string or comment for the display name and the remainder is the address. In fact, as I mentioned above, the second example does not appear to be valid as the comment should follow the address. Therefore, you can check for the first space as a delimiter if there is no angle address. But even if it was acceptable, you can detect it is not a valid address since it is commented.

Makes sense?


Hector Santos, CTO

<Prev in Thread] Current Thread [Next in Thread>