nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Thoughts: header/address parsing

2014-08-04 20:26:36

On Aug 4, 2014, at 6:05 PM, Ken Hornstein <kenh(_at_)pobox(_dot_)com> wrote:

I can't speak for things like Received, but at least for headers
containing comma-separated addresses you're supposed to (according to
the RFC) add a comma and logically combine them together (we mostly do
okay on that).

If I'm reading Norm's request the right way, he wants something that can easily 
be parsed with things like cut and awk.  I know that's what I would be looking 
for.  So, let's say you were given

  To: NMH Workers <nmh-workers(_at_)nongnu(_dot_)org (NMH Workers Unite!)

you would get something like

  addr\tnmh-workers(_at_)non-gnu(_dot_)org
  phrase\tNMH Workers
  comment\t NMH Workers Unite!

The idea being each structured part of the header would be broken out with an 
identifying tag, and with the content attached in a completely un-encoded 
manner.  So you would have 

  TAG HT CONTENT EOL

(in pseudo-BNFO, with CONTENT being fully decoded utf8 text.

Looking at this I might have answered my own question, at least partially for 
the To: case.  Each recipient in the To: header would print its own 
addr/phrase/comment tuple.  You could separate recipients in the header by 
blank lines to let awk figure out the individual recipients.  But how do you 
deal with 

  mhparse to first-last

??

--lyndon

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers
<Prev in Thread] Current Thread [Next in Thread>