nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] nmh internals: full MIME integration

2014-07-26 05:12:29
Hi Ken,

Right now a call to the MIME parsing routines end up slurping in the
whole message, but that's not desirable for a lot of programs (scan,
pick).  It seems like parsing all of the messages headers is generally
worthwhile; that (usually) fits within a single stdio buffer, so doing
extra work there shouldn't be a huge problem.

If we're having lazy evaluation of MIME parts, which is good, can it
also cover the headers?  `pick --list-id <foo(_at_)bar(_dot_)com>' isn't 
concerned
with decoding Subject and all those Received headers.  It may not sound
like much, but we have folders with tens of thousands of emails.
get_header() could note minimal details of each header it comes across
whilst searching for the List-ID but not bother too much about their
contents.

Also, http://www.ietf.org/rfc/rfc2919.txt says only one List-ID per
email;  does nmh have knowledge of one-off headers so it can stop
reading headers on the first match?  That pick uses the `--' as pick
doesn't know of List-ID, unlike, say, Subject;  perhaps it needs to know
of more official headers so it can make use of one-off-ness.

Whilst looking at pick's source, I found MHPDEBUG;  I don't think it's
documented but could be useful for those learning pick?  Perhaps it
should be -debug instead?

    $ MHPDEBUG=x pick -from tom -and -lbr --list-id foo -o -sub Foo -rbr .
    AND
    | PATTERN(header) ^from[        ]*:.*tom
    | OR
    | | PATTERN(header) ^list-id[   ]*:.*foo
    | | PATTERN(header) ^subject[   ]*:.*Foo
    pick: no messages match specification
    $

Also-also, access to raw and decoded headers would be nice, e.g. I
sometimes want to find Subjects that have `=?utf-8?' in them.

The Content struct would be extended to indicate whether or not the
complete message had been parsed; programs that just needed to examine
the header would simply parse out those headers in the message.
Because address parsing is common, we could parse out all of the
addresses as well during header reading.  We could also maintain a
list of headers that contain addresses (right now each program has to
keep that list locally) and make a function/macro to query that.

That's the kind of overhead that would be nice to see done only on
demand.

Cheers, Ralph.

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>