nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] some indexing results

2011-02-07 14:44:11
From: Valdis(_dot_)Kletnieks(_at_)vt(_dot_)edu
Date: Mon, 07 Feb 2011 09:01:15 -0500

On Mon, 07 Feb 2011 08:54:13 GMT, Peter Maydell said:

Dunno if 100 chars will be enough if and when we finally add
enough MIME support for scan to do something sensible with
MIME-encoded bodies (ie print the start of the text/plain bit).

Keep in mind that there isn't any requirement that the
first bodypart be a text/plain.  It's often a text/html and you
need to go scanning another 2-3K into the <censored> thing
before you find stuff that's not markup.  Exchange in particular
seems enamored of sending 10-12K of inline CSS for a 4-5 word
message.

my theory on this is, MH's mime awareness is better than it was but
still nowhere near good enough.  for example:

1. mhshow should not exist, we should merge its functionality into show.
2. mhstore should not exist, we should merge its functionality into burst.
3. text/plain with long lines should be word wrapped (via "fmt" or similar.)

i'm not currently scheduling time to work on those things, but if i do end
up integrating the index stuff such that i have to rototill the internal
interfaces used by scan and pick to be more opaque and less "FILE *" based,
i'll make every effort to make it possible to parse the mime to find the
"first few words" needed by scan, and once i have that logic, i'll use it
when building the index so that no mime decoding will have to be done on
the output from the database.  not because i worry about the processing time
but because storing an extra couple hundred or thousand bytes of boilerplate
mime headers per message would really hurt the disk cache locality and blow
out the size of the sleepycat *.db files.

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>