nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] Maybe time for a new release?

2016-03-08 20:20:12
in the past, a simple line oriented UNIX utility such as "grep -r" could 
be used on the mhdir, with good results. with MIME that's no longer 
true, either because of quoted-printable, base64, or nested objects.

one solution proposed for this was to add some new helper tools similar 
to find(1) and/or xargs(1) which could execute line-oriented tools 
within the context of defanged and decoded content.

another solution proposed would write defanged/decoded content into 
files that are stored alongside the original (as received in SMTP) 
content. some new option to "folder" or "sortm" could generate these for 
older mhdirs, whereas inc and rcvstore would just write both forms.

i think i favoured both approaches :-). was there an end to the argument?

Well ... no?  I mean, I didn't view it as an argument, myself.  But let
me be clear: nothing I am planning on doing involved changing the MH
store.  Nothing I am planning on doing would PRECLUDE that, it's just
that I didn't view it as part of my scope.

Let me explain my larger view.  I know lots of people still want to be
able to use Unix text processing utilities on a MH store.  But I hate
to be the the one who has to explain this ... that hasn't been a realistic
goal since the advent of MIME.  The model that "email is text" just
isn't valid anymore.

Now, I know you and most everybody _know_ this, Paul ... but you haven't
really accepted what it means.  What it means to me is that you can't
really expect to use Unix text processing tools on RFC 5322-format files;
they don't meet the definition of "text" by any stretch of the imagination,
and different parts of the file can have different encoding schemes and
character sets (and parts aren't even guaranteed to contain human-readable
text).  So to me, trying to use Unix text processing tools on RFC-5322
format files in 2016 is simply a fool's errand.  Yeah, it may work fine ...
but you're guaranteed to have it not work at some point.  And we can't
really fix that without breaking a long-standing guarantee.  I know, I
know ... it was written on stone tablets, brought down from Mountain View
by Marshall Rose at the dawn of the First Age, that "MH files are text
files!".  Those tablets were pretty much smashed by Nathaniel Borenstein
in the Second Age.  (I think now we're in the Third Age).

Slightly more seriously ... email != text, that's pretty much a given
today.  If we want to have the MH store consist of "email", then it can't
be text.  If we want to change the concept of what an MH store is, then
that would break a lot of third-party tools.

So, getting around to my larger point ... my goal here is to make it
so all of the MIME tools natively deal with MIME.  That means the
standard internal API is MIME-aware, things like %{body} in scan(1) are
MIME-aware (it could be the decoded version of the first text part), and
"pick" would know how to search inside a MIME-encoded text part after
converting everything to the native character set.  You get the idea.
This requires a new API, parsing routines, etc etc ... that's the part
I'd like to work on.

Now people have suggested doing something like storing each MIME part in
it's own file, somewhere in a directory corresponding to each message.
I'm not exactly OPPOSED to that, per se ... I think it would be a lot of
work for relatively little gain, when (for instance) if I got the "new
MIME" code working, you could search through email fine with pick(1).  I
don't recall a suggestion about helper tools a la find(1) and/or xargs,
but that could just be my memory going; I'm not sure how that would
work.  I think we need to do the "new MIME" architecture to make things
better in the long run, so I view this as orthogonal to having a Unix
text processing-friendly store.  Since I personally view the text
processing-friendly store as redundant, it's not something I want to spend
my free time on.  I think the "new MIME" interface would make doing that
easier, as the code to extract those files would be simpler.

So, I guess the TL;DR answer is:

- There were lots of ideas, but nothing concrete in terms of people
  volunteering to write code.
- I wasn't planning on doing anything like that as part of my work.
  My work wouldn't stop that and might make it easier.

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>