nmh-workers
[Top] [All Lists]

Re: [Nmh-workers] I need to learn more about MIME

2014-09-12 13:40:26
I have scanned all of and have read and understood much of RFC 2045 and RFC
2046; though how much I will retain is problematic. The going was slow because
of a character flaw that limits the amount of pain I can endure each day.
Also, I took two days off to write a program to decode quoted-printable, as a
form of occupational therapy (Yeah, I know, Perl's MIME::QuotedPrint does
that.)

Wow, that's some dedication!  Congrats!

RFC 2045 seems to require that mail lines have Microsoft line endings, but
fetchmail delivers mail with UNIX line endings. Did they arrive here with
Microsoft line endings?

So ... this is kind of an issue that gets glossed over.  It's subtle as
to what's going on here.

As you and everyone else know text files on Unix have lines that simply
end in a LF.  All of the email RFCs (including non-MIME RFCs like RFC
822 and earlier) specify that lines end with CR LF.  The convention has
always been that when you exchange email with other systems you make
sure that lines end with CR LF but you convert to the local line convention
for local storage.

Mostly nmh doesn't have to deal with this; email is treated as
traditional Unix text files with just LF line endings, and mail in a
spool file already has a LF line ending; whatever put it there stripped
out the CRs for us.  But there are some important exceptions:

- Retrieving email via POP and sending email via SMTP; both of these
  protocols specify CR LF line endings, so nmh has to strip/append CRs
  where appropriate.
- Dealing with text conent encoded with base64; that specifically has
  to have CR LF for a line separator.  We used to get that wrong, but
  that's been fixed for 1.6.

Is it true that, that possibly apart from mhbuild, the nmh user does
not have explicit control of the content type of a message?

Well, that's a pretty big exception; with mhbuild you can literally create
any MIME message you can imagine.

Based
on my experimentation, nmh must have a fairly complex heuristic
for determining content type. For example, I attached a file named
"Chart.pj", That is, a file whose suffix was not .java. But human
intelligence reveals it to be a Java source file. But nmh knew it was a
java file; nmh set:

      Content-Type: text/x-java; charset="us-ascii"; name="Chart.pj"

Is there any documentation of that heuristic, maybe in source code
comments?

We kind of skip over the details here.  The specifics are that we
call a function called mime_type(), in sbr/mime_type.c.  If you have
a 'file' command that supports the --mime-type option, it will use that.
If that doesn't work or doesn't exist, we use entries in the user's or
system nmh profile (mhn.defaults).  So if you have:

mhshow-suffix-text/x-java: .pj

That would cause that to happen.

The above referenced file, Chart.pj, contains only ASCII characters,
and has no line longer than 63 characters. Why did nmh use a
Content-Transfer-Encoding of quoted-printable instead of text/plain?

That decision is made in scan_content() in uip/mhbuildsbr.c.  Well, let
me rephrase that.  For MIME _bodies_, that's where the decision is made.
For headers that happens in sbr/encode_rfc2047.c.

The exact decision is here:

        int wants_q_p = (containsnul || linelen || linespace || checksw);

        containsnul means the content contains at least one byte of '0x00'.
        linelen means 'longer than maxunencoded', which defaults to 78.
        linespace means blanks at the end of the line.
        checksw means you're generating an MD5 checksum (which honestly,
        we should get rid of that code).

I suspect you're running afowl of the 'linespace' test.

When I send a test message with non-ASCII characters sometimes nmh uses
a Content-Transfer-Encoding of quoted-printable and sometimes base64. I
can't predict which it will use.

For MIME text parts, nmh will never pick base64 (by my reading of the code).
For 8-bit headers, nmh will pick the shorter encoding (this is mentioned
on the mhbuild man page).  If you can find a counter-example I would love to
see it!

Is it correct that nmh does not support the partial subtype, described
in RFC 2045 Section 5.2.2.

Actually, that is false; nmh (and MH) do support the partial subtype!
Although I doubt it's been tested lately.  There's a whole section about
message/partial in the mhstore man page.  The -split option to send(1)
is likewise untested.  Before you ask, I'm not sure how big the split
size is.  And I have a sneaking suspicion that few MUAs understand
message/partial at this point.

--Ken

_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
https://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>