procmail
[Top] [All Lists]

Re: Bart to the rescue (was: Re: What am I doing wrong? (was: Re: why won't it match something in the body for me?

2002-10-12 19:35:45
daniel lance herrick wrote:
Bart wrote:
Procmail is expecting to get the messages with unix line termination, e.g.
just "\n" at the end of each line.

I'm not sure about that, it would seem to expect the LF... but read on.

There's the explanation. The file I dumped with the perl filter had a
whole lot of ^M-s in it when I looked at it with less.
[...]
I thought RFC822, dating back to the days of Western Union TeleType
terminals, specified <cr><lf> as the line separater in e-mail. Am I just
wrong about that? Or is something expected to convert line-ends to unix
conventions before the message gets delivered?

I think you're basically correct.

From RFC2822 (supersedes 822):

   The body of a message is simply lines of US-ASCII characters.  The
   only two limitations on the body are as follows:

   - CR and LF MUST only occur together as CRLF; they MUST NOT appear
     independently in the body.

   - Lines of characters in the body MUST be limited to 998 characters,
     and SHOULD be limited to 78 characters, excluding the CRLF.


I'm picking the mail up with fetchmail from an IMAP server on my ISP
(solaris the last time I had reason to know), I sent it there with pine
(linux) connecting to port 25 at the ISP, After whatever processing he
does, he used procmail as the local delivery agent and I ran it through a
big .procmailrc there [...]
If
they're not required by the protocol, who put those ^M-s in the message?
For that matter, those ^M-s are not in the mailbox files, so why does the
procmail delivery agent strip them after screening the message instead of
stripping them before screening the message?

If it is instead getting lines with dos/windows termination "\r\n", it
will think the entire message is the header, because it will never find
the sequence "\n\n" which means end of header/start of body.

No. According to the RFCs, that's CRLFCRLF. LFs or CRs MUST NOT appear
independently in the body... or anywhere. In fact, although sendmail
doesn't care, qmail gets downright petulant about it. :-\

Plaintext is always converted to the local format.. or should be. In other
words the line terminators should be changed to a plain LF on a unix
system, or a plain CR on a Mac. But what "goes over the wire" has to have
CRLF... you are correct in that, brother dinosaur.

Thank-you, Bart. You identified exactly what is happening. I wonder how
many years it would have taken me to solve that one.

I think Bart's on the right track: since procmail is running on a unix
system, it is indeed expecting to see unix line termination.. so why isn't
that happening?

It could be the way the message is being sent, as well as the way it is
being received: that's where I discovered qmail's propensity for pedantry,
a "text file" which was not entirely encoded according to the line
termination semantics of the host platform... and hence not properly
"converted" by the MUA on the way out.

--

Fred Morris
m3047(_at_)inwa(_dot_)net


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>
  • Re: Bart to the rescue (was: Re: What am I doing wrong? (was: Re: why won't it match something in the body for me?, Fred Morris <=