procmail
[Top] [All Lists]

Re: bad message id's

1998-03-19 01:49:02
When I wrote my version up, I didn't have action to the RFCs (so I
left it pretty lose).  Now my curiosity is piqued, so I'll confirm/
denounce any statements below.  :)

  * Permit empty pair of quotes (is this legal?)

Yup.

  * Allow iterations over [pseudocode] (".*"|[^"]+) with a + 
    (I take it the localpart can't be empty)

Must be one character long or more.

  * Disallow spaces and tabs after @ and require at least one char

I *think* the RFC says that spaces and tabs are allowed, if
the domain is enclosed in quotes (which strikes me as just plain
weird).  But it's also 2:30 in the morning and I'm reading RFCs,
which strikes me as equally strange.

  * Permit whitespace after closing broket

I'm pretty sure that this isn't allowed.  The RFC implies everything
that's allowed up to the closing bracket, but doesn't explicitly say
"No, you have to stop here."  So maybe you're right.  I'm 
going to leave mine without this allowance and see what happens...

Can't there be backslash-escaped quotes inside double-quoted strings?

Yup.

Any other problems?

Can also have escaped carriage returns.

Just as a data point, I have been using the "pedestrian" version of
this filter for several months (since 1997/08/06 07:15:59 sez RCS, but
that was a tweak of an earlier version) and it only caught one piece
of legit mail (and that was from a guy who was trying to write his own
MUA :-) ... but then that's mostly mail from a limited set of sources
(mailing lists, mostly Sendmail sites, mostly people using Pine, elm,
mutt, Pegasus, Eudora, etc).

I'm ashamed to say that I've been testing the filters on all incoming
mail at our site (with my boss's permission, of course), so I get
a fairly wide range of messages as my testbed (about 20,000 messages
daily).  

Surprisingly, the valid messages that I've been catching are blatantly
non-compliant.  i.e.

   Message-ID: <199803112212.OAA20352>
   Message-Id: <199803112008.MAA28585>
   Message-ID: <Wed, 11 Mar 98 13:57:40 
PST_16(_at_)ccm(_dot_)hf(_dot_)intel(_dot_)com>
   Message-ID: <91168 
98/03/12*vsnl(_dot_)itbbombay(_at_)ems(_dot_)vsnl(_dot_)net(_dot_)in>
   Message-ID: IN*VSNL*;9803121606581984502

The biggest problem is nonquoted spaces or the lack of a '@domain'
identifier.  Sometimes you just have to wonder what these people
were thinking when the software was written...

Chris

<Prev in Thread] Current Thread [Next in Thread>