procmail
[Top] [All Lists]

Re: Implication of bad message id's (please)?

1997-12-09 15:49:57
I ask:
 a) Are there any real problems with a user sending mail, unknowningly
    containing a bad message id?
 b) How can this occur in the first place -- how could our user accidentally
    send out email with a bad message id?

This is the filter that is catching (apparently) the space in the Message-Id:
#############################################################################
# From era eriksson <era(_at_)iki(_dot_)fi> in procmail-d Digest V97 #308:
# Note: There are spaces and tabs in that filter below:
#       
^Message-Id:[SpaceTab]*<[^SpaceTab<>@]+(_at_)[^SpaceTab<>@]+>[SpaceTab]*$
:0:
* ! ^Message-Id:[     ]*<[^   <>@]+(_at_)[^   <>@]+>[         ]*$
$SPAMDIR/IN.badMessageId
##############################################################################

The message-id with a space before the closing angle certainly appears
to be faulty per rfc22.

Era's recipe does not allow all valid message-ids through. I
have been using the following with some success. This is my source
input to a preprocessor, and must have some of the comments removed
before it can be used.

############################################### spam .................. MID-R -C
## rfc822 says message-id = word *("." word) "@" sub-domain *("." sub-domain) ##
## I have seen valid mail with <winATT-3.01-userid-999> as the message-id.    ##
################################################################################
  TEMPRULE                              ## clear temporary rule variable
  :0                                    ##
  * MID ?? ^^^^                         ## do we have anything saved as MID?
  { TEMPRULE=0 }                        ## no? score 0 (haven't seen it yet)
  :0 E                                  ## don't bother examining it if
  * ! RULE ?? stlth                     ## this is stealth garbage
  { :0                                  ## don't parse this without help
    * ! MID ?? ^^<([^][()<>,;:\"        ]+(\.[^][()<>,;:\"      ]+)*|\"[^"]+\")\
         @((\[.*\]|[^][()<>,;:\"        ]+)(\.(\[.*\]|[^][()<>,;:\"     ]+))*\
                  )>[   ]*^^            ## and anything else ain't up to rfc!
    { TEMPRULE=R }                      ## so tag it R for rfc violation
    :0                                  ## now, has anything funny been stuck
    * MID ?? 0000000000|2345678|$\ATPLAY   ## into the id?
    { TEMPRULE=${TEMPRULE}C }           ## score it a C for constant and boot it
  }                                     ##
  RULE="$RULE${TEMPRULE:+ MID-$TEMPRULE}"  ## add TEMPRULE to RULE

A couple of notes about the recipe:

o RULE is a variable which holds other spam rules which may have been
  tripped previously.  Specifically, if I think that the stealth mailer
  was used, I won't check the message-id because it is always forged
  with stealth in my experience.
o The message-id (if there is one) has already been extracted into MID
  at this point.
o ATPLAY is my email address on this system.
o The regexp for a valid message-id allows quoted strings in the local
  part of the address, which era's recipe does not, but rfc822 does. It
  also allows arbitrary strings between square brackets in the domain
  portion, as my reading of the definition of a domain-literal
  indicates.
o The note about a valid message with an invalid message-id just shows
  that there are plenty of broken mailers, as you have discovered. The
  rest should be clear.

As to the other part of your question: I can't think of a single thing
that will break with an invalid message-id.

-- 
Rik Kabel          Old enough to be an adult              
rik(_at_)netcom(_dot_)com