procmail
[Top] [All Lists]

Re: procmail filtering for specific attachments (M$ Word in this case)

2004-11-03 11:10:00
* Marc Feldesman <feldesmanm(_at_)gmail(_dot_)com> [2004-11-03 12:32]:

I receive student papers weekly in email format.  The students have
been told that the only acceptable formats for such papers is either
in Microsoft Word's .doc format or in the more cross-compatible
.rtf.  I won't accept papers in any other formats.  Currently I
retrieve mail, filter on the subject line (which will always contain
a keyword to facilitate the filtering), and then determine whether
the attachment is in the correct format.  If the format is correct,
the student automatically receives an email receipt telling them
that I received their document in an apparently acceptable form.

I have a script that should serve you well.  What's interesting is
that I detect Word doc attachments for the purpose of responding to
the sender telling them that it's inappropriate and bad etiquette to
assume their recipient can read the vendor-specific format and that
it's unreasonable to have an expectation that the recipient has
purchased the commercial tools needed.

Anyway, you could use the same script, and just write your canned
response in anti_word.msg (obviously w/ different wording than I
have).

Here is the script.  You'll have to fill in some blanks, like the
$MYSELF variable, which should be an expression that matches all your
email addresses.

  PRC_LIBRARY_DIR=$HOME/procmail
  TEMPDIR=$HOME/tmp
  MAILDIR=$HOME/Mail
  
  ORIG_MSG        ="$TEMPDIR/preserved_orginal.msg"
  WORD_DOC_NOTICE ="$PRC_LIBRARY_DIR/anti_word.msg"
  AUTORESP_OUTBOX ="$MAILDIR/sent-mail/autoresp"
  WORD_SENDERS    ="$MAILDIR/wordusers.cache"
  
  :0 c: captureorigmsg.lock
  * -230^0
  *   10^0 B ?? Content-Disposition: attachment
  *   10^0 B ?? (file)?name=.*\.doc
  *   10^0 B ?? >15000
  *   30^0 B ?? Content-Type: application/msword
  *$  70^0  ^TO_.*\/$MYSELF
  *$  70^0 !^X-Loop: $MATCH
  *   70^0 !^FROM_DAEMON
  | (echo "---Original Message---"; \
     formail -X "From:" \
             -X "To:" \
             -X "Subject:"; \
     echo) > $ORIG_MSG
  
  WORDSCORE=$=
  
  # Be sure to add the following line to your
  # $HOME/.mh_profile file:
  #
  # mhstore-store-text: -
  #
  # This is necessary in order for the following
  # recipe to function properly.
  #
  :0 c: captureorig.lock
  *$ $WORDSCORE^0
  | mhstore -file - -type text/plain | sed -e 's/^/> /' >> $ORIG_MSG
  
  :0 Whc: worddoc.lock
  *$ $WORDSCORE^0
  | formail -rD 8192 $WORD_SENDERS
  
    :0 ehc         # if the name was not in the cache
    | (formail -rtI"Precedence: junk" \
                 -A"X-Loop: $MATCH" \
                 -I"From: $MATCH" ; \
       cat $WORD_DOC_NOTICE ; \
       echo; \
       cat $ORIG_MSG ; rm $ORIG_MSG \
      ) | tee $AUTORESP_OUTBOX | $SENDMAIL -oi -t -f $MATCH
  
  WORDSCORE=-1

This script is tested, though not too thoroughly because people know
where I stand on this issue and rarely send me word docs.  So if
anyone sees opportunities for improvement, please post!

BTW Marc- I don't have anything for RTFs, but this script could be
adapted for it.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail