procmail
[Top] [All Lists]

Re: Auto File Retrieval Recipe

1996-04-14 15:33:20
Paul O Bartlett <pobart(_at_)access(_dot_)digex(_dot_)net> writes:
...
   Because I myself am sometimes a sloppy typist, I would like to give
others the same privilege.  In particular, I want to allow more than one
space between keywords in the   Subject: get file {whatever}  line in
the requesting email header.  The recipe below successfully allows more
than one space before the "get" and "file" keywords.  However, if there
is not exactly one space between "file" and the filename, the recipe
fails and generates the following response by return email:


Your condition currently reads:
 * ^Subject: +get +file +\/[^ ]*

Procmail's regexps are slightly non-standard in that operators like *
and + normally match the _least_ possible number of times.  For normal
boolean conditions this is fine, as the above won't change the set of
strings that they match, instead it will just speed up the matching.

However, when you start using the \/ operators and $MATCH, this doesn't
work right.  The result is that * and + operators on the _right-hand_
side of the \/ will do standard longest/greatest match, while those on
the _left-hand_ side continue to be procmails shortest/least match.

In your case, the + on the space after 'file' will match once (the
minimum number of times, then procmail will successfully match the
"[^ ]*" zero times, and $MATCH will be empty.

The solution is to force procmail to match at least one character on the
right hand side of the \/ operator.  For instance, changing the "[^ ]*"
to "[^ ]+" will work.  procmail will be forced to find the first non-space
character after it matches "file ".

   * ^Subject: +get +file +\/[^ ]+



...
   Second point.  Some of the files in my ftp lib are binary, so this
receipe as it stands will not work for them.  Is there a reasonable mod
to this code so that filenames ending in .exe, .com, .zip, .gif, or
.jpg will be automatically UUENCODEd?

You can always match against the filename and go from there, but rather
than limit myself to the ending I remember to add, I'd be tempted to
cheat and use perl's -T and -B operators to make the choice.  To quote
the perlfunc manpage:

             The -T and -B switches work as follows.  The first
             block or so of the file is examined for odd
             characters such as strange control codes or
             characters with the high bit set.  If too many odd
             characters (>30%) are found, it's a -B file,
             otherwise it's a -T file.  Also, any file containing
             null in the first block is considered a binary file.
             If -T or -B is used on a filehandle, the current
             stdio buffer is examined rather than the first
             block.  Both -T and -B return TRUE on a null file,
             or a file at EOF when testing a filehandle.  Because
             you have to read a file to do the -T test, on most
             occasions you want to use a -f against the file
             first, as in next unless -f $file && -T $file.

Oh, and I'd also suggest using MIME base64 encoding instead of
uuencode; it's much more portable.  I'll use the mimencode program
here.  In fact, I'll be really paranoid and do quoted-printable across
non-binary files to cover any high bit set characters.  To supremely
flexible, if a file .meta.$FILE exists, then it is assumed to contain
other header lines for the outgoing message, possibly overriding the
determined Content-Transfer-Encoding:, Content-Type:, and
Content-Disposition:  headers.  Thus if you know that perl
misdetermines the file "foo" as non-binary just because the first block
is clean ascii, but the rest of file is binary, you can put the line:

        Content-Transfer-Encoding: base64

in .meta.foo, and the script below will not even bother running perl
in the first place.  I'm leaving off the initial conditions and the
extraction of the filename into $MATCH, as (except for the regexp
problem from the first question) that looked fine.  Have fun!

Philip Guenther

...
    FILE=$MATCH
    LOG=$WHOFROM

    :0 a
    {
        # Check for meta information
        :0 fh
        * ? test -f ./.meta.$FILE
        | cat - ./.meta.$FILE

        # Do we already know what encoding to use?
        :0
        * ^Content-Transfer-Encoding: *\/[^ ]+
        { CTE = $MATCH }

        :0 E
        {
            # Nope. Ask perl.  I think the '$' is unneeded here, but...
            # Note that for a binary file, -T will return 0 which when given
            # as an exit code looks like success.  Thus the backwardness.
            :0
            * $ ? perl -e "exit(-T '$FILE')"
            { CTE = base64 }

                :0 E
                { CTE = quoted-printable }
        }

        # How about the Content-Type?
        :0
        * ^Content-Type: *\/[^ ]+
        { CT = $MATCH }

        :0 E
        {
            # Assume that base64 encoded files are application/octet-stream,
            # and everything else is text/plain
            :0
            * CTE ?? base64
            { CT = application/octet-stream }

                :0 E
                { CT = text/plain }
        }


        :0
        * ^Content-Disposition: *\/[^ ]+
        { CD = $MATCH }

        :0 E
        {
            # This is sorta fancy, but let's support the Content-Disposition
            # header.  base64 encoded files default to attachment and include
            # the filename, while everything else is inline.  See also rfc1806
            :0
            * 1^1 ^Content-Transfer-Encoding: +base64
            * 1^1 CTE ?? base64
            { CD = "attachment; filename=$FILE"

                :0 E
                { CD = inline }
        }

        # Okay, stick all the proper headers in.  This could be rolled into
        # the following recipes if desired, replacing the 'cat -' in each.
        :0 fh
        | formail -I"Content-Transfer-Encoding: $CTE" \
                  -I"Content-Type: $CT" \
                  -I"Content-Disposition: $CD"

        # base64?
        :0 h
        * CTE ?? base64
        | (cat -; mimencode -b ./$FILE) | $SENDMAIL -oi -t

        # nope.  How about quoted-printable?
        :0 h
        * CTE ?? quoted-printable
        | (cat -; mimencode -q ./$FILE) | $SENDMAIL -oi -t

        # nope.  Assuming binary, and just cat it
        :0
        | cat - ./$FILE | $SENDMAIL -oi -t
    }
}

<Prev in Thread] Current Thread [Next in Thread>