A few hours ago I wrote
Edward J. Sabol writes on 16 October 1997 at 11:46:36
The exact solution is left as an exercise for the reader.
I'll play around with this and see what I come up with...more to
This line-by-line parsing could be useful...the only problem I had was
with blank lines. I couldn't get the reg-exp modified right, so I
used scoring to fix things up; there's probalby a better way.
Anyway, here's what I think is a decent first-pass at a reasonably good
way to reduce a MIME multipart/alternative message (Netscape's text
and HTML) message to just the first part (text). Because of the
recursive INCLUDERC two files are needed, both appended below.
Next exercise: extract the Nth part of a MIME multipart message.
So this *was* about procmail after all :-) My "some-filter..." can be
done in procmail.
Dan
------------------- message is author's opinion only ------------------
J. Daniel Smith <DanS(_at_)bristol(_dot_)com>
http://www.bristol.com/~DanS
Bristol Technology B.V. +31 33 450 50 50, ...51 (FAX)
Amersfoort, The Netherlands {info,jobs}(_at_)bristol(_dot_)com
----
TMPDIR=/tmp/$LOGNAME
MAILDIR=$TMPDIR/procmail.out
DEFAULT=$MAILDIR/$LOGNAME
VERBOSE=yeah
SHELL=/bin/sh
:0
* ! ? test -d $TMPDIR || mkdir $TMPDIR
{
# Bail out if directory didn't exist and couldn't be created
EXITCODE=127
HOST
}
:0
* ! ? test -d $MAILDIR || mkdir $MAILDIR
{
# Ditto
EXITCODE=127
HOST
}
# ... your experimental recipes here
RCDIR=$HOME/.procmail/mime
LINEBUF=100000
:0B
* $> ${LINEBUF}
{
# message too large
}
:0E
* ^Mime-Version:
* ^Content-Type:[ ]*multipart/alternative;[ ]*boundary="?\/[^"]+
{
boundary=$MATCH
:0B
* ^^\/(.*$)+
{
BODYLINES = $MATCH
INCLUDERC=$RCDIR/extract-part.rc
:0
* part ?? .
{
:0bfwi
| echo "$part"
# get the Content-Type:
:0B
* $ ^Content-Type:[ ]*\/[^ ]+
{
content_type=$MATCH
# and the contents of this MIME part
:0B
* $ ^Content-Type:[ ]*$\content_type$\/(.|$)*
{
:0bfwi
| echo "$MATCH"
:0hwfi
| formail -I "Content-Length:" -I "Content-Type: $content_type"
}
}
# A Content-Type: isn't required
:Ehwfi
| formail -I "Content-Length:"
}
}
}
# From: "Edward J. Sabol" <sabol(_at_)alderaan(_dot_)gsfc(_dot_)nasa(_dot_)gov>
# To: Procmail Mailing List
<procmail(_at_)Informatik(_dot_)RWTH-Aachen(_dot_)DE>
# Date: Thu, 16 Oct 1997 11:46:36 -0400
:0
* BODYLINES ?? ^^(.*$)\/(.*$)+
{ REMAININGLINES = $MATCH }
:0E
{ REMAININGLINES }
:0
* BODYLINES ?? ^^\/.*$
{ THISLINE = $MATCH }
:0E
{ THISLINE }
# .*$ sucks up blank lines; this puts it back
:0
* 1^1 BODYLINES ?? ^$
{ b=$= }
:0
* 1^1 REMAININGLINES ?? ^$
{ r=$= }
:0
* $ b ?? $r
{ }
:E
{
THISLINE="$THISLINE
"
}
:0
* $ THISLINE ?? ^--$\boundary(--)?
{
THISLINE
:0
* in_boundary ?? yes
{
in_boundary=no
stop=yes
THISLINE
}
:E
{ in_boundary=yes }
}
# Now recurse if there are any remaining lines.
:0
* REMAININGLINES ?? .
{
BODYLINES = $REMAININGLINES
:0
* stop ?? yes
{ }
:E
* in_boundary ?? yes
{ part=$part$THISLINE }
INCLUDERC = $_
}