procmail
[Top] [All Lists]

Re: convert a HTML multipart message to a plain Text formated Message

2006-10-21 02:33:35
Udi Mottelo schrieb:

    Are you doing an exercise to learn procmail or you need it in
    production?  I ask because you fork to external programs and
    back to procmail over and over.  You can group the 'fw's into
    one shell script with sed\awk or perl and release the queue of
    the messages in the server.  Remember that if you need to fix
    something under the water you don't call diver and teach him
    locksmithing, but, you call locksmith and teach him how to dive.

a bit of both,
some chineese send email with  html who is
not printable  without prior html to text conversion.
but this are only very vew emails the weak


the reason for forking to external programms
that way is very simple  " i don't know better !"
at least for the moment

reasons for that thought:

demime is for shure better in deciding "witch" Text
parts to preserve then me and my scripting abilities

stripmime.pl from Adam Glass is able to
preserve all MIME types past to it by
the -i parameter

makemime -c  adds to the from demime extractet
plain text file a new mime header

reformime -r8 standardized the MIME
 
#make shure /tmp/tmp.000 exist and is empty
:0 w
* ? ( echo "" > /tmp/tmp.000 )
{ X="" }

    The parenthesis are extra.  If you set SELL=/bin/sh (recommanded)
    you can just  >/tmp/tmp.000

#nice to know
:0
* ?  > /tmp/tmp.000
{   }
#removing the email header
:0 fw
|formail -I ""

    Why remove the header?  You can use unly the body by 'b'
    or 'B' flags in any recipe.
because it is inside a nested recipe and i
understood probably wrong from the

http://www.stimpy.net/procmail/tutorial/ref/procmailrc.html

Zitat:
On a nesting block, the flags `H'  and  `B'
only affect the conditions leading up to the block,
the flags `h' and `b' have no effect whatsoever.


:0 bfw
|makemime -c "text/plain;" -o /tmp/tmp.001 -

will do it then
--------


#erasing the body of the original mail
#inserting in the body the first boundary

    Just for example (1)+(2)=

:0 fbw
| awk '{printf("%s", '$BOUNDARY'); exit}'


i have to learn about "awk" prior using it

#put it all together an repair missing boundary,
#charactersets and ..... with reformime
:0 fw
|cat - /tmp/tmp.001 /tmp/tmp.000|reformime -r
    I use metamail to unpack all the parts into one directory.  The
    command `file *' will give the type of the parts (not in mime's
    type format) then you and group and rebuile the message with the
    parts that you want to.  If you want to try I'll send you my very
    little script.  
stripmime.pl extract all mime parts into one single file
still MIM encoded
i don't need to reasamble  the parts i  want
but it shurly be interesting to learn some more
so it would be nice from you to send me your script.

the way i do it produces only 2 tmp files regardless
of how much parts a email has !!

to find all the Conten-Type: whatsever/xxxxx

i include now the recipe content.rc

---------------
#remime.rc
****

CONTENTTYPES=`grep -e Content-Type: `

XCONT="Content-Type:"
INCLUDERC=content.rc

CONTENTTYPES #UNSET
CTYPE        #UNSET
XCONT        #UNSET

---------------
######################
#content.rc

:0  #GETTING THE
* $ CONTENTTYPES ?? $XCONT\/.*[^$NL]

{ 
CTYPE=$MATCH 

:0 #making a comma separeted 
   # parameter line for stripmime.pl -i
* ! MATCH ?? multipart
* ! MATCH ?? text
* $ MATCH ?? ([$WS]?|[$WS]+)\/[a-zA-Z0-9-]+/[a-zA-Z0-9-]+[^;]
{     
:0 #no doubble Conten-type: whatever/xxxx
* ! $ CPLINE ?? $MATCH
{
:0 #comma only if there is more then 1
* CPLINE ?? .
{ CPLINE=$CPLINE,$MATCH } 
:0 E
{ CPLINE=$MATCH }
}
}
}

:0 A #next loop only if last was sucessfull
{
 XCONT=$XCONT$CTYPE$NL[$WS]?Content-Type: 
 INCLUDERC=content.rc
}

#### END
###################
--------

#get all MIME parts -i $CPLINE
:0 fw
|$HOME/perlscript/stripmime.pl -i $CPLINE  -m  -h

CPLINE #UNSET

:0
/tmp/proctmp.000

----------
the rest of my mumbo jumbo


if you like i would like some comments from you on this
what i am realy not shure about is if it is correct to recall
 a recipe by itself but i see no other way  ??
and probably  there is some more i should think about


Matthias

 











____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail