Ok, here's what happens when someone crosses the border from annoying to
obnoxious. Park this on your hard disk. Oh and yeah, don't tell egroups...
Volker
##### strip_egroups_ad.rc
#
# Resource file for procmail. Run with:
# INCLUDERC=yourpath/strip_egroups_ad.rc
# in your $HOME/.procmailrc
#
# Removes the first ad in eGroups list emails.
# First and last line of ad are matched by $start and $end, and some string
# inside the ad by $admatch. The whole ad is replaced by $note.
#
# The usual headache with bare-bones Unix-rubbish: the sed solution never(!)
# works under Solaris 2.7 because -e handles neither newlines nor nested
# '{ }'-lists. Solaris awk is also too dumb - nawk is required.
# Needless to say, the GNU tools never have a problem. Long live the Penguin!
#
# In the public domain.
# Volker Kuhlmann
<v(_dot_)kuhlmann(_at_)elec(_dot_)canterbury(_dot_)ac(_dot_)nz>
# 31 Aug 2000
#
:0
* ^Delivered-To:(_dot_)*(_at_)egroups\(_dot_)com
* ^Mailing-List:(_dot_)*(_at_)egroups\(_dot_)com
# With awk (change nawk to gawk etc. if necessary):
#
{
end='...--------------------.*--------------------[-~>=_|e]*$'
start="^$end"
admatch='http:\/\/.*\.egroups\.com\/.*\/'
note='\[obnoxious eGroups ad removed\]'
:0 fbw
| nawk "\
BEGIN { ad=0; done=0 }\
done { print; next }\
ad && \$0 ~ \"$end\" { \
ad=0; \
if (match(text,\"$admatch\")) {\
print \"$note\"; done=1\
} else {\
print text \$0; text=\"\" }\
next\
}\
\$0 ~ \"$start\" { ad=1 }\
ad { text=text \$0 \"\n\" }\
!ad { print }\
"
}
# With sed:
#
# Write $start and $end to match the whole line, but do not(!) anchor $end at
# the start of the line using "^".
# The /$admatch/! condition is necessary to remove the first ad line, in case
# $end also matches the start line.
# The conditions should be reasonably broad, and still catch if egroups changes
# some characters in the lines.
#
# Adopted from the sed FAQ:
# :t
# /BLOCK_TOP/,/BLOCK_END/ {
# /BLOCK_END/! { N; b t; }
# /regex/s/^.*BLOCK_END//
# }
# Suppose the beginning of the block is indicated by 'BLOCK_TOP' and
# the end of the block is indicated by 'BLOCK_END'. If the expression
# 'regex' appears anywhere within the block, the entire block should
# be deleted.
# The most difficult part was to get the quoting right for procmail...
#
#{
# end='...-\{20,\}.*-\{20,\}[-~>=_|e]*$'
# start="^$end"
# admatch='http:\/\/.*\.egroups\.com\/.*\/'
# note='\[obnoxious eGroups ad removed\]'
# :0 fbw
# | sed \
# -e ':t' \
# -e "/$start/,/^$end/ { \
# /^.*$end/! { N; b t; }; \
# /$admatch/! { N; b t; }; \
# /$admatch/ { \
# s/^.*$end/$note/ ; \
# :tt; \
# n; b tt; \
# }; \
# }"
#}
##### EOF strip_egroups_ad.rc
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail