procmail
[Top] [All Lists]

Re: processing emails that bounce.

2002-05-22 01:55:03
At 22:32 2002-05-21 -0700, Paul Thomas did say:
[snip-snip-snip]

> Ensuring that your list rejects posts from non-subscribers is a good first
> step in blocking such idiots.  Adding loop checking (to ensure that a

Huh?

What don't you understand about the prospect that mailing lists rejecting posts from non-subscribers being a good thing? It avoids most spam submission (spammers tend not to _subscribe_ to lists - they just spam at harvested addresses), and also avoids a variety of braindead bounces (which don't always come from mailer-daemon or postmaster) which are sent to the list address instead of the owner-list address.

> message which has already passed through your list isn't being sent back
> into it, say by some braindead wannabe MTA that ignores the envelope data

Huh?

Go manage a large mailing list with an international audience sometime and experience first hand all the stupid error messages you can get from recipient mail servers and things people do with email en-routet. Heck, some DSNs retain the original subject, and others even come FROM the address they're telling you isn't valid any longer...

> delivered as a UNIQUE message out to EACH RECIPIENT.  Normally, if you

Yikes! You mean that might afford other functions such as mail-merge,
including the actual subscriber address incase of need of unsubscription,
and even database driven applications...

Whatever rows your boat. If you're running multiple mailing lists off of a fat pipe, and want to deal with 10's of thousands of users across multiple lists (the vast majority of which deliver without errors or warnings), each with scores of messages a day, and send each message out as a unique SMTP transaction, that's fine by me.

Most network admins I know would just as soon not waste the bandwidth (then again, most net admins I know don't deal with the list management aspect).

The majority of bounces are easily discernable for what they are - it's the ones that aren't that are a problem.

So how many minutes does it take you to deliver to 20,000 subscribers?

Do you realize how much queue space that requires (when each of those has separate recipient headers and nifty click here to uns*bscribe links)?

Multiply the subscriber list (let's say 20K) by 25 messages a day (that's a rather light discussion list - fortunatley most people just lurk), each about 8KB in size (sometimes larger, but let's average them at 8KB). That's 4GB of network traffic in one day, for one list. That, if distributed evenly over a 24 hour period is around 400Kbit/second of bandwidth utilization. One list, not a lot of message traffic. Now, multiply by multiple lists, add some more traffic, and oh, pray to operate a webserver off of it too (not for lack of processor power, but bandwidth). A T1 suddenly doesn't quite go as far as it used to.

More processor is cheap enough. Doubling or tripling your bandwidth however, well, that costs extra - for most people, it cost significantly more than a new cheap PC.

Doing unique deliveries is great for periodic newsletters. It totally sucks for discussion lists, especially as the traffic climbs. It also sucks the royal one when many users suffer the "mailbox full" malady. If you can fire off a message with 200 recipients at hotmail, and 20 of them bounce because the yahoos don't check their email often enough, then you still only sent ONE message, and you also get only ONE bounce. Send the same message as unique messages, and you sent 200 times as much stuff, and received 20 bounces.

You'd really love it when you get digest bounces from outfits that insist on sending back the whole damn digest (fortunatley, my procmail scripts for majordomo - called Seneschal - truncate bounced digests to just the body-included headers).

> makes the email exchange efficient - one copy of the message and a
> buggerin' large list of envelope recipients. To uniquely hash each message

If you want a _really_ efficient car, leave it parked in the driveway.

Since the smallest most "efficient" car in my stable is my wife's 5.0L Mustang, and it goes straight on up from there (in both displacement and cylinder count), I'd have to say that anyone who knows me personally knows pretty well that pure efficiency isn't my game when it comes to cars. A parked car doesn't provide you with the work product which is the purpose of having a car - there's nothing wrong with wanting to attain efficiency. With a car, that'd start with ensuring that combustion is complete, rather than running unburnt fuel out the tailpipe - you might go real fast and get lots of power by running over rich, but you're not going to go very far on a tank of gas. Apply the same theme to operating an email list.

Strive for a balance.

the public Internet with 1.5ghz workstations to check their emails and
view pics of their grandkids, etc., and this is what is driving
the marketplace.

You can throw the fastest multiprocessor machine at something - if your _bandwidth_ is wasted away, it won't matter. These 1.5GHz grannies are plugged into the internet with a modem and AOL after all.

I assure you that you don't want to be running the per-recipient mailing list that some blue haired grandma sends the five 1MB BMP snapshots of her new grandkid to...

I bet you think I'm kidding.

> A few months ago, I wrote a collection of procmail scripts to work in
> conjunction with several majordomo lists I'm involved with

Should we go check at the Majordomo site?

I didn't say it was published. They're employed on a large automotive site, where the procmail filters are invoked as part of the alias which normally invokes the majordomo and list archiving stuff. It all blends in nicely, including that each list has it's own separate set of parameters used by procmail (which optional filters to use, some values for scoring, etc), even though they're all running from the same set of scripts and invoked with the same syntax.

to them. No doubt a tedious brute-force approach, but then so is
addressing the issues of email-based worms/virii.

Spot attachment. Parse for executable or a handful of known viruses (particularly of the variety that you can't even reply to). Bad attachment. No bone.


Our luck, the original query is for a 20 person mailing list with two messages a month...

---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>