Re: Are SPF fault tolerant ? How to make SPF records changed correctly ?

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Monday 12 July 2004 01:06 pm, Andrew G. Tereschenko wrote:

Can somebody clarify me this situation:

I'm a small or mid-size company in developing country.
We use a single colocation server on local ISP to send and recieve our
emails.
We configured SPF records to accept emails only from our ISP IP range all
others are "-all".
Something like:
"v=spf1 +mail.ourisp.com/24 -all"
(or instead of -all, we can use ?all or ~all, but select servers can
start to block them too, situation are similar)


Since SPF piggybacks on the DNS infrastructure, all of these problems are 
already addressed.

1. But here is an accident - fire/tournado/earthquake (think about
Silicon Valley, CA) or 9.11

Our ISP lines damaged. Our server lost.
As part of assistance, ISP from another state provided us new server as
replacement and configured it in _own_ network.

But here a problem:
_Cached_SPF_records_ prevent us from sending emails to our clients.
All messages are 550 blocked becouse of "-all" SPF rule and IP unknown to
our clients.

Cached MX records are okey - all undelivered emails to us are stored in
mail queee and thouse MTAs who will find our new MXes - will deliver them
to us. With a delay, but all of them will be delivered.

Your proposal ?
Do our new server must keep retrying all 550 delivery errors ?
How we can separate valid 550 from SPF 550 ? Will be "Local Policy" error
message search okey (nothing about this in "Fail" error) ?


This is the same problem as having your receiving MTAs, or HTTP servers, or 
whatever cut off from the internet due to a disaster. The normal method of 
protecting against this is to have receiving MTAs, HTTP servers, etc... 
across the globe on different networks, and have DNS point to all of them, 
or publish different DNS records for different regions. This is, of course, 
expensive and complicated, but it is the only way to get true redundancy.

The problem with sending mail is similar. If you want a fault tolerant 
sending process, you'll have to put in redundancy world wide.

How will you send email if you are cutoff from the internet?

2. Another sitiation. We found that work-load of our current server
increased (or pricing on another ISP are better) so we need to move to
another ISP.
How many time (based on TTL) it will take to move our server to send
emails correctly without "550" retries ?


Depends on the TTL of your DNS records. Moving HTTP servers, receiving MTAs, 
and other services experience the same problem.

What I've seen done in the past is a few days before the move, change the 
TTL to a few minutes. Then, after the move, publish new records with TTLs 
of a few minutes (just in case there are mistakes). After checking the 
systems are working, extend the TTLs to whatever you normally have them - 
24 hours or whatnot.

Receiving SPF aware MTAs should abide by TTLs published in DNS. In fact, 
they should probably query the DNS system directly and do no local 
connection caching.

There is a rule of thumb: 1/4 of your resources (bandwidth, memory, CPU, 
rackspace) should be unused. When you start intruding into that 1/4, start 
making plans for an upgrade. If you hit 90% resource usage, emergency plans 
have to be made for an upgrade. So you should never run out of resources 
because you are well prepared, right?

3. One more. We do not change ISP. But our ISP willing to change IP
netblock they own OR simply change an IP of our server OR change IP
address of dial-up pool we used to send emails.
The same question - how to do this correctly ? Do we have to delegate our
DNS management in addition to server management to ISP ?
Or we must take such a change burden on ourself ?

Is there any recomentations on TTL for SPF records and caching/validation
process other that http://www.ietf.org/rfc/rfc1537.txt recomended TTL
(downtime) = 1 day ?
Can you document all requered steps how to change SPF records data for
situations described above ?


If you have a contract with your ISP for a static IP, and they change your 
IP, your servers won't work anymore. SPF will be the least of your 
problems.

If you don't have such a contract, then you'll have to point to their 
records or rely on their DNS services. Somehow, you'll have to work with 
them to figure out what SPF records are appropriate.


AFAIK, Nothing like this will happen for DomainKeys. Cached DNS values
will only benefit them - do not hurt in such a situations.
SPF rely on current network configurations - but Internet originaly was
designed to change configuration/routing even in case of WW3 and USSR
A-Bomb attack.


Yes, routing is adaptive. But if the endpoints are destroyed or cut off, 
then a connection can't be made, and it doesn't matter how adaptive routing 
is.

DNS is merely a service that translates names to addresses. It's also a way 
to publish a database of information about a domain. If you change IP 
addresses, but don't update the DNS database, then you can't expect DNS to 
be correct. In the same way, if you change your sending policy, but don't 
update the SPF records in DNS, you can't expect it to be correct.

- -- 
Jonathan M. Gardner
Mass Mail Systems Developer, Amazon.com
jonagard(_at_)amazon(_dot_)com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFA9EunBFeYcclU5Q0RAgFkAJwLtKEixNAKcweP5v1ou6WeqUB1pQCgt9rR
E09+NQQWeZKcO0lI+4equhE=
=M9gq
-----END PGP SIGNATURE-----