On Fri, Feb 05, 2016 at 06:42:34AM -0800, Ned Freed wrote:
The implementation and documentation of this was joint work with
Wietse back in early 2006. These days, when STARTTLS fails, Postfix
tries other MX hosts first and if they all fail, defers the mail
initially. Cleartext fallback kicks in on the second delivery
attempt if STARTTLS fails again.
Actually, I consider this approach as unacceptable unless the second delivery
attempt occurs within a minute or two. (Which, incidentally, is a much shorter
retry period after deferral than the standards recommend.)
The default is 5 minutes, with doubling exponential backoff up to
a cutoff of somewhat over an hour:
$ postconf -d {min,max}imal_backoff_time
minimal_backoff_time = 300s
maximal_backoff_time = 4000s
(These, combined with per-destination concurrency limits to avoid
overwhelming remote systems that come back after a period of
downtime, and throttling of destinations when enough back-to-back
deliveries fail, do a better job of avoiding hammering remote
systems than the recommendations in the standards, while substantially
reducing delay when transmission fails on the first attempt).
I typically override these to min/max = 225s/7200s (this reduces
congestion delay if the queue happens to hold enough long-term
deferred mail, since with these settings the fraction that is active
at any given time is reduced by a factor of 2 or so).
As for "unacceptable", you might find the below fall into that
category:
* IIRC Sendmail never falls back to cleartext if STARTTLS is
advertised.
* Microsoft's Schannel TLS stack at outlook.com and in Exchange
by default solicits client certs it does not use and then rejects
client connections that happen to present a certificate chain with
an MD5 signature, even when it the self-signature of a root CA
as with CAcert.org.
* The ancient SChannel implementation in Exchange 2003 ignores all
but the first 64 ciphers in the client's TLS hello, and has a
broken DES3-CBC implementation that fails post-handshake with
trailing garbage in the TLS response to EHLO that breaks the
established TLS connection after "MAIL FROM". Only RC4-SHA
and RC4-MD5 work (but modern Schannel at outlook.com refuses
to negotiate RC4).
$ posttls-finger -o tls_medium_cipherlist=RC4-SHA -c -Ldebug microsoft.com
posttls-finger: initializing the client-side TLS engine
posttls-finger: setting up TLS connection to
microsoft-com.mail.protection.outlook.com[207.46.163.138]:25
posttls-finger:
microsoft-com.mail.protection.outlook.com[207.46.163.138]:25: TLS cipher list
"RC4-SHA:!aNULL"
posttls-finger: SSL_connect:before/connect initialization
posttls-finger: SSL_connect:SSLv2/v3 write client hello A
posttls-finger: SSL_connect error to
microsoft-com.mail.protection.outlook.com[207.46.163.138]:25: lost connection
* DANE/DNSSEC at mail.mil and the many domains it is MX for is
fubar'ed, because their firewall "protects" the DNS servers by
blocking lookups for exotic records such as TLSA.
$ dig -t tlsa +noall +comment +ans _25._tcp.pri-jeemsg.eemsg.mail.mil.
;; connection timed out; no servers could be reached
$ dig -t a +noall +comment +ans _25._tcp.pri-jeemsg.eemsg.mail.mil.
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 14873
;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1
* Despite being notified at the beginning of Aug 2015, isphuset.no
still has has upgraded from a buggy PowerDNS version that
botches TLSA record denial of existence for the domains of all
their DNSSEC hosted customers. [ In the example below, the
name internot.no seems rather apt ]
$ unbound-host -t tlsa -v _25._tcp.internot.no.
_25._tcp.internot.no. has no TLSA record (BOGUS (security failure))
validation failure <_25._tcp.internot.no. TLSA IN>: nodata proof failed
from 195.35.82.103
As for a delay of < 5 minutes delivering email to such broken sites
it is, for most users, a reasonable trade-off to reduce needless
TLS fallback in the face of routine transmission glitches.
Though in the case of mail.mil, internot.no and the like, one has
to explicitly disable DANE support for those domains, since in
order to avoid downgrade attacks there is no fallback when TLSA
record lookups timeout or servfail.
--
Viktor.