Today, I finally got around to doing something that has been on my
TODO list for about 6 months. I took the ABNF out of the SPF spec and
converted into a regular expression. So, I now have a regular
expression that matches valid SPF records and rejects invalid ones.
I used egrep to run this regular expression over a list of 591475 SPF
records that I had found in the .com domains and it took 1.25seconds
on my 900MHz PIII. I think this shows that you can easily do complete
syntax checking on all SPF records without any significant performance
penalty.
Using this regular expression, I have already discovered a couple of
bugs in the SPF spec's ABNF and several more in the test suite. I
think this regular expression can be used to help develop and/or test
your SPF implementations.
I have linked to the regular expression on the
http://www.schlitt.net/spf/tests/ webpage. It currently comes in two
forms, one that works for egrep, and one the uses ruby/perl style
"extended" regular expressions.
Just for giggles, here is the "extended" version of the regular
expression:
%r{[Vv]=[Ss][Pp][Ff]1
(?:\x20+
(?:[\x2b\x2d\x3f~]?
(?:[Aa][Ll][Ll]|
[Ii][Nn][Cc][Ll][Uu][Dd][Ee]:
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[Aa]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
[Mm][Xx]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
[Pp][Tt][Rr]
(?::
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))?|
[Ii][Pp]4:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])(?:/\d+)?|
[Ii][Pp]6:
(?:::|
(?:[0-9A-Fa-f]{1,4}:){7}[0-9A-Fa-f]{1,4}|
(?:[0-9A-Fa-f]{1,4}:){1,8}:|
(?:[0-9A-Fa-f]{1,4}:){7}:[0-9A-Fa-f]{1,4}|
(?:[0-9A-Fa-f]{1,4}:){6}(?::[0-9A-Fa-f]{1,4}){1,2}|
(?:[0-9A-Fa-f]{1,4}:){5}(?::[0-9A-Fa-f]{1,4}){1,3}|
(?:[0-9A-Fa-f]{1,4}:){4}(?::[0-9A-Fa-f]{1,4}){1,4}|
(?:[0-9A-Fa-f]{1,4}:){3}(?::[0-9A-Fa-f]{1,4}){1,5}|
(?:[0-9A-Fa-f]{1,4}:){2}(?::[0-9A-Fa-f]{1,4}){1,6}|
[0-9A-Fa-f]{1,4}:(?::[0-9A-Fa-f]{1,4}){1,7}|
:(?::[0-9A-Fa-f]{1,4}){1,8}|
(?:[0-9A-Fa-f]{1,4}:){6}(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
\x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){6}:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
\x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){5}:(?:[0-9A-Fa-f]{1,4}:)?
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){4}:(?:[0-9A-Fa-f]{1,4}:){0,2}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){3}:(?:[0-9A-Fa-f]{1,4}:){0,3}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
(?:[0-9A-Fa-f]{1,4}:){2}:(?:[0-9A-Fa-f]{1,4}:){0,4}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
[0-9A-Fa-f]{1,4}::(?:[0-9A-Fa-f]{1,4}:){0,5}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
::(?:[0-9A-Fa-f]{1,4}:){0,6}
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]))(?:/\d+)?|
[Ee][Xx][Ii][Ss][Tt][Ss]:
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d))|
[Rr][Ee][Dd][Ii][Rr][Ee][Cc][Tt]=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[Ee][Xx][Pp]=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*
(?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d)|
[A-Za-z][\x2d\x2e0-9A-Z_a-z]*=
(?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
%%|
%_|
%\x2d|
[!-\x24&-~])*))*\x20*}x
-wayne