spf-discuss
[Top] [All Lists]

A regular expression that matches SPF records

2005-06-27 22:53:33


Today, I finally got around to doing something that has been on my
TODO list for about 6 months.  I took the ABNF out of the SPF spec and
converted into a regular expression.  So, I now have a regular
expression that matches valid SPF records and rejects invalid ones.

I used egrep to run this regular expression over a list of 591475 SPF
records that I had found in the .com domains and it took 1.25seconds
on my 900MHz PIII.  I think this shows that you can easily do complete
syntax checking on all SPF records without any significant performance
penalty.


Using this regular expression, I have already discovered a couple of
bugs in the SPF spec's ABNF and several more in the test suite.  I
think this regular expression can be used to help develop and/or test
your SPF implementations.



I have linked to the regular expression on the
http://www.schlitt.net/spf/tests/ webpage.  It currently comes in two
forms, one that works for egrep, and one the uses ruby/perl style
"extended" regular expressions.


Just for giggles, here is the "extended" version of the regular
expression:

%r{[Vv]=[Ss][Pp][Ff]1
   (?:\x20+
      (?:[\x2b\x2d\x3f~]?
         (?:[Aa][Ll][Ll]|
            [Ii][Nn][Cc][Ll][Uu][Dd][Ee]:
            (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
               %%|
               %_|
               %\x2d|
               [!-\x24&-~])*
            (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
               %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
               %%|
               %_|
               %\x2d)|
            [Aa]
            (?::
               (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d|
                  [!-\x24&-~])*
               (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
                  %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
            [Mm][Xx]
            (?::
               (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d|
                  [!-\x24&-~])*
               (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
                  %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d))?(?:(?:/\d+)?(?://\d+)?)?|
            [Pp][Tt][Rr]
            (?::
               (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d|
                  [!-\x24&-~])*
               (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
                  %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
                  %%|
                  %_|
                  %\x2d))?|
            [Ii][Pp]4:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
            (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
            (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
            (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])(?:/\d+)?|
            [Ii][Pp]6:
            (?:::|
               (?:[0-9A-Fa-f]{1,4}:){7}[0-9A-Fa-f]{1,4}|
               (?:[0-9A-Fa-f]{1,4}:){1,8}:|
               (?:[0-9A-Fa-f]{1,4}:){7}:[0-9A-Fa-f]{1,4}|
               (?:[0-9A-Fa-f]{1,4}:){6}(?::[0-9A-Fa-f]{1,4}){1,2}|
               (?:[0-9A-Fa-f]{1,4}:){5}(?::[0-9A-Fa-f]{1,4}){1,3}|
               (?:[0-9A-Fa-f]{1,4}:){4}(?::[0-9A-Fa-f]{1,4}){1,4}|
               (?:[0-9A-Fa-f]{1,4}:){3}(?::[0-9A-Fa-f]{1,4}){1,5}|
               (?:[0-9A-Fa-f]{1,4}:){2}(?::[0-9A-Fa-f]{1,4}){1,6}|
               [0-9A-Fa-f]{1,4}:(?::[0-9A-Fa-f]{1,4}){1,7}|
               :(?::[0-9A-Fa-f]{1,4}){1,8}|
               (?:[0-9A-Fa-f]{1,4}:){6}(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
               \x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               (?:[0-9A-Fa-f]{1,4}:){6}:(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])
               \x2e(?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               (?:[0-9A-Fa-f]{1,4}:){5}:(?:[0-9A-Fa-f]{1,4}:)?
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               (?:[0-9A-Fa-f]{1,4}:){4}:(?:[0-9A-Fa-f]{1,4}:){0,2}
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               (?:[0-9A-Fa-f]{1,4}:){3}:(?:[0-9A-Fa-f]{1,4}:){0,3}
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               (?:[0-9A-Fa-f]{1,4}:){2}:(?:[0-9A-Fa-f]{1,4}:){0,4}
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               [0-9A-Fa-f]{1,4}::(?:[0-9A-Fa-f]{1,4}:){0,5}
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])|
               ::(?:[0-9A-Fa-f]{1,4}:){0,6}
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5])\x2e
               (?:\d|[1-9]\d|1\d{2}|2[0-4]\d|25[0-5]))(?:/\d+)?|
            [Ee][Xx][Ii][Ss][Tt][Ss]:
            (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
               %%|
               %_|
               %\x2d|
               [!-\x24&-~])*
            (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
               %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
               %%|
               %_|
               %\x2d))|
         [Rr][Ee][Dd][Ii][Rr][Ee][Cc][Tt]=
         (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
            %%|
            %_|
            %\x2d|
            [!-\x24&-~])*
         (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
            %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
            %%|
            %_|
            %\x2d)|
         [Ee][Xx][Pp]=
         (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
            %%|
            %_|
            %\x2d|
            [!-\x24&-~])*
         (?:\x2e(?:[A-Za-z]|[A-Za-z](?:[\x2d0-9A-Za-z]?)*[0-9A-Za-z])|
            %\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
            %%|
            %_|
            %\x2d)|
         [A-Za-z][\x2d\x2e0-9A-Z_a-z]*=
         (?:%\x7b[CDHILOPR-Tcdhilopr-t]\d*[Rr]?[\x2b-/=_]*\x7d|
            %%|
            %_|
            %\x2d|
            [!-\x24&-~])*))*\x20*}x




-wayne


<Prev in Thread] Current Thread [Next in Thread>