mhonarc-commits
[Top] [All Lists]

CVS: mhonarc/MHonArc/lib mhidxrc.pl,2.17,2.18 mhinit.pl,2.56,2.57

2011-01-09 01:52:28
Update of mhonarc/MHonArc/lib
Modified Files:
	mhidxrc.pl mhinit.pl 
Log Message:
Modified core address detection regexp to be more simple.  Although
the RFCs allow for funky addresses, such usage is rare today.
The more complicated regex causes performance problems with large
messages which the newer one does not suffer from.

Also modified addressmodifycode to be the following when spammode
is set:

  s|@(.+)$|'@'.('x' x length($1))|ge

This has the same effect as the previous regexp, but is simpler.


======================================================================
FILE: mhonarc/MHonArc/lib/mhidxrc.pl
<http://www.mhonarc.org/cgi-bin/viewcvs.cgi/*checkout*/mhonarc/MHonArc/lib/mhidxrc.pl?rev=2.18>

<http://www.mhonarc.org/cgi-bin/viewcvs.cgi/mhonarc/MHonArc/lib/mhidxrc.pl.diff?r1=2.17&r2=2.18&diff_format=h>
--- mhidxrc.pl	15 Mar 2004 21:07:18 -0000	2.17
+++ mhidxrc.pl	9 Jan 2011 07:52:25 -0000	2.18
@@ -750,5 +750,5 @@
 if ($AddressModify eq "") {
     $AddressModify =
-        q{s|([\!\%\w\.\-+=/]+@)([\w\-]+\.[\w\.\-]+)|$1.('x' x length($2))|ge}
+        q{s|@(.+)$|'@'.('x' x length($1))|ge}
         if $SpamMode;
     $IsDefault{'AddressModify'} = 1;

======================================================================
FILE: mhonarc/MHonArc/lib/mhinit.pl
<http://www.mhonarc.org/cgi-bin/viewcvs.cgi/*checkout*/mhonarc/MHonArc/lib/mhinit.pl?rev=2.57>

<http://www.mhonarc.org/cgi-bin/viewcvs.cgi/mhonarc/MHonArc/lib/mhinit.pl.diff?r1=2.56&r2=2.57&diff_format=h>
--- mhinit.pl	31 Dec 2010 20:34:00 -0000	2.56
+++ mhinit.pl	9 Jan 2011 07:52:25 -0000	2.57
@@ -300,7 +300,13 @@
 $VarExp    = '\$([^\$]*)\$'  if !defined($VarExp) || $VarExp !~ /\S/;
 
-##  Regexp for address/msg-id detection (looks like cussing in cartoons)
-$AddrExp  = '[^()<>@,;:\/\s"\'&|]+@[^()<>@,;:\/\s"\'&|]+';
-$HAddrExp = '[^()<>@,;:\/\s"\'&|]+(?:@|&\#[xX]0*40;|&64;)[^()<>@,;:\/\s"\'&|]+';
+##  Regexp for address/msg-id detection: In the past we had
+##  patterns that are more insync with RFC (2)822, but much
+##  of the world today uses simplier syntax.  Also, the more
+##  general versions have serious performance problems on
+##  large strings.
+#$AddrExp  = '\b[^()<>@,;:\/\s"\'&|]+@[^()<>@,;:\/\s"\'&|]+\b';
+#$HAddrExp = '\b[^()<>@,;:\/\s"\'&|]+(?:@|&\#[xX]0*40;|&64;)[^()<>@,;:\/\s"\'&|]+\b';
+$AddrExp   = '\b[A-Za-z\d\-\.+%_]+@[A-Za-z\d.\-]+\.[A-Za-z]{2,4}\b';
+$HAddrExp  = $AddrExp;
 
 ##  Text clipping function and source file: Set in mhopt.pl.

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-COMMITS