procmail
[Top] [All Lists]

Re: Regexp fails in scoring recipe

2003-05-15 01:12:57
I found the answers to my questions:
   Q1. Why did my original recipe stop working?
Q2. Why does the recipe fail in production mode while succeeding in test mode?

The reason became evident while I was testing the new non-scoring recipe in production mode: It was also failing when the traffic report had only road work events in the locations of interest. In other words, the new recipe was matching road work events when the regexp was designed to match everything except road work events. To debug this, I used the \/ token to determine the matching text and put it into the log file. This is what I found:

A1: The traffic report body was in DOS format! The DOS line terminator is carriage return followed by newline, whereas Unix uses only newline. The regexp was looking for "[^ ].........$", and the file contained "road work^M$", where ^M represents the carriage return character, so the regexp matched in production mode (i.e. this is a failure).

A2: I use Netscape 7 Mail for my mail client. When I used Netscape to display the source of the unmodified traffic report, Netscape stripped out the carriage returns. I would copy this report (not knowing that Netscape had modified it) to a file and test the recipe on this altered copy. The recipe succeeded in test mode because the carriage returns had been stripped.

Either the incoming traffic reports changed from Unix format to DOS format in mid-April, or my mail server stopped converting DOS format messages to Unix format. I think the former is more likely because I was able to find messages in my mail archives containing carriage return and pre-dating the onset of the recipe failure. In either case, at least I have a solution now: Invoke the dos2unix command on incoming mail before using regular expressions involving newline.

Kevin


Tony L. Svanstrom wrote:

Kevin, this might be a stupid question, but since you said that it used to
work and then stopped... Have you checked if the e-mails are encoded?

If they nowadays are encoded your tests would/could fail, but since the e-mail
client decodes it you might not have noticed (and your saved testmessages would
look just as before).




_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail