procmail
[Top] [All Lists]

Re: PDF spam

2007-07-22 12:10:04
yes but problem with that is that spammers use images in pdf
documents, so probably you would have to extract image somehow from
pdf and then do ocr on it to capture any spam related keywords.


On 7/22/07, fleet(_at_)teachout(_dot_)org <fleet(_at_)teachout(_dot_)org> wrote:

Look at the first five (or so) characters in base64 encoding.  Encoding
for a PDF file seems to start always with 'JVBER'.  One can also identify
GIF, JPG, PNG, etc. the same way.

So something like:

:0B
* ^JVBER
PDF-junk

                               - fleet -
____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>