procmail
[Top] [All Lists]

Re: Zapping repeated .sig appendices

1996-09-03 19:23:46
i certainly agree that it would be far more courteous for people to do
what you describe, but given the difficulty of breaking millions of people
of (what we consider to be) their bad habits, why not try to solve the
problem another way: with procmail?


Brandon,

Your signature is an excellent example of why it would be very difficult
to automatically remove signatures from incoming e-mail.  Please define
a "rule" by which a program can determine that your e-mail message ends,
and your signature starts.  Please do not make the rule specific to
*your* signature, but make it general, so it can be applied to my
signature, and all of your other correspondant's signatures.

Unless the filter can read english, (and even if it could) I can think
of no heuristic to apply which would allow me to determine what was
signature (which should be removed) and what was "content" (which should
be kept).

since the problem is not one of removing *all* signatures that are *ever*
received,  but those that clutter up messages where there has been
repeated replying, this problem isn't as insoluble as it may appear.  

if the replying is between a small number of people, then you would expect
to see the same signatures more than once in the message (at least some of
the time).  this could be the starting point:  any series of three or more
lines that is repeated within the message (with > and similar marks
ignored) would be removed.  the version with the > or | or whatever marks
would be the one(s) removed, and the actually (newly applied) signature
would be left.  At first this would only catch a few signatures, but these
could be saved to a file, so that anytime any set of three or more lines
matched a saved sig (with the > marks ignored, or course, so that it
could match) then the signature would be removed if and only if it did
have the > marks.  so these marks would be ignored for matching purposes,
but they would be used to determine whether or not to remove the
signature, so fresh signatures would be left.  as the signature kill file
grew larger, it would  be able to be used to remove more and more people's
extraneous signatures.


another possible way of finding sigs (which, if it could be
implemented, would work independenly of the previous idea, though
perhaps in addition to it) would be to look for repeated use of
non-alphanumeric characters, either in horizontal or verticl repetition.
any column of asterisks, or row of hyphens, to give two examples, would be
an obvious clue to the presence of a signature.  how exactly these clues
would be used is unclear, but i mention this as a possible alternative to
look into.  and again, this couldn't by any means weed out all signatures.
so neither solution is complete and perfect.



If there were a global convention which all users would agree upon, we
could apply a rule to follow the convetion.  Such a convention might be
to begin signatures with a blank line, followed by a line of
underscores, then arbitrary text.

However, there is always the case that someone might not be following
the convention and may include a nice table in their file.  For example,
a table of salary increments which begins with a blank line, and then a
line of underscores would be thrown away from an incoming e-mail
message. 

only if:
          1)  it had resend marks in the left column, e.g. '>'.
          2)  it had whatever other criteria were used to match
        signatures, like <3-12characters>@<3-20characters> some-
        where in the sig, and whatever other things you could find
        to use
          3)  it fit whatever size range the user specified, like from
        3-10 lines.  (most tables would be larger, i think.)





In case I haven't made my point yet, here is one of the many signatures
I've collected over the years.  How can you tell where the e-mail ends
and the signature starts?

" Work is for people who can't play volleyball. "
                      
                    o
                         ,, 
         ,    ,,  ____    \\                            
        o|    ||  |  |      o                          
        |/'   o   |__|      |              
        |     |   |  |      |             o
        />    |   |  |      <\           //\
       //    />   |  |       \\         '' <\    
            //    |  |                      \\
////////\\\\__////////////////\\\\\\\\\\\\\\__///////////////////////
_____________________________________________________________________
Alan Stebbens <stebbens(_at_)sgi(_dot_)com>      
http://reality.sgi.com/stebbens



some signatures, or parts of signatures would slip through.  

i'm not denying that this would be a difficult project, just that
reforming  the entire internet community would be far harder, especially
on a relatively smaller point (compared to, say, flaming or spamming).

  -brandon


p.s.  please note this is the only copy of my signa
  .o8                                            .o8                        
 "888                                           "888                        
  888oooo.  oooo d8b  .oooo.   ooo. .oo.    .oooo888   .ooooo.  ooo. .oo.   
  d88' `88b `888""8P `P  )88b  `888P"Y88b  d88' `888  d88' `88b `888P"Y88b  
  888   888  888      .oP"888   888   888  888   888  888   888  888   888  
  888   888  888     d8(  888   888   888  888   888  888   888  888   888  
  `Y8bod8P' d888b    `Y888""8o o888o o888o `Y8bod88P" `Y8bod8P' o888o o888o 

  brandon d. zylstra                                     
brandon(_at_)umich(_dot_)edu

<Prev in Thread] Current Thread [Next in Thread>