procmail
[Top] [All Lists]

Re: Removing multiple spaces from Subject lines

2001-04-08 10:11:30
From: Trevor Jenkins <trevor(_dot_)jenkins(_at_)suneidesis(_dot_)com>

On Fri, 6 Apr 2001, Dallman Ross <dman(_at_)nomotek(_dot_)com> wrote:

From: Trevor Jenkins <trevor(_dot_)jenkins(_at_)suneidesis(_dot_)com>

Now to solve the problem of overly long subject lines. 
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^

The following section of sed code, taken from a longer set of
statements

I tried a completly different approach to this part of the problem
(expunging multiple spaces). I used tr with its --squeeze-repeats
option. Works fine. :-)

Although it might be less efficient to run tr than sed, using tr
makes it [snip]

I'm confused at the continuity, or lack thereof.  I see, of course,
that the subject-line of this thread is mainly about killing extra
spaces.  But the tangent I'd addressed was about long subject-lines.
Now you've quoted that, and replied, sans segue, as if I'd been
discussing killing extra spaces again.  IDGI.  (I don't get it.)

As for killing extra spaces, last year I posted a way to do that
with procmail alone.  I don't claim that it's more efficient or
easier to follow than a sed or tr approach, though.  But here it
is, again.

For my own purposes, I was trying to capture subjects containing
"commands" (defined as such only by me and my .procmailrc).  These
"commands" begin with "#" in the subject and contain only all-
caps.  I use the model for various things.  But this can, of
course, be reused generally for any header-line that is captured
to MATCH.

The indented part if from a .procmailrc.  Note that "$WHITESPACE" has
already been defined somewhere above as a space or a tab:


  :0 Di  # accepted commands start with `#' and are all-caps
  * $ SUBJ ?? ^^#.*\/[^$WHITESPACE].*
  * ! MATCH ?? [a-z]
  {
     myCMD  = $MATCH  # myCMD is now the part of SUBJ after `#' and whitespace

     INCLUDERC = .normalize_whitespace.rc

     ## The INCLUDERC gets rid of inappropriate whitespace, if
     ## any, in myCMD (without calling an external program like
     ## sed).  It will be invoked recursively, at most  n - 1
     ## times, where `n' represents the number of words expected
     ## in the myCMD expression.  Since procmail has no natural
     ## looping mechanism, we have to kludge one by putting
     ## our pseudo-function recipe set into an INCLUDERC.  The
     ## INCLUDERC then calls itself recursively until the test
     ## fails.  (My thanks to Volker Kulhmann for reminding me
     ## of this trick.)  The INCLUDERC is viewable via WWW at:
     ##
     ## http://nomotek.com/_private/.normalize_whitespace.rc


     LOG = "  ::: myCMD IS >$myCMD< $NL"
     
  }  # Now myCMD's words are separated by exactly one space.


Okay, so what was in the INCLUDERC file ".normalize_whitespace.rc"?
Here it is:


:0 bi
* $ myCMD ?? ^^\/.*($TAB|$SPACE$SPACE)
* $ MATCH ?? ^^\/.*[^$WHITESPACE]
{
   # If we're here, there was some inappropriate whitespace

   headCMD = $MATCH

   :0 bi
   * $ myCMD ?? ^^$headCMD[$WHITESPACE]+\/[^$WHITESPACE].*
   { tailCMD = $MATCH }

   myCMD = "${headCMD}${SPACE}${tailCMD}"

   INCLUDERC = .normalize_whitespace.rc
}

-- 
Netcom has imploded.  Please now use NOTnetcom.com for mail.
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Ex-Netcommies:  Mail "forwards" for free forwarding service!
NOT affiliated with EarthLink, Inc.'s Netcom brand identity.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>