Re: Re^2: Condition with BODY length?

Mitsuru Furukawa <furu(_at_)009(_dot_)com> writes:

Philip Guenther <guenther(_at_)gac(_dot_)edu> wrote:
<snip>
Philip>        :0 bw
Philip>        |forward_to_pager
Philip> 
Philip> Where forward_to_pager contains:
Philip> 
Philip>        #!/usr/local/bin/perl
Philip>        # ...or whereever
Philip>        undef $/;
Philip>        $_ = <>;
Philip>        s/\n/ /g;
Philip>        while(length($_)) {
Philip>            open(PAGER, "|pager -with -any -needed -args") or die "fork

: $!";

Philip>            print substr($_, 0, 100), "\n";
Philip>            close(PAGER) or die "exec or write: $!";
Philip>            substr($_, 0, 100) = '';
Philip>        }

Frankly, I need to study perl little bit more to understand this.
So meanwhile could you comment on my version of recipe below?
It works fine in one-byte world, but not in my two-byte world emails.
I need to figure out some way to avoid cutting two-byte char 
in the middle. Any idea?


Oog.  Multibyte characters are nasty.  Is the pager limit 100 bytes, or
a 100 characters?  Does your pager even understand multibyte
characters, or do them come through as goblygook?

I heard substr in jperl does not support two-byte char but split does.
Anyway, I prefer to solve it using "domestic" perl by
probably comparing the last byte with char-code-range and ......


Hmm, it sounds like jperl supports multibyte characters with regexps
searchs, in which case you may be able to replace the substr() calls
above with something like:

        s/^(.{,100})//;
        print "$1\n";

With a non-multibyte aware perl you would be best off building the string
to sent byte by byte, keeping track of how many actual characters you've
gotten so far and quiting when that hits your limit.

# split body of mail into 100 byte(actually 96 to leave 4 for SEQ) files
# THIS WORKS FINE IN ONE-BYTE WORLD!!!!!!!!!!!!!!
# use sh control flow with external files

  YMDHMSP=`date "+%y%m%d.%H%M%S.$$"`
  :0bfw         # remove empty lines, leading whitespace, quoted lines
  | sed -e '/^[       ]*$/d' -e 's/^[         ]*//' \
  -e '/^[     ]*>.*$/d'
  :0bfw         # more efficient to use tr > THANKS TO PHILIP.
   | perl -pe 's/\n/ /g' 
  :0bcw
  | split -b96 - $YMDHMSP
  SPLITFILES=`ls -1 $YMDHMSP*`
  COUNT=`echo $SPLITFILES | wc -w`
  :0
  * $ ? test -f $SPLITFILES
  | i=1 ;\
    for file in $SPLITFILES; do \
       (echo "Subject: paging from splitcharloop.rc [$i/$COUNT]" ; \
        echo "To: mynumber(_at_)pager(_dot_)domain" ; \
        echo "From: furu(_at_)009(_dot_)com" ; \
        echo "Cc: furu(_at_)009(_dot_)com" ; \
        echo "X-Loop: furu(_at_)009(_dot_)com" ; \
        echo "$i>" ; cat "$file" ; \
        ) | $SENDMAIL -oi -t ; \
        i=`expr $i + 1` ; \
    done ; \
    rm -f "$YMDHMSP"*


Looks like it would work to me, assuming the _program_ split is multibyte
character aware.

Philip>        :0
Philip>        * B ?? 1^1 > 1
Philip>        { }
Philip>        BODYLEN = $=

I understand condition line such as * 1^1 . counts up length of search area.
But I do not understand how 1^1 > 1 counts up length.
Or is it B which puts length value in $= ?
Could you explain?


When it doubt, check the manpage (procmailsc(5)):

Weighted length conditions
     If the length of the actual mail is M then:
 
          * w^x  > L
 
     will generate an additional score of:
 
                     x
              /  M  \
          w * | --- |
              \  L  /
 
     And:
 
          * w^x  < L
 
     will generate an additional score of:
 
                     x
              /  L  \
          w * | --- |
              \  M  /
 
     In both cases, if L=M, this will add w to the score.  In the
     former  case  however, larger mails will be favoured, in the


With w=x=L=1, you get

        1*(M/1)^1 == M


Philip Guenther