procmail
[Top] [All Lists]

Re: hex value in condition

1997-08-07 00:23:00
On Thu, 07 Aug 1997 08:11:07 +0900, Mitsuru Furukawa <furu(_at_)009(_dot_)com>
wrote:
era eriksson <era(_at_)iki(_dot_)fi> wrote:
 era> > Did I clarify the question?
 era> Yes. Can't you just put those four bytes there? There is no notation
 era> for using octal/hexadecimal/whatever codes in place of real
 era> characters. (If you want that, you can write a preprocessor of some
Yes, I could just put 4 bytes, 
HOWEVER it could be tricky to put EUC code chars (seldom used on my PC)
in an environment where I have to deal with at least three kinds of
code system to handle my own language;-<
So, I came up with following solution:
HENSHIN=`perl -e 'print"\xca\xd6\xbf\xae"`
:0
* $ B ?? .*\/$HENSHIN
  { 
    :0
    | echo "$MATCH" > henshin
  } 

I'm unfamiliar with how EUC works, but there's always the possibility
of misalignment, so I guess perhaps you should check for that. If your
character glyphs are always four-byte entities, just check for
multiples of four characters after the previous line break: 

    * $ B ?? ^(....)*\/$HENSHIN

If, as the case may be, you might have random stretches of normal
ASCII with the 8th bit clear, and four-byte EUC characters which all
have the 8th bit set, you could try something like

    HENSHIN=`echo "yta/rg==" | mmencode -u` # probably not more efficient?
    ASC=`echo "AC1/" | mmencode -u`         # might use Perl after all

    * $ B ?? ^([$ASC]*([^$ASC][^$ASC][^$ASC][^$ASC])*)*\/$HENSHIN

$ASC actually expands to ^(_at_)-^?, where ^@ is the null character and ^?
is DEL. If you're comfortable putting bare nulls and other control
characters in your files, you don't need this trick. (Or you can just
approximate with [       -~], that is, tab and space through tilde.)

The mmencode trick is perhaps not more efficient than Perl, but I
wanted to come up with an alternative. (I tried uudecode first but it
became very unwieldy :-)

As a side note, you should probably use an :i flag on the echo recipe.
Since input is being ignored, you can also put in an :h or :b
(whichever is likely to be smaller) to minimize the amount of data
that gets ignored.

Hope this helps,

/* era */

-- 
Defin-i-t-e-ly. Sep-a-r-a-te. Gram-m-a-r.  <http://www.iki.fi/~era/>
 * Enjoy receiving spam? Register at <http://www.iki.fi/~era/spam.html>

<Prev in Thread] Current Thread [Next in Thread>