procmail
[Top] [All Lists]

Re: hex value in condition

1997-08-07 16:08:00
era eriksson <era(_at_)iki(_dot_)fi> wrote:
<snip>
era> I'm unfamiliar with how EUC works, but there's always the possibility
era> of misalignment, so I guess perhaps you should check for that. If your
era> character glyphs are always four-byte entities, just check for
era> multiples of four characters after the previous line break: 
era> 
era>     * $ B ?? ^(....)*\/$HENSHIN
era> 
era> If, as the case may be, you might have random stretches of normal
era> ASCII with the 8th bit clear, and four-byte EUC characters which all
era> have the 8th bit set, you could try something like
era> 
era>     HENSHIN=`echo "yta/rg==" | mmencode -u` # probably not more efficient?
era>     ASC=`echo "AC1/" | mmencode -u`         # might use Perl after all
era> 
era>     * $ B ?? ^([$ASC]*([^$ASC][^$ASC][^$ASC][^$ASC])*)*\/$HENSHIN
era> 
era> $ASC actually expands to ^(_at_)-^?, where ^@ is the null character and ^?
era> is DEL. If you're comfortable putting bare nulls and other control
era> characters in your files, you don't need this trick. (Or you can just
era> approximate with [  -~], that is, tab and space through tilde.)
era> 
era> The mmencode trick is perhaps not more efficient than Perl, but I
era> wanted to come up with an alternative. (I tried uudecode first but it
era> became very unwieldy :-)

To be candid, it was little bit beyond my comprehension;-(
Especially "yta/rg==" and "AC1/" parts were puzzling to me.

But if you are concerned with mixing up of 2-byte Japanese chars and
1-byte ASCII chars in matching operation, then it is not necessary.
EUC Japanese chars could co-exist with ASCII chars "safely"
and any byte of EUC char would not be mis-interpreted as ASCII char.
That is a reason I need to convert to EUC code system before manipulating
internet mails which is transmitted in another code system.
FYI, mails change to yet another splendid code system on my PC!
Sounds exciting?:-)

era> As a side note, you should probably use an :i flag on the echo recipe.
era> Since input is being ignored, you can also put in an :h or :b
era> (whichever is likely to be smaller) to minimize the amount of data
era> that gets ignored.
era> 
era> Hope this helps,

Thanks.
Actually, I want to use this $HENSHIN matching to delete
quoted portion in reply message from cc:Mail such as
from
____________________________ HeNsHiN ________________________________
to the end of mail.
Does
  sed -e '/^_* $HENSHIN _*$/,$d'
work?
So far, I have received only Japanese cc:Mail message with such portion.
To prepare for domestic cc:Mail message with quotation,
could anyone tell me the corresponding word in domestic cc:Mail? 
Is it "Reply"? "REPLY"? "Quote"? Or?
TIA.
_/_/_/      Mitsuru FURUKAWA      _/_/_/
_/_/_/        Tokyo, Japan        _/_/_/
_/_/_/     mailto:furu(_at_)009(_dot_)com    _/_/_/
_/_/_/  http://www.009.com/furu/  _/_/_/

<Prev in Thread] Current Thread [Next in Thread>