On Thu, 07 Aug 1997 08:11:07 +0900, Mitsuru Furukawa <furu(_at_)009(_dot_)com>
wrote:
era eriksson <era(_at_)iki(_dot_)fi> wrote:
era> > Did I clarify the question?
era> Yes. Can't you just put those four bytes there? There is no notation
era> for using octal/hexadecimal/whatever codes in place of real
era> characters. (If you want that, you can write a preprocessor of some
Yes, I could just put 4 bytes,
HOWEVER it could be tricky to put EUC code chars (seldom used on my PC)
in an environment where I have to deal with at least three kinds of
code system to handle my own language;-<
So, I came up with following solution:
HENSHIN=`perl -e 'print"\xca\xd6\xbf\xae"`
:0
* $ B ?? .*\/$HENSHIN
{
:0
| echo "$MATCH" > henshin
}
I'm unfamiliar with how EUC works, but there's always the possibility
of misalignment, so I guess perhaps you should check for that. If your
character glyphs are always four-byte entities, just check for
multiples of four characters after the previous line break:
* $ B ?? ^(....)*\/$HENSHIN
If, as the case may be, you might have random stretches of normal
ASCII with the 8th bit clear, and four-byte EUC characters which all
have the 8th bit set, you could try something like
HENSHIN=`echo "yta/rg==" | mmencode -u` # probably not more efficient?
ASC=`echo "AC1/" | mmencode -u` # might use Perl after all
* $ B ?? ^([$ASC]*([^$ASC][^$ASC][^$ASC][^$ASC])*)*\/$HENSHIN
$ASC actually expands to ^(_at_)-^?, where ^@ is the null character and ^?
is DEL. If you're comfortable putting bare nulls and other control
characters in your files, you don't need this trick. (Or you can just
approximate with [ -~], that is, tab and space through tilde.)
The mmencode trick is perhaps not more efficient than Perl, but I
wanted to come up with an alternative. (I tried uudecode first but it
became very unwieldy :-)
As a side note, you should probably use an :i flag on the echo recipe.
Since input is being ignored, you can also put in an :h or :b
(whichever is likely to be smaller) to minimize the amount of data
that gets ignored.
Hope this helps,
/* era */
--
Defin-i-t-e-ly. Sep-a-r-a-te. Gram-m-a-r. <http://www.iki.fi/~era/>
* Enjoy receiving spam? Register at <http://www.iki.fi/~era/spam.html>