I'm embarrassed that it took so long to figure out how to enter [ESC]$B.
Using vi, I knew I had to use Ctrl+V, but I didn't realize I actually
had to press the Esc key :-) .
I thank era eriksson, Satoru Manita, and Philip Guenther for their
kindness and patience.
* ^[\$B (entered with vi by Ctrl+V,Esc,\,$,B)
does a fine job of finding all posts containing Japanese, with one
exception. Posts from people using the AOL 4.0 mailer seem to have
Japanese coded in a completely different way. No Esc at all. What I
see is mostly pairs of characters separated by "^" -- I'm guessing this
is Ctrl. The first character of each pair is a capital letter (very
often "A") with some diacritic, e.g, circumflex, acute, grave. The
second character can be anything, it seems. I can't show you the
diacritics, but a typical line is
^At^AI^Ie^I^C^AA^N1/4^AI^AR^A}^A^Ah and so on
(the "1/4" is a single character)
I think I could find this stuff by searching on [Ctrl] plus, say
capital A with a circumflex accent (I believe this is character code
194, right?), or capital A with an acute accent (code 193?). How can I
put these in a recipe? Or could I use weighted scoring and look for
many [Ctrl]? Is it possible to use vi to enter just a [Ctrl] in the
recipe, and then count them?
* -1^0
* 1^1 [Ctrl] plus A with a circumflex accent
* -5^0
* 1^1 [Ctrl]
are the kind of things I'm thinking of.
With Philip Guenther's help I've relearned some basic weighted scoring,
even though I didn't need it to find [ESC]$B.. Sorry for the dumb
questions.
Dick Moores rdm(_at_)netcom(_dot_)com