smkelly(_at_)filenet(_dot_)com said:
-i need to covert the strings to 2 Unicode formats ,
on like this \ua9e0 for each character
on like this &#a9e0 for each character
I think in the latter case, you might really want "ꧠ" (decimal
number, terminated with semi-colon), if your intention is to produce HTML
numeric entities for unicode characters.
One basic approach (assuming $_ contains a utf8 string) is:
# convert non-ascii to "\uHHHH":
s/([^[:ascii:]])/sprintf("\\u%04x",ord($1))/eg;
# convert non-ascii to "&#nnnn":
s/([^[:ascii:]])/sprintf("&#%d;",ord($1))/eg;
and similarly for other variants. Look at the section on "POSIX character
class syntax" regarding the "[:ascii:]" expression.
David Graff