perl-unicode

Re: In-Band Information Considered Harmful

1998-10-22 21:55:17
On Thu, Oct 22, 1998 at 11:32:58PM -0400, Ilya Zakharevich wrote:
Suppose that utf8.pm knows about screen-width of chars (whatever this
means, for me width 0 and 1 is enough).  Suppose that <b> and <\b>
above denote 0-screen-width inband data (say, encoded as utf chars
above 1<<32 which are known to be 0-width, or Unicode inband
"Language" chars).  Then
  use re_ignore_zerowidth_char;
or
  use utf8_screen_width;
(or whatever) makes /hello there/i match "<b>Hello</b> there!".  
I already raised this question here on p5p in slightly different term,
without reference to screen-width.  Now I think it pares well with
screen width support.

Interesting.

Allow the re engine to be more useful... but what would you do about
tags with arguments?

    <font size=+2>H</font>appy now I am.

You can't assign a unicode character to each combination/permutation... :-)

mark

--
  markm(_at_)nortel(_dot_)ca  /  markm(_at_)freenet(_dot_)carleton(_dot_)ca     
 _______________________
.  .  _  ._  . .   .__    .  . ._. .__ .   . . .__  | Northern Telecom Ltd. |
|\/| |_| |_| |/    |_     |\/|  |  |_  |   |/  |_   | Box 3511, Station 'C' |
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, ON    K1Y 4H7 |